方法论
六个公开排名的透明综合 — 我们度量什么、不度量什么,以及如何解读我们的数字。
FWUR 估计什么(以及不估计什么)
FWUR Rank 是六个公开大学排名的透明综合。我们度量 (1) 这些排名在哪里不一致,(2) 在哪里趋同,(3) 共识对所纳入排名的选择有多敏感。我们不直接度量教育或研究质量。
我们度量的三件事
- 首要 — 分歧
六家机构对某一机构的排名差异有多大。这正是 FWUR 存在的目的;共识数字是钩子,分歧信号才是实质。
- 次要 — 共识
对各机构趋同处的稳健截尾均值摘要。作为头条数字呈现,但视觉上不大于分歧展示。
- 再次 — 方法敏感性
答案对所纳入机构的依赖程度 — 通过自定义子集(模式 C)视图和方法敏感性带显示。
诚实的局限
FWUR v0.1 算法与 v1.0 产品完成于 2026-05-08 由项目负责人单独权威锁定,未经外部统计顾问或领域专家审稿人签署。该决定基于项目负责人对多机构聚合的七年积累思考、三轮共十五份 LLM 同行评审,以及确定性 v0.1 基线(62 个单元测试与定理证明)。
验证通过内部 Saltelli–Sobol 方法敏感性分析(Track C)进行。外部验证路径(用户 A/B 研究;Bradley–Terry 专家成对比较小组)记录为待未来预算的愿景。贝叶斯模型研究分支因相同原因被无限期推迟。
这就是诚实的约束。我们不主张我们没有的外部学术验证。
Methodological honesty — what we deliberately do not do
Why we avoid frequentist uncertainty intervals
The six rankings are not a random sample drawn from a population — they are the population of major published university rankings. Standard frequentist uncertainty quantification (the kind that produces an interval with a coverage guarantee) requires a sampling model that does not exist here, so quoting one would be mathematically misleading. Instead we surface a qualitative disagreement bucket (high agreement / mixed signal / divergent signal) and a method-sensitivity band (planned for v0.2 once the Saltelli–Sobol pipeline runs over the 41 size-≥3 agency subsets). Our naming-discipline lint actively blocks the corresponding language in user-facing copy.
Why our trajectory chart is overlay, not small multiples
Edward Tufte's rule for time series with more than three lines is small multiples — one mini-chart per agency, faceted side by side. We use overlay (six lines on one chart) because the user task is direct comparison: did agency X agree with agency Y this year? Faceted small multiples answer that less directly than co-located lines. We acknowledge the trade-off: with six overlapping series the chart can look crowded, especially in the middle of the rank range. A small-multiples view is on the v0.2-x backlog as an option toggle, not a default.
Both limits have explicit reactivation triggers in CONSTRAINTS.md §5: when external statistical consultation becomes accessible, or when the Saltelli–Sobol pipeline yields a defensible empirical band, the corresponding methodology section will be amended via a new ADR.
我们遵循的标准
莱顿宣言(Hicks et al. 2015) · 柏林原则(IREG 观测站) · OECD/JRC 复合指标手册(Saisana 2008/2011) · DORA · AAPOR
完整细节
- Smart_Rank 形式规范(docs/SMART_RANK_FORMAL_SPEC.md)
- 身份与方向(docs/DECISIONS/ADR-040)
- 包含修正 2 的路线图(docs/DECISIONS/ADR-034)
- 包含修正 1 的验证框架(docs/DECISIONS/ADR-036)
- 运营约束(docs/CONSTRAINTS.md)
这些文档是项目仓库的一部分;方法论通过版本化的 ADR 修正演进,而非默默更改。