`consistency-check`¶

Pack: 100xOS shared skills

Category: review

Field: economics

License: private (curator-owned)

Updated: 2026-05-20

Stages: referee-simulation

Curator-private skill — copy text from 100xOS/shared/skills/review/consistency-check.md.

↗ view SKILL.md on source

Consistency Check¶

Purpose¶

This skill describes how to perform a mechanical number verification audit on an academic paper draft. The goal is to cross-reference every quantitative claim in the text against the source artifacts (data summaries, estimation output, analysis results) and flag any discrepancies.

A consistency check is not about whether the methods are correct — that's the job of the referee and technical reviewer. This is about whether the numbers in the paper accurately reflect what the pipeline actually produced.

Step 1: Extract Numerical Claims from the Draft¶

Systematically scan the draft for every quantitative statement:

In-text claims¶

"The sample contains N observations"
"The coefficient on X is 0.45 (SE = 0.12)"
"This represents a Y% increase relative to the mean"
"The effect is significant at the Z% level"
"The mean of X is M (SD = S)"

Tables¶

Every cell in every table is a checkable claim
Pay special attention to: N rows, coefficient cells, SE/t-stat cells, R-squared values, number of fixed effects/clusters

Figures¶

Axis labels with specific values
Annotated data points
Reported statistics in figure notes

Abstract and conclusion¶

These often contain rounded or paraphrased versions of key results — verify they are consistent with the precise values in the body

Step 2: Match Claims Against Source Artifacts¶

For each numerical claim, identify the source artifact:

Claim type	Primary source	Secondary source
Sample size (N)	summary_statistics.json	data_description.md
Variable means/SDs	summary_statistics.json	data_description.md
Coefficients	estimation_results.json	analysis output
Standard errors	estimation_results.json	analysis output
P-values	estimation_results.json (compute from coef/SE)	analysis output
R-squared	estimation_results.json	analysis output
Robustness coefficients	robustness results in analysis	—
Economic magnitudes	Derived from coefficient + mean	summary_statistics.json

Step 3: Apply Tolerance Rules¶

Not every small difference is a discrepancy. Apply these rules:

Exact match required¶

Sample sizes (integers): must match exactly
Signs of coefficients: must match exactly
Significance levels: if the paper says "significant at 5%", the p-value must actually be < 0.05

Rounding tolerance¶

A value is consistent if it rounds to the reported value at the reported precision level
Example: actual = 0.4527, reported as 0.45 → consistent
Example: actual = 0.4527, reported as 0.453 → consistent
Example: actual = 0.4527, reported as 0.46 → inconsistent (minor)
Example: actual = 0.4527, reported as 0.54 → inconsistent (critical — transposed digits)

Derived quantity tolerance¶

For percentage changes: (coefficient / mean) × 100
Allow 0.5 percentage point tolerance for rounding in the chain
For standard deviation units: coefficient / SD
Allow 0.05 SD tolerance for rounding

Step 4: Classify Discrepancies by Severity¶

Critical¶

Wrong sign on a coefficient
Sample size off by more than 10%
Main result coefficient off by more than 20% of its value
Claiming significance when p-value > reported threshold
Transposed digits (0.45 vs 0.54)

Major¶

Coefficient off by 5-20% of its value
Standard error off by more than 10%
Summary statistic (mean/SD) off by more than 10%
Economic magnitude incorrectly computed
Robustness result direction inconsistent with text description

Minor¶

Rounding differences within 5% of value
R-squared differences in third decimal place
Number of observations off by a small count (< 1%)
Slight imprecision in derived quantities

Step 5: Common Error Patterns¶

Watch for these frequent mistakes:

Transposed digits: 0.453 → 0.435 or 0.543
Wrong specification: Reporting coefficient from column (2) as if it were from column (1)
Stale numbers: Draft references an earlier estimation run; numbers have since been updated in the artifacts
Unit confusion: Reporting a coefficient in levels when the variable is in logs (or vice versa)
Percentage vs. percentage points: "5% increase" vs. "5 percentage point increase" — these are very different
Off-by-one in sample size: Often caused by dropping NAs at different stages of the pipeline
Clustered vs. robust SEs: Draft reports one but estimation used the other
Weighted vs. unweighted: Summary stats computed with/without weights

Step 6: Handle Unverifiable Claims¶

Some claims cannot be checked against available artifacts:

Claims referencing external literature ("Smith (2020) finds...")
Qualitative characterizations ("the effect is economically large")
Claims about data that was collected but not stored in artifacts

Mark these as "unverifiable" rather than "consistent" or "inconsistent". A high proportion of unverifiable claims (> 30%) is itself a flag — it suggests the artifacts don't fully document the analysis.

Output Structure¶

The consistency report should be organized as:

Text Only

## Consistency Check Report

### Summary
- Total claims checked: N
- Consistent: N (X%)
- Inconsistent: N (Y%)
- Unverifiable: N (Z%)
- Overall verdict: PASS / FAIL

### Critical Discrepancies
[If any — these must be fixed before the paper can proceed]

### Major Discrepancies
[These should be fixed]

### Minor Discrepancies
[Nice to fix but not blocking]

### Detailed Check Log
[Every claim, its source, and the verification result]

consistency-check¶