RECAST (Replication and Extension with Causal AI Statistical Toolkit)¶

external · status: active · focus: replication · discipline: econometrics · started: 2026

Project page: https://github.com/qgallea/recast-causal-ai

Source: projects/landscape/recast-causal-ai.yml

Positioning¶

An end-to-end autonomous pipeline for the replication + extension + peer-review arc of the RISE concept diagram. Given a published econometrics paper and its replication data, RECAST reproduces the original results, extends them using Double/Debiased Machine Learning (7 ML methods, 20+ sample splits) and Causal Forests, runs the extended findings through a structured three-referee AI review, and emits a final synthesis report.

Distinctive contribution¶

Couples replication with methodological extension — most agentic-replication tools stop at "did we get the same numbers?"; RECAST proposes that the same agentic infrastructure should also push existing papers forward with modern causal-ML estimators and have its own output stress-tested by AI referees before publication.

Evaluation scores¶

Dimension	Score (0–3)	Note
Lifecycle coverage	2	Spans six pipeline stages (replication, data, analysis, code, peer review, drafting) — broader than pure replication tools.
Autonomy level	2	Supervised: user supplies target paper + data; system runs the rest autonomously, including the multi-referee review.
Architectural transparency	3	MIT-licensed; pipeline is implemented as Claude-Code skill (.md) files which are themselves the architecture documentation.
Inputs supported	2	Accepts target paper PDF/text + replication dataset; integrates external knowledge via Claude Code tool use.
Outputs / reproducibility	2	Replicated tables, extension results, code, and referee reports all persisted as durable artifacts.
Internal evaluation	2	Built-in three-referee AI review acts as runtime quality control on each run; no large external benchmark of replication-success rates yet.
Openness	3	MIT license; public GitHub repo; documentation site on GitHub Pages.
Maturity / traction	1	Active early-stage project (single-developer, low star count as of 2026-05).
Cross-family policy	0	Single-LLM-family pipeline (Claude Code); referees are isolated instances of the same model family.
Runtime assurance	3	Structured three-isolated-referee review + synthesis loop is the assurance layer; among the most explicit in the catalog.
Cross-platform portability	1	Claude-Code-specific (skill files); not portable to other agent harnesses without rewriting.

Scored on 2026-05-22. See the evaluation rubric.

Tags¶

Pipeline stages: replication data-acquisition data-analysis code-generation peer-review drafting

Architectural features: tool-use multi-agent-review dag-orchestration artifact-versioning claude-code-skill-files

Inputs: target-paper target-dataset

Outputs: replication-report extension-results referee-reports synthesis-report code

Data sources: target-paper-data

Knowledge sources: target-paper doubleml-literature causal-forest-literature

Limitations¶

Tied to the Claude Code skill-file orchestration model — porting requires reimplementing the pipeline.
Replication-success depends on availability of target paper's data and reproducibility of its code.
Same-family AI referees may share blind spots with the executor; no cross-family review by default.

Papers describing this project¶

Double Machine Learning: A Practical Guide — Baiardi, A., Naghi, A. A. (2024). The Econometrics Journal. doi
How to Write an Effective Referee Report and Improve the Scientific Review Process — Berk, J. B., Harvey, C. R., Hirshleifer, D. (2017). Journal of Economic Perspectives 31(1):231–244. doi