RECAST (Replication and Extension with Causal AI Statistical Toolkit)¶
external · status: active · focus: replication · discipline: econometrics · started: 2026
Project page: https://github.com/qgallea/recast-causal-ai
Source: projects/landscape/recast-causal-ai.yml
Positioning¶
An end-to-end autonomous pipeline for the replication + extension + peer-review arc of the RISE concept diagram. Given a published econometrics paper and its replication data, RECAST reproduces the original results, extends them using Double/Debiased Machine Learning (7 ML methods, 20+ sample splits) and Causal Forests, runs the extended findings through a structured three-referee AI review, and emits a final synthesis report.
Distinctive contribution¶
Couples replication with methodological extension — most agentic-replication tools stop at "did we get the same numbers?"; RECAST proposes that the same agentic infrastructure should also push existing papers forward with modern causal-ML estimators and have its own output stress-tested by AI referees before publication.
Evaluation scores¶
| Dimension | Score (0–3) | Note |
|---|---|---|
| Lifecycle coverage | 2 | Spans six pipeline stages (replication, data, analysis, code, peer review, drafting) — broader than pure replication tools. |
| Autonomy level | 2 | Supervised: user supplies target paper + data; system runs the rest autonomously, including the multi-referee review. |
| Architectural transparency | 3 | MIT-licensed; pipeline is implemented as Claude-Code skill (.md) files which are themselves the architecture documentation. |
| Inputs supported | 2 | Accepts target paper PDF/text + replication dataset; integrates external knowledge via Claude Code tool use. |
| Outputs / reproducibility | 2 | Replicated tables, extension results, code, and referee reports all persisted as durable artifacts. |
| Internal evaluation | 2 | Built-in three-referee AI review acts as runtime quality control on each run; no large external benchmark of replication-success rates yet. |
| Openness | 3 | MIT license; public GitHub repo; documentation site on GitHub Pages. |
| Maturity / traction | 1 | Active early-stage project (single-developer, low star count as of 2026-05). |
| Cross-family policy | 0 | Single-LLM-family pipeline (Claude Code); referees are isolated instances of the same model family. |
| Runtime assurance | 3 | Structured three-isolated-referee review + synthesis loop is the assurance layer; among the most explicit in the catalog. |
| Cross-platform portability | 1 | Claude-Code-specific (skill files); not portable to other agent harnesses without rewriting. |
Scored on 2026-05-22. See the evaluation rubric.
Tags¶
Pipeline stages: replication data-acquisition data-analysis code-generation peer-review drafting
Architectural features: tool-use multi-agent-review dag-orchestration artifact-versioning claude-code-skill-files
Inputs: target-paper target-dataset
Outputs: replication-report extension-results referee-reports synthesis-report code
Data sources: target-paper-data
Knowledge sources: target-paper doubleml-literature causal-forest-literature
Limitations¶
- Tied to the Claude Code skill-file orchestration model — porting requires reimplementing the pipeline.
- Replication-success depends on availability of target paper's data and reproducibility of its code.
- Same-family AI referees may share blind spots with the executor; no cross-family review by default.