Skip to content

E2ER — End-to-End Research

owned · status: active · focus: end-to-end · discipline: economics · started: 2025

Project page: https://github.com/bhanneke/E2ER-project

Source: projects/e2er.yml

Positioning

E2ER is a strategist-driven agentic research pipeline that takes a research idea (human- or agent-supplied) and carries it through literature synthesis, identification, data acquisition, analysis, and paper drafting. It targets the full inputs → knowledge production → outputs arc of the RISE diagram, with explicit data and knowledge side-inputs.

Distinctive contribution

Strategist-orchestrated multi-agent design with persona-rich review loops; emphasizes durable artifact production over chat output, and treats methodological skills (econometrics, replication, referee simulation) as first-class reusable modules.

Evaluation scores

Dimension Score (0–3) Note
Lifecycle coverage 3 Covers ideation through referee simulation — 12 of 14 canonical stages.
Autonomy level 2 Supervised agent — human sets up the task and reviews final artifacts; intermediate steps are autonomous.
Architectural transparency 2 Architecture, prompts, and skills published on GitHub; some orchestration internals still evolving.
Inputs supported 3 Accepts ideas, RQs, prior papers; integrates literature corpora and live data sources.
Outputs / reproducibility 2 Versioned artifacts persisted; full end-to-end reproducibility from inputs still in progress.
Internal evaluation 1 Reviewer-simulation loops provide internal evaluation; no external benchmark or peer-reviewed publication yet.
Openness 2 Public repository under permissive license; some examples reproducible without proprietary credentials.
Maturity / traction 1 Active research prototype; single-team use as of 2026-05.
Cross-family policy 0 Single-LLM-family design (Claude Code); skill-based critics within the same family — no cross-family policy.
Runtime assurance 2 Reviewer-simulation skills + skills-loader + persistent memory + scheme-aggregated weighted consensus provide moderate runtime gating; no published claim-audit harness.
Cross-platform portability 1 Docker stack; Claude Code as primary back-end; some skills reusable elsewhere but no native multi-IDE adapters.

Scored on 2026-05-18. See the evaluation rubric.

Tags

Pipeline stages: rq-formulation hypothesis-generation literature-discovery literature-synthesis research-design data-acquisition data-analysis formal-modeling code-generation paper-drafting revision-editing referee-simulation

Architectural features: multi-agent human-in-loop tool-use rag-knowledge-base persistent-memory dag-orchestration iterative-loop artifact-versioning

Inputs: human-idea agentic-idea research-question prior-paper

Outputs: paper-draft figures tables code referee-reports replication-package

Data sources: fred yfinance ssrn openalex

Knowledge sources: arxiv ssrn openalex semantic-scholar

Limitations

  • End-to-end reproducibility from inputs not yet demonstrated on a public test case.
  • Domain coverage currently focused on economics/finance; portability to other fields untested.
  • External evaluation (peer review, third-party replication) pending.