Open CoScientist Agents¶

external · status: dormant · focus: ideation · discipline: general · started: 2025

Project page: https://github.com/conradry/open-coscientist-agents

Source: projects/landscape/open-coscientist.yml

Positioning¶

An open-source implementation of Google DeepMind's AI co-scientist (arXiv:2502.18864), built on LangGraph and GPT Researcher. Realizes the co-scientist's multi-agent design — literature review, generation, reflection, evolution, meta-review, supervisor, and a tournament-style hypothesis competition with ELO ranking.

Distinctive contribution¶

Makes a major closed industrial design (DeepMind's co-scientist) inspectable and runnable: every agent role and the ELO tournament loop are present in code, with a Streamlit dashboard for inspecting hypothesis competition transcripts. Useful as a reference implementation against which the original paper's claims can be audited.

Evaluation scores¶

Dimension	Score (0–3)	Note
Lifecycle coverage	1	Four upstream stages culminating in ranked hypotheses; no execution or drafting.
Autonomy level	3	Supervisor agent orchestrates the loop without per-step approval.
Architectural transparency	3	MIT-licensed; every agent role visible in source; references the original DeepMind paper.
Inputs supported	2	Research-goal inputs; multi-LLM (Gemini 2.5 Pro + Claude Sonnet 4 + o3) collaboration.
Outputs / reproducibility	1	Tournament transcripts persisted; LLM nondeterminism and live web search limit run-to-run determinism.
Internal evaluation	1	Demo-quality evaluation; tournament metrics are internal to the loop, not external validation.
Openness	3	MIT-licensed; reproducible setup with PyPI install path.
Maturity / traction	1	53 stars; last push 2025-07; appears semi-maintained.
Cross-family policy	3	Requires Gemini 2.5 Pro + Claude Sonnet 4 + o3 in collaboration — explicitly multi-family by design.
Runtime assurance	2	ELO tournament + meta-review + reflection agents provide debate-based runtime gating.
Cross-platform portability	1	LangGraph + GPT-Researcher dependency; single Python entry.

Scored on 2026-05-18. See the evaluation rubric.

Tags¶

Pipeline stages: literature-discovery literature-synthesis hypothesis-generation research-design

Architectural features: multi-agent tool-use rag-knowledge-base iterative-loop debate-consensus

Inputs: research-goal

Outputs: ranked-hypotheses tournament-transcripts research-report

Data sources: web-search

Knowledge sources: web-search gpt-researcher

Limitations¶

Requires API keys from multiple commercial providers (Gemini + Anthropic + OpenAI) for the full design.
Independent implementation; fidelity to the original DeepMind system is the implementer's best effort, not author-verified.
Last push 2025-07.

Papers describing this project¶

Towards an AI co-scientist — Gottweis, J., Weng, W.-H., Daryin, A., Tu, T., Palepu, A., Sirkovic, P., et al. (2025). arXiv (Google DeepMind). arXiv:2502.18864

Wu, J. et al. (2025). Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools wu2025agenticreasoning
Park, J. S. et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior park2023generative