STORM / Co-STORM¶
external · status: active · focus: literature · discipline: general · started: 2024
Project page: https://github.com/stanford-oval/storm
Source: projects/landscape/storm.yml
Positioning¶
An LLM-powered knowledge-curation system that writes Wikipedia-style long-form articles from web search. STORM uses perspective-guided question asking and simulated conversations between a writer and a topic expert; Co-STORM (EMNLP 2024) adds a collaborative discourse protocol with human-in-the-loop and a dynamic mind map.
Distinctive contribution¶
Treats the pre-writing problem (deciding what questions to ask) as the central bottleneck of automated long-form writing, and operationalizes it via perspective discovery and simulated expert dialogue. Co-STORM further makes the human–LLM curation loop a first-class architectural element.
Evaluation scores¶
| Dimension | Score (0–3) | Note |
|---|---|---|
| Lifecycle coverage | 1 | Three stages clustered around the pre-writing + drafting block; no analysis, modeling, or review. |
| Autonomy level | 2 | Supervised in STORM (topic → article); Co-STORM adds collaborative human steering. |
| Architectural transparency | 3 | Open under MIT; modular interfaces; two arXiv papers (NAACL 2024 + EMNLP 2024) document the design. |
| Inputs supported | 2 | Single input form (topic) but multiple retrieval back-ends: Bing, You.com, custom vector store. |
| Outputs / reproducibility | 2 | Pip-installable knowledge-storm package; outputs are deterministic given the retrieval back-end and model. |
| Internal evaluation | 2 | Both papers report systematic evaluations against baselines and Wikipedia editors. |
| Openness | 3 | MIT-licensed, pip-installable, demo site, public papers. |
| Maturity / traction | 3 | 28k+ stars, live research preview with 70k+ users, integrated into multiple downstream projects. |
| Cross-family policy | 0 | Single LLM provider per run. |
| Runtime assurance | 1 | Perspective-guided question asking + simulated conversation provide light internal review. |
| Cross-platform portability | 2 | Multiple retrieval back-ends (Bing, You.com, VectorRM); knowledge-storm pip package usable across providers. |
Scored on 2026-05-18. See the evaluation rubric.
Tags¶
Pipeline stages: literature-discovery literature-synthesis paper-drafting
Architectural features: multi-agent human-in-loop tool-use rag-knowledge-base iterative-loop
Inputs: topic
Outputs: long-form-article citations mind-map
Data sources: web-search user-provided-documents
Knowledge sources: bing-search you-search vector-rm
Limitations¶
- Stated explicitly by authors: output is not publication-ready and requires significant editing.
- Focused on Wikipedia-style synthesis; not designed to generate novel research.
- Quality dependent on search back-end coverage.
Related projects in this catalog¶
Papers describing this project¶
- Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models — Shao, Y., Jiang, Y., Kanell, T. A., Xu, P., Khattab, O., Lam, M. S. (2024). NAACL 2024. arXiv:2402.14207
- Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations — Jiang, Y., Shao, Y., Ma, D., Semnani, S. J., Lam, M. S. (2024). EMNLP 2024 (Co-STORM). arXiv:2408.15232
Also compared in¶
- Agentic AI for Scientific Discovery: A Survey (
gridach2025agenticsurvey) — Covered as a flagship literature-synthesis agent.
Related references (literature catalog)¶
- Wu, J. et al. (2025). Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
wu2025agenticreasoning