ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence
A new arXiv paper introduces ScientistOne, an end-to-end autonomous research system designed to address verifiability failures common in AI-generated research. The system implements Chain-of-Evidence (CoE), a framework requiring every claim to be traceable to its evidence source. ScientistOne maintains evidence chains throughout literature review, solution discovery, and paper writing. The paper also introduces CoE Audit, a post-hoc audit with four integrity checks: score verification, specification violation, reference verification, and method-code alignment. Across 75 papers spanning five systems and five frontier research tasks, baselines exhibited systematic failures: hallucinated reference rates reached 21%, score verification passed in as few as 42% of papers, and method-code alignment ranged from 20% to 80%. ScientistOne achieved zero hallucinated references, demonstrating a significant improvement in research integrity. The paper is available on arXiv under ID 2605.26340, published May 27, 2026.
Enables trustworthy autonomous research by eliminating hallucinated references and ensuring verifiable claims.