DeepSciVerify: Verifying Scientific Claim--Citation Alignment via LLM-Driven Evidence Escalation
DeepSciVerify, introduced in a paper on arXiv (2605.27710), is a pipeline that verifies alignment between scientific claims and their cited evidence. It first checks claims against paper abstracts, then escalates uncertain cases to retrieve and analyze full-text passages. This selective escalation leverages complementary behaviors across LLMs: some models are more conservative, others more decisive under uncertainty. On the SCitance benchmark, DeepSciVerify achieves 86.7 Micro-F1, outperforming strong abstract-only baselines by +4.5 points. It resolves 67% of instances without needing full-text retrieval, improving both accuracy and efficiency. The system targets a common failure mode in LLM-generated reports where claims misalign with citations, limiting reliability in scientific and high-stakes settings.
Improves reliability of LLM-generated scientific reports by verifying claim-citation alignment efficiently.