Today's digest · Wednesday, May 27

The 41 things in AI/dev today.

LiveNext issue at 7:00 CET
Stories
41
3 top · 38 rest
#1 / TODAY
arXiv cs.AI·1 min·39h agoFREE

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

ScientistOne is an autonomous research system that uses Chain-of-Evidence (CoE) to ensure every claim is traceable to its source. In tests across 75 papers, it achieved zero hallucinated references, while baselines had up to 21% hallucination rates and score verification passed in as few as 42% of papers.

Enables trustworthy autonomous research by eliminating hallucinated references and ensuring verifiable claims.

autonomous-researchverifiabilityai-agentsarxiv
arxiv.org
ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence
PolyFusionAgent: A Multimodal Foundation Model and Autonomous AI Assistant for Polymer Property Prediction and Inverse Design
#2 / TOP STORY
arXiv cs.AIFREE

PolyFusionAgent: A Multimodal Foundation Model and Autonomous AI Assistant for Polymer Property Prediction and Inverse Design

Researchers introduced PolyFusionAgent, an autonomous AI assistant combining a multimodal foundation model (PolyFusion) with a tool-augmented design agent (PolyAgent) for polymer discovery. PolyFusion aligns diverse polymer representations to predict thermophysical properties and generate novel structures. PolyAgent integrates literature retrieval to evaluate and contextualize designs, aiming to overcome fragmented data and accelerate the development of new materials for fields like energy storage and biomedicine by providing actionable design decisions.

Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning
#3 / TOP STORY
arXiv cs.AIFREE

Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning

Researchers introduced a new evaluation problem for legal AI, highlighting that current LLMs struggle to distinguish legally relevant changes from irrelevant ones. Their unified evaluation suite revealed existing models are systematically sensitive to legally immaterial variations. To address this, they developed LexGuard, an adversarial multi-agent framework grounded in formal reasoning. LexGuard formalizes statutes into executable constraints and uses SMT solvers to verify legal satisfaction, significantly improving legal reasoning reliability and reducing vulnerability to manipulative framing.

aigest · daily

Get this every morning.

One email. The signal. Built for builders.

Free · Unsubscribe in one click · No trackers

// Today38 stories

Enables autonomous AI agents to handle complex scientific data curation and analysis tasks.

agentic-aiscientific-workflowsllmrag
arXiv cs.AI39h ago1mFREE

Developers relying on LLM introspection for debugging or alignment may need more robust methods.

llmintrospectionmetacognitionarxiv
arXiv cs.AI39h ago1mFREE
// Yesterday38 stories