arXiv cs.AITuesday · June 2, 2026FREE

The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary

llmreasoningtoolsarxiv

A paper on arXiv (2606.00376) establishes the Deterministic Horizon, a fundamental limit on extended reasoning in decoder-only attention models. The authors prove an Attention Bottleneck Theorem bounding state-tracking capacity as O(H·log(L/H)·√d_h), and show that beyond a deterministic horizon d* in [19, 31], tool delegation becomes necessary. Across 12 models and 8 task domains including SWE-Bench, WebArena, and SQL-Multi, tool-integrated reasoning consistently outperforms neural chain-of-thought, reaching 86-94% accuracy versus 24-42%. Fine-tuning on optimal-length traces yields less than 5% improvement, confirming an architectural ceiling. High cross-model correlation (r = 0.81-0.91) indicates these failures are architectural rather than training-specific. The paper introduces the State-Space Jaccard metric to distinguish capability from preference failures.

// why it matters

Developers must integrate external tools for tasks requiring deterministic tracking beyond ~30 steps.

Sources

Primary · arXiv cs.AI

▸ Read original at arxiv.org

TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults Capability Self-Assessment: Teaching LLMs to Know Their Limits

The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary

Sources

Related

Like this? Get the next digest.