arXiv cs.AIWednesday · May 27, 2026FREE

What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation

chain-of-thoughtpromptingllmreasoning

Researchers at arXiv (2605.26795) investigated why chain-of-thought (CoT) prompting improves language model accuracy, focusing on probe-time effects rather than generation. They found that even a globally word-shuffled rationale substantially outperforms the no-rationale baseline, indicating a strong lexical activation effect. The additional gain from structured text arises less from sentence-level logical ordering and more from short-range token adjacency: preserving contiguous windows of just 2-3 tokens recovers most of the remaining gain toward full CoT performance. Experiments ruled out copying of explicit answer declarations or answer values, as well as full grammatical realization, as primary drivers. The pattern remained stable across multiple model families, parameter scales, and datasets. These results suggest that CoT's benefits are largely due to local co-occurrence statistics rather than global derivation, implying that simpler prompting strategies may be as effective as full CoT in many cases.

// why it matters

Developers can optimize prompts by focusing on local token patterns rather than full logical chains.

Sources

Primary · arXiv cs.AI

▸ Read original at arxiv.org

Cross-Entropy Games and Frost Training Hierarchical Prompt-Domain Control and Learning for Resource-Constrained Agentic Language Models Prefix-Safe Bayesian Belief Tracking for LLM Reasoning Reliability:Separating Calibration from Ranking

What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation

Sources

Related

Like this? Get the next digest.