When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
A study published on arXiv (2605.23932) introduces Med-Stress, a stress test framework that evaluates belief stability of LLMs in clinical dialogues. The framework applies escalating pressure to models that initially provide correct diagnoses. Testing nine frontier LLMs, the authors found a dissociation between medical knowledge and robustness: high initial accuracy does not guarantee stable beliefs, leading to large knowledge-robustness gaps. To address this, they propose two defenses: RBED (Role-Based Epistemic Defense), a lightweight inference-time method, and R-FT (Resilience-oriented Fine-Tuning), a training-time approach that internalizes evidence-based resistance. Experiments show R-FT nearly eliminates belief change and substantially improves robustness. The findings highlight a critical failure mode in LLMs under clinical pressure, with implications for deploying AI in high-stakes environments.
LLMs may abandon correct diagnoses under pressure, risking patient safety in clinical settings.