TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety
TRACE (Trajectory Risk-Aware Compression) addresses the challenge of detecting safety risks in long-horizon LLM agent trajectories, where risk signals are sparse, delayed, and compositional. Existing turn-level detectors fail to retain evidence over extended contexts. TRACE uses a Compressor-Reader architecture: the Compressor encodes the full trajectory into a compact latent evidence state under trajectory-level supervision, and the Reader judges the raw trajectory using this state as a safety reference. This design aggregates dispersed risk cues and reduces premature evidence loss. Evaluated on ASSEBench, Pre-Ex-Bench, and R-Judge, TRACE achieves the best accuracy across all tested backbones, improving over strong baselines by up to 12.6 percentage points. On the LongSafety benchmark, TRACE exhibits smaller performance degradation as context length increases. Attention visualizations and case studies indicate that the compressed reference helps the Reader focus on relevant safety evidence. The paper is available on arXiv (2606.00611) and was published on June 2, 2026.
Enables safer long-horizon LLM agents by reliably aggregating sparse risk signals.