AGORA: Adapter-Grounded Observation-Action Retention for Inference-Free Prompt Compression in LLM Agents
AGORA (Adapter-Grounded Observation-Action Retention) is a prompt compression technique designed specifically for LLM agents, addressing the failure of general token-level extractive compressors in agentic settings. The paper, published on arXiv on May 27, 2026, reports that across 17 (environment, backbone, method) cells spanning two independent token-level method families, every cell collapsed to a mean reward of 75% of uncompressed performance in 8 of 9 cells (with the lone exception at 73%). A four-way component ablation revealed that the structural floor—the retention of observation-action pairs—is the dominant quality lever, while the learned scorer provides 1.0-11.5x adaptive end-to-end compression from a single fixed keep ratio. AGORA requires no inference-time overhead, making it suitable for real-time agent applications. The method is evaluated on multiple agent benchmarks, demonstrating that it preserves task performance while significantly reducing prompt length.
Enables efficient, lossless prompt compression for LLM agents without inference overhead.