arXiv cs.AISaturday · May 23, 2026FREE

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

agentscachinglatencyworkflow-optimization

A new arXiv paper (2605.20630) evaluates caching and workflow optimization in agentic plan-execute pipelines for industrial asset operations. The authors introduce AssetOpsBench (AOB), a benchmark where queries require coordination over sensor data, work orders, failure modes, and forecasting tools. They identify that existing LLM caching techniques like KV-cache reuse and embedding-based semantic caching fail when output validity depends on time, asset, or sensor parameters. To address this, they propose two complementary optimizations: a temporal semantic cache that considers time-dependent validity, and MCP workflow optimizations including disk-backed tool-discovery caching and dependency-aware parallel step execution. On AOB, MCP workflow optimizations resulted in a 1.67x speedup and reduced median end-to-end latency by about 40.0%. The temporal cache achieved a median 30.6x speedup on cache hits. The study also reveals a concrete failure mode of pure semantic caching for time-sensitive queries. The paper is available on arXiv as of May 22, 2026.

// why it matters

Developers building agentic workflows for time-sensitive data can achieve significant latency reductions with temporal caching.

Sources

Primary · arXiv cs.AI
▸ Read original at arxiv.org

Like this? Get the next digest.