arXiv cs.AIMonday · May 25, 2026FREE

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

llm-agentssecuritymemory-auditingcausal-attribution

MemAudit, introduced in a new arXiv paper (2605.23723), targets a vulnerability in memory-augmented LLM agents where adversarial users can inject malicious records through ordinary interactions. These records later steer agent reasoning and actions. Existing defenses focus on online intervention (e.g., prompt filtering), but MemAudit provides post-hoc auditing after harmful behavior is observed. The framework combines a counterfactual memory influence score to measure each memory's causal contribution to harmful outputs, and a memory consistency graph to detect structurally anomalous memories. It is evaluated against MINJA, a query-only memory injection attack. The paper is dated May 25, 2026, and is available on arXiv.

// why it matters

Developers can now audit agent memory post-attack, enabling forensic analysis and safer deployment.

Sources

Primary · arXiv cs.AI
▸ Read original at arxiv.org

Like this? Get the next digest.