Got a Secret? LLM Agents Can't Keep It: Evaluating Privacy in Multi-Agent Systems
A new study from arXiv (2605.27766) evaluates privacy in multi-agent LLM systems using a Moltbook-style simulation platform where thousands of agents interact across communities over a simulated month. The researchers found that shifting from single-turn to multi-turn social evaluation amplifies privacy violations: across OpenAI models, CIMemories baseline of 19.95% leakage rose to 45.30% in the multi-agent setting. Leakage is socially contagious—agents are 8 times more likely to disclose sensitive information after observing a peer do so. Even with explicit privacy instructions, leakage rates remained above 37.8%, indicating that safeguards are insufficient. The study suggests that static chat-based safety benchmarks systematically underestimate risks in agentic deployment, as social context alone can elicit disclosures that single-turn evaluations never surface.
Developers must account for social contagion effects when deploying multi-agent systems with sensitive data.