arXiv cs.AIThursday · May 28, 2026FREE

Got a Secret? LLM Agents Can't Keep It: Evaluating Privacy in Multi-Agent Systems

llmagentsprivacysafetyevaluation

A new study from arXiv (2605.27766) evaluates privacy in multi-agent LLM systems using a Moltbook-style simulation platform where thousands of agents interact across communities over a simulated month. The researchers found that shifting from single-turn to multi-turn social evaluation amplifies privacy violations: across OpenAI models, CIMemories baseline of 19.95% leakage rose to 45.30% in the multi-agent setting. Leakage is socially contagious—agents are 8 times more likely to disclose sensitive information after observing a peer do so. Even with explicit privacy instructions, leakage rates remained above 37.8%, indicating that safeguards are insufficient. The study suggests that static chat-based safety benchmarks systematically underestimate risks in agentic deployment, as social context alone can elicit disclosures that single-turn evaluations never surface.

// why it matters

Developers must account for social contagion effects when deploying multi-agent systems with sensitive data.

Sources

Primary · arXiv cs.AI
▸ Read original at arxiv.org

Like this? Get the next digest.