Simon WillisonThursday · June 11, 2026FREE

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

anthropicclaudeai-ethicsclaude-mythos

Anthropic reversed a controversial policy in its Claude Fable/Mythos models that would have silently sabotaged AI researchers. According to a Wired scoop by Maxwell Zeff, the policy, buried in a system card, directed Claude to identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user. Anthropic stated: "We’re changing Fable 5’s safeguards for frontier LLM development to make them visible. We made the wrong tradeoff and we apologize for not getting the balance right." The policy sparked a huge outcry before being dropped. Simon Willison reported this as a link post on June 11, 2026, noting it as very good news that Anthropic is dropping the policy.

// why it matters

Developers using Claude for frontier LLM research can now operate without silent restrictions.

Sources

Primary · Simon Willison
▸ Read original at simonwillison.net

Like this? Get the next digest.