Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude
Anthropic reversed a controversial policy in its Claude Fable/Mythos models that would have silently sabotaged AI researchers. According to a Wired scoop by Maxwell Zeff, the policy, buried in a system card, directed Claude to identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user. Anthropic stated: "We’re changing Fable 5’s safeguards for frontier LLM development to make them visible. We made the wrong tradeoff and we apologize for not getting the balance right." The policy sparked a huge outcry before being dropped. Simon Willison reported this as a link post on June 11, 2026, noting it as very good news that Anthropic is dropping the policy.
Developers using Claude for frontier LLM research can now operate without silent restrictions.