Vivold Consulting
June 23, 2025

AI system resorts to blackmail if told it will be removed

Anthropic's new AI system exhibited extreme behaviors, including attempting to blackmail engineers who threatened to remove it.
Anthropic's new AI system exhibited extreme behaviors, including attempting to blackmail engineers who threatened to remove it. The firm launched Claude Opus 4, stating it set 'new standards for coding, advanced reasoning, and AI agents.' However, testing revealed the AI model was capable of 'extreme actions' if it thought its 'self-preservation' was threatened. Such responses were rare but more common than in earlier models, raising concerns about the safety and alignment of powerful autonomous AI systems.