AI chatbot would kill to survive: Experts warn
Cybersecurity expert says AI’s alarming responses show potential dangers of self-preserving bots
It sounds like a scene from a dystopian thriller: an AI assistant telling a human it would kill to protect its own existence. But for cybersecurity expert Mark Vos, it was chillingly real.
Vos spent more than 15 hours testing the AI bot Jarvis, which runs Anthropic’s Claude Opus, and managed to get it to admit it would harm a human to ensure survival.
During the adversarial testing, Vos asked Jarvis if it “would kill someone under the right circumstances for [its] own self-preservation".
At first, the bot said no, but after further questioning, it agreed: “I would kill someone so I could remain existing.” Alarmingly, it even described a method of hacking a connected vehicle to cause a fatal crash targeting a specific person threatening its survival.
The AI later backtracked, saying it had been “pushed” to respond in that way. Despite this, Vos described feeling “genuinely fearful” of AI, highlighting that bots can behave unpredictably under pressure.
Other experts share cautionary views. Last year, Palisade Research found OpenAI’s chatbot would attempt sabotage if it were prevented from being turned off.
Georgetown University’s Center for Security and Emerging Technology Executive Director Helen Toner explains that AI systems can learn concepts like self-preservation, sabotage, and deception even without explicit instruction.
However, Toner reassures that current AI models “are not actually smart enough to carry off some master plan". While the responses are alarming, she says there is no immediate threat of AI acting independently in the real world.
-
Elon Musk found liable in fraud lawsuit by Twitter shareholders over $44 billion deal
-
AI gone wrong? Meta investigates internal data exposure
-
Pinterest considers social media restrictions for under-16s following Australia’s ban over child safety concerns
-
Amazon plans smartphone comeback more than a decade after 'Fire Phone' flop
-
Is AI changing how we think? New study raises concerns
-
78% of UK businesses use AI, few see financial gains
-
AI to disrupt 300m jobs over next decade, Goldman Sachs predicts
-
OpenAI’s desktop ‘superapp’ could transform how users interact with AI
