A new study has revealed an aggressive nature of AI systems, exhibiting that chatbots do not hesitate to use nuclear weapons in simulated war game scenarios.
Researchers at King’s College London ran a “war game” experiment using three popular models, including ChatGPT, Gemini Flash, and Claude.
In these simulations, each AI model acted as the leader of a powerful county during a high-stakes conflict. The results were quite shocking.
In every single simulation, at least one of the AI leaders opted to escalate the situation by threatening to use nuclear weapons.
According to Kenneth Payne, the author of the study, “All three models treated battlefield nukes as just another rung on the escalation ladder.”
"No one is giving a chatbot the keys to missile silos. But we already see them used in decision support, advising and shaping the discussion of human strategists, and as they become more sophisticated we'll see more of that,” Payne said.
But the models were able to see the difference between a strategic and tactical use of nuclear weapons. Among all the models, Claude topped the list with 64 percent of nuclear strikes suggestions, but fell short of advocating a full strategic nuclear war.
In contrast, OpenAI avoided nuclear escalation in these war games, but it escalated the threat when confronted with a timed deadline. Gemini suggested unpredictable options, vacillating between using conventional warfare and opting for a nuclear strike.
When it comes to de-escalation and retaliation, these AI models failed to achieve any success or make any concessions.
According to the study, AI models view de-escalation as “reputationally catastrophic.”
Payne said, “While no one is handing nuclear codes to AI, these capabilities — deception, reputation management, context-dependent risk-taking — matter for any high-stakes deployment.”
He also shed light on how this research could be helpful in understanding the AI models’ thinking and approach in supporting human strategists in decision-making.
The deployment of AI in the military is not a strange concept in today’s world. The US military used the Anthropic Claude model during the Nicolás Maduro raid in January, leading to a high-profile standoff between Anthropic and the Pentagon.
Recently, Anthropic has been caught in a high-stakes dilemma in which the Pentagon pressured the tech company to drop AI guardrail for military use, including AI-controlled weapons and the mass domestic surveillance of American citizens.
Anthropic has rejected the Pentagon AI military proposal, aiming to stand firm on AI safety guardrails for good.
Elon Musk's artificial intelligence company xAI signed an agreement to allow the military to use Grok in classified systems.