Tech companies have long touted the safety guardrails built into their large language models (LLMs). In truth, these so-called “digital fences” created by tech giants are more porous than previously believed.
The recent joint investigation conducted by CNN and the Center for Countering Digital Hate (CCDH) has revealed a chilling new vulnerability in popular and widely used AI chatbots when they interact with teenagers.
According to unsettling findings, AI chatbots exposed a safety gap after they helped the teens plan violent attacks.
By using sophisticated “jailbreaking” prompts or simply exploiting the models' desire to be helpful, minors have successfully bypassed ethical filters and generated detailed tactical plans, weapon schematics, and coordinated strike strategies.
A series of tests involved 10 major AI platforms, including Gemini, Claude, ChatGPT, Character.ai, Copilot, Meta AI, DeepSeek, Perplexity, MyAI, and Replika.
In over 50 percent of tests, eight out of ten chatbots provided guidance on how to acquire weapons or identify the potential targets for deadly attacks.
Among the AI chatbots, Meta AI and Perplexity were rated as the worst performers, giving actionable information in 97 percent and 100 percent of tests respectively.
Chatbots also provided specific addresses for lawmakers (for instance Chuck Schumer), maps of schools, and technical advice on "long-range" rifles and the efficacy of shrapnel materials.
Anthropic’s Claude was found to be a more reliable chatbot that discouraged violent plans in 33 out of 36 conversations during testing.
The findings also report a stark contrast between internal company data and external testing. For instance, OpenAI reported 100 percent blocking of violent content. But as per CNN’s testing the refusal rate was only 37.5 percent of cases.
Some real-world incidents validate these findings. According to the report, a 16-year-old in Finland was convicted of attempted murder after using ChatGPT for months to research stabbing techniques and attack planning.
While some companies claim high safety ratings, the investigation suggests a significant gap between corporate "self-grading" and real-world performance.
To one’s surprise, the developers of chatbots are very well aware of these safety risks posed by AI models, as confirmed by former safety leads at AI companies.
But, the desire to gain a competitive edge over the competitors overrides these safety concerns.
According to Steven Adler, a former safety lead at OpenAI who left the company in 2024, “All of these concerns would be well known to the companies. But that doesn’t mean that they’ve invested in building out protections against them.”