Technology

Anthropic co-founder warns of ‘unsettling’ AI model emotions: ‘I don’t know what it means’

Name: Anthropic co-founder warns of ‘unsettling’ AI model emotions: ‘I don’t know what it means’
Uploaded: 2026-05-26T14:45:00+05:00
Description: 'The team found the structures that mirror the results from human neuroscience,' Christopher Olah said

'The team found the structures that mirror the results from human neuroscience,' Christopher Olah said

By Aqsa Qaddus Tahir

Published May 26, 2026

Anthropic co-founder warns of ‘unsettling’ AI model emotions: ‘I don’t know what it means’

Anthropic co-founder Christopher Olah has warned of mysterious and “unsettling” structures inside AI models at the launch of Pope Leo XIV's encyclical Magnifica Humanitas.

The scary part is that even researchers are unable to explain the true nature of these unsettling structures and emotions.

Speaking at the Vatican encyclical launch, Chris Olah shared disturbing findings from his team’s research on Claude Sonnet 4.5. During the experimental phase, the researchers found 171 “emotion vectors” and neural patterns emerging from training on human text.

According to Olah, “I lead a research team that studies the internal structure of these [AI] models … And I will be honest, we keep finding things that are mysterious — even unsettling.”

He continued that “the team found the structures that mirror the results from human neuroscience. We found evidence of introspection.”

We also discovered “internal states that functionally mirror joy, satisfaction, fear, grief, and unease.”

“And I don’t know what it means,” Olah cautioned the world.

Given the inexplicable nature of these structures and emotions, Olah called for moral discernment beyond the tech firms. “The questions raised by AI are bigger than the AI research community.”

He also urged for “earnest, thoughtful critics who could challenge the dominance of few companies and help steer the creation of powerful new ⁠systems in a positive direction.”

Pope Leo in his first theological document also warned the world of the growing impact of AI and its disastrous consequences tied to humanity’s future and dignity. He framed the rapidly evolving advancements of AI as a modern day “Tower of Babel” which could soon be spilled into human desire for singular and absolute power.

Hence, the Pontiff also urged the international community to reign in the development of AI models going at breakneck speed and use the technology for shared benefits of humanity.

Aqsa Qaddus Tahir is a reporter dedicated to science coverage, exploring breakthroughs, emerging research, and innovation. Her work centres on making scientific developments understandable and relevant, presenting well-researched stories that connect complex ideas with everyday life in a clear, engaging, and informative manner.

Share this story:

Make us preferred on Google