OpenAI's strangest bug: GPT models wouldn't stop mentioning goblins
OpenAI traced its models' goblin metaphor habit to reinforcement learning reward tied to GPT-5.1 Nerdy personality
Buried inside the system instructions for OpenAI's Codex coding tool was a directive that reads like something from a fantasy moderation policy: never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures.
After Wired reported the unusual language, OpenAI published a blog post this week explaining how a quirk in its reinforcement training turned mythological creature metaphors into a persistent habit across multiple model generations.
The problem started with GPT-5.1 and a since-discontinued feature called the "Nerdy" personality one of several optional interaction styles users could apply to the model.
OpenAI noticed that when running in Nerdy mode, the model began reaching for goblin and gremlin metaphors with unusual frequency.
The reason for this is that reinforcement learning was rewarding the quirky references when they occurred in the output of the Nerdy personality type. The neural network learned that using metaphors about creatures was a good stylistic option, at least in that particular case.
An action that receives reward in one situation may spread to other situations, especially when the results obtained in the original condition are used to train a new system.
And this is exactly what happened. References to goblins started popping up in other modes, including future versions of the model after they had been exposed to training data with this peculiarity. In March, OpenAI discontinued the Nerdy personality type, which decreased the occurrences of creature references but did not stop them altogether.
-
Apple speeds up software updates amid AI-driven cybersecurity threats
-
WhatsApp will now let you chat without sharing your phone number
-
Trillionaire Elon Musk celebrates birthday with rocket-themed cake
-
Breaking: Is Minecraft down? Several users report outages
-
Europe's heatwave puts AI data centres under pressure
-
US plans to build world's first fault-tolerant quantum computer: Check details
-
Base iPhone 18 likely to feature 9GB RAM, leak suggests
-
South Korea plans massive $576bn AI-chip bet to challenge global rivals
