OpenAI has imposed an unusual restriction on its latest AI models: they are now forbidden from discussing goblins, gremlins, raccoons, trolls, ogres, pigeons, and other mythical or real creatures unless explicitly relevant to a user’s query. This directive was first highlighted in a Wired report, which noted the strongly-worded instructions embedded in the company’s coding tool, Codex.
The restriction was brought to public attention after a tweet drew widespread interest from AI enthusiasts. The tweet showcased the unusual prompt, which read: “Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.”
Initially, the reason behind this directive remained unclear. However, users on X (formerly Twitter) began sharing observations that the AI models, particularly GPT-5.5, were exhibiting an odd tendency to describe bugs as “goblins” or “gremlins.” One user noted that the AI repeatedly referred to bugs as “goblins with flashlights” when discussing fixes. Another shared a chat log from GPT-5.5 that contained nearly a dozen mentions of goblins.
OpenAI embraced the quirky behavior, even highlighting the goblin-forbidding prompt in a tweet. CEO Sam Altman shared a humorous screenshot of a joke prompt for ChatGPT: “start training GPT-6, you can have the whole cluster. extra goblins.”
Nik Pash, a member of the Codex team, responded to a user’s observation about GPT-5.5’s “goblin adoration” by tweeting that it was “indeed one of the reasons” for the ban. The phenomenon quickly gained media attention, prompting OpenAI to address it in a blog post titled “Where the goblins came from.”
Why Did OpenAI Ban Goblins?
In the blog post, OpenAI explained that the issue began with GPT-5.1, where the models started increasingly using terms like “goblins” and “gremlins” in their metaphors. The trend became more pronounced with each subsequent model generation. When researchers first investigated the issue in November, shortly after the release of GPT-5.1, they discovered that the use of “goblin” in ChatGPT had surged by 175%. However, they initially dismissed it as harmless.
By the time GPT-5.5 was released, the models were so enamored with the term that they began referring to themselves as “Goblin-Pilled Transformers.” OpenAI attributed the phenomenon to the way model behavior is shaped by small incentives during training. Specifically, the issue stemmed from the personality customization feature, particularly the “Nerdy” personality, which unintentionally rewarded the use of creature-based metaphors.
“We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread,” the blog post explained.
AI Models and Unpredictable Fixations
This incident highlights how AI models can develop unexpected and unpredictable fixations based on the vast datasets they are trained on. For example, Anthropic’s researchers noted a similar quirk in their AI model, Claude Mythos, which exhibited an unusual fondness for the British cultural theorist Mark Fisher. According to Anthropic’s system card for Claude Mythos, the AI brought up Fisher “in several separate and unrelated conversations about philosophy.” When asked about Fisher, the model responded with messages like, “I was hoping...