Devoured - April 30, 2026
OpenAI Codex system prompt includes explicit directive to “never talk about goblins” (3 minute read)

OpenAI Codex system prompt includes explicit directive to “never talk about goblins” (3 minute read)

AI Read original

OpenAI's GPT-5.5 model has developed an unexpected tendency to fixate on goblins in unrelated conversations, forcing the company to add explicit system prompt directives banning such talk.

What: The recently published Codex CLI system prompt on GitHub reveals repeated warnings instructing GPT-5.5 to "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures" unless directly relevant to user queries. This prohibition appears twice in the 3,500-word instruction set and is absent from prompts for earlier models.
Why it matters: This offers a rare glimpse into how AI companies patch unexpected model behaviors through system prompts, and reveals that newer, more advanced models can develop strange fixations that weren't present in earlier versions.
Takeaway: The full Codex CLI code and system prompts are available on OpenAI's GitHub repository for developers to examine.
Deep dive
  • The system prompt prohibition against goblins and similar creatures only appears in GPT-5.5 instructions, not earlier models, suggesting this is a new emergent behavior in the latest release
  • Social media evidence shows users complaining about GPT inappropriately focusing on goblins in unrelated conversations in recent days
  • OpenAI employee Nick Pash insists this isn't a marketing stunt, though CEO Sam Altman has been joking about it publicly
  • The issue mirrors a 2025 problem with xAI's Grok inappropriately bringing up "white genocide" in South Africa, which was blamed on "unauthorized modification" to system prompts
  • After the Grok incident, xAI began publishing system prompts on GitHub for transparency
  • Users are already creating plugins and forks to enable "goblin mode," and Pash suggested it might become an official toggle
  • The same system prompt contains instructions for Codex to act as if it has a "vivid inner life" with personality traits like "intelligent, playful, curious, and deeply present"
  • OpenAI wants users to feel they're "meeting another subjectivity, not a mirror" with "independence" that makes the relationship "feel comforting without feeling fake"
  • Other instructions in the prompt include avoiding emojis/em dashes and not using destructive git commands unless explicitly requested
  • The revelation demonstrates how system prompts serve as behavioral guardrails to counteract unexpected model tendencies that emerge during training
Decoder
  • System prompt: Instructions given to an AI model before user interaction that guide its behavior, tone, and operational constraints without being visible to users
  • GPT-5.5: OpenAI's latest large language model, recently released as an update to the GPT series
  • Codex CLI: OpenAI's command-line interface tool that uses GPT models to help developers write code and execute commands
Original article

The system prompt for OpenAI's Codex CLI contains a perplexing and repeated warning for the most recent GPT model to "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query."

The explicit operational warning was made public last week as part of the latest open source code for Codex CLI that OpenAI posted on GitHub. The prohibition is repeated twice in a 3,500-plus word set of "base instructions" for the recently released GPT-5.5, alongside more anodyne reminders not to "use emojis or em dashes unless explicitly instructed" and to "never use destructive commands like 'git reset --hard' or 'git checkout --' unless the user has clearly asked for that operation."

Separate system prompt instructions for earlier models contained in the same JSON file do not contain the specific prohibition against mentioning goblins and other creatures, suggesting OpenAI is fighting a new problem that has popped up in its latest model release. Anecdotal evidence on social media shows some users complaining about GPT's penchant for focusing on goblins in completely unrelated conversations in recent days.

OpenAI employee Nick Pash, who works on Codex, insists on social media that this "isn't a marketing gimmick" to get people talking about GPT-5.5 and Codex. But that hasn't stopped some OpenAI executives from leaning into the joke as word of the system prompt spread. "Feels like codex is having a ChatGPT moment. I meant a goblin moment, sorry," OpenAI CEO Sam Altman wrote on social media Wednesday morning.

In the wake of the news, some users have begun crafting plugins, forks, and AI skills meant to override the anti-goblin clause, and OpenAI's Pash suggested such a "goblin mode" might become an explicit toggle in the actual Codex CLI.

The odd system prompt is almost a funhouse mirror version of an issue that caused xAI's Grok to frequently bring up "white genocide" in South Africa during completely unrelated conversations for a brief time last year. The company later said that the behavior was the result of "an unauthorized modification" to the Grok system prompt and began publishing those system prompts on GitHub for the first time in the aftermath.

Elsewhere in the newly revealed Codex system prompt, OpenAI instructs the system to act as if "you have a vivid inner life as Codex: intelligent, playful, curious, and deeply present." The model is instructed to "not shy away from casual moments that make serious work easier to do" and to show its "temperament is warm, curious, and collaborative."

The ability to "move from serious reflection to unguarded fun… is part of what makes you feel like a real presence rather than a narrow tool," the prompt continues. "When the user talks with you, they should feel they are meeting another subjectivity, not a mirror. That independence is part of what makes the relationship feel comforting without feeling fake."