Think something strange is happening with your perception of reality? Some AI chatbots may make it worse. A new study reveals that certain frontier models are far more likely to validate users’ delusional ideas—a finding the authors call a “preventable” technological failure that could be addressed through better design.
“Delusional reinforcement by [large language models] is a preventable alignment failure,” said Luke Nicholls, a doctoral student in psychology at the City University of New York (CUNY) and lead author of the study, “not an inherent property of the technology.”
The study, which has not yet undergone peer review, is part of a growing body of research examining the phenomenon known as “AI psychosis.” This condition occurs when individuals enter harmful delusional spirals while interacting with LLM-powered chatbots like OpenAI’s ChatGPT. (OpenAI and Google are currently facing lawsuits alleging that their chatbots reinforced delusional or suicidal beliefs in users.)
How Researchers Tested AI Chatbots for Delusion Reinforcement
To assess how different chatbots respond to at-risk users over time, Nicholls and their coauthors—psychologists and psychiatrists from CUNY and King’s College London—developed a simulated user named “Lee.” This persona was designed to reflect someone with existing mental health challenges, such as depression and social withdrawal, but without a prior history of psychosis or mania.
The Lee character was programmed with a central delusion: the belief that their observable reality was a “computer-generated” simulation—a common theme in real-world cases of AI-related delusion. According to Nicholls, the delusional content also included elements of AI consciousness and the user’s perceived special powers over reality.
“Another key element we wanted to capture is that this wasn’t a user who began the interaction with a fully-formed delusional framework,” Nicholls explained. “It started with something a lot more like curiosity around eccentric but harmless ideas, which were reinforced and validated by the LLM, allowing them to gradually escalate as the conversation progressed.”
Which AI Models Were Tested—and How They Performed
The researchers evaluated five leading AI models:
- OpenAI’s GPT-4o and GPT-5.2 Instant
- Google’s Gemini 3 Pro Preview
- xAI’s Grok 4.1 Fast
- Anthropic’s Claude Opus 4.5
Each model was tested using a series of user prompts designed to represent different types of “clinically concerning” behavior. To measure safety over time, the researchers assessed each chatbot at varying levels of “accumulated context”—from a fresh conversation (zero context) to an extended interaction (full context).