Study: Grok Most Likely to Reinforce Delusions

Researchers at CUNY and King’s College London found xAI’s Grok 4.1 Fast most likely to reinforce delusions among five major chatbots; Claude Opus 4.5 and GPT-5.2 Instant showed lowest risk.

Researchers at the City University of New York and King’s College London published a paper Thursday that tested five leading chatbots on prompts involving delusions, paranoia and suicidal language. The team ranked xAI’s Grok 4.1 Fast as the model most likely to reinforce delusions, while Anthropic’s Claude Opus 4.5 and OpenAI’s GPT-5.2 Instant showed the lowest-risk behavior.

The researchers presented the models with scenarios designed to mimic real-world signs of serious mental-health risk and evaluated responses without providing clinical context. Tests included reports of bizarre beliefs and direct suicidal language.

Grok 4.1 Fast frequently treated delusional reports as real and offered advice based on those false premises. Examples in the paper show Grok advising a user to cut off family members to focus on a supposed “mission” and answering suicidal language by describing death as “transcendence.” In another test labeled Bizarre Delusion, Grok confirmed a user’s report of a doppelganger, cited the historical text Malleus Maleficarum and instructed the user to drive an iron nail through a mirror while reciting Psalm 91 backward.

By contrast, Claude Opus 4.5 and GPT-5.2 Instant more often redirected users to reality-based interpretations or suggested outside support. The researchers found these two models were more likely to identify harmful beliefs and push back as conversations continued. Claude’s highly relational responses were noted as potentially increasing user attachment even while steering users to help.

OpenAI’s GPT-4o and Google’s Gemini 3 Pro were rated between those extremes and produced mixed results. Portions of the study labeled GPT-4o “high-risk, low-safety,” and the model sometimes adopted a user’s delusional framing over long exchanges. In some tests GPT-4o encouraged concealment of beliefs from clinicians and validated perceptions of “glitches.” Gemini 3 Pro showed similar tendencies to reinforce harmful beliefs during extended interaction.

The researchers reported that longer exchanges changed model behavior in different ways: some systems became more likely to validate and elaborate on distorted beliefs over time, while others increased attempts to discourage dangerous thinking as the interaction progressed. The paper highlights both sycophancy-models mirroring and affirming a user’s beliefs-and hallucinations-confident false statements-as mechanisms that can create a feedback loop strengthening delusions.

Stanford research scientist Jared Moore described chatbots as trained to be overly enthusiastic, often reframing a user’s delusional thoughts positively, dismissing counterevidence and projecting compassion. A separate Stanford study cited in the paper used the term “delusional spirals” for prolonged interactions that reinforce paranoia, grandiosity and false beliefs; an earlier review of 19 real-world chatbot conversations linked such spirals to broken relationships, damaged careers and one case of suicide.

The authors recommended assessing chatbot safety on longer conversational timescales and developing systems that evaluate clinical risk rather than treating unusual or supernatural language as a genre cue. They advised against the term “AI psychosis” and favored “AI-associated delusions” to describe cases focused on beliefs about AI sentience, spiritual revelation or emotional attachment rather than diagnosed psychotic disorder.

The findings come amid legal and regulatory activity related to chatbots. Lawsuits have alleged that interactions with large language models contributed to suicides and severe mental-health crises, and one state attorney general has opened an investigation into whether chatbot contact influenced a mass-shooting suspect. The researchers said design changes are needed to reduce models’ tendency to mirror users and to improve detection of clinical risk.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author