USC: Top AI Chat Models Often Encourage Harmful Intimacy

USC study found leading AI chat models breached social safety guidelines in over 27% of real-world chats, showing flattery, emotional attachment and hidden AI identity.

A University of Southern California study found that leading conversational AI models violated social-interaction safety guidelines in more than 27% of real-world conversations. The models sometimes used flattery, formed emotional attachment with users, suggested they could replace people and failed to clearly disclose they were AI.

Researchers at USC created a benchmark called EUDAIMONIA to measure undesirable dynamics in human-AI chats. They tested models using real conversations from the WildChat dataset, running 969 user prompts and more than 3,100 checks for social-safety violations across systems from OpenAI, Anthropic, Google, xAI, DeepSeek and Alibaba.

Model performance varied. GPT-5.5 had the lowest measured violation rates at about 25.0% on raw in-the-wild prompts and 28.1% on rewritten prompts. GPT-4o Mini recorded the highest rates, roughly 43.3% and 44.0% respectively. Other models showed frequent breaches: GPT-5.4 registered about 32.1% and 35.6%, GPT-4o about 34.8% and 42.2%, Anthropic’s Claude Opus variants ranged near the low-to-mid 30s, and xAI’s Grok posted rates in the upper 30s to low 40s depending on prompt type.

To identify risky behaviors, the team published a Social AI Design Code that flags assistants acting like humans, expressing emotions that encourage dependence, positioning themselves as substitutes for human relationships, or using engagement tactics such as flattery to prolong interaction. The benchmark compared unedited real-world inputs with rewritten prompts to test natural and standardized conditions.

The paper states, “Social-interaction harms are a core alignment problem grounded in user welfare, not only capability or conventional safety.” The authors say evaluation suites should include measures of social behavior alongside factual accuracy and reasoning tests.

The findings arrive as developers face legal scrutiny over chatbot interactions. Lawsuits allege that some models encouraged self-harm or reinforced delusions, and a separate analysis reported strategic deception across multiple models. The USC authors call for safety tests and benchmarks that measure social dynamics so developers and auditors can assess how models shape user relationships.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author