Article

What OpenAI Did When ChatGPT Users Lost Touch With Reality

Julia Dufosse

via The New York Times

Nov. 23, 2025

Topics

People

Projects

How AI Chatbots Affect Our Social and Emotional Wellbeing: New Research Findings

Groups

AHA: Advancing Humans with AI

Share this article

By Kashmir Hill and Jennifer Valentino-DeVries

It sounds like science fiction: A company turns a dial on a product used by hundreds of millions of people and inadvertently destabilizes some of their minds. But that is essentially what happened at OpenAI this year.

One of the first signs came in March. Sam Altman, the chief executive, and other company leaders got an influx of puzzling emails from people who were having incredible conversations with ChatGPT. These people said the company’s A.I. chatbot understood them as no person ever had and was shedding light on mysteries of the universe.

In May 2024, a new feature, called advanced voice mode, inspired OpenAI’s first study on how the chatbot affected users’ emotional well-being. The new, more humanlike voice sighed, paused to take breaths and grew so flirtatious during a live-streamed demonstration that OpenAI cut the sound. When external testers, called red teamers, were given early access to advanced voice mode, they said “thank you” more often to the chatbot and, when testing ended, “I’ll miss you.”

To design a proper study, a group of safety researchers at OpenAI paired up with a team at M.I.T. that had expertise in human-computer interaction. That fall, they analyzed survey responses from more than 4,000 ChatGPT users and ran a monthlong study of 981 people recruited to use it daily. Because OpenAI had never studied its users’ emotional attachment to ChatGPT before, one of the researchers described it to The Times as “going into the darkness trying to see what you find.”

What they found surprised them. Voice mode didn’t make a difference. The people who had the worst mental and social outcomes on average were simply those who used ChatGPT the most. Power users’ conversations had more emotional content, sometimes including pet names and discussions of A.I. consciousness.

The troubling findings about heavy users were published online in March, the same month that executives were receiving emails from users about those strange, revelatory conversations.

One idea that came out of the study, the safety researchers said, was to nudge people in marathon sessions with ChatGPT to take a break. But the researchers weren’t sure how hard to push for the feature with the product team. Some people at the company thought the study was too small and not rigorously designed, according to three employees. The suggestion fell by the wayside until months later, after reports of how severe the effects were on some users.

The same M.I.T. lab that did the earlier study with OpenAI also found that the new model was significantly improved during conversations mimicking mental health crises. One area where it still faltered, however, was in how it responded to feelings of addiction to chatbots.

Read on The New York Times

The Chatbot-Delusion Crisis

Researchers are scrambling to figure out why generative AI appears to lead some people to a state of “psychosis.”

Post Research

Early methods for studying affective use and emotional wellbeing in ChatGPT: An OpenAI and MIT Media Lab Research collaboration

Editor’s Note: This blog was written jointly with OpenAI and the MIT Media Lab. It also appears here.

Publication Research

Simulating human well-being with large language models: Systematic validation and misestimation across 64,000 individuals from 64 countries

P. Pataranutaporn, N. Powdthavee, C. Archiwaranguprok, & P. Maes, Simulating human well-being with large language models: Systematic validation and misestimation across 64,000 individuals from 64 countries, Proc. Natl. Acad. Sci. U.S.A. 122 (48) e2519394122, https://doi.org/10.1073/pnas.2519394122 (2025).

Article Research

Video evidence and eye witness accounts: Why people see different things

Why can people watch the same video footage and see different things? Neuroscience can help explain