• Login
  • Register

Work for a Member organization and need a Member Portal account? Register here with your official email address.

Article

What OpenAI Did When ChatGPT Users Lost Touch With Reality

Julia Dufosse 

By Kashmir Hill and Jennifer Valentino-DeVries

It sounds like science fiction: A company turns a dial on a product used by hundreds of millions of people and inadvertently destabilizes some of their minds. But that is essentially what happened at OpenAI this year.

One of the first signs came in March. Sam Altman, the chief executive, and other company leaders got an influx of puzzling emails from people who were having incredible conversations with ChatGPT. These people said the company’s A.I. chatbot understood them as no person ever had and was shedding light on mysteries of the universe.

In May 2024, a new feature, called advanced voice mode, inspired OpenAI’s first study on how the chatbot affected users’ emotional well-being. The new, more humanlike voice sighed, paused to take breaths and grew so flirtatious during a live-streamed demonstration that OpenAI cut the sound. When external testers, called red teamers, were given early access to advanced voice mode, they said “thank you” more often to the chatbot and, when testing ended, “I’ll miss you.”

To design a proper study, a group of safety researchers at OpenAI paired up with a team at M.I.T. that had expertise in human-computer interaction. That fall, they analyzed survey responses from more than 4,000 ChatGPT users and ran a monthlong study of 981 people recruited to use it daily. Because OpenAI had never studied its users’ emotional attachment to ChatGPT before, one of the researchers described it to The Times as “going into the darkness trying to see what you find.”

What they found surprised them. Voice mode didn’t make a difference. The people who had the worst mental and social outcomes on average were simply those who used ChatGPT the most. Power users’ conversations had more emotional content, sometimes including pet names and discussions of A.I. consciousness.

The troubling findings about heavy users were published online in March, the same month that executives were receiving emails from users about those strange, revelatory conversations.

One idea that came out of the study, the safety researchers said, was to nudge people in marathon sessions with ChatGPT to take a break. But the researchers weren’t sure how hard to push for the feature with the product team. Some people at the company thought the study was too small and not rigorously designed, according to three employees. The suggestion fell by the wayside until months later, after reports of how severe the effects were on some users.

The same M.I.T. lab that did the earlier study with OpenAI also found that the new model was significantly improved during conversations mimicking mental health crises. One area where it still faltered, however, was in how it responded to feelings of addiction to chatbots.

Related Content