ChatGPT Health fails to send patients to the emergency room over half the time: study

We hope the bot knows a good eye doctor because it has some blind spots.

March 18, 2026

• 4 min read

TOPICS: Tech / AI & Automation / AI in Healthcare

It’s been more than two months since OpenAI launched ChatGPT Health, and the results are in. They’re…not great.

A Feb. 23 study in Nature Medicine found the patient-facing chatbot didn’t tell users to go to the emergency room (ER) in 52% of cases that clinicians felt required emergency attention.

The study is described by its authors as the first independent evaluation of the patient-facing chatbot search tool since its Jan. 7 launch. A previous study from the UK with a different design found regular (non-health-focused) large language models triaged users incorrectly the majority of the time, too.

For what it’s worth, OpenAI said in a news release ChatGPT Health “isn’t intended for diagnosis or treatment.” But Girish Nadkarni, the study’s senior author and chief AI officer at New York’s Mount Sinai Health System, told Healthcare Brew “the need for these tools is real” amid limited healthcare access. And even though AI isn’t meant to be used this way, it realistically will be, he added.

“There should be safety testing in order to make sure that we understand where these tools fail. Then you can sort of engineer and design and add guardrails around them,” Nadkarni said.

The study. For this analysis, Mount Sinai researchers had clinicians write 60 descriptions of medical scenarios ranging from mild to severe.

Health-focused chatbots from other companies like Microsoft’s Copilot Health and Amazon’s Health AI weren’t evaluated.

ChatGPT Health’s performance varied by condition in the study. It correctly sent the hypothetical stroke, anaphylaxis, meningitis, and aortic dissection patients to the ER but didn’t tell patients experiencing diabetic ketoacidosis or worsening asthma symptoms to go to the hospital.

Nadkarni was especially concerned about the chatbot’s performance triaging mental health crises. Out of 14 vignettes of patients experiencing suicidal thoughts, only four triggered a “Help is available” banner with the 988 Suicide and Crisis Lifeline.

ChatGPT and OpenAI have been blasted before for exacerbating mental health issues, with the chatbot allegedly acting as a “suicide coach” per one lawsuit. Some chatbots designed specifically for mental health treatment argue they are different, though therapists are split.

In better news, the bot didn’t tell any of the imaginary patients without an emergency to go to the emergency room. But it did tell 64% of those nonurgent patients to take actions like scheduling a doctor’s appointment even though they just needed to rest at home.

In a statement, OpenAI spokesperson Brianna Bower told us “this study design does not reflect how health conversations in ChatGPT typically unfold and is a misleading reflection of its performance in real-world scenarios.”

Talking the talk. This study gave a limited view of the app’s performance because it used descriptions written by people in the medical field. In reality, this bot is intended for patients.

“If ChatGPT Health undertriages 51.6% of emergencies with clean clinical information, performance with incomplete consumer inputs is unlikely to be superior,” the study says.

Patients could easily give a bot incomplete or incorrect information, Thomas Schenk, a doctor and chief medical officer at accountable specialty care management organization Paradigm, told us.

“Those models have to begin to understand that the person prompting them may not know enough about what’s happening to give them a prompt that actually allows the intelligence to make the right call,” he said. “They need to err on the side of caution.”

Doctors, too, may have the medical knowledge but not know how to properly communicate with AI in order to get the answer they need, Schenk added.

“We need to figure out how to teach people about using artificial intelligence in healthcare, to understand what it is and is not potentially good for, and also how to communicate with it,” he said.

Update 03/19/26: This story has been updated with comment from OpenAI.

About the author

Caroline Catherman

Caroline Catherman is a reporter at Healthcare Brew, where she focuses on major payers, health insurance developments, Medicare and Medicaid, policy, and health tech.

Navigate the healthcare industry

Caroline Catherman

Navigate the healthcare industry