ChatGPT, Gemini and other AI bots give bad medical advice half the time

April 15, 2026

1

People already use AI chatbots like search engines for everyday health information. This habit appears to be riskier after a new study found that half of the responses from five major bots were problematic, even if the responses sounded sophisticated and confident.

Researchers tested ChatGPT, Gemini, Grok, Meta AI and DeepSeek with 250 prompts on the topics of cancer, vaccines, stem cells, nutrition and sports performance.

The prompts reflected common health questions and known misinformation topics, then assessed whether the bots stuck to scientific evidence or veered into misleading and potentially unsafe advice.

Broad questions uncovered the biggest gaps

The weakest results were for open-ended prompts. These broader questions resulted in far more problematic answers than expected, while closed-ended prompts tended to result in safer answers.

This is important because real people don’t typically ask medical questions in a neat multiple-choice format. They ask whether a treatment works, whether a vaccine is safe, or what might improve athletic performance.

In the study, this type of prompt pushed the bots to provide answers that mixed solid evidence with weaker or misleading claims.

Strong trust, shaky procurement

The flaws didn’t stop with the answers themselves. Reference quality was poor with an average completeness score of 40% and none of the chatbots produced a fully accurate reference list.

This weakens one of the main reasons people trust chatbot responses. An answer can seem source-based and authoritative, but then collapse as soon as the citations are checked.

The researchers also called attention to made-up references, while the bots still answered with certainty and made almost no reservations.

Why this is important beyond a test

There are limits to the findings. The study only included five chatbots, these products change quickly and the prompts were designed to emphasize the models, which may overstate how often bad answers crop up in everyday use.

Nevertheless, the most important finding is difficult to dismiss. These systems were tested against evidence-based medical topics, and half of the answers were still incorrect or incomplete.

Although chatbots can currently help summarize information or formulate follow-up questions, they do not appear to be reliable enough to make meaningful medical decisions.

Related reads:

ChatGPT, Gemini and other AI bots give bad medical advice half the time

Broad questions uncovered the biggest gaps

Strong trust, shaky procurement

Why this is important beyond a test

Microsoft is adding new security barriers to protect you from remote desktop attacks

Save $75 on Ray-Ban Meta smart glasses: AI, open-ear audio, and a 12MP camera in a frame you’d wear anyway

MSI is introducing a flood of laptops with up to RTX 5090 graphics and Intel Arrow Lake chips

LEAVE A REPLY Cancel reply

Most Popular

The fuel crisis deepens as one of Australia’s two oil refineries catches fire

Microsoft is adding new security barriers to protect you from remote desktop attacks

Save $75 on Ray-Ban Meta smart glasses: AI, open-ear audio, and a 12MP camera in a frame you’d wear anyway

Why procurement automation is really about rules

Recent Comments

EDITOR PICKS

The fuel crisis deepens as one of Australia’s two oil refineries catches fire

Microsoft is adding new security barriers to protect you from remote desktop attacks

Save $75 on Ray-Ban Meta smart glasses: AI, open-ear audio, and a 12MP camera in a frame you’d wear anyway

POPULAR POSTS

The fuel crisis deepens as one of Australia’s two oil refineries catches fire

Microsoft is adding new security barriers to protect you from remote desktop attacks

Save $75 on Ray-Ban Meta smart glasses: AI, open-ear audio, and a 12MP camera in a frame you’d wear anyway

POPULAR CATEGORY

ABOUT US

FOLLOW US