Journal article icon

Journal article

Reliability of LLMs as medical assistants for the general public: a randomized preregistered study

Abstract:
Global healthcare providers are exploring the use of large language models (LLMs) to provide medical advice to the public. LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings. We tested whether LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (disposition) in ten medical scenarios in a controlled study with 1,298 participants. Participants were randomly assigned to receive assistance from an LLM (GPT-4o, Llama 3, Command R+) or a source of their choice (control). Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in fewer than 34.5% of cases and disposition in fewer than 44.2%, both no better than the control group. We identify user interactions as a challenge to the deployment of LLMs for medical advice. Standard benchmarks for medical knowledge and simulated patient interactions do not predict the failures we find with human participants. Moving forward, we recommend systematic human user testing to evaluate interactive capabilities before public deployments in healthcare.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1038/s41591-025-04074-y

Authors

More by this author
Institution:
University of Oxford
Division:
SSD
Department:
Oxford Internet Institute
Role:
Author
ORCID:
0000-0001-8439-5975
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-8954-7584
More by this author
Institution:
University of Oxford
Division:
SSD
Department:
Oxford Internet Institute
Role:
Author
ORCID:
0000-0001-5786-2750
More by this author
Institution:
University of Oxford
Division:
SSD
Department:
Oxford Internet Institute
Role:
Author
More by this author
Role:
Author
ORCID:
0000-0001-9971-0050


Publisher:
Nature Research
Journal:
Nature Medicine More from this journal
Publication date:
2026-02-09
Acceptance date:
2025-10-22
DOI:
EISSN:
1546-170X
ISSN:
1078-8956


Language:
English
Keywords:
Pubs id:
2374045
Local pid:
pubs:2374045
Source identifiers:
W7128444586
Deposit date:
2026-02-15
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP