Reliability of LLMs as medical assistants for the general public: a randomized preregistered study
University of Oxford · Betsi Cadwaladr University Health Board · +6 more institutions
Abstract
Global healthcare providers are exploring the use of large language models (LLMs) to provide medical advice to the public. LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings. We tested whether LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (disposition) in ten medical scenarios in a controlled study with 1,298 participants. Participants were randomly assigned to receive assistance from an LLM (GPT-4o, Llama 3, Command R+) or a source of their choice (control). Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in…
Citation impact
- FWCI
- 268.69
- Percentile
- 100%
- References
- 25
Authors
11Topics & keywords
- Reliability (semiconductor)
- Control (management)
- Action (physics)
- Disposition
- Medical advice
- Software deployment
- Public health
- Health care