Mapping the susceptibility of large language models to medical misinformation across clinical notes and social media: a cross-sectional benchmarking analysis
Mount Sinai Health System · Mayo Clinic in Arizona · +1 more institution
Abstract
Large language models (LLMs) are increasingly used in health care but remain vulnerable to medical misinformation. We aimed to evaluate how often these models accept or reject fabricated medical content, and how framing that content as a logical fallacy changes results.
In this cross-sectional benchmarking analysis, we probed 20 LLMs with more than 3·4 million prompts that all contained health misinformation drawn from three sources: public-forum and social-media dialogues, real hospital discharge notes in which we inserted a single false recommendation, and 300 physician-validated simulated vignettes. Logical fallacies-common patterns of flawed reasoning such as appeals to authority, popularity, or emotion-were used to test how rhetorical framing influences model behaviour. Each prompt was posed once in a neutral base form and ten times with a named logical fallacy. For every run we logged susceptibility (model accepts the false claim) and fallacy detection (model flags the rhetoric).
Citation impact
- FWCI
- 221.13
- Percentile
- 100%
- References
- 16
Authors
11Topics & keywords
- Misinformation
- Benchmarking
- Social media
- Public health
- MEDLINE
- Health communication
- Industry, innovation and infrastructure