Medical large language models are vulnerable to data-poisoning attacks
NYU Langone Health · New York University · +3 more institutions
Abstract
The adoption of large language models (LLMs) in healthcare demands a careful analysis of their potential to spread false medical knowledge. Because LLMs ingest massive volumes of data from the open Internet during training, they are potentially exposed to unverified medical knowledge that may include deliberately planted misinformation. Here, we perform a threat assessment that simulates a data-poisoning attack against The Pile, a popular dataset used for LLM development. We find that replacement of just 0.001% of training tokens with medical misinformation results in harmful models more likely to propagate medical errors. Furthermore, we discover that corrupted models match the performance of their…
Citation impact
- FWCI
- 69.03
- Percentile
- 100%
- References
- 48
Authors
33- DADaniel Alexander AlberCorresponding
NYU Langone Health, New York University
- ZYZihao Yang
NYU Langone Health, New York University
- AAAnton Alyakin
Washington University in St. Louis, NYU Langone Health
- EYEunice Yang
NYU Langone Health, Columbia University
- SRSumedha Rai
NYU Langone Health, New York University
Topics & keywords
- Misinformation
- Computer science
- Harm
- The Internet
- Internet privacy
- Computer security
- Health care
- Data science