Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
Stanford Health Care · Stanford Medicine · +1 more institution
Abstract
One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop diagnostic reasoning prompts to study whether LLMs can imitate clinical reasoning while accurately forming a diagnosis. We find that GPT-4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can imitate clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether an LLMs response is likely correct and can be…
Citation impact
- FWCI
- 23.73
- Percentile
- 100%
- References
- 22
Authors
5Topics & keywords
- Interpretability
- Perception
- Cognition
- Personalized medicine
- Psychology
- Medicine
- Cognitive psychology
- Computer science
- Quality Education
Funding
- UDU.S. Department of Veterans Affairs
- GAGordon and Betty Moore FoundationAward: 12409
- AHAmerican Heart Association
- GCGeorgia Clinical and Translational Science AllianceAward: UL1TR003142
- NINational Institutes of HealthAwards: UG1DA015815, UL1TR003142
- NINational Institute of Allergy and Infectious DiseasesAward: 1R01AI17812101
- NCNational Center for Advancing Translational SciencesAward: UL1TR003142