Clinical reasoning with machines: evaluating the interpretive depth of AI in urological case assessments
Indexed incrossrefdoajpubmed
Abstract
Large language models (LLMs) are increasingly utilized as decision-support tools in medicine. However, their clinical reliability and applicability remain uncertain. This study compared ChatGPT-3.5, ChatGPT-4o, and Gemini 1.0 Pro in responding to standardized urological clinical scenarios evaluated by blinded experts. This observational cross-sectional study included 75 urology specialists categorized by experience (
Citation impact
4
total citations
- FWCI
- 43.39
- Percentile
- 100%
- References
- 16
Too recent for citation history.
Authors
5Topics & keywords
Topics
Keywords
- Reliability (semiconductor)
- Correlation
- Likert scale
- Inter-rater reliability
- Rating scale
- Observational study
- Pairwise comparison
- Normality
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.