Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?

Kaygisiz, Ömer Faruk; Teke, Mehmet Turhan

doi:10.1186/s12903-025-06034-x

articleBMC Oral HealthApr 25, 2025GOLD OA

Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?

ÖFÖmer Faruk Kaygisiz MTMehmet Turhan Teke

Gaziantep University

PubMed

Indexed incrossrefdoajpubmed

Abstract

Objective

Artificial intelligence (AI) has been widely used in various medical fields to support diagnostic development. The development of different AI techniques has made important contributions to early diagnoses. This research compares and evaluates the diagnostic accuracy of ChatGPT-4o and Deepseek-v3 AI applications in 16 clinical case scenarios in oral pathologies. METHODOLOGY: Clinical case scenarios of 16 imaginary oral pathologies were prepared by the authors. The cases were asked to provide 3 possible preliminary diagnoses to two different AI applications, DeepSeek-V3 and ChatGPT-4o, and to reference the literature for these diagnoses. The diagnoses of both AI applications were evaluated with Likert scale by 20 different specialists from two different specialties.

Results

The mean score for DeepSeek-v3 was 4.02 ± 0.36. For ChatGPT-4o it was 3.15 ± 0.41. According to the average scores, both models performed at a moderate to high level. Also, between the two AI models. DeepSeek-v3 was statistically better in 9 out of 16 clinical scenarios, while ChatGPT-4o was statistically better in 1 question. In general, DeepSeek-v3 was statistically more successful in the comparison of the two models (p = 0.024). In terms of references, ChatGPT-4o showed 62 references and 50 of them were fake, while 8 out of 48 references were fake in DeepSeek-v3.

Citation impact

48

total citations

FWCI: 25.25
Percentile: 100%
References: 36

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Medicine
Oral and maxillofacial surgery
Dermatology
General surgery
Dentistry

No related works found for this paper.