articleFrontiers in Oral HealthFeb 24, 2026GOLD OA

Multimodal large language models for oral lesion diagnosis: a systematic review of diagnostic performance and clinical utility

International University · Khalifa University of Science and Technology · +6 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

Diagnosing oral lesions from benign conditions to oral cancer remains challenging due to overlapping visual features and reliance on histopathology. Large language models (LLMs) can integrate textual and visual cues, but their diagnostic accuracy and clinical utility in real decision-making contexts remain uncertain. To systematically evaluate the diagnostic performance, clinical usefulness, and limitations of LLMs in identifying oral lesions.

Methods

PubMed, CINAHL, Embase, Web of Science, and Google Scholar were searched to 20 July 2025. Eligible studies applied LLMs (e.g., ChatGPT, Gemini, DeepSeek, Copilot, Claude) for diagnosis or differential diagnosis of oral lesions using text, images, or multimodal inputs. Outcomes included diagnostic accuracy, agreement metrics, and qualitative assessments of explanation quality and clinical applicability. Risk of bias was assessed using an adapted QUADAS-2. Narrative synthesis was performed due to heterogeneity.

No related works found for this paper.

Funding