articleNature CommunicationsMar 6, 2025GOLD OA

Towards a holistic framework for multimodal LLM in 3D brain CT radiology report generation

National Yang Ming Chiao Tung University · Taipei Veterans General Hospital · +3 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Multi-modal large language models (MLLMs) have transformed the landscape of modern healthcare, with automated radiology report generation (RRG) emerging as a cutting-edge application. While 2D MLLM-based RRG has been well established, its utility for 3D medical images remains largely unexplored. In this regard, we curate the 3D-BrainCT dataset (18,885 text-scan pairs) and develop BrainGPT, a clinically visual instruction-tuned (CVIT) model designed for 3D CT RRG. While we notice that the traditional LLM metrics failed to gauge the diagnostic quality of the RRG, we propose feature-oriented radiology task evaluation (FORTE), an evaluation scheme that captures the clinical essence of the generated reports. Here…

No related works found for this paper.

Funding