Towards a holistic framework for multimodal LLM in 3D brain CT radiology report generation
National Yang Ming Chiao Tung University · Taipei Veterans General Hospital · +3 more institutions
Abstract
Multi-modal large language models (MLLMs) have transformed the landscape of modern healthcare, with automated radiology report generation (RRG) emerging as a cutting-edge application. While 2D MLLM-based RRG has been well established, its utility for 3D medical images remains largely unexplored. In this regard, we curate the 3D-BrainCT dataset (18,885 text-scan pairs) and develop BrainGPT, a clinically visual instruction-tuned (CVIT) model designed for 3D CT RRG. While we notice that the traditional LLM metrics failed to gauge the diagnostic quality of the RRG, we propose feature-oriented radiology task evaluation (FORTE), an evaluation scheme that captures the clinical essence of the generated reports. Here…
Citation impact
- FWCI
- 54.84
- Percentile
- 100%
- References
- 51
Authors
14- CLCheng-Yi LiCorresponding
National Yang Ming Chiao Tung University, Taipei Veterans General Hospital
- KCKao-Jung Chang
National Yang Ming Chiao Tung University, University of California, Los Angeles, Taipei Veterans General Hospital
- CYCheng-Fu Yang
University of California, Los Angeles
- HWHsin‐Yu Wu
National Yang Ming Chiao Tung University, Taipei Veterans General Hospital
- WCWenting Chen
City University of Hong Kong
Topics & keywords
- Computer science
- Computed tomography
- Medicine
- Radiology
- Medical physics