Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma

Yeo, Yee Hui; Samaan, Jamil S.; Ng, Wee Han; Ting, Peng–Sheng; Trivedi, Hirsh D.; Vipani, Aarshi; Ayoub, Walid S.; Yang, Ju Dong; Liran, Omer; Spiegel, Brennan; Kuo, Alexander

doi:10.3350/cmh.2023.0089

articleClinical and Molecular HepatologyMar 22, 2023GOLD OA

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma

YHYee Hui Yeo JSJamil S. Samaan WHWee Han Ng PTPeng–Sheng Ting HDHirsh D. Trivedi

Cedars-Sinai Medical Center · University of Bristol · +1 more institution

PubMed

Indexed incrossrefdoajpubmed

Abstract

Methods

ChatGPT's responses to 164 questions were independently graded by two transplant hepatologists and resolved by a third reviewer. The performance of ChatGPT was also assessed using two published questionnaires and 26 questions formulated from the quality measures of cirrhosis management. Finally, its emotional support capacity was tested.

Results

We showed that ChatGPT regurgitated extensive knowledge of cirrhosis (79.1% correct) and HCC (74.0% correct), but only small proportions (47.3% in cirrhosis, 41.1% in HCC) were labeled as comprehensive. The performance was better in basic knowledge, lifestyle, and treatment than in the domains of diagnosis and preventive medicine. For the quality measures, the model answered 76.9% of questions correctly but failed to specify decision-making cut-offs and treatment durations. ChatGPT lacked knowledge of regional guidelines variations, such as HCC screening criteria. However, it provided practical and multifaceted advice to patients and caregivers regarding the next steps and adjusting to a new diagnosis.

Citation impact

619

total citations

FWCI: 22.33
Percentile: 100%
References: 28

Citations per year

Authors

11

Topics & keywords

Topics

Keywords

Cirrhosis
Medicine
Hepatocellular carcinoma
Intensive care medicine
Internal medicine

No related works found for this paper.