PMC-LLaMA: toward building open-source language models for medicine

Wu, Chaoyi; Lin, Weixiong; Zhang, Xiaoman; Zhang, Ya; Xie, Weidi; Wang, Yanfeng

doi:10.1093/jamia/ocae045

articleJournal of the American Medical Informatics AssociationApr 13, 2024BRONZE OA

PMC-LLaMA: toward building open-source language models for medicine

CWChaoyi Wu WLWeixiong Lin XZXiaoman Zhang YZYa Zhang WXWeidi Xie

Shanghai Jiao Tong University · Shandong Jiaotong University · +1 more institution

PubMed

Indexed incrossrefpubmed

Abstract

Objective

Recently, large language models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering (QA) situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this article, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA.

Materials And Methods

We adapt a general-purpose LLM toward the medical domain, involving data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive domain-specific instruction fine-tuning, encompassing medical QA, rationale for reasoning, and conversational dialogues with 202M tokens.

Citation impact

255

total citations

FWCI: 27.15
Percentile: 100%
References: 12

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Open source
Natural language processing
Programming language
Software

UN Sustainable Development Goals

Quality Education

No related works found for this paper.