preprintarXiv (Cornell University)Apr 6, 2019GREEN OA

Publicly Available Clinical BERT Embeddings

Massachusetts Institute of Technology · Microsoft Research (United Kingdom)

Indexed inarxivdatacite

Abstract

Contextual word embedding models such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) have dramatically improved performance for many natural language processing (NLP) tasks in recent months. However, these models have been minimally explored on specialty corpora, such as clinical text; moreover, in the clinical domain, no publicly-available pre-trained BERT models yet exist. In this work, we address this need by exploring and releasing BERT models for clinical text: one for generic clinical text and another for discharge summaries specifically. We demonstrate that using a domain-specific model yields performance improvements on three common clinical NLP tasks as compared to nonspecific…

Citation impact

725
total citations
FWCI
Percentile
References
20
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Natural language processing
  • Task (project management)
  • Artificial intelligence
  • Embedding
  • Domain (mathematical analysis)
  • Identification (biology)
  • Language model
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.