articleJournal of Biomedical InformaticsJan 4, 2014HYBRID OA

NCBI disease corpus: A resource for disease name recognition and concept normalization

National Center for Biotechnology Information · National Institutes of Health · +1 more institution

PubMed
Indexed incrossrefpubmed

Abstract

Information encoded in natural language in biomedical literature publications is only useful if efficient and reliable ways of accessing and analyzing that information are available. Natural language processing and text mining tools are therefore essential for extracting valuable information, however, the development of powerful, highly effective tools to automatically detect central biomedical concepts such as diseases is conditional on the availability of annotated corpora. This paper presents the disease name and concept annotations of the NCBI disease corpus, a collection of 793 PubMed abstracts fully annotated at the mention and concept level to serve as a research resource for the biomedical natural…

Citation impact

832
total citations
FWCI
13.09
Percentile
100%
References
50
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Identifier
  • Annotation
  • Natural language processing
  • Information retrieval
  • Unique identifier
  • Resource (disambiguation)
  • Named-entity recognition
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding