NCBI disease corpus: A resource for disease name recognition and concept normalization

Islamaj, Rezarta; Leaman, Robert; Lu, Zhiyong

doi:10.1016/j.jbi.2013.12.006

articleJournal of Biomedical InformaticsJan 4, 2014HYBRID OA

NCBI disease corpus: A resource for disease name recognition and concept normalization

RIRezarta Islamaj RLRobert Leaman ZLZhiyong Lu

National Center for Biotechnology Information · National Institutes of Health · +1 more institution

PubMed

Indexed incrossrefpubmed

Abstract

Information encoded in natural language in biomedical literature publications is only useful if efficient and reliable ways of accessing and analyzing that information are available. Natural language processing and text mining tools are therefore essential for extracting valuable information, however, the development of powerful, highly effective tools to automatically detect central biomedical concepts such as diseases is conditional on the availability of annotated corpora. This paper presents the disease name and concept annotations of the NCBI disease corpus, a collection of 793 PubMed abstracts fully annotated at the mention and concept level to serve as a research resource for the biomedical natural…

Citation impact

832

total citations

FWCI: 13.09
Percentile: 100%
References: 50

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
Identifier
Annotation
Natural language processing
Information retrieval
Unique identifier
Resource (disambiguation)
Named-entity recognition

UN Sustainable Development Goals

Quality Education

No related works found for this paper.

Funding

NI
National Institutes of Health