DISEASES: Text mining and data integration of disease–gene associations
Novo Nordisk Foundation · University of Copenhagen · +2 more institutions
Abstract
Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource,…
Citation impact
- FWCI
- 15.65
- Percentile
- 100%
- References
- 49
Authors
5- SPSune Pletscher-Frankild
Novo Nordisk Foundation, University of Copenhagen
- APAlbert Pallejá
Novo Nordisk Foundation, University of Copenhagen
- KTKalliopi Tsafou
University of Copenhagen, Novo Nordisk Foundation
- JXJanos X. Binder
University of Luxembourg, European Molecular Biology Laboratory
- LJLars Juhl JensenCorresponding
University of Copenhagen, Novo Nordisk Foundation
Topics & keywords
- Computational biology
- Disease
- Gene
- Biology
- Genetics
- Data science
- Computer science
- Medicine