ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature
University of Cambridge · Bridge University
Abstract
The emergence of "big data" initiatives has led to the need for tools that can automatically extract valuable chemical information from large volumes of unstructured data, such as the scientific literature. Since chemical information can be present in figures, tables, and textual paragraphs, successful information extraction often depends on the ability to interpret all of these domains simultaneously. We present a complete toolkit for the automated extraction of chemical entities and their associated properties, measurements, and relationships from scientific documents that can be used to populate structured chemical databases. Our system provides an extensible, chemistry-aware, natural language processing…
Citation impact
- FWCI
- 10.10
- Percentile
- 100%
- References
- 30
Authors
2Topics & keywords
- Computer science
- Information retrieval
- Extraction (chemistry)
- Data science
- Information extraction
- Chemistry
- Chromatography
- Quality Education