Automatic Keyword Extraction from Individual Documents
Pacific Northwest National Laboratory
Abstract
Keywords are widely used to define queries within information retrieval (IR) systems as they are easy to define, revise, remember, and share. This chapter describes the rapid automatic keyword extraction (RAKE), an unsupervised, domain-independent, and language-independent method for extracting keywords from individual documents. It provides details of the algorithm and its configuration parameters, and present results on a benchmark dataset of technical abstracts, showing that RAKE is more computationally efficient than TextRank while achieving higher precision and comparable recall scores. The chapter then describes a novel method for generating stoplists, which is used to configure RAKE for specific domains…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 12
Authors
4Topics & keywords
- Computer science
- Keyword extraction
- Rake
- Benchmark (surveying)
- Generality
- Information retrieval
- Domain (mathematical analysis)
- Natural language processing
- Quality Education