preprintarXiv (Cornell University)Dec 18, 2019GREEN OA

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Indexed inarxivdatacite

Abstract

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our…

Citation impact

981
total citations
FWCI
Percentile
References
45
Citations per year

Authors

4

Topics & keywords

Keywords
  • Automatic summarization
  • Computer science
  • Transformer
  • Encoder
  • Natural language processing
  • Artificial intelligence
  • Downstream (manufacturing)
  • Information retrieval
No related works found for this paper.