articleJan 1, 2020GOLD OA
On the Sentence Embeddings from Pre-trained Language Models
Indexed incrossref
Abstract
Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task theoretically, and then analyze the BERT sentence embeddings empirically. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. To…
Citation impact
539
total citations
- FWCI
- 51.27
- Percentile
- 100%
- References
- 45
Citations per year
Authors
6Topics & keywords
Topics
Keywords
- Computer science
- Natural language processing
- Sentence
- Semantic similarity
- Artificial intelligence
- Embedding
- Similarity (geometry)
- Language model
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.