On the Sentence Embeddings from Pre-trained Language Models

Li, Bohan; Zhou, Hao; He, Junxian; Wang, Mingxuan; Yang, Yiming; Li, Lei

doi:10.18653/v1/2020.emnlp-main.733

articleJan 1, 2020GOLD OA

On the Sentence Embeddings from Pre-trained Language Models

BLBohan Li HZHao Zhou JHJunxian He MWMingxuan Wang YYYiming Yang

Carnegie Mellon University

Indexed incrossref

Abstract

Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task theoretically, and then analyze the BERT sentence embeddings empirically. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. To…

Citation impact

539

total citations

FWCI: 51.27
Percentile: 100%
References: 45

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

Computer science
Natural language processing
Sentence
Semantic similarity
Artificial intelligence
Embedding
Similarity (geometry)
Language model

UN Sustainable Development Goals

Quality Education

No related works found for this paper.