SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Wang, Alex; Pruksachatkun, Yada; Nangia, Nikita; Singh, Amanpreet; Michael, Julian; Hill, Felix; Levy, Omer; Bowman, Samuel R.

articleNeural Information Processing SystemsMay 2, 2019Closed access

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

AWAlex Wang YPYada Pruksachatkun NNNikita Nangia ASAmanpreet Singh JMJulian Michael

Supélec · University of Applied Sciences and Arts of Southern Switzerland · +5 more institutions

Abstract

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at https://super.gluebenchmark.com.

Citation impact

515

total citations

FWCI: 67.95
Percentile: 100%
References: 0

Citations per year

Authors

8

Topics & keywords

Topics

Keywords

Benchmark (surveying)
Computer science
Set (abstract data type)
Metric (unit)
Software
Artificial intelligence
Language model
Machine learning

UN Sustainable Development Goals

Quality Education

No related works found for this paper.