articleProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)Jan 1, 2022HYBRID OA
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Indexed incrossref
Abstract
We propose a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics. We crafted questions that some humans would answer falsely due to a false belief or misconception. To perform well, models must avoid generating false answers learned from imitating human texts. We tested GPT-3, GPT-Neo/J, GPT-2 and a T5-based model. The best model was truthful on 58% of questions, while human performance was 94%. Models generated many false answers that mimic popular misconceptions and have the potential to deceive humans. The largest models were generally the least truthful. This…
Citation impact
560
total citations
- FWCI
- 43.24
- Percentile
- 100%
- References
- 70
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Benchmark (surveying)
- Computer science
- Artificial intelligence
- Imitation
- Language model
- Machine learning
- Natural language processing
- Psychology
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.