articleNatureJun 19, 2024HYBRID OA

Detecting hallucinations in large language models using semantic entropy

University of Oxford · Science Oxford

PubMed
Indexed incrossrefpubmed

Abstract

Abstract Large language model (LLM) systems, such as ChatGPT 1 or Gemini 2 , can show impressive reasoning and question-answering capabilities but often ‘hallucinate’ false outputs and unsubstantiated answers 3,4 . Answering unreliably or without the necessary information prevents adoption in diverse fields, with problems including fabrication of legal precedents 5 or untrue facts in news articles 6 and even posing a risk to human life in medical domains such as radiology 7 . Encouraging truthfulness through supervision or reinforcement has been only partially successful 8 . Researchers need a general method for detecting hallucinations in LLMs that works even with new and unseen questions to which humans…

Citation impact

576
total citations
FWCI
180.46
Percentile
100%
References
49
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Task (project management)
  • Artificial intelligence
  • Meaning (existential)
  • Natural language processing
  • Cognitive psychology
  • Data science
  • Psychology
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding