preprintarXiv (Cornell University)Jan 28, 2022GREEN OA

BNAI, NO-TOKEN, and MIND-UNITY: Pillars of a Systemic Revolution in Artificial Intelligence

WJWei, JasonWXWang, XuezhiDSDale SchuurmansMBMaarten BosmaIBIchter, Brian
Indexed inarxivdatacite

Abstract

AbstractThere is a failure mode in large language models that we do not have a good name for, and thatwe therefore tend not to treat seriously enough. It is not hallucination — the model is not assertingsomething false. It is not refusal — the model answers at length. It is the production of responses thatcarry the complete outward form of careful reasoning while the cognitive work that reasoning issupposed to represent has not, in any meaningful sense, occurred. We call this theatrical compliance,and we argue that it is, in practical terms, more dangerous than either of the failure modes thatcurrently dominate alignment research. This paper identifies the phenomenon, characterizes its fiveprincipal forms,…

Citation impact

4,243
total citations
FWCI
Percentile
References
7
Citations per year

Authors

9

Topics & keywords

Keywords
  • Computer science
  • Language model
  • Benchmark (surveying)
  • Chain (unit)
  • Cognitive science
  • Artificial intelligence
  • Word (group theory)
  • Natural language processing
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.