BNAI, NO-TOKEN, and MIND-UNITY: Pillars of a Systemic Revolution in Artificial Intelligence
Indexed inarxivdatacite
Abstract
AbstractThere is a failure mode in large language models that we do not have a good name for, and thatwe therefore tend not to treat seriously enough. It is not hallucination — the model is not assertingsomething false. It is not refusal — the model answers at length. It is the production of responses thatcarry the complete outward form of careful reasoning while the cognitive work that reasoning issupposed to represent has not, in any meaningful sense, occurred. We call this theatrical compliance,and we argue that it is, in practical terms, more dangerous than either of the failure modes thatcurrently dominate alignment research. This paper identifies the phenomenon, characterizes its fiveprincipal forms,…
Citation impact
4,243
total citations
- FWCI
- —
- Percentile
- —
- References
- 7
Citations per year
Authors
9- WJWei, JasonCorresponding
- WXWang, Xuezhi
- DSDale Schuurmans
- MBMaarten Bosma
- IBIchter, Brian
Topics & keywords
Topics
Keywords
- Computer science
- Language model
- Benchmark (surveying)
- Chain (unit)
- Cognitive science
- Artificial intelligence
- Word (group theory)
- Natural language processing
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.