Bayesian teaching enables probabilistic reasoning in large language models
Massachusetts Institute of Technology · Menlo School · +3 more institutions
Abstract
Large language models (LLMs) are increasingly used as agents that interact with users and with the world. To do so successfully, LLMs must construct representations of the world and form probabilistic beliefs about them. To provide personalized recommendations, for example, the LLM needs to infer a user’s preferences from their behavior over multiple interactions. The Bayesian inference framework lays out the optimal way for an agent to update its beliefs as it receives new information. We first show that LLMs fall far short of the standard defined by the Bayesian framework. We then show that by teaching LLMs to mimic the predictions of the normative Bayesian model, we can dramatically improve their ability to…
Citation impact
- FWCI
- 134.31
- Percentile
- 100%
- References
- 31
Authors
6Topics & keywords
- Probabilistic logic
- Normative
- Construct (python library)
- Inference
- Bayesian inference
- Bayesian probability
- Quality Education