Artificial Intelligence, Values, and Alignment
Google DeepMind (United Kingdom)
Abstract
Abstract This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’…
Citation impact
- FWCI
- 38.18
- Percentile
- 100%
- References
- 85
Authors
1- IGIason GabrielCorresponding
Google DeepMind (United Kingdom)
Topics & keywords
- Normative
- Philosophy of mind
- Philosophy of science
- Context (archaeology)
- Reflective equilibrium
- Ideal (ethics)
- Theory of computation