Universal Value Function Approximators
Google DeepMind (United Kingdom) · Google (United Kingdom)
Abstract
Value functions are a core component of reinforcement learning systems. The main idea is to to construct a single function approximator V (s; θ) that estimates the long-term reward from any state s, using parameters θ. In this paper we introduce universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g. We develop an efficient technique for supervised learning of UVFAs, by factoring observed values into separate embedding vectors for state and goal, and then learning a mapping from s and g to these factored embedding vectors. We show how this technique may be incorporated into a reinforcement learning algorithm that updates the UVFA solely from…
Citation impact
- FWCI
- 46.29
- Percentile
- 100%
- References
- 24
Authors
4Topics & keywords
- Reinforcement learning
- Embedding
- Computer science
- Core (optical fiber)
- Factoring
- Function (biology)
- Construct (python library)
- State (computer science)