Tree-Based Batch Mode Reinforcement Learning

University of Liège

Abstract

Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt,ut,rt,xt+1) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree,…

Citation impact

864
total citations
FWCI
31.54
Percentile
100%
References
43
Citations per year

Authors

3

Topics & keywords

Keywords
  • Reinforcement learning
  • Tree (set theory)
  • Sequence (biology)
  • Convergence (economics)
  • Computer science
  • Function (biology)
  • Mathematics
  • Tuple
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.