Tree-Based Batch Mode Reinforcement Learning

Ernst, Damien; Geurts, Pierre; Wehenkel, Louis

articleOpen Repository and Bibliography (University of Liège)Dec 1, 2005GREEN OA

Tree-Based Batch Mode Reinforcement Learning

DEDamien Ernst PGPierre Geurts LWLouis Wehenkel

Abstract

Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (xt,ut,rt,xt+1) where xt denotes the system state at time t, ut the control action taken, rt the instantaneous reward obtained and xt+1 the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree,…

Citation impact

864

total citations

FWCI: 31.54
Percentile: 100%
References: 43

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Reinforcement learning
Tree (set theory)
Sequence (biology)
Convergence (economics)
Computer science
Function (biology)
Mathematics
Tuple

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.