Maximum margin planning

Ratliff, Nathan; Bagnell, J. Andrew; Zinkevich, Martin

doi:10.1145/1143844.1143936

articleJan 1, 2006Closed access

Maximum margin planning

NRNathan Ratliff JAJ. Andrew Bagnell MZMartin Zinkevich

Carnegie Mellon University · University of Alberta

Indexed incrossref

Abstract

Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior. Further, we demonstrate a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference. Although the technique is general, it is particularly relevant in problems where A* and dynamic programming approaches make learning policies tractable in problems beyond the…

Citation impact

638

total citations

FWCI: 15.85
Percentile: 100%
References: 24

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Margin (machine learning)
Computer science
Subgradient method
Planner
Reinforcement learning
Inference
Artificial intelligence
Machine learning

UN Sustainable Development Goals

Sustainable cities and communities

No related works found for this paper.

Funding

DA
Defense Advanced Research Projects Agency