Maximum margin planning
Carnegie Mellon University · University of Alberta
Abstract
Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior. Further, we demonstrate a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference. Although the technique is general, it is particularly relevant in problems where A* and dynamic programming approaches make learning policies tractable in problems beyond the…
Citation impact
- FWCI
- 15.85
- Percentile
- 100%
- References
- 24
Authors
3Topics & keywords
- Margin (machine learning)
- Computer science
- Subgradient method
- Planner
- Reinforcement learning
- Inference
- Artificial intelligence
- Machine learning
- Sustainable cities and communities