Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)
Ben-Gurion University of the Negev
Abstract
Due to the safety risks and training sample inefficiency, it is often preferred to develop controllers in simulation. However, minor differences between the simulation and the real world can cause a significant sim-to-real gap. This gap can reduce the effectiveness of the developed controller. In this paper, we examine a case study of transferring an octorotor reinforcement learning controller from simulation to the real world. First, we quantify the effectiveness of the real-world transfer by examining safety metrics. We find that although there is a noticeable (around 100%) increase in deviation in real flights, this deviation may not be considered unsafe, as it will be within > 2m safety corridors. Then, we…
Citation impact
- FWCI
- 570.62
- Percentile
- 100%
- References
- 11
Authors
3- NANatan, AvrahamCorresponding
Ben-Gurion University of the Negev
- SRStern, Roni
Ben-Gurion University of the Negev
- KMKalech, Meir
Ben-Gurion University of the Negev
Topics & keywords
- Computer science
- Optimization algorithm
- Algorithm
- Mathematical optimization
- Mathematics