Deep reinforcement learning is successfully applied in many real-world robotic tasks. However, it is limited to domains in which a simulator is available or environments that have been tailored and instrumented for the agent’s training.
Interactive learning approach is useful in training not just industrial robotic systems. Image credit: Auledas via Wikimedia, CC-BY-SA-4.0
Therefore, a recent paper proposes an interactive learning approach in which a human teacher provides evaluative and corrective feedback to the robot during training.
The method does not require any reward function and thus avoids credit assignment and reward exploitation issues. The human teacher can see the improvement in the policy performance and decide when to stop training. Furthermore, the presence of the human ensures that the robot can be stopped in the case of unsafe behavior. The real-world experiments show that the proposed approach allows training a physical robot to solve complex manipulation tasks in less than one hour.
Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Deep reinforcement learning algorithms have recently demonstrated impressive results, although they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive learning where a human teacher provides feedback to the policy during execution, as opposed to imitation learning where a pre-collected dataset of perfect demonstrations is used. Our proposed CEILing (Corrective and Evaluative Interactive Learning) framework combines both corrective and evaluative feedback from the teacher to train a stochastic policy in an asynchronous manner, and employs a dedicated mechanism to trade off human corrections with the robot’s own experience. We present results obtained with our framework in extensive simulation and real-world experiments that demonstrate that CEILing can effectively solve complex robot manipulation tasks directly from raw images in less than one hour of real-world training.
Link to the project site: https://ceiling.cs.uni-freiburg.de/
Research paper: Chisari, E., Welschehold, T., Boedecker, J., Burgard, W., and Valada, A., “Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation”, 2021. Link to the article: https://arxiv.org/abs/2110.03316