Reinforcement learning (RL) methods greatly help to design general-purpose robotic systems. However, many of them lack efficiency.
Current techniques to improve RL methods rely on better optimization or more efficient exploration. A recent paper on arXiv.org proposes another approach.
Image credit: Pxhere, CC0 Public Domain
The researchers suggest designing primitives with minimal human effort, enabling their expressiveness by parameterizing them with arguments and learning to control them with a high-level policy instead of learning low-level primitives.
Primitive robot motions are applied to redefine the policy-robot interface in the context of robotic reinforcement learning. These parameterized actions are easy to design, need only be defined once, and can be re-used without modification across tasks. It is shown that a simple parameterized action-based approach outperforms prior state-of-the-art by a significant margin.
Despite the potential of reinforcement learning (RL) for building general-purpose robotic systems, training RL agents to solve robotics tasks still remains challenging due to the difficulty of exploration in purely continuous action spaces. Addressing this problem is an active area of research with the majority of focus on improving RL methods via better optimization or more efficient exploration. An alternate but important component to consider improving is the interface of the RL algorithm with the robot. In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy. These parameterized primitives are expressive, simple to implement, enable efficient exploration and can be transferred across robots, tasks and environments. We perform a thorough empirical study across challenging tasks in three distinct domains with image input and a sparse terminal reward. We find that our simple change to the action interface substantially improves both the learning efficiency and task performance irrespective of the underlying RL algorithm, significantly outperforming prior methods which learn skills from offline expert data. Code and videos at this https URL
Research paper: Dalal, M., Pathak, D., and Salakhutdinov, R., “Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives”, 2021. Link: https://arxiv.org/abs/2110.15360