A Framework for Studying Reinforcement Learning and Sim-to-Real in Robot Soccer

In Very Small Size Soccer two teams of three robots compete to score goals against each other. The behaviour of robots is usually programmed for each situation. Reinforcement learning may be used to improve the abilities of robots; however, real-world training is impractical because of the degradation of hardware and the consumption of energy and time.

Very Small Size robotic soccer competition. Image credit: Hansenclever F. Bassani, Renie A. Delgado, José Nilton de O. Lima Junior, Heitor R. Medeiros, Pedro H. M. Braga, Mateus G. Machado, Lucas H. C. Santos, Alain Tapp, arXiv:2008.12624

A recent study proposes a framework for sim-to-real training. In this case, robots are trained in simulations and the learned policy is transferred to the real world. It is shown that this strategy leads to a broader repertoire of behaviours than human-designed policy, but strikes are slower and less precise. The effectiveness of reinforcement learning was evaluated in the 2019 Latin American Robotics Competition. Here, it was a first time a team of robots trained by the reinforcement learning has won against teams which operated by human-designed policies.

This article introduces an open framework, called VSSS-RL, for studying Reinforcement Learning (RL) and sim-to-real in robot soccer, focusing on the IEEE Very Small Size Soccer (VSSS) league. We propose a simulated environment in which continuous or discrete control policies can be trained to control the complete behavior of soccer agents and a sim-to-real method based on domain adaptation to adapt the obtained policies to real robots. Our results show that the trained policies learned a broad repertoire of behaviors that are difficult to implement with handcrafted control policies. With VSSS-RL, we were able to beat human-designed policies in the 2019 Latin American Robotics Competition (LARC), achieving 4th place out of 21 teams, being the first to apply Reinforcement Learning (RL) successfully in this competition. Both environment and hardware specifications are available open-source to allow reproducibility of our results and further studies.

Link: https://arxiv.org/abs/2008.12624

Source