Reinforcement learning algorithms are commonly used for narrowly defined tasks. In contrast, humans plan their actions using a mental model of the world’s physical structure.
Inspired by human intelligence, a recent paper proposes an environment that is designed as a sandbox where the agent has to learn about how a physical world works.
Illustration of the problem setup for modeling and testing OPEn, or Open-ended Physics Environment. Image credit: arXiv:2110.06912 [cs.RO]
It combines active learning, where a learner gets to pick the datapoints it trains on, and world modeling, where a learner attempts to model the dynamics of an environment. A 3D physical environment with photorealistic images is built to provide high-fidelity physical simulation.
Several agents were tested on the benchmark. The results show that the best results are achieved when using an agent that adopts impact-driven exploration and contrastive unsupervised representations learning.
Humans have mental models that allow them to plan, experiment, and reason in the physical world. How should an intelligent agent go about learning such models? In this paper, we will study if models of the world learned in an open-ended physics environment, without any specific tasks, can be reused for downstream physics reasoning tasks. To this end, we build a benchmark Open-ended Physics ENvironment (OPEn) and also design several tasks to test learning representations in this environment explicitly. This setting reflects the conditions in which real agents (i.e. rolling robots) find themselves, where they may be placed in a new kind of environment and must adapt without any teacher to tell them how this environment works. This setting is challenging because it requires solving an exploration problem in addition to a model building and representation learning problem. We test several existing RL-based exploration methods on this benchmark and find that an agent using unsupervised contrastive learning for representation learning, and impact-driven learning for exploration, achieved the best results. However, all models still fall short in sample efficiency when transferring to the downstream tasks. We expect that OPEn will encourage the development of novel rolling robot agents that can build reusable mental models of the world that facilitate many tasks.
Research paper: Gan, C., Bhandwaldar, A., Torralba, A., Tenenbaum, J. B., and Isola, P., “OPEn: An Open-ended Physics Environment for Learning Without a Task”, 2021. Link: https://arxiv.org/abs/2110.06912
Link to the site of the project: https://open.csail.mit.edu/