The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark for Physically Realistic Embodied AI

The recent development of 3D virtual environments lets to train and evaluate AI algorithms. However, most of the previous research was aimed at visual navigation and did not pay attention to physical interactions.

Image: Geralt via, free licence

A recent paper on suggests a new embodied AI task. The agent has to move and change the state of various objects in a realistic virtual environment. In particular, an agent with two nine-degrees-of-freedom arms had to explore a virtual house, look for objects scattered in different rooms, and take them to the desired location. Also, containers were positioned in the house. The agent could use them to transport more than two objects at a time.

A fully physics-based API was developed for training. The results show that a pure reinforcement learning model struggles to accomplish the task. The hierarchical planning-based agent achieves better results but still fails to solve the task.

We introduce a visually-guided and physics-driven task-and-motion planning benchmark, which we call the ThreeDWorld Transport Challenge. In this challenge, an embodied agent equipped with two 9-DOF articulated arms is spawned randomly in a simulated physical home environment. The agent is required to find a small set of objects scattered around the house, pick them up, and transport them to a desired final location. We also position containers around the house that can be used as tools to assist with transporting objects efficiently. To complete the task, an embodied agent must plan a sequence of actions to change the state of a large number of objects in the face of realistic physical constraints. We build this benchmark challenge using the ThreeDWorld simulation: a virtual 3D environment where all objects respond to physics, and where can be controlled using fully physics-driven navigation and interaction API. We evaluate several existing agents on this benchmark. Experimental results suggest that: 1) a pure RL model struggles on this challenge; 2) hierarchical planning-based agents can transport some objects but still far from solving this task. We anticipate that this benchmark will empower researchers to develop more intelligent physics-driven robots for the physical world.

Research paper: Gan, C., “The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark for Physically Realistic Embodied AI”, 2021. Link:


Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x