Researchers at Carnegie Mellon University and Facebook AI Research (FAIR) have developed a “semantic” navigation system called Goal-Oriented Semantic Exploration (SemExp), winning the Habitat ObjectNav Challenge during the virtual Computer Vision and Pattern Recognition conference last month.
The system uses machine learning to enable robots to recognise specific objects and “understand” where in a given space they are likely to be located, thereby improving navigation and performance on search tasks.
“Common sense says that if you’re looking for a refrigerator, you’d better go to the kitchen,” said Devendra S. Chaplot, a Ph.D. student in CMU’s Machine Learning Department. In contrast to SemExp, classical robotic navigation systems typically rely on generating spatial maps to avoid obstacles and guiding the robot to its destination along the shortest possible route.
While navigation systems that rely on semantic “reasoning” aren’t new, historically they’ve been rather clunky. Instead of developing the capacity to generalise, “common sense” approaches would enable the memorisation of objects in specific environments, which proved to be problematic in unfamiliar spaces.
To surmount this problem, Chaplot, in collaboration with Dhiraj Gandhi, Abhinav Gupta and Ruslan Salakhutdinov, made SemExp modular, whereby searching for an object is guided by first generating and consulting semantic information.
“Once you decide where to go, you can just use classical planning to get you there,” Chaplot explained. The first “module” is designed to explore relationships between objects and room layouts, while the second is based around classical navigation planning, which optimises the path between point A and point B.
The ultimate purpose of systems like SemExp is to facilitate interactions between humans and robots, allowing the former to make requests of the latter in a more natural manner, without worrying about what the robot is likely to “understand” and what’s beyond the pale of its reasoning engine.