Playing with Food: Learning Food Item Representations through Interactive Exploration

In order to successfully manipulate food items, robots have to know their material properties. This task is difficult and time-consuming for robotic systems, but humans can accomplish it by vision or touch.

A recent study on suggests distinguishing properties between food items using multimodal sensor data from interactive robot exploration.

 Image credit: G.steph.rocket via Wikimedia (CC BY-SA 4.0)

The researchers created a dataset that consists of visual, audio, proprioceptive, and force data collected through robot interactions with food items. 21 different food items cut into 10 different types of slices were used to collect the data.

Convolutional neural networks are then used to learn food embeddings that encode similarities between food types. The system outperforms normal visual-only and audio-only baselines when recognizing such properties as hardness or juiciness.

A key challenge in robotic food manipulation is modeling the material properties of diverse and deformable food items. We propose using a multimodal sensory approach to interact and play with food that facilitates the ability to distinguish these properties across food items. First, we use a robotic arm and an array of sensors, which are synchronized using ROS, to collect a diverse dataset consisting of 21 unique food items with varying slices and properties. Afterwards, we learn visual embedding networks that utilize a combination of proprioceptive, audio, and visual data to encode similarities among food items using a triplet loss formulation. Our evaluations show that embeddings learned through interactions can successfully increase performance in a wide range of material and shape classification tasks. We envision that these learned embeddings can be utilized as a basis for planning and selecting optimal parameters for more material-aware robotic food manipulation skills. Furthermore, we hope to stimulate further innovations in the field of food robotics by sharing this food playing dataset with the research community.



Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x