Generating annotated data from RGB-D video sequences is both time-consuming and expensive, as it involves annotations in both 2D and 3D. Current techniques require extensive manual work or have other limitations.
A recent paper on arXiv.org tackles these obstacles and proposes a simple and effective tool to generate bounding box and segmentation data for RGB-D semi-automatically.
Image credit: pxhere.com, CC0 Public Domain
The copy and interpolation features help the user annotate multiple keyframes by copying the first one and making adjustments. The 6 degrees of freedom bounding boxes are created up to 33.95 times faster than a naïve approach. Algorithmically guided generation of segmentation masks speeds up annotation time by a factor up to 8.55. In some cases, the quality of automatically generated data is improved as the proposed functionalities make annotation more productive and reduce the workload.
Large labeled data sets are one of the essential basics of modern deep learning techniques. Therefore, there is an increasing need for tools that allow to label large amounts of data as intuitively as possible. In this paper, we introduce SALT, a tool to semi-automatically annotate RGB-D video sequences to generate 3D bounding boxes for full six Degrees of Freedom (DoF) object poses, as well as pixel-level instance segmentation masks for both RGB and depth. Besides bounding box propagation through various interpolation techniques, as well as algorithmically guided instance segmentation, our pipeline also provides built-in pre-processing functionalities to facilitate the data set creation process. By making full use of SALT, annotation time can be reduced by a factor of up to 33.95 for bounding box creation and 8.55 for RGB segmentation without compromising the quality of the automatically generated ground truth.
Research paper: Stumpf, D., Krauß, S., Reis, G., Wasenmüller, O., and Stricker, D., “SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences”, 2021. Link: https://arxiv.org/abs/2102.10820