Self-Supervised Unseen Object Instance Segmentation via Long-Term Robot Interaction


Yangxiao Lu
UT Dallas
Ninad A Khargonkar
UT Dallas
Zesheng Xu
UT Dallas
Charles Averill
UT Dallas
Kamalesh Palanisamy
UT Dallas
Kaiyu Hang
Rice University
Yunhui Guo
UT Dallas
Nicholas Ruozzi
UT Dallas
Yu Xiang
UT Dallas
Paper Website

Paper ID 17

Session 3. Self-supervision and RL for Manipulation

Poster Session Tuesday, July 11

Poster 17

Abstract: We introduce a novel robotic system for improving unseen object instance segmentation in the real world by leveraging long-term robot interaction with objects. Previous approaches either grasp or push an object and then obtain the segmentation mask of the grasped or pushed object after one action. Instead, our system defers the decision on segmenting objects after a sequence of robot pushing actions. By applying multi-object tracking and video object segmentation on the images collected via robot pushing, our system can generate segmentation masks of all the objects in these images in a self-supervised way. These include images where objects are very close to each other, and segmentation errors usually occur on these images for existing object segmentation networks. We demonstrate the usefulness of our system by fine-tuning segmentation networks trained on synthetic data with real-world data collected by our system. We show that, after fine-tuning, the segmentation accuracy of the networks is significantly improved both in the same domain and across different domains. In addition, we verify that the fine-tuned networks improve top-down robotic grasping of unseen objects in the real world.