Unsupervised Joint 3D Object Model Learning and 6D Pose Estimation for Depth-Based Instance Segmentation
Abstract: In this work, we propose a novel unsupervised approach
to jointly learn the 3D object model and estimate the 6D
poses of multiple instances of a previously unknown object, with applications to depth-based instance segmentation.
The inputs are depth images, and the learned object model
is represented by a 3D point cloud. Traditional 6D pose
estimation approaches are not sufficient to address this unsupervised problem, in which neither a CAD model of the
object nor the ground-truth 6D poses of its instances are
available during training. To solve this problem, we propose
to jointly optimize the model learning and pose estimation
in an end-to-end deep learning framework. Specifically, our
network produces a 3D object model and a list of rigid transformations of this model to generate instances, which when
rendered must match the observed 3D point cloud to minimize the Chamfer distance. To render the set of instance
point clouds with occlusions, the network automatically removes the occluded points in a given camera view. Extensive
experiments evaluate our technique on several object models and varying numbers of instances. We demonstrate the
application of our method to instance segmentation of depth
images of small bins of industrial parts. Compared with
popular baselines for instance segmentation, our model not
only demonstrates competitive performance, but also learns
a 3D object model that is represented as a 3D point cloud
0 Replies
Loading