Abstract: While category-level 9DoF object pose estimation has emerged
recently, previous correspondence-based or direct regression methods are
both limited in accuracy due to the huge intra-category variances in
object shape and color, etc. Orthogonal to them, this work presents a
category-level object pose and size refiner CATRE, which is able to iteratively enhance pose estimate from point clouds to produce accurate
results. Given an initial pose estimate, CATRE predicts a relative transformation between the initial pose and ground truth by means of aligning
the partially observed point cloud and an abstract shape prior. In specific,
we propose a novel disentangled architecture being aware of the inherent
distinctions between rotation and translation/size estimation. Extensive
experiments show that our approach remarkably outperforms state-of-
the-art methods on REAL275, CAMERA25, and LM benchmarks up
to a speed of ≈85.32 Hz, and achieves competitive results on category-
level tracking. We further demonstrate that CATRE can perform pose
refinement on unseen category. Code and trained models are available.
Loading