Improved Convex Decomposition with Ensembling and Boolean Primitives

Vaibhav Vavilala; Florian Kluger; Seemandhar Jain; Bodo Rosenhahn; David Forsyth

Improved Convex Decomposition with Ensembling and Boolean Primitives

Vaibhav Vavilala, Florian Kluger, Seemandhar Jain, Bodo Rosenhahn, David Forsyth

22 Jan 2025 (modified: 18 Jun 2025)Submitted to ICML 2025EveryoneRevisionsBibTeXCC BY 4.0

TL;DR: A method that can fit small, accurate primitive representations to images of scenes that includes a set-difference operator.

Abstract: Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established and difficult fitting problem. Different scenes require different numbers of primitives, and these primitives interact strongly. Existing methods are evaluated by predicting depth, normals and segmentation from the primitives, then evaluating the accuracy of those predictions. The state of the art method involves a learned regression procedure to predict a start point consisting of a fixed number of primitives, followed by a descent method to refine the geometry and remove redundant primitives. CSG representations are significantly enhanced by a set-differencing operation. Our representation incorporates $\textit{negative}$ primitives, which are differenced from the positive primitives. These notably enrich the geometry that the model can encode, while complicating the fitting problem. This paper demonstrates a method that can (a) incorporate these negative primitives and (b) choose the overall number of positive and negative primitives by ensembling. Extensive experiments on the standard NYUv2 dataset confirm that (a) this approach results in substantial improvements in depth representation and segmentation over SOTA and (b) negative primitives make a notable contribution to accuracy. Our method is robustly applicable across datasets: in a first, we evaluate primitive prediction for LAION images. Code will be released upon acceptance of the paper.

Primary Area: Applications->Computer Vision

Keywords: Convex Decomposition, 3D primitives, ensembling

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Flagged For Ethics Review: true

Submission Number: 7118

Loading