Abstract: Multi-model fitting aims to robustly estimate the parameters of various model instances in data contaminated by noise and outliers. Most previous works employ only a single type of consensus or implicit fusion model to represent the correlation between data points and model hypotheses. This approach often results in unrealistic and incorrect model fitting in the presence of noise and uncertainty. In this paper, we propose a novel method of diverse Consensuses paired with Motion estimation-based multi-Model Fitting (CMMF), which leverages three types of diverse consensuses along with inter-model collaboration to enhance the effectiveness of multi-model fusion. We design a Tangent Consensus Residual Reconstruction (TCRR) module to capture motion structure information of two points at the pixel level. Additionally, we introduce a Cross Consensus Affinity (CCA) framework to strengthen the correlation between data points and model hypotheses. To address the challenge of multi-body motion estimation, we propose a Nested Consensus Clustering (NCC) strategy, which formulates multi-model fitting as a motion estimation problem. It explicitly establishes motion collaboration between models and ensures that multiple models are well-fitted. Extensive quantitative and qualitative experiments are conducted on four public datasets (i.e., AdelaideRMF-F, Hopkins155, KITTI, MTPV62), and the results demonstrate that our proposed method outperforms several state-of-the-art methods.
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Content] Vision and Language, [Generation] Multimedia Foundation Models, [Generation] Social Aspects of Generative AI
Relevance To Conference: Multi-model fitting is a fundamental and significant research problem in the field of computer vision and pattern recognition. Its goal is to fit multi-view media and multiple model instances in observation data contaminated by noise and outliers. In this work, we propose a novel method that utilizes diverse consensuses paired with motion estimation-based multi-model fitting to integrate and embed multi-view media information and multiple models. We explore three diverse types of consensuses to analyze correlations between data points and hypotheses. By doing so, we contribute to understanding potential motion interactions between multiple models and demonstrate how multi-model fitting approaches can leverage this potential to comprehensively capture relevant information about data points, hypotheses, and models. This enhances and facilitates the fusion nature of multi-perspective media information and multi-models. We test the proposed method on four publicly available datasets, including AdelaideRMF, MTPV 62, KITTI, and Hopkins 155. Qualitative and quantitative results validate the applicability and superiority of the proposed method, outperforming several state-of-the-art methods in terms of clustering accuracy and fitting performance.
Submission Number: 5293
Loading