Abstract: The inherent variability and unpredictability in open multi-view learning scenarios infuse considerable ambiguity into the learning and decision-making processes of predictors. This demands that predictors not only recognize familiar patterns but also adaptively interpret unknown ones out of training scope. To address this challenge, we propose an Ambiguity-Aware Multi-view Learning Framework, which integrates four synergistic modules into an end-to-end framework to achieve generalizability and reliability beyond the known. By introducing the mixed samples to broaden the learning sample space, accompanied by corresponding soft labels to encapsulate their inherent uncertainty, the proposed method adapts to the distribution of potentially unknown samples in advance. Furthermore, an instance-level sparse inference is implemented to learn sparse approximated points in the multiple view embedding space, and individual view representations are gated by view-level confidence mappings. Finally, a multi-view consistent representation is obtained by dynamically assigning weights based on the degree of cluster-level dispersion. Extensive experiments demonstrate that our approach is effective and stable compared with other state-of-the-art methods in open-world recognition situations.
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: The work is designed to effectively reconcile and fuse information from disparate sources, a common requirement in multimedia applications.
Through sophisticated modules for sparse inference and confidence assessment, the framework excels in prioritizing relevant information across modalities, reducing the impact of redundant and inconsistent data. At the same time, the proposed approach broadens the training space of the model through soft mixed augmentation modules, enabling it to adapt to open environments.
Submission Number: 3576
Loading