Multimodal Dataset Upgrading: a New Challenge for Data Annotation

Published: 04 Mar 2024, Last Modified: 02 May 2024DPFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multimodal, data upgrading, data annotation
TL;DR: In this paper, we propose a novel task of multimodal dataset upgrading to enhance the quality of multimodal annotations.
Abstract: In recent years, many large-scale datasets become available, yet their annotations are coarse and noisy. In this paper, we propose a novel task of multimodal dataset upgrading to enhance the quality of multimodal annotations. Distinguishing from traditional annotation efforts that focus on creating labels from scratch, multimodal dataset upgrading seeks to refine existing annotations by increasing annotation granularity, reducing errors, and improving multimodal alignment. We propose a framework for tackling multimodal data upgrading, consisting of generating candidates for upgrading and cross-modality matching to select the upgraded data. We further provide a case study on open-vocabulary segmentation datasets where by improving the class name quality, we achieve significant performance enhancements in state-of-the-art open-vocabulary segmentation models. As an initial exploration, we hope this paper showcases the benefits of data upgrading and opens up new avenues for research in data problems for foundation models.
Submission Number: 16
Loading