Keywords: multi-view object classification, learning from noisy labels, dataset and benchmark
TL;DR: This paper presents a dataset and benchmark for multi-view object classification.
Abstract: Combining information from multiple views is essential for discriminating similar objects. However, existing datasets for multi-view object classification have several limitations, such as synthetic and coarse-grained objects, no validation split for hyperparameter tuning, and a lack of view-level information quantity annotations for analyzing multi-view-based methods. To address this issue, this study proposes a new dataset, MVP-N, which contains 44 retail products, 16k real captured views with human-perceived information quantity annotations, and 9k multi-view sets. The fine-grained categorization of objects naturally generates multi-view label noise owing to the inter-class view similarity, allowing the study of learning from noisy labels in the multi-view case. Moreover, this study benchmarks four multi-view-based feature aggregation methods and twelve soft label methods on MVP-N. Experimental results show that MVP-N will be a valuable resource for facilitating the development of real-world multi-view object classification methods. The dataset and code are publicly available at https://github.com/SMNUResearch/MVP-N.
Supplementary Material: zip
Contribution Process Agreement: Yes
In Person Attendance: Yes
Dataset Url: https://github.com/SMNUResearch/MVP-N
License: CC BY-NC-ND License for the dataset; MIT License for the code
Author Statement: Yes