Abstract: 3D image classification plays a crucial role in fields such as computer vision, archaeology, and medical imaging, where accurate object recognition is essential. However, 3D classifiers alone may struggle to extract sufficient discriminative features, limiting their accuracy. To address this, we propose a hybrid framework that integrates both 3D and 2D classification techniques. Our approach extracts multi-view 2D projections from 3D objects and leverages them alongside 3D structural features to enhance classification performance. By combining the outputs of both modalities, our framework produces more robust and accurate predictions. We evaluate our framework using three state-of-the-art 3D classifiers (PointNet, PointNet++, and Mamba3D) and three 2D classifiers (ConvNeXt, EfficientNet, and ResNet). We also evaluate different methodologies to combine the predictions of the classifiers. Experimental results show that our hybrid approach consistently outperforms 3D-only classification. On the ModelNet10 dataset, PointNet++ accuracy improved from 88.93% to 94.38%, while on ModelNet40, it increased from 87.56% to 92.67%. These findings highlight the effectiveness of integrating multi-view and 3D classification for improved object recognition.
External IDs:dblp:conf/icdsp/FukudaGLAOM25
Loading