Active Object Recognition with Trained Multi-view Based 3D Object Recognition Network

Yunyi Guan; Asako Kanezaki

Active Object Recognition with Trained Multi-view Based 3D Object Recognition Network

Yunyi Guan, Asako Kanezaki

Published: 16 Apr 2024, Last Modified: 02 May 2024MoMa WS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: active vision, reinforcement learning, next-best-view planning

TL;DR: The paper presents a novel framework for active object recognition that uses a pre-trained 3D object recognition network and reinforcement learning to efficiently improve object recognition accuracy, outperforming several existing approaches.

Abstract: We tackle the active object recognition (AOR) problem, in which agents learn effective exploration actions to actively acquire new images with partial knowledge of observation for better object recognition. Previous studies have typically used reinforcement learning to jointly train a single-input classifier and a policy network to learn to find new observations. However, this joint learning process is very laborious and cannot reach the accuracy level of existing object classifiers. It is also reported that when using highly accurate classifiers such as ResNet, the active vision capabilities of such jointly trained models will disappear. To overcome these problems, we propose a framework using a highly accurate pre-trained multi-view-based 3D object recognition network to train AOR agents by reinforcement learning, aiming at higher accuracy of object recognition in a lighter way. Through evaluation experiments with several benchmark datasets, we show that the performance of our approach outperforms several previous studies.

Submission Number: 3

Loading