Effective Fisher vector aggregation for 3D object retrieval

Jean-Baptiste Boin, Andre F. de Araújo, Lamberto Ballan, Bernd Girod

2017 (modified: 26 Jan 2022)ICASSP 2017Readers: Everyone

Abstract: We formulate the task of 3D object retrieval as a visual search problem where a database containing videos of objects captured manually from different viewpoints is queried using a single image. We propose to aggregate visual information of similar views and use the Fisher vector (FV) framework to compactly represent a database of objects. Large-scale experiments on an existing video dataset that we complemented with image queries, shows that our aggregation schemes significantly outperform standard retrieval techniques. When representing our database with only 4 FVs per object, our approach performs with a mean average precision (mAP) of 73.0% on our dataset while the baseline (no aggregation) only reaches a mAP of 43.8%. It can also reach a 72.0% mAP level with a 10× smaller database than the baseline.

0 Replies