A Multi-Scale Learning Framework for Visual Categorization

Published: 01 Jan 2010, Last Modified: 03 Oct 2025ACCV (1) 2010EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Spatial pyramid matching has recently become a promising technique for image classification. Despite its success and popularity, no prior work has tackled the problem of learning the optimal spatial pyramid representation for the given image data and the associated object category. We propose a Multiple Scale Learning (MSL) framework to learn the best weights for each scale in the pyramid. Our MSL algorithm would produce class-specific spatial pyramid image representations and thus provide improved recognition performance. We approach the MSL problem as solving a multiple kernel learning (MKL) task, which defines the optimal combination of base kernels constructed at different pyramid levels. A wide range of experiments on Oxford flower and Caltech-101 datasets are conducted, including the use of state-of-the-art feature encoding and pooling strategies. Finally, excellent empirical results reported on both datasets validate the feasibility of our proposed method.
Loading