A Deconvolutional Strategy for Implementing Large Patch Sizes Supports Improved Image Classification

Xinhua Zhang, Garrett T. Kenyon

2015 (modified: 04 Oct 2024)BICT 2015Readers: Everyone

Abstract: Sparse coding is a widely-used technique for learning an overcomplete basis set from unlabeled image data. We hypothesize that as the size of the image patch spanned by each basis vector increases, the resulting dictionary should encompass a broader range of spatial scales, including more features that better discriminate between object classes. Previous efforts to measure the effects of patch size on image classification performance were confounded by the difficulty of maintaining a given level of overcompleteness as the patch size is increased. Here, we employ a type of deconvolutional network in which overcompleteness is independent of patch size. Based on image classification results on the CIFAR10 database, we find that optimizing our deconvolutional network for sparse reconstruction leads to improved classification performance as a function of the number of training epochs. Different from previous reports, we find that enforcing a certain degree of sparsity improves classification performance. We also find that classification performance improves as both the number of learned features (dictionary size) and the size of the image patch spanned by each feature (patch size) are increased, ultimately the best published results for sparse autoencoders

0 Replies