Abstract: Sparse representations of images are useful in many computer vision applications. Sparse coding with an $l_1$ penalty and a learned linear dictionary requires regularization of the dictionary to prevent a collapse in the $l_1$ norms of the codes. Typically, this regularization entails bounding the Euclidean norms of the dictionary's elements. In this work, we propose a novel sparse coding protocol which prevents a collapse in the codes without the need to regularize the decoder. Our method regularizes the codes directly so that each latent code component has variance greater than a fixed threshold over a set of sparse representations for a given set of inputs. Furthermore, we explore ways to effectively train sparse coding systems with multi-layer decoders since they can model more complex relationships than linear dictionaries. In our experiments with MNIST and natural image patches, we show that decoders learned with our approach have interpretable features both in the linear and multi-layer case. Moreover, we show that sparse autoencoders with multi-layer decoders trained using our variance regularization method produce higher quality reconstructions with sparser representations when compared to autoencoders with linear dictionaries. Additionally, sparse representations obtained with our variance regularization approach are useful in the downstream tasks of denoising and classification in the low-data regime.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Revision June 21: Added experiments in the Supplementary Materials section. Revision Aug 19: - added more related literature as requested by the reviewers - added experiments previously in the Supplementary Materials section to the main text - updated the model visualization (Figure 1) - re-organized the experiments section for better readability - added a visualizations of LISTA encoders, WDL, WDL-NL, DO, and DO-NL dictionaries in the Appendix - general improvements to the text, especially to clarify points raised by the reviewers (e.g. clarify the goal of the classification experiments, the description of the non-linear, etc.) Camera ready 8/31: Camera ready version uploaded.
Assigned Action Editor: ~Vincent_Dumoulin1