Abstract: Underpinning the success of deep learning is the effective regularization that allows a broad range of structures in data to be compactly modeled in a deep architecture. Examples include transformation invariances, robustness to adversarial/random perturbations, and correlations between multiple modalities. However, most existing methods incorporate such priors either by auto-encoders, whose result is used to initialize supervised learning, or by augmenting the data with exemplifications of the transformations which, despite the improved performance of supervised learning, leaves it unclear whether the learned latent representation does encode the desired regularities. To address these issues, this work proposes an \emph{end-to-end} representation learning framework that allows prior structures to be encoded \emph{explicitly} in the hidden layers, and to be trained efficiently in conjunction with the supervised target. Our approach is based on proximal mapping in a reproducing kernel Hilbert space, and leverages differentiable optimization. The resulting technique is applied to generalize dropout and invariant kernel warping, and to develop novel algorithms for multiview modeling and robust temporal learning.
Code: https://github.com/learndeep2019/ProxNet.git
Keywords: representation learning, multiview learning
Original Pdf: pdf
6 Replies
Loading