Neural Processes with Stochastic Attention: Paying more attention to the context datasetDownload PDF


Sep 29, 2021 (edited Nov 22, 2021)ICLR 2022 Conference Blind SubmissionReaders: Everyone
  • Keywords: neural processes, stochastic attention, variational inference, information theory
  • Abstract: Neural processes (NPs) aim to stochastically complete unseen data points based on a given context dataset. NPs essentially leverage a given dataset as a context embedding to derive an identifier suitable for a novel task. To improve the prediction accuracy, many variants of NPs have investigated context embedding approaches that generally design novel network architectures and aggregation functions satisfying permutation invariant. This paper proposes a stochastic attention mechanism for NPs to capture appropriate context information. From the perspective of information theory, we demonstrate that the proposed method encourages context embedding to be differentiated from a target dataset. The differentiated information induces NPs to learn to derive appropriate identifiers by considering together context embeddings and features in a target dataset. We empirically show that our approach substantially outperforms various conventional NPs in 1D regression and lotka-Volterra problem as well as image completion. Plus, we observe that the proposed method maintains performance and captures context embedding under restricted task distributions, where typical NPs suffer from lack of effective tasks to learn context embeddings. The proposed method achieves comparable results with state-of-the-art methods in the MovieLens-10k dataset, one of the real-world problems with limited users, perform well for the image completion task even with very limited meta-training dataset.
  • One-sentence Summary: This paper extends the attentive neural process (ANP), replacing the deterministic weights in the cross-attention module of ANP with latent weights.
  • Supplementary Material: zip
22 Replies