Keywords: multi-output gaussian processes, exact inference, directed acyclic graphs, conditional independence, structure learning, negative transfer
Abstract: Multi-output Gaussian processes (MOGPs) introduce correlations between outputs, but are subject to negative transfer, where learned correlations associate an output with another that is actually unrelated, leading to diminished predictive accuracy. Negative transfer may be countered by structuring the MOGP to follow conditional independence statements so that independent outputs do not correlate, but to date the imposed structures have been hand selected for specific applications. We introduce the DAG-GP model, which linearly combines latent Gaussian processes so that MOGP outputs follow a directed acyclic graph structure. Our method exposes a deep connection between MOGPs and Gaussian directed graphical models, which has not been explored explicitly in prior work. We propose to learn the graph from data prior to training the MOGP, so that correlations between outputs are only introduced when justified. Automated structure learning means no prior knowledge of the conditional independence between outputs is required, and training multiple MOGPs to identify the best structure is no longer necessary. Graph structure is learned by applying existing structure learning algorithms developed for graphical models to a downselected set of MOGP training data. Experiments on real world data sets show that with sufficiently expressive kernels, prediction error and likelihood are improved when using the DAG-GP model compared to state of the art exact MOGP methods.
One-sentence Summary: Multi-output Gaussian processes are constructed to follow a directed acyclic graph structure which is learned from training data.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=mPNMJBhVM0
5 Replies
Loading