Shared Stochastic Gaussian Process Latent Variable Models: A Multi-modal Generative model for Quasar spectra
Abstract: This work proposes a scalable probabilistic latent variable model based on Gaussian processes in the context of multiple observation spaces. We focus on an application in astrophysics where it is typical for data sets to contain both observed spectral features as well as scientific properties of astrophysical objects such as galaxies or exoplanets. In our application, we study the spectra of very luminous galaxies known as quasars, and their properties, such as the mass of their central supermassive black hole, their accretion rate and their luminosity, and hence, there can be multiple observation spaces. A single data point is then characterised by different classes of observations which may have different likelihoods. Our proposed model extends the baseline stochastic variational Gaussian process latent variable model (GPLVM) to this setting, proposing a seamless generative model where the quasar spectra and the scientific labels can be generatedsimultaneously when modelled with a shared latent space acting as input to different sets of Gaussian process decoders, one for each observation space. Further, this framework allows training in the missing data setting where a large number of dimensions per data point may be unknown or unobserved. We demonstrate high-fidelity reconstructions of the spectra and the scientific labels during test-time inference and briefly discuss the scientific interpretations of the results along with the significance of such a generative model.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Final revision and additional scripts uploaded.
Code: https://github.com/vr308/Quasar-GPLVM
Supplementary Material: zip
Assigned Action Editor: ~Manuel_Haussmann1
Submission Number: 3386
Loading