Self-supervised Multimodal Model for Astronomy

Published: 11 Oct 2024, Last Modified: 12 Nov 2024Neurips 2024 Workshop FM4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal, Self-supervised, Astronomy
Abstract: While machine-learned models are now routinely employed to facilitate astronomical inquiry, model inputs tend to be limited to a primary data source (namely images or time series) and, in the more advanced approaches, some metadata. Yet with the growing use of wide-field, multiplexed observational resources, individual sources of interest often have a broad range of observational modes available. Here we construct an astronomical multimodal dataset and propose a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously. Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. In a fine-tuning supervised setting, our results demonstrate that CLIP pre-training improves classification performance for time-series photometry, where accuracy increases from 84.6% to 91.5%. Furthermore, CLIP boosts classification accuracy by up to 12.6% when the availability of labeled data is limited, showing the effectiveness of leveraging larger corpora of unlabeled data. To our knowledge this is the first construction of an n > 2 mode model in astronomy. Extensions to n > 3 modes is naturally anticipated with this approach.
Submission Number: 86
Loading