Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment
Track: Extended Abstract Track
Keywords: Representation Learning, Similarity and Distance Learning, Optimization for Deep Networks
TL;DR: Few-Shot Optimal Transport for Systematics Mitigation in Surveys
Abstract: Systematics contaminate observables, leading to distribution shifts relative to
theoretically simulated signals—posing a major challenge for using pre-trained
models to label such observables. Since systematics are often poorly understood
and difficult to model, removing them directly and entirely may not be feasible.
To address this challenge, we propose a novel method that aligns learned features
between in-distribution (ID) and out-of-distribution (OOD) samples by optimizing
a feature-alignment loss on the representations extracted from a pre-trained ID
model. We first experimentally validate the method on the MNIST dataset using
possible alignment losses, including mean squared error and optimal transport, and
subsequently apply it to large-scale maps of neutral hydrogen. Our results show
that optimal transport is particularly effective at aligning OOD features when parity
between ID and OOD samples is unknown, even with limited data—mimicking
real-world conditions in extracting information from large-scale surveys. Our code
is available at https://github.com/sultan-hassan/feature-alignment-for-OOD-generalization.
Submission Number: 18
Loading