Keywords: Set Representation Learning, Auxiliary Learning, Adversarial Encoding Perturbation
Abstract: Sets are a fundamental data structure, and learning their vectorized representations is crucial for many computational problems. Existing methods typically focus on intra-set properties such as permutation invariance and cardinality independence. While effective at preserving basic intra-set semantics, these approaches may be insufficient in explicitly modeling inter-set correlations, which are critical for tasks requiring fine-grained comparisons between sets. In this work, we propose SRAL, a Set Representation Auxiliary Learning framework for capturing inter-set correlations that is compatible with various downstream tasks. SRAL conceptualizes sets
as high-dimensional distributions and leverages the 2-Sliced-Wasserstein distance to derive their distributional discrepancies into set representation encoding. More
importantly, we introduce a novel adversarial auxiliary learning scheme. Instead of
manipulating the input data, our method perturbs the set encoding process itself and
compels the model to be robust against worst-case perturbations through a min-max
optimization. Our theoretical analysis shows that this objective, in expectation,
directly optimizes for the set-wise Wasserstein distances, forcing the model to
learn highly discriminative representations. Comprehensive evaluations across
four downstream tasks examine SRAL’s performance relative to baseline methods,
showing consistent effectiveness in both inter-set relation-sensitive retrieval and
intra-set information-oriented processing tasks.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 2217
Loading