On Training-Test (Mis)alignment in Unsupervised Combinatorial Optimization: Observation, Empirical Exploration, and Analysis
Keywords: Test-time derandomization, Unsupervised combinatorial optimization, Continuous relaxation, Rounding
TL;DR: We discuss the training-test (mis)alignment in unsupervised combinatorial optimization, including an issue we observe together with our empirical exploration and analysis.
Abstract: In *unsupervised combinatorial optimization* (UCO), during training, one aims to have continuous decisions that are promising in a *probabilistic* sense for each training instance, which enables end-to-end training on initially discrete and non-differentiable problems.
At the test time, for each test instance, starting from continuous decisions, *derandomization* is typically applied to obtain the final deterministic decisions. Researchers have developed more and more powerful test-time derandomization schemes to enhance the empirical performance and the theoretical guarantee of UCO methods. However, we notice a misalignment between training and testing in the existing UCO methods. Consequently, lower training losses do not necessarily entail better post-derandomization performance, *even for the training instances without any data distribution shift*. Empirically, we indeed observe such undesirable cases. We explore a preliminary idea to better align training and testing in UCO by including a differentiable version of derandomization into training.Our empirical exploration shows that such an idea indeed improves training-test alignment, but also introduces nontrivial challenges into training.
Submission Number: 19
Loading