Disentangle, Gate, and Optimize: Cross Domain Transfer power by Multi Objective Bayesian Optimization

Disentangle, Gate, and Optimize: Cross Domain Transfer power by Multi Objective Bayesian Optimization

ICLR 2026 Conference Submission15246 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Prompt Tuning, Cross Domian Transfer, Multiple Objective Bayesian Optimization

Abstract: Prompt Tuning (PT) has recently shown remarkable success in diverse Natural Language Processing (NLP) tasks, providing an efficient knowledge transfer paradigm to textually instruct models with domain-level guidance. However, existing PT approaches often struggle to accurately distinguish between domain-invariant and domain-specific knowledge of input texts, thereby inducing negative transfer that harms model performances across various domains. To mitigate this, recent studies have introduced the concept of adversarial training to highlight domain-specific nuances for improving the model's adaptation ability, but often rely on overly complex parameter optimization, which hinders smooth generalization. Motivated by this, we propose a novel prefix tuning framework, named Adaptive Robust Prefix Optimization (ARPO), in which adaptive representation disentanglement precisely decouples domain-specific information from invariant knowledge, while Multi-Objective Bayesian Optimization (MOBO) dynamically adjusts adversarial strategies for improved model robustness. Specifically, we first develop disentangled representation learning based on Information Bottleneck theory with dynamic orthogonality and conditional independence constraints, combined with adaptive adversarial training driven by dynamic thresholds. We then employ MOBO for efficient search within the high-dimensional strategy space. We theoretically prove that the proposed MOBO approach is feasible and guaranteed to converge under reasonable assumptions. Extensive evaluations on GLUE, Super GLUE, MRQA 2019, GSM8K, and HumanEval show that ARPO achieves around 6% improvement in two experimental settings, highlighting its robust cross-domain generalization.

Primary Area: transfer learning, meta learning, and lifelong learning

Submission Number: 15246

Loading