Dual Variance Reduction with Momentum for Imbalanced Black-Box Discrete Prompt Learning

23 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: prompt learning; black-box optimization; imbalanced data
Abstract: Black-box prompt learning has proven to be an effective approach for customizing large language models (LLMs) offered as services to address various downstream tasks. Within this domain, policy gradient-based methods have garnered substantial attention as a prominent approach for learning discrete prompts. However, the highly imbalanced data distribution in the real world limits the applicability of such approaches by influencing LLMs' tendency to favor certain categories. To tackle the challenge posed by imbalanced data, this paper pioneers the integration of pairwise AUC loss into the policy gradient optimization of discrete text prompts and proposes learning discrete prompts with doubly policy gradient. Unfortunately, the doubly policy gradient estimation suffers from two variance components, resulting in unstable optimization. As a further improvement, we propose (1) a novel unbiased variance-reduced doubly policy gradient estimator and (2) incorporating the STORM variance reduction technique. Ultimately, we introduce a novel momentum-based discrete prompt learning method with doubly policy gradient (mDP-DPG). Crucially, we provide theoretical convergence guarantees for mDP-DPG within standard frameworks. The experimental results show that mDP-DPG surpasses baseline approaches across diverse imbalanced text classification datasets, emphasizing the advantages of our proposed approach for tackling data imbalance. Our code is available at the following URL: https://anonymous.4open.science/r/DPDPG-1ECB.
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3010
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview