Grey-box Prompt Optimization and Fine-Tuning for Cloud-Edge LLM Agents

ICLR 2025 Conference Submission13721 Authors

28 Sept 2024 (modified: 19 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language model, zeroth-order optimization, prompt optimization
Abstract: Large Language Models (LLMs) are transforming the landscape of generative AI, delivering groundbreaking performance across diverse tasks. Yet, their immense model sizes tether most LLMs to the cloud, posing challenges for tasks that demand processing private and proprietary data. In this paper, we introduce a grey-box prompt optimization and fine-tuning framework for cloud-edge LLMs-paving the way for a seamless, hybrid approach that merges the best of both private and public cloud environments. This framework not only boosts flexibility and scalability but also empowers users with heightened security and compliance, optimizing cost and performance. Beyond that, it ensures robust disaster recovery and business continuity through redundancy and smart workload distribution. At the heart of our solution is an efficient algorithm with guaranteed convergence, specifically tailored to the structure of the grey-box optimization problem. We rigorously analyze and derive its non-asymptotic convergence rate. Our extensive experiments reveal that sandwiched tuning-our novel fine-tuning method-delivers up to a 47.9\% performance improvement over traditional methods across multiple tasks.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13721
Loading