Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs

ChangHao Li; Yuchen Zhuang; Rushi Qiang; Haotian Sun; Hanjun Dai; Chao Zhang; Bo Dai

Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs

ChangHao Li, Yuchen Zhuang, Rushi Qiang, Haotian Sun, Hanjun Dai, Chao Zhang, Bo Dai

Published: 18 Sept 2025, Last Modified: 11 Dec 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Black-Box LLMs, LLM Reasoning, LLM Planning, LLM Personalization

TL;DR: M-Pilot uses a small, controllable language model to guide a large, complex language model through complex tasks, improving its reasoning, planning, and personalization capability.

Abstract: Despite the impressive generative abilities of black-box large language models (LLMs), their inherent opacity hinders further advancements in capabilities such as reasoning, planning, and personalization. Existing works aim to enhance LLM capabilities via domain-specific adaptation, which require additional training on accessible model parameters, an infeasible option for black-box LLMs. To address this challenge, we introduce Matryoshka Pilot (M-Pilot), a lightweight white-box LLM controller that guides a large-scale black-box LLM generator by decomposing complex tasks into a series of intermediate outputs. Specifically, we consider the black-box LLM as an environment, with M-Pilot serving as a policy to provide intermediate guidance through prompts for driving the black-box LLM. M-Pilot is trained to pivot the outputs of the black-box LLM aligning with preferences during iterative interaction, which enables controllable multi-turn generation and self-improvement in optimizing intermediate guidance. Empirical evaluations on diverse tasks demonstrate that our method effectively enhances the capabilities of black-box LLMs in complex, long-horizon tasks.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 18973

Loading