Navigating Cognitive Manifolds: Optimal Transport for Large Language Model Optimization

Yuntao Zou; Qianqi Zhang; xu zl

Navigating Cognitive Manifolds: Optimal Transport for Large Language Model Optimization

Yuntao Zou, Qianqi Zhang, xu zl

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Optimal Transport, Cognitive Manifolds, Wasserstein Distance, Prompt Optimization, Geometric Deep Learning

Abstract: Large language models (LLMs) possess vast knowledge but face inefficiencies in task-specific knowledge organization and activation. Existing prompt engineering relies on empirical trial-and-error, lacking principled optimization frameworks. We introduce Cognitive Geometry Optimal Transport (CGOT), a framework that reframes LLM cognitive optimization as geometric navigation in high-dimensional probability spaces. Our key insight models cognitive configurations as probability measures over knowledge states, leveraging optimal transport theory to derive principled paths from initial to target configurations. CGOT employs a dual geometric guidance system: Wasserstein distances for radial metrics and Kantorovich potential gradients for directional guidance, enabling continuous optimization on cognitive manifolds. Through systematic experiments on three prominent LLMs (Qwen3-72B, Deepseek-v3-67B, LLaMA-3-70B) across four cognition-intensive benchmarks (GSM8K, HumanEval, CommonsenseQA, BigBench-Hard), we demonstrate: (1) LLM cognitive spaces exhibit low-dimensional manifold structures (intrinsic dimension ~8.7) with strong geometry-performance correlation (Pearson $r = -0.76$, robustified to standardized $\beta = -0.82$ under hierarchical mixed-effects modeling); (2) CGOT achieves consistent 4.8\% average performance gains (Cohen's d $>$ 0.7 in structured tasks), outperforming baselines like APO, OPRO, GrIPS, and BayesOpt-Prompt by 0.6\% on average (p$<$0.05); (3) the framework generalizes across prompt strategies (Zero-shot: +5.3\%, Few-shot: +4.5\%, Chain-of-Thought: +4.6\%) and model architectures. Ablation studies confirm the critical contributions of Wasserstein metrics (-1.3\% without) and non-linear optimization (-2.2\% without). This work bridges optimal transport theory with LLM optimization, transforming prompt engineering from empirical art to geometric science with enhanced process interpretability

Primary Area: optimization

Submission Number: 7549

Loading