Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning

Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning

ICLR 2026 Conference Submission20655 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Representation learning, latent planning, contrastive learning

TL;DR: Planning can be performed directly in an evaluation-aligned embedding space, where actions are ranked by their alignment with a global advantage direction.

Abstract: Planning in high-dimensional decision spaces is increasingly being studied through the lens of learned representations. Rather than training policies or value heads, we investigate whether planning can be carried out directly in an evaluation-aligned embedding space. We introduce SOLIS, which learns such a space using supervised contrastive learning. In this representation, outcome similarity is captured by proximity, and a single global advantage vector orients the space from losing to winning regions. Candidate actions are then ranked according to their alignment with this direction, reducing planning to vector operations in latent space. We demonstrate this approach in chess, where SOLIS uses only a shallow search guided by the learned embedding to reach competitive strength under constrained conditions. More broadly, our results suggest that evaluation-aligned latent planning offers a lightweight alternative to traditional dynamics models or policy learning. All source code and pretrained models will be made available upon publication.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 20655

Loading