Align While Search: Belief-Guided Exploratory Inference for Test-Time World Alignment

Seohui Bae; Jeonghye Kim; Youngchul Sung; Woohyung Lim

Align While Search: Belief-Guided Exploratory Inference for Test-Time World Alignment

Seohui Bae, Jeonghye Kim, Youngchul Sung, Woohyung Lim

Published: 12 Jun 2025, Last Modified: 09 Jul 2025EXAIT@ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Language Modeling

Keywords: Epistemic Exploration, Language-Model Agents, Test-time Adaptation

Abstract: We introduce a test-time adaptive agent that performs exploratory inference through posterior-guided belief refinement without relying on gradient-based updates or additional training for LLM agent search operation under partial observability. Our agent maintains a structured belief over the environment state, iteratively updates it via action-conditioned observations and selects actions by maximizing predicted information gain over the belief space. We estimate information gain using a lightweight LLM-based surrogate and assess world alignment through a novel reward that quantifies the consistency between posterior belief and ground-truth environment configuration. Experiments show that our method outperforms inference-time scaling baselines such as prompt-augmented or retrieval-enhanced LLMs, in aligning with latent world states with significantly lower integration overhead.

Serve As Reviewer: ~Seohui_Bae1, ~Jeonghye_Kim1

Submission Number: 101

Loading