Inferring Implicit Goals Across Differing Task Models

Published: 15 Jun 2025, Last Modified: 17 Aug 2025AIA 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: value alignment, implicit requirements, unspecified user objectives, expectation mismatch, bottleneck identification, query strategy, policy inference, goal inference, subgoal identification
TL;DR: This paper presents a method for AI agents to identify and query users about implicit, unspecified goals by using bottleneck states in Markov Decision Processes to minimize the number of queries needed while ensuring value-aligned behavior.
Abstract: One of the significant challenges to generating value-aligned behavior is to not only account for the specified user objectives but also any implicit or unspecified user requirements. The existence of such implicit requirements could be particularly common in settings where the user's understanding of the task model may differ from the agent's estimate of the model. Under this scenario, the user may incorrectly expect some agent behavior to be inevitable or guaranteed. This paper addresses such expectation mismatch in the presence of differing models by capturing the possibility of unspecified user subgoal in the context of a task captured as a Markov Decision Process (MDP) and querying for it as required. Our method identifies bottleneck states and uses them as candidates for potential implicit subgoals. We then introduce a querying strategy that will generate the minimal number of queries required to identify a policy guaranteed to achieve the underlying goal. Our empirical evaluations demonstrate the effectiveness of our approach in inferring and achieving unstated goals across various tasks.
Paper Type: New Full Paper
Submission Number: 23
Loading