Keywords: Human-AI Alignment, Task-Solving Trajectories, Misalignment Detection, Intention Prediction, Abstraction and Reasoning Corpus
TL;DR: Detecting and modeling misalignments in human task-solving trajectories improves AI reasoning by incorporating inferred intentions into decision-transformer-based learning.
Abstract: Understanding misalignments in human task-solving trajectories is critical for improving AI models trained to mimic human reasoning. This study categorizes these misalignments into (1) Functional Inadequacies in Tools, (2) User Unfamiliarity with Tools, and (3) Cognitive Dissonance in Users. We introduce a misalignment detection algorithm and a visualization tool to analyze discrepancies in user trajectories from O2ARC, formalizing intention-aware trajectory modeling. Additionally, we propose an intention prediction algorithm that infers user intentions by identifying frequently visited states and structured transitions. By incorporating intention-aligned supervision into a Decision Transformer-based ARC solver, we demonstrate that aligning AI with inferred human intentions significantly improves task-solving performance. These findings underscore the importance of modeling human task-solving trajectories beyond action sequences and capturing underlying intentions for better AI alignment.
Submission Type: Long Paper (9 Pages)
Archival Option: This is a non-archival submission
Presentation Venue Preference: ICLR 2025
Submission Number: 30
Loading