Show or Tell? Interactive Task Learning with Large Language Models

Jacob Sansom; Muhammad Khalifa; Honglak Lee; Joyce Chai

Show or Tell? Interactive Task Learning with Large Language Models

Jacob Sansom, Muhammad Khalifa, Honglak Lee, Joyce Chai

Published: 06 Oct 2025, Last Modified: 04 Nov 2025MTI-LLM @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY-ND 4.0

Keywords: interactive, task, learning, language, model, agent

TL;DR: We explore the Interactive Task Learning capabilities of small-to-medium size LLMs on compositional symbolic tasks.

Abstract: Large Language Models (LLMs) can perform tasks specified in natural language, making them accessible to users regardless of technical background. However, specifying tasks within a single, static prompt is often both difficult and suboptimal. Interactive Task Learning (ITL)—a goal for autonomous agents—proposes to address this challenge through multi-turn interactions: teachers provide a task description and (optionally) a demonstration, agents attempt the task while asking clarifying questions, and teachers offer feedback. Despite ITL’s promise, systematic evaluation of LLMs’ interactive learning capabilities remains limited. We introduce the ListOps Domain, a novel testbed for evaluating models’ ability to learn compositional symbolic tasks through ITL. We evaluate small-to-medium size LLMs (4 to 32 billion parameters) and find that a limited form of teacher feedback— expressing only reminders about broken rules rather than explicitly identifying or correcting errors—enhances generalization. Using this feedback, we compare models’ ITL and Few-Shot Learning (FSL) capabilities and find that ITL frequently outperforms FSL, especially within more powerful models. We conclude with a discussion of limitations and recommendations for advancing ITL research.

Submission Number: 143

Loading