Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition

Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition

ACL ARR 2025 February Submission1418 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Understanding user intents from UI interaction trajectories remains a challenging, yet crucial, frontier in intelligent agent development. While massive, datacenter-based, multi-modal large language models (MLLMs) possess greater capacity to handle the complexities of such sequences, smaller models which can run on-device to provide a privacy-preserving, low-cost, and low-latency user experience, struggle with accurate intent inference. In this paper, we address these limitations by introducing a novel decomposed approach: first, we perform structured interaction summarization, capturing key information from each user action. Second, we perform fine-tuned intent extraction model operating on the aggregated summaries. Remarkably, this method empowers resource-constrained models to not only achieve improved intent understanding, but also surpass the base performance of large MLLMs.

Paper Type: Long

Research Area: Generation

Research Area Keywords: efficient models, model architectures, inference methods, UI-to-text generation

Contribution Types: Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 1418

Loading