Identifying User Goals from UI Trajectories

Identifying User Goals from UI Trajectories

ACL ARR 2024 August Submission73 Authors

12 Aug 2024 (modified: 23 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Identifying underlying user goals and intents has been recognized as valuable in various settings, such as personalized agents, improved search responses, advertising, user analytics and more. In this paper we propose leveraging an additional signal for identifying user intents, namely by observing users' interactions within UI environments. To that end, we introduce the task of goal identification from observed UI trajectories, aiming to infer the user's intended task based on their UI interactions. We propose a novel evaluation metric to assess whether two task descriptions are paraphrases within a specific UI environment. By Leveraging the inverse relation with the UI automation task, we utilized Android and web datasets for our experiments. Using our metric and these datasets, we conducted experiments comparing the performance of humans and state-of-the-art models, specifically GPT-4 and Gemini-1.5 Pro. Our results demonstrate that both Gemini and GPT underperform compared to humans, highlighting significant room for improvement.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: UI Automation, LLM, Multimodality, Intent Identification, Autonomous UI agents

Contribution Types: Model analysis & interpretability, Position papers

Languages Studied: English

Submission Number: 73

Loading