UICOMPASS: UI Map Guided Mobile Task Automation via Adaptive Action Generation

UICOMPASS: UI Map Guided Mobile Task Automation via Adaptive Action Generation

ACL ARR 2025 May Submission6549 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Mobile task automation is an emerging technology that leverages AI to automatically execute routine tasks by users' commands on mobile devices like Android, thus enhancing efficiency and productivity. While large language models (LLMs) excel at general mobile tasks through training on massive datasets, they struggle with app-specific workflows. To solve this problem, we designed UI Map, a structured representation of target app's UI information. We further propose a UI Map-guided LLM-based approach UICompass to automate mobile tasks. Specifically, UICompass first leverages static analysis and LLMs to automatically build UI Map from either source codes of apps or byte codes (\emph{i.e.,} APK packages). During task execution, UICompass mines the task-relevant information from UI Map to feed into the LLMs, generate a planned paths, and adaptively adjust the path based on the actual app state and action history. Experimental results demonstrate that UICompass achieves a 15.87\% higher task executing success rate than SOTA approaches. Even when only APK is available, UICompass maintains superior performance, demonstrating its applicability to closed-source apps.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Task Automation, Large Language Models, App Analysis

Contribution Types: Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 6549

Loading