From Screenshots to Hierarchical Code: Android GUI Layout Code Generation via Multi-Agent LLMs

ACL ARR 2025 May Submission5535 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: UI-to-Code systems have achieved strong performance in web interfaces, yet generating structured Android GUI code remains challenging due to layout complexity. We propose a framework that converts Android screenshots into hierarchical code through multi-agent LLMs. The framework begins with GUI component recognition, extracting both local component information and global layout structure. The LLM is then guided to generate code for each component in context, ensuring consistency and modularity. To improve code quality, we introduce a feedback-driven refinement stage that leverages structural similarity metrics for iterative enhancement. We evaluate our approach on subsets of Rico datasets. Results show that our method significantly outperforms Pix2Code, direct prompting, and chain-of-thought prompting strategies. Our findings highlight the effectiveness of layout-aware prompting and structured refinement for accurate Android GUI code generation.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: Special Theme Track, Dialogue and Interactive Systems, Generation
Contribution Types: NLP engineering experiment
Languages Studied: English
Keywords: Special Theme Track, Dialogue and Interactive Systems, Generation
Submission Number: 5535
Loading