AttrAppGen: User Attribute-Aware App Usage Trace Generation With Large Language Models

Published: 2026, Last Modified: 21 Jan 2026IEEE Trans. Netw. Sci. Eng. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The exponential growth of smartphones and mobile applications (apps) has generated vast app usage traces, which are critical for stakeholders but accompanied by high costs during data collection, as well as significant privacy risks when data is released or shared. Traditional privacy-preserving methods, such as anonymization and differential privacy, often fail to reconcile data utility with robust privacy guarantees, suffering from re-identification vulnerabilities or excessive noise. In this paper, we propose AttrAppGen, a novel framework leveraging large language models (LLMs) to synthesize app usage traces. AttrAppGen comprises four key stages: i) attribute-aware app text dataset construction via attribute desensitization; ii) cluster-based conditional prompt design to discover behavioral patterns of user and app attributes, and then generate prompt-output app data training pairs; iii) data generation LLM fine-tuning based on app data training pairs; and iv) synthetic app usage data sampling for final data generation. Extensive experiments on a real-world dataset comprising 260,000 records from 6,102 users and 1,129 apps demonstrate the superior performance of AttrAppGen in preserving overall data distribution, ensuring high utility across three downstream tasks, and maintaining strong privacy protection. By effectively balancing the trade-off between utility and privacy, AttrAppGen provides a scalable and secure solution for data sharing in mobile data ecosystems.
Loading