Track: Security and privacy
Keywords: Browser Fingerprinting, Differential Privacy, Federated Learning
Abstract: Browser fingerprinting is a pervasive online tracking technique increasingly used for profiling and targeted advertising.
Existing research on fingerprinting prevalence relies heavily on automated web crawls, which inherently struggle to replicate the nuances of human-computer interaction.
This raises concerns about the accuracy of current understandings of real-world fingerprinting deployments.
To that end, this paper presents a user study involving 30 participants over a 10-week period, capturing telemetry data from real browsing sessions across 3,000 top-ranked websites.
Our findings reveal that automated crawls miss nearly half (47.8%) of the fingerprinting websites encountered by real users.
This discrepancy mainly stems from crawlers' inability to access authentication-protected pages, circumvent bot detection mechanisms, and trigger fingerprinting scripts activated by specific user interactions.
We also identify potential new fingerprinting vectors present in real user data but absent from automated crawls.
Finally, we evaluate the effectiveness of federated learning for training browser fingerprinting detection models on real user data, demonstrating superior performance to models trained solely on automated crawl data.
Submission Number: 506
Loading