OpenMAE: Efficient Masked Autoencoder for Vibration Sensing with Open-domain Data Enrichment

Chenzhi Hu, Yatong Chen, Denizhan Kara, Shengzhong Liu, Tarek Abdelzaher, Fan Wu, Guihai Chen

Published: 09 Jun 2025, Last Modified: 06 Nov 2025Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous TechnologiesEveryoneRevisionsCC BY-SA 4.0
Abstract: This paper introduces OpenMAE, a novel data enrichment framework utilizing open-world sensor data streams to facilitate efficient masked autoencoder (MAE) pretraining on vibration signals. Due to highly sparse event occurrences and inevitable distributional shifts from downstream tasks, directly concatenating large-scale open-domain data with limited in-domain data during pretraining leads to degraded downstream task performance. The problem is further complicated by missing knowledge of open-world sensor environments and associated physical event semantics. Against these challenges, OpenMAE makes the following contributions to vibration MAE pretraining with open-domain data: First, it automatically filters out uninformative samples based on the event activeness and information consistency without relying on human annotations; Second, to mind the gap between open-domain and in-domain distributions, OpenMAE develops a novel data mixing method, FreqCutMix, that combines two data types in the frequency domain as augmented pretraining samples, preserving both events-of-interest semantics from in-domain data and real-world diversity from open-domain data. The open-domain data scale in data mixing is dynamically increased as pretraining progresses to stabilize the model convergence. We download over 5 million open-world vibration samples from the Raspberry Shake datacenter1 and conduct extensive experiments with two applications (i.e., indoor activity and outdoor transportation analysis). The evaluation results show OpenMAE improves downstream task accuracies by up to 23% and achieves enhanced generalizability into diverse downstream tasks, domain variations, and sensor-to-target distances.
External IDs:doi:10.1145/3729485
Loading