Keywords: HAR, Sythetic data, Regularization
Abstract: Synthetic data has become a common strategy to address data scarcity in Human Activity Recognition (HAR). However, models trained on synthetic samples often overfit to spurious features, leading to a substantial domain gap when transferred to real-world data. To address this challenge, we propose Regularization via Invariant Patterns (RIP), a novel data-centric method that extends the idea of domain randomization to the temporal domain. RIP augments time-series windows by "framing" them with invariant (constant-valued) patterns, compelling models to focus on informative signals rather than irrelevant temporal context.
Evaluated across five HAR datasets, four classifiers, and more than 2,000 experiments, RIP consistently improves F1 scores, achieving gains of up to +53 percentage points (over +160\% relative improvement) compared to synthetic baselines — often matching or surpassing real-data baselines. Beyond synthetic scenarios, RIP also boosts performance in real-only training settings, highlighting its broad applicability. Both theoretical analysis and empirical results show that RIP stabilizes weight updates and enhances calibration, all without modifying model architectures.
Supplementary Material: pdf
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 22098
Loading