PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision

Published: 10 Oct 2024, Last Modified: 26 Nov 2024NeurIPS 2024 TSALM WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time Series, IMU, Multimodal Training, Self-Supervisied Learning
TL;DR: We propose PRIMUS, a method for pretraining IMU encoders using self-supervised, multimodal, and nearest-neighbor supervision.
Abstract: Sensing human motions through Inertial Measurement Units (IMUs) embedded in personal devices has enabled significant applications in health and wellness. While labeled IMU data is scarce, we can collect unlabeled or weakly labeled IMU data to model human motions. For video or text modalities, the "pretrain and adapt" approach utilizes large volumes of unlabeled or weakly labeled data for pretraining, building a strong feature extractor, followed by adaptation to specific tasks using limited labeled data. However, for IMU data, pretraining methods are poorly understood, and pretraining pipelines are rarely evaluated on out-of-domain tasks. We propose PRIMUS: a method for PRetraining IMU encoderS that uses a novel pretraining objective that is empirically validated based on downstream performance on both in-domain and out-of-domain datasets. The PRIMUS objective effectively enhances downstream performance by combining self-supervision, multimodal, and nearest-neighbor supervision. With fewer than 500 labeled samples per class, PRIMUS can improve test accuracy by up to 15\%, compared to state-of-the-art baselines. To benefit the broader community, we open-source our code at github.com/nokia-bell-labs/pretrained-imu-encoders.
Submission Number: 67
Loading