MultimodalHD: Federated Learning Over Heterogeneous Sensor Modalities using Hyperdimensional Computing

Quanling Zhao; Xiaofan Yu; Shengfan Hu; Tajana Rosing

MultimodalHD: Federated Learning Over Heterogeneous Sensor Modalities using Hyperdimensional Computing

Quanling Zhao, Xiaofan Yu, Shengfan Hu, Tajana Rosing

Published: 01 Jan 2024, Last Modified: 13 May 2025DATE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Federated Learning (FL) has gained increasing interest as a privacy-preserving distributed learning paradigm in recent years. Although previous works have addressed data and system heterogeneities in FL, there has been less exploration of modality heterogeneity, where clients collect data from various sensor types such as accelerometer, gyroscope, etc. As a result, traditional FL methods assuming uni-modal sensors are not applicable in multimodal federated learning (MFL). State-of-the-art MFL methods use modality-specific blocks, usually recurrent neural networks, to process each modality. However, executing these methods on edge devices proves challenging and resource-intensive. A new MFL algorithm is needed to jointly learn from heterogeneous sensor modalities while operating within limited resources and energy. We propose a novel hybrid framework based on Hyperdimensional Computing (HD) and deep learning, named MultimodalHD, to learn effectively and efficiently from edge devices with different sensor modalities. MultimodalHD uses a static HD encoder to encode raw sensory data from different modalities into high-dimensional low-precision hypervectors. These multimodal hypervectors are then fed to an attentive fusion module for learning richer representations via inter-modality attention. Moreover, we design a proximity-based aggregation strategy to alleviate modality interference between clients. MultimodalHD is designed to fully utilize the strengths of both worlds: the computing efficiency of HD and the capability of deep learning. We conduct experiments on multimodal human activity recognition datasets. Results show that MultimodalHD delivers comparable (if not better) accuracy compared to state-of-the-art MFL algorithms, while being 2x – 8x more efficient in terms of training time. Our code is available online<sup>1</sup>.

Loading