Beyond Hearing: Learning Task-agnostic ExG Representations from Earphones via Physiology-informed Tokenization

ICLR 2026 Conference Submission18309 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Physiology-informed Multi-band Tokenization, ExG, Representation Learning, Free-living ExG dataset, Task-agnostic training
TL;DR: We collected 50h of free-living ExG data with an earphone-based device and propose PiMT, a physiology-informed multi-band tokenization approach designed for task-agnostic representation learning with reconstruction-based pre-training.
Abstract: Electrophysiological (ExG) signals offer valuable insights into human physiology, yet building foundation models that generalize across everyday tasks remains challenging due to two key limitations: (i) insufficient data diversity, as most ExG recordings are collected in controlled labs with bulky, expensive devices; and (ii) task-specific model designs that require tailored processing (i.e., targeted frequency filters) and architectures, which limit generalization across tasks. To address these challenges, we introduce an approach for scalable, task-agnostic ExG monitoring in the wild. We collected 50 hours of unobtrusive free-living ExG data with an earphone-based hardware prototype to narrow the data diversity gap. At the core of our approach is Physiology-informed Multi-band Tokenization (PiMT), which decomposes ExG signals into 12 physiology-informed tokens, followed by a reconstruction task to learn robust representations. This enables adaptive feature recognition across the full frequency spectrum while capturing task-relevant information. Experiments on our new HumanSense dataset, the first to enable ExG-based analysis across five human senses, together with four public ExG benchmarks, demonstrate that PiMT consistently outperforms state-of-the-art methods across diverse tasks.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 18309
Loading