Towards a generalizable, unified framework for decoding from multimodal neural activity
Keywords: brain-computer interfaces, spikes, LFP, multimodal, transformer, neural decoding
TL;DR: We extend the POYO neural decoding framework to be able to process multiple modalities of neural data (spikes and LFPs) and see promising performance gains as a result.
Abstract: Recent advances in neural decoding have led to the development of large-scale deep learning-based neural decoders that can generalize across sessions and subjects. However, existing approaches predominantly focus on single modalities of neural activity, limiting their applicability to specific modalities and tasks. In this work, we present a multimodal extension of the POYO framework that jointly processes neuronal spikes and local field potentials (LFPs) for behavioural decoding. Our approach employs flexible tokenization schemes for both spikes and LFPs, enabling efficient processing of heterogeneous neural populations without preprocessing requirements like binning. Through experiments on data from nonhuman primates performing motor tasks, we demonstrate that multimodal pretraining yields superior decoding performance compared to unimodal baselines. We also show evidence of cross-modal transfer: models pretrained on both modalities outperform LFP-only models when fine-tuned solely on LFPs, suggesting a path toward more cost-effective brain-computer interfaces that can use performant LFP-based decoders. Our models also exhibit robustness to missing modalities during inference when trained with modality masking, and scale effectively with both model size and pretraining data. Overall, this work represents an important first step towards unified, general-purpose neural decoders capable of leveraging diverse neural signals for a variety of brain-computer interface applications.
Submission Number: 80
Loading