GPT2MEG: Quantizing MEG for Autoregressive Generation

Published: 01 Mar 2026, Last Modified: 01 Mar 2026ICLR 2026 TSALM Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Magnetoencephalography, Time series, Autoregressive modeling, Generative modeling, GPT, Tokenization, Conditional generation, multi-subject modeling
TL;DR: We adapt GPT-2 to continuous multichannel MEG using simple quantization and embeddings, enabling scalable, conditioned autoregressive generation that faithfully reproduces MEG dynamics.
Abstract: Large models and language-model training recipes are increasingly repurposed for time series, yet most work emphasizes univariate forecasting and evaluates models primarily via next-step loss. We introduce GPT2MEG, a tokenized GPT-2-style Transformer for multichannel MEG that enables context-informed autoregressive generation via additive embeddings for sensor identity, subject ID, and time-aligned task conditions. Using simple mu-law companding with uniform quantization, we train with cross-entropy and sample long horizons. To support rigorous evaluation of generative time-series models, we complement next-step metrics with spectral fidelity, HMM-based multivariate dynamics, and task-evoked response alignment. GPT2MEG best matches HMM state statistics and conditioned evoked responses, scales across 15 subjects via subject embeddings, and yields interpretable channel embeddings aligned with sensor geometry. Code will be available on GitHub.
Track: Research Track (max 4 pages)
Submission Number: 6
Loading