Supplementary Material: pdf
Keywords: alignment, representational similarity, fMRI, compression, information bottleneck
TL;DR: Compressing sentence-processing brain data improves alignment with LLM representations, suggesting different amounts of information in the two representation spaces.
Abstract: Recent work has discovered similarities between learned representations in large language models (LLMs) and human brain activity during language processing. However, it remains unclear what information LLM and brain representations share. In this work, inspired by a notion that brain data may include information not captured by LLMs, we apply an information bottleneck method to generate compressed representations of fMRI data. For certain brain regions in the frontal cortex, we find that compressing brain representations by a small amount increases their similarity to both BERT and GPT2 embeddings. Thus, our method not only improves LLM-brain alignment scores but also suggests important characteristics about the amount of information captured by each representation scheme.
Track: Extended Abstract Track
Submission Number: 16
Loading