Keywords: disentangled, representation learning, face perception, generative models, fmri, encoding model, generalization
TL;DR: Disentangled deep generative models can be used to interpret face representations in the human brain.
Abstract: How does the human brain recognize faces and represent their many features? Despite decades of research, we still lack a thorough understanding of the computations carried out in face-selective regions of the human brain. Deep networks provide good match to neural data, but lack interpretability. Here we use a new class of deep generative models, disentangled representation learning models, which learn a latent space where each dimension “disentangles” a different interpretable dimension of faces, such as rotation, lighting, or hairstyle. We show that these disentangled networks are a good encoding model for human fMRI data. We further find that the latent dimensions in these models map onto non-overlapping regions in fMRI data, allowing us to "disentangle" different features such as 3D rotation, skin tone, and facial expression in the human brain. These methods provide an exciting alternative to standard “black box” deep learning methods, and have the potential to change the way we understand representations of visual processing int he human brain.