Keywords: face perception, multimodal fusion, computer vision, tensor factorization
Abstract: Machine-vision representations of faces can be aligned to people’s first impressions of others (e.g., perceived trustworthiness) to create highly predictive models of biases in social perception. Here, we use deep tensor fusion to create a unified model of first impressions that combines information from three channels: (1) visual information from pretrained machine-vision models, (2) linguistic information from pretrained language models, and (3) demographic information from self-reported demographic variables. We test the ability of the model to generalize to held-out faces, traits, and participants and measure its fidelity to a large dataset of people’s first impressions of others.
8 Replies
Loading