Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction

Sreejan Kumar; Raja Marjieh; Byron Zhang; Declan Iain Campbell; Michael Y. Hu; Umang Bhatt; Brenden Lake; Thomas L. Griffiths

Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction

Sreejan Kumar, Raja Marjieh, Byron Zhang, Declan Iain Campbell, Michael Y. Hu, Umang Bhatt, Brenden Lake, Thomas L. Griffiths

Published: 02 Mar 2024, Last Modified: 02 Mar 2024ICLR 2024 Workshop Re-Align PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 9 pages)

Keywords: serial reproduction, Bayesian inference, multimodal models, GPT-4, vision, language

TL;DR: We adapted a classic paradigm in cognitive psychology, serial reproduction, to study visuo-linguistic representations in humans vs large language model in order to show significant differences in how they build abstractions across modalities.

Abstract: Humans extract useful abstractions of the world from noisy sensory data. Serial reproduction allows us to study how people construe the world through a paradigm similar to the game of telephone, where one person observes a stimulus and reproduces it for the next to form a chain of reproductions. Past serial reproduction experiments typically employ a single sensory modality, but humans often communicate abstractions of the world to each other through language. To investigate the effect language on the formation of abstractions, we implement a novel multimodal serial reproduction framework by asking people who receive a visual stimulus to reproduce it in a linguistic format, and vice versa. We ran unimodal and multimodal chains with both humans and GPT-4 and find that adding language as a modality has a larger effect on human reproductions than GPT-4's. This suggests human visual and linguistic representations are more dissociable than those of GPT-4.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 38

Loading