Redundancy in Perceptual and Linguistic Experience: Comparing Feature-Based and Distributional Models of Semantic Representation
Abstract: Since their inception, distributional models of semantics have been criticized as inadequate cogni-
tive theories of human semantic learning and representation. A principal challenge is that the repre-
sentations derived by distributional models are purely symbolic and are not grounded in perception
and action; this challenge has led many to favor feature-based models of semantic representation.
We argue that the amount of perceptual and other semantic information that can be learned from
purely distributional statistics has been underappreciated. We compare the representations of three
feature-based and nine distributional models using a semantic clustering task. Several distributional
models demonstrated semantic clustering comparable with clustering-based on feature-based repre-
sentations. Furthermore, when trained on child-directed speech, the same distributional models
perform as well as sensorimotor-based feature representations of children’s lexical semantic knowl-
edge. These results suggest that, to a large extent, information relevant for extracting semantic cate-
gories is redundantly coded in perceptual and linguistic experience. Detailed analyses of the
semantic clusters of the feature-based and distributional models also reveal that the models make use
of complementary cues to semantic organization from the two data streams. Rather than conceptual-
izing feature-based and distributional models as competing theories, we argue that future focus
should be on understanding the cognitive mechanisms humans use to integrate the two sources.
Loading