Keywords: acquisition of color words, multimodal language model
TL;DR: We evaluated a multimodal language model trained on a developmentally plausible corpus to align utterances with visual input, and found that the word-world correspondence is an important factor underlying model performance.
Abstract: We evaluated a multimodal language model trained on a developmentally plausible corpus to align utterances with visual input. Our results show considerable variability in model accuracy across color terms. We also found that the word-world correspondence is an important factor underlying model performance.
Submission Number: 52
Loading