Keywords: Semantic Representation, Human Judgment, Crowdsourcing, Distributional Semantics, Neural Network, Adjective, Noun
Abstract: Human semantic representations are both difficult to capture and hard to fully interpret. Similarity judgments of words are highly sensitive to context, and association norms may only capture coarse similarity. By contrast, feature norms are more interpretable, and the number of norms can be scaled without limit, but they often only exist for sets of nouns described with concrete norms. In this paper, we introduce a new large dataset of nouns normed by a set of continuous adjective ratings both concrete and abstract. We compare our dataset to other forms of representation and find that they capture rich, unique structure, which can be represented by a low-dimensional latent semantic space. We further make relationships between our data and neural network representations from different modalities. Our dataset contributes to an increasingly detailed picture of one relatively sizable swath of human semantic representations, and can be used in a variety of modeling paradigms.