How Does an Adjective Sound Like? Improving Audio Phrase Composition with Text Embeddings

Anonymous

How Does an Adjective Sound Like? Improving Audio Phrase Composition with Text Embeddings

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone

Abstract: We learn matrix representations for the most frequent sound-relevant adjectives of English and compose them with vector representations of their nouns. The matrices are learnt jointly from audio and textual data, via linear regression (LR) and tensor skipgram (TSG). Their quality is as assessed on a novel adjective noun phrase similarity dataset, applied to two tasks: semantic similarity and audio similarity. Joint learning via TSG outperforms audio-only models, matrix composition outperforms addition and non compositional phrase vectors.

Paper Type: short

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Contribution Types: NLP engineering experiment, Reproduction study, Data resources

Languages Studied: English

0 Replies

Loading