Embedding Multimodal Relational Data

Pouya Pezeshkpour; Liyan Chen; Sameer Singh

Embedding Multimodal Relational Data

Pouya Pezeshkpour, Liyan Chen, Sameer Singh

13 Dec 2017 (modified: 25 Jan 2018)ICLR 2018 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Representing entities and relations in an embedding space is a well-studied approach for machine learning on relational data. Existing approaches however primarily focus on simple link structure between a finite set of entities, ignoring the variety of data types that are often used in relational databases, such as text, images, and numerical values. In our approach, we propose a multimodal embedding using different neural encoders for this variety of data, and combine with existing models to learn embeddings of the entities. We extend existing datasets to create two novel benchmarks, YAGO-10-plus and MovieLens-100k-plus, that contain additional relations such as textual descriptions and images of the original entities. We demonstrate that our model utilizes the additional information effectively to provide further gains in accuracy. Moreover, we test our learned multimodal embeddings by using them to predict missing multimodal attributes.

TL;DR: Extending relational modeling to support multimodal data using neural encoders.

Keywords: multimodal, knowledge base, relational modeling, embedding, link prediction, neural network encoders

3 Replies

Loading