Feat2Vec: Dense Vector Representation for Data with Features


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Methods that calculate dense vector representations for features in unstructured data - such as words in a document - have proven to be very successful for knowledge representation. Surprisingly, very little work has focused on methods for structured datasets where there is more than one type of feature - that is, datasets that have arbitrary features beyond words. We study how to estimate dense representations for multiple feature types within a dataset, where each feature type exists in a different higher-dimensional space. Feat2Vec is a novel method that calculates embeddings for data with multiple feature types enforcing that all different feature types exist in a common space. We demonstrate our work on two datasets, and our experiments suggest that Feat2Vec significantly outperforms existing algorithms that do not leverage the structure of the data.
  • TL;DR: Learn dense vector representations of arbitrary types of features in unlabeled datasets
  • Keywords: unsupervised learning, knowledge representation, deep learning