Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Cat2Vec: Learning Distributed Representation of Multi-field Categorical Data
Ying Wen, Jun Wang, Tianyao Chen, Weinan Zhang
Nov 04, 2016 (modified: Nov 04, 2016)ICLR 2017 conference submissionreaders: everyone
Abstract:This paper presents a method of learning distributed representation for multi-field categorical data, which is a common data format with various applications such as recommender systems, social link prediction, and computational advertising. The success of non-linear models, e.g., factorisation machines, boosted trees, has proved the potential of exploring the interactions among inter-field categories. Inspired by Word2Vec, the distributed representation for natural language, we propose Cat2Vec (categories to vectors) model. In Cat2Vec, a low-dimensional continuous vector is automatically learned for each category in each field. The interactions among inter-field categories are further explored by different neural gates and the most informative ones are selected by pooling layers. In our experiments, with the exploration of the interactions between pairwise categories over layers, the model attains great improvement over state-of-the-art models in a supervised learning task, e.g., click prediction, while capturing the most significant interactions from the data.
TL;DR:an unsupervised pairwise interaction model to learning the distributed representation of multi-field categorical data
Keywords:Unsupervised Learning, Deep learning, Applications
Enter your feedback below and we'll get back to you as soon as possible.