- TL;DR: The SEM approach to multi-label learning models labels using multinomial distributions parametrized by nonlinear functions of the instance features, is scalable and outperforms current state-of-the-art algorithms
- Abstract: Multi-label learning aims to automatically assign to an instance (e.g., an image or a document) the most relevant subset of labels from a large set of possible labels. The main challenge is to maintain accurate predictions while scaling efficiently on data sets with extremely large label sets and many training data points. We propose a simple but effective neural net approach, the Semantic Embedding Model (SEM), that models the labels for an instance as draws from a multinomial distribution parametrized by nonlinear functions of the instance features. A Gauss-Siedel mini-batch adaptive gradient descent algorithm is used to fit the model. To handle extremely large label sets, we propose and experimentally validate the efficacy of fitting randomly chosen marginal label distributions. Experimental results on eight real-world data sets show that SEM garners significant performance gains over existing methods. In particular, we compare SEM to four recent state-of-the-art algorithms (NNML, BMLPL, REmbed, and SLEEC) and find that SEM uniformly outperforms these algorithms in several widely used evaluation metrics while requiring significantly less training time.
- Keywords: Supervised Learning
- Conflicts: berkeley.edu, bjtu.edu.cn