Abstract: Understanding the content of user's image posts is a particularly interesting problem in social networks and web settings. Current machine learning techniques focus mostly on curated training sets of image-label pairs, and perform image classification given the pixels within the image. In this work we instead leverage the wealth of information available from users: firstly, we employ user hashtags to capture the description of image content; and secondly, we make use of valuable contextual information about the user. We show how user metadata (age, gender, etc.) combined with image features derived from a convolutional neural network can be used to perform hashtag prediction. We explore two ways of combining these heterogeneous features into a learning framework: (i) simple concatenation; and (ii) a 3-way multiplicative gating, where the image model is conditioned on the user metadata. We apply these models to a large dataset of de-identified Facebook posts and demonstrate that modeling the user can significantly improve the tag prediction quality over current state-of-the-art methods.
0 Replies
Loading