Abstract: This paper aims at predicting the volume of online movie play on an Video On Demand(VOD) platform using movie characteristics together with related social media data collected from the Internet. An ordinal support vector machine classification approach is employed to distinguish movies playback volume levels. This study collected 1,266 online movie data from year 2013 to 2015, and divided it into three types: high, medium, and low, according to view counts. For each movie, the first two months of video broadcasts are collected from the VOD website. Country, movie type, director, actor, box office are collected from professional movie database. Total number of news and tweets of the movies from Baidu and Sina Weibo are crawled to reflect the publicity level of the movie and the attention of the audience. After feature selection and feature creation, the ordinal support vector machine model is used to predict movie view counts with focus on identifying movies with low play counts and high play counts. The empirical results show that our ordinal support vector machine approach has better out-of-sample prediction accuracy, comparing with standard support vector machine models and other classification models.
Loading