Abstract: With the wide availability of historical data from baseball games, one of the most popular sports, high accurate winner prediction has become a significant target of statistical analysis and machine learning. However, existing techniques for a pre-game prediction yield poor accuracies due to the incomplete player lists given in starting lineups and substitutions occurring during the game. We exploit the capability of Long Short-Term Memory (LSTM) in identifying hidden patterns of time series data to propose inter-dependent LSTM baseball game prediction with only the starting lineup information. Particularly, we preprocess historical data to generate a pair of pre-game and post-game records for each baseball game. The pre-game record indicates the incomplete player lists given in starting lineups, and the post-game one contains the list of all players who participated in the game. The inter-dependent LSTM model exploits the dependencies of the pairs to predict a game result with only pre-game input. Our experiment results show that the proposed model achieves up to 12% higher accuracy than the existing ones.
Loading