\section{Conclusion}
In this paper, we propose future gradient descent (FGD) that forecasts the gradient information of the future domain for training to address the issue of temporal domain shift in online recommendation systems. We show that FGD gives smaller temporal domain generalization in theory compared with a widely adopted algorithm, Batch Update. Empirical evidence is provided to show that FGD outperforms various representatives algorithms.
\clearpage