Abstract: Highlights•MSCDP predicts crowd density heatmaps in future time steps by fusing video frame and density heatmap encodings.•Long-term motion context memory alignment improves prediction accuracy by learning periodic movement patterns in optical flows and matching short-term observations to those patterns.•MSCDP outperforms state-of-the-art techniques and variants in predicting crowd density heatmaps, as shown in evaluation on two real-world datasets.
Loading