Abstract: In this paper, we propose an end-to-end learning network to predict future frames in a point cloud sequence. As main novelty, an initial layer learns topological information of point clouds as geometric features, to form representative spatiotemporal neighborhoods. This module is followed by multiple Graph-RNN cells. Each cell learns point dynamics (i.e., RNN states) by processing each point jointly with its spatiotemporal neighbours. We tested the network performance with a MNIST dataset of moving digits, a synthetic human bodies motions, and JPEG dynamic bodies datasets. Simulation results demonstrate that our method outperforms baseline ones, which neglect geometry features information.
0 Replies
Loading