Abstract: Currently, the scarcity of training data is an important factor restricting the performance of 3D human pose estimators. To make better use of existing data, we propose a graph data augmentation framework, MGPose. We regard the existing pose estimator as an autoencoder architecture, masking the human joints randomly in the encoder and reconstructing them in the decoder. Through random masking, our encoder can only get the features of the joints from the visible subset. The visible joints obtained each time are different, which helps to improve the feature extraction ability of the encoder. However, there is an offset between the masked data and the original data after passing through the same encoder. In order to narrow this gap, we reconstruct the high-dimensional features of the joints in the decoder and introduce the idea of equivariance to constrain the generation of the joint data. The parameter scale of MGPose is comparable to the original model, which is very simple and efficient. We conduct experiments on the most popular dataset Human3.6M and achieve state-of-the-art results.
0 Replies
Loading