3DFaceMAE: Pre-training of Masked Autoencoder Using Patch-Based Random Masking Reconstruction and Super-resolution for 3D Face Recognition

Ziqi Gao, Qiufu Li, Linlin Shen, Junpeng Yang

Published: 02 Nov 2024, Last Modified: 04 Mar 2025OpenReview Archive Direct UploadEveryoneRevisionsCC BY 4.0

Abstract: Compared to 2D face recognition, 3D face recognition exhibits stronger robustness against variations like pose and illumination. However, due to the limited training data, the accuracy of existing 3D face recognition methods is still unsatisfactory. In this paper, we introduce 3DFaceMAE, which is the first masked autoencoder (MAE) based 3D face recognition method using point clouds. Specifically, we first synthesize a large-scale 3D point cloud facial dataset and combine it with the small-scale real data. In the pre-training of 3DFaceMAE, we extract the key facial regions from the input 3D facial point cloud, using normal difference techniques, and reconstruct these key regions using patch-based random masking reconstruction and super-resolution. We finally fine-tune the encoder of 3DFaceMAE on the real 3D face point cloud data. In the experiments, we test 3DFaceMAE on three 3D face datasets, as high as 91.17% was achieved on the Lock3DFace dataset, which is the first reported result surpassing 90%. In addition, the experimental results indicate that 3DFaceMAE has strong cross-quality generalization performance. We also validate the effectiveness of different components of 3DFaceMAE through ablation study.