Abstract: In recent years, convolution neural networks have significant progress on keypoint detection. However, lots of approaches require multiple upsampling of small featuremaps to produce the final output. The huge information loss in the upsampling process is the key to restrict the accuracy. This work presents a new network architecture, which is designed to solve the problem of information loss in the process of upsampling and feature fusion compared with the previous approaches. We replace the normal convolution in the backbone network by using dilated convolution with stride of one to keep featuremaps’ size consistent, which avoids the multi-times upsampling during prediction. We also explore the feature fusion methods and propose a feature fusion block (AFB), which improves the accuracy and accelerates the convergence of model with the help of multi-scale resampling and attention mechanism. Excellent results are achieved on MsCOCO 2017 dataset and Ali apparel keypoints dataset. The code will be released for further research.
Loading