Abstract: Human pose estimation is a fundamental research topic in computer vision. This topic has been largely improved recently thanks to the development of the convolution neural network. This paper introduces an efficient human pose estimator based on Mask RCNN, a member of RCNN family. It uses MobileNetV3 as backbone and replaces the vanilla convolutions with the proposed expanded depthwise separable convolutions to reduce the model size, FLOPs and inference time. The model can run in realtime speed at 25 FPS with acceptable scores.
0 Replies
Loading