Abstract: Highlights•A Lightweight Cross-scale Feature Fusion Network for 2D human pose estimation from monocular.•The Dynamic Multi-Scale Convolution reduces the complexity of the model and improve the model’s ability to extract representations of variable poses.•The Cross-Resolution-Aware Semantics Module reduces the semantic gap between different scales.•The Adapt Feature Fusion Module can fine-tune the position of key points and improve recognition accuracy.•Our method achieves the state-of-the-art performance on two challenging 2D human pose estimation benchmark datasets.
Loading