QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query

Yabo Xiao; Kai Su; Xiaojuan Wang; Dongdong Yu; Lei Jin; Mingshu He; Zehuan Yuan

QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query

Yabo Xiao, Kai Su, Xiaojuan Wang, Dongdong Yu, Lei Jin, Mingshu He, Zehuan Yuan

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: Sparse, End-to-end, Spatial-aware part-level query.

Abstract: We propose a sparse end-to-end multi-person pose regression framework, termed QueryPose, which can directly predict multi-person keypoint sequences from the input image. The existing end-to-end methods rely on dense representations to preserve the spatial detail and structure for precise keypoint localization. However, the dense paradigm introduces complex and redundant post-processes during inference. In our framework, each human instance is encoded by several learnable spatial-aware part-level queries associated with an instance-level query. First, we propose the Spatial Part Embedding Generation Module (SPEGM) that considers the local spatial attention mechanism to generate several spatial-sensitive part embeddings, which contain spatial details and structural information for enhancing the part-level queries. Second, we introduce the Selective Iteration Module (SIM) to adaptively update the sparse part-level queries via the generated spatial-sensitive part embeddings stage-by-stage. Based on the two proposed modules, the part-level queries are able to fully encode the spatial details and structural information for precise keypoint regression. With the bipartite matching, QueryPose avoids the hand-designed post-processes. Without bells and whistles, QueryPose surpasses the existing dense end-to-end methods with 73.6 AP on MS COCO mini-val set and 72.7 AP on CrowdPose test set. Code is available at https://github.com/buptxyb666/QueryPose.

TL;DR: We propose a sparse end-to-end multi-person pose regression framework with spatial-aware part-level queries. Our sparse solution outperforms all dense end-to-end methods.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/querypose-sparse-multi-person-pose-regression/code)

17 Replies

Loading