Abstract: Student pose information can reflect learning status, which is significant in teaching management and evaluation. However, the traditional manual behavior recognition and analysis process is complex and slow, so counting massive classroom data is a difficult task. Therefore, using computer vision technology to accurately recognize student behavior is of great significance to teaching management and evaluation. Besides, the performance of existing behavior recognition methods is limited by problems such as dense objects and occlusion in classroom scenes. To address these issues, we propose an anchor-free object detector based on center keypoints for behavior recognition in classroom scenes. Specifically, we design a multiscale convolution neural networks (CNNs) module to alleviate the scale variation of objects through multiscale receptive fields. The head network is based on the anchor-free detection head architecture and integrates the keypoint heatmaps to accurately extract the center keypoint position through the center pooling module (CPM). the CPM can suppress irrelevant background information and effectively alleviate the missed detection of dense objects. Regressing the distances from the center keypoint of the positive region to the four sides can suppress low-quality bounding boxes so that the detector can accurately locate the object. Furthermore, during the inference stage, the heatmap scores of keypoints and center keypoints are linearly combined as novel confidence to mitigate inaccurate measurement of predicted bounding boxes. The extensive experimental performance comparison on the classroom behavior (CB) and SCB-dataset3 benchmark datasets demonstrate that the proposed behavior recognition method can accurately detect objects in classroom scenes. Compared with other current state-of-the-art methods, the proposed behavior recognition method based on anchor-free object detector is able to achieve good performance.
Loading