AerialGait: Bridging Aerial and Ground Views for Gait Recognition

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this work, we present AerialGait, a comprehensive dataset for aerial-ground gait recognition. This dataset comprises 82,454 sequences totaling over 10 million frames from 533 subjects, captured from both aerial and ground perspectives. To align with real-life scenarios of aerial and ground surveillance, we utilize a drone and a ground surveillance camera for data acquisition. The drone is operated at various speeds, directions, and altitudes. Meanwhile, we conduct data collection across five diverse surveillance sites to ensure a comprehensive simulation of real-world settings. AerialGait has several unique features: 1) The gait sequences exhibit significant variations in views, resolutions, and illumination across five distinct scenes. 2) It incorporates challenges of motion blur and frame discontinuity due to drone mobility. 3) The dataset reflects the domain gap caused by the view disparity between aerial and ground views, presenting a realistic challenge for drone-based gait recognition. Moreover, we perform a comprehensive analysis of existing gait recognition methods on AerialGait dataset and propose the Aerial-Ground Gait Network (AGG-Net). AGG-Net effectively learns discriminative features from aerial views by uncertainty learning and clusters features across aerial and ground views through prototype learning. Our model achieves state-of-the-art performance on both AerialGait and DroneGait datasets. The dataset and code will be made available upon acceptance.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: We construct AerialGait, a comprehensive dataset for aerial-to-ground gait recognition. It offers a multimodal data type that includes silhouettes, skeletons, and human parsing results. Based on this dataset, we have conducted extensive experiments based on silhouette and skeleton modalities. Moreover, we introduce a silhouette-based method named AGG-Net to bridge the gap between aerial and ground views, which demonstrates outstanding performance on our dataset. The dataset and code will be made available upon acceptance.
Supplementary Material: zip
Submission Number: 2154
Loading