Abstract: The goal of gait recognition is to learn the unique spatiotemporal pattern about the human body shape from its temporal changing characteristics. As different body parts behave differently during walking, it is intuitive to model the
spatio-temporal patterns of each part separately. However,
existing part-based methods equally divide the feature maps
of each frame into fixed horizontal stripes to get local parts.
It is obvious that these stripe partition-based methods cannot accurately locate the body parts. First, different body
parts can appear at the same stripe (e.g., arms and the
torso), and one part can appear at different stripes in different frames (e.g., hands). Second, different body parts
possess different scales, and even the same part in different
frames can appear at different locations and scales. Third,
different parts also exhibit distinct movement patterns (e.g.,
at which frame the movement starts, the position change
frequency, how long it lasts). To overcome these issues,
we propose novel 3D local operations as a generic family of building blocks for 3D gait recognition backbones.
The proposed 3D local operations support the extraction of
local 3D volumes of body parts in a sequence with adaptive spatial and temporal scales, locations and lengths. In
this way, the spatio-temporal patterns of the body parts
are well learned from the 3D local neighborhood in partspecific scales, locations, frequencies and lengths. Experiments demonstrate that our 3D local convolutional neural
networks achieve state-of-the-art performance on popular
gait datasets. Code is available at: https://github.
com/yellowtownhz/3DLocalCNN.
0 Replies
Loading