Linear-Argmax Layer: A Robust and Effective Keypoint Localization Layer With Heatmap Induction Characteristics
Abstract: Keypoint localization is a fundamental task in computer vision, commonly addressed via heatmap regression. While the Soft-argmax Layer (SAM) exists as an end-to-end alternative, it suffers from fundamental limitations, including training instability and poor robustness, due to its internal softmax function. In this paper, we mathematically analyze these issues, identifying that the polarization characteristic of softmax causes information loss during forward propagation and a biased gradient flow during backpropagation. To resolve these problems, we propose the Linear-argmax Layer (LAM), a novel, softmax-free layer. LAM is a more robust and efficient architecture that ensures stable information flow through linear weighting. In experiments on face alignment datasets like WFLW and LAPA, LAM consistently achieved a lower prediction failure rate than SAM. Most notably, our lightweight LAM model with MobileNetV3-Small achieved a 5.39% NME with only 0.08 BFLOPs, overwhelmingly outperforming a heatmap-based model that recorded a 9.41% NME with over six times the computation. Consequently, our work demonstrates that LAM is a powerful alternative that overcomes the limitations of existing methodologies, enabling effective keypoint localization, especially in resource-constrained environments.
External IDs:doi:10.1109/access.2025.3639770
Loading