Efficient Facial Landmark Detection via Prior Knowledge-Guided Agents

Efficient Facial Landmark Detection via Prior Knowledge-Guided Agents

ICLR 2026 Conference Submission16871 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: landmark search, prior knowledge, proximity-weighted contrastive learning

TL;DR: This paper introduces a highly efficient landmark detection algorithm utilizing prior knowledge of landmarks.

Abstract: We present a highly efficient, agent-based framework for facial landmark detection that prioritizes model compactness and computational efficiency over maximum accuracy. Unlike conventional approaches that rely on large, fully supervised models, our method assigns each agent to a specific landmark, enabling it to infer its position solely from local observations and prior knowledge without explicit location awareness or inter-agent communication. Prior knowledge is modeled in two embedding spaces—feature and coordinate—using class-conditional Gaussian distributions. Agents navigate by minimizing deviations from these priors via a lightweight policy network. To enhance representation learning, we introduce a proximity-weighted contrastive learning strategy that incorporates spatial proximity into the training objective. A multi-stage detection strategy further reduces redundant computation by detecting sub-landmarks relative to core landmarks. While our method produces slightly higher normalized mean error than state-of-the-art (SoTA) methods, it achieves over $16\times$ and $41\times$ improvements in space and time complexities, respectively, compared to the SoTA lightweight model, running at $4.19$ and $1.29$ frames per second on an i5 CPU (2.5 GHz) for the COFW and 300W datasets, respectively.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 16871

Loading