Keywords: Animal Communication, Behavioral Analysis, Quadruped Pose Prediction, Animal Robot Interaction, Computer Vision
TL;DR: This paper introduces QuadForecaster, the first diffusion-based model designed to predict the future body poses of four-legged animals, enabling automated systems to decode animal communication and facilitate animal-robot interaction.
Abstract: Animal communication relies on subtle temporal patterns in movement that current pose estimation systems cannot anticipate, thus limiting their utility. Existing frameworks excel at detecting present configurations but fail to predict future poses, forcing interaction systems to remain reactive rather than proactive. We introduce QuadForecaster, the first diffusion-based model specifically designed for quadrupedal pose prediction, enabling automated systems to decode animal communication through movement forecasting. Our temporally cascaded diffusion architecture treats pose prediction as structured denoising, iteratively refining uncertain future poses while providing essential uncertainty quantification for safe deployment. Evaluated on the cheetah and cow datasets, QuadForecaster achieves 0.116m MPJPE for cheetah behaviors and 0.86m MPJPE for complex cow social interactions, successfully capturing rapid behavioral transitions and multi-modal dynamics. QuadForecaster paves the way for robust animal motion and communication analysis, enabling proactive cross-species interaction across conservation, agriculture, and research applications.
Submission Number: 33
Loading