Animal-JEPA: Advancing Animal Behavior Studies Through Joint Embedding Predictive Architecture in Video Analysis

Published: 01 Jan 2024, Last Modified: 02 Aug 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Analyzing animal behavior from video data is crucial for understanding brain function, assessing pharmacological interventions, and examining genetic modifications. Traditional methods often struggle to accurately analyze group behaviors in complex environments. To address these challenges, we introduce the Animal Joint Embedded Prediction Architecture (Animal-JEPA), a novel self-supervised learning model designed for studying animal behavior from video data. Animal-JEPA leverages a dynamic scaling mechanism and an elliptical masking strategy to enhance feature extraction and behavioral analysis without the need for labeled data. Our approach significantly outperforms existing models, including Separate 3D ConvNet (S3D) [3], Video Vision Transformer (ViViT) [4], and the original V-JEPA [6], particularly in multi-category and multi-objective classification tasks on our newly developed Mice-Behavior3 (MB3) dataset. The results highlight Animal-JEPA’s potential to improve the accuracy and adaptability of behavioral analysis in animal research, providing a powerful tool for neuroscientists and researchers.
Loading