Abstract: We have been witnessing remarkable success led by the power of neural networks driven by a significant scale of training data in handling various computer vision tasks. However, less attention has been paid to monitoring the camouflaged animals, the masters of hiding themselves in the background. Robust and precise segmentation of camouflaged animals is challenging even for domain experts due to their similarity to the environment. Although several efforts have been made in camouflaged animal image segmentation, to the best of our knowledge, limited work exists on camouflaged animal video understanding (CAVU). Biologists often prefer videos for monitoring and understanding animal behaviors, as videos provide redundant information and temporal consistency. However, the scarcity of labeled video data significantly hinders progress in this area. To address these challenges, we present CamoVid60K, a diverse, large-scale, and accurately annotated video dataset of camouflaged animals. This dataset comprises 218 videos with 62,774 finely annotated frames, covering 70 animal categories, which surpasses all previous datasets in terms of the number of videos/frames and species included. CamoVid60K also offers more diverse downstream tasks in computer vision, such as camouflaged animal classification, detection, and task-specific segmentation (semantic, referring, motion),etc.We have benchmarked several state-of-the-art algorithms on the proposed CamoVid60K dataset, and the experimental results provide valuable insights for future research directions. Our dataset serves as a novel and challenging benchmark to stimulate the development of more powerful camouflaged animal video segmentation algorithms, with substantial room for further improvement.
External IDs:doi:10.1007/s11263-026-02765-8
Loading