SeaTurtleID2022: A long-span dataset for reliable sea turtle re-identification

Published: 01 Jan 2024, Last Modified: 04 Nov 2024WACV 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper introduces the first public large-scale, long-span dataset with sea turtle photographs captured in the wild -$SeaTurtleID2022$. The dataset contains 8729 photographs of 438 unique individuals collected within 13 years, making it the longest-spanned dataset for animal re-identification. Each photograph includes various annotations, e.g., identity, encounter timestamp, and body parts segmentation masks. Instead of a standard ’’random" split, the dataset allows for two realistic and ecologically motivated splits: (i) time-aware: a closed-set with training, validation, and test data from different days/years, and (ii) open-set: with new unknown individuals in test and validation sets. We show that time-aware splits are essential for benchmarking methods for re-identification, as random splits lead to performance overestimation. Furthermore, a baseline instance segmentation and re-identification performance over various body parts is provided. At last, an end-to-end system for sea turtle re-identification is proposed and evaluated. The proposed system based on Hybrid Task Cascade for head instance segmentation and ArcFace-trained feature-extractor achieved an accuracy of 86.8%.
Loading