Keywords: Insects Dataset, Classification, Detection, and Segmentation, entomological research
Abstract: Visual understanding of insects from historical collections is crucial for insect biodiversity, ecological sustainability, and agricultural management. However, most existing datasets mainly focus on semantic labels and lack the spatial annotations which are essential for real-world applications and morphological analysis. To address these limitations, we introduce \textbf{LabInsect-48K}, the first comprehensive dataset of high-resolution entomological specimen images sourced from museum archives. LabInsect-48K contains 48,400 images spanning 643 species across 4 major insect orders. Delivers comprehensive and precise annotations of insects in both semantic and spatial dimensions, with the aim of advancing the landscape of biodiversity research communities. Specifically, the data set provides hierarchical taxonomic labels in semantics and supports different levels of categorization. More importantly, the dataset also has fine-grained spatial annotations to support quantitative analysis of morphology, \eg, insect shape, size, and structural traits. Our dataset supports the three core tasks in computer vision: species-level classification, object detection, and instance segmentation. We benchmark a wide range of state-of-the-art models across these tasks and demonstrate that high-resolution imaging, coupled with fine-grained annotations, empowers both ecological insight and foundation for developing multi-task, morphology-aware learning systems.
Supplementary Material: pdf
Primary Area: datasets and benchmarks
Submission Number: 15189
Loading