BenthicNet: A global compilation of seafloor images for deep learning applications

Published: 09 Oct 2025, Last Modified: 09 Oct 2025NeurIPS 2025 Workshop ImageomicsEveryoneRevisionsBibTeXCC BY 4.0
Submission Track: Full papers that have been published at a peer-reviewed venue after January 1st, 2024 (up to 9 pages, excluding references)
Keywords: vision, image, dataset, benthic, ocean, seafloor, sea, underwater, habitat, biodiversity, coral
TL;DR: We collate 2000 small and diverse seafloor imagery datasets together into one large dataset of 11M pretraining images, and standardize the labels for 3M annotations over 190k of the images.
Abstract: Advances in underwater imaging enable collection of extensive seafloor image datasets necessary for monitoring important benthic ecosystems. The ability to collect seafloor imagery has outpaced our capacity to analyze it, hindering mobilization of this crucial environmental information. Machine learning approaches provide opportunities to increase the efficiency with which seafloor imagery is analyzed, yet large and consistent datasets to support development of such approaches are scarce. Here we present BenthicNet: a global compilation of seafloor imagery designed to support the training and evaluation of large-scale image recognition models. An initial set of over 11.4 million images was collected and curated to represent a diversity of seafloor environments using a representative subset of 1.3 million images. These are accompanied by 3.1 million annotations translated to the CATAMI scheme, which span 190,000 of the images. A large deep learning model was trained on this compilation and preliminary results suggest it has utility for automating large and small-scale image analysis tasks.
Submission Number: 72
Loading