Data Augmentation using Foundation model for Fine-grained Fish Species Identification

Published: 01 Jan 2024, Last Modified: 10 Jun 2025GCCE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: There are many similar appearance in fish species, and identifying species from an image is a challenging task called fine-grained image recognition (FGIR). Although several model architectures focusing on local regions have been proposed for FGIR and validated for their effectiveness, it is still important to expand the training dataset to enhance model performance and prevent overfitting. In this study, we develop a method to improve the estimation accuracy of a fish-species FGIR task by data augmentation using a foundation model Grounding DINO. Our method crops fish regions by Grounding DINO as data augmentation, and use them together with the original dataset for FGIR model training. As a result of our experiments using WildFish dataset, we demonstrated the effectiveness of our data augmentation method.
Loading