A bimodal image dataset for seed classification from the visible and near-infrared spectrum

Maksim Kukushkin, Martin Bogdan, Simon Goertz, Jan-Ole Callsen, Eric Oldenburg, Matthias Enders, Thomas Schmid

Published: 08 Oct 2025, Last Modified: 07 Nov 2025Scientific DataEveryoneRevisionsCC BY-SA 4.0
Abstract: The success of deep learning in image classification has been largely underpinned by large-scale datasets, such as ImageNet, which have significantly advanced multi-class classification for RGB and grayscale images. However, datasets that capture spectral information beyond the visible spectrum remain scarce, despite their high potential, especially in agriculture, medicine and remote sensing. To address this gap in the agricultural domain, we present a thoroughly curated bimodal seed image dataset comprising paired RGB and hyperspectral images for 10 plant species, making it one of the largest bimodal seed datasets available. We describe the methodology for data collection and preprocessing and benchmark several deep learning models on the dataset to evaluate their multi-class classification performance. By contributing a high-quality dataset, our manuscript offers a valuable resource for studying spectral, spatial and morphological properties of seeds, thereby opening new avenues for research and applications.
Loading