IRRISIGHT: A Large-Scale Multimodal Dataset and Scalable Pipeline to Address Irrigation and Water Management in Agriculture
Keywords: Remote Sensing, Multimodal Dataset, Vision, LLM
TL;DR: IRRISIGHT is a large-scale multimodal dataset that combines satellite, soil, crop, and hydrological data across 20 U.S. states to advance irrigation mapping and agricultural water management research.
Abstract: The lack of fine-grained, large-scale datasets on water availability presents a critical barrier to applying machine learning (ML) for agricultural water management. Since there are multiple natural and anthropogenic factors that influence water availability, incorporating diverse multimodal features can significantly improve modeling performance. However, integrating such heterogeneous data is challenging due to spatial misalignments, inconsistent formats, semantic label ambiguities, and class imbalances. To address these challenges, we introduce IRRISIGHT, a large-scale, multimodal dataset spanning 20 U.S. states. It consists of 1.4 million pixel-aligned 224×224 patches that fuse satellite imagery with rich environmental attributes. We develop a robust geospatial fusion pipeline that aligns raster, vector, and point-based data on a unified 10m grid, and employ domain-informed structured prompts to convert tabular attributes into natural language. With irrigation type classification as a representative problem, the dataset is AI-ready, offering a spatially disjoint train/test split and extensive benchmarking with both vision and vision–language models. Our results demonstrate that multimodal representations substantially improve model performance, establishing a foundation for future research on water availability.
https://github.com/Nibir088/IRRISIGHT
https://huggingface.co/datasets/OBH30/IRRISIGHT
Croissant File:  json
Dataset URL: https://huggingface.co/datasets/OBH30/IRRISIGHT
Code URL: https://github.com/Nibir088/IRRISIGHT
Supplementary Material:  pdf
Primary Area: Datasets & Benchmarks for applications in computer vision
Submission Number: 2002
Loading