Keywords: wildfire smoke, deep learning, dataset, remote sensing, satellite, semi-supervised learning
TL;DR: We use physics-guided semi-supervised learning to align human-labeled wildfire smoke annotations with GOES satellite imagery, creating SmokeViz, a large-scale dataset for smoke plume segmentation.
Abstract: The global rise in wildfire frequency and intensity over the past decade underscores the need for improved fire monitoring techniques. To advance deep learning research on wildfire detection and its associated human health impacts, we introduce **SmokeViz**, a large-scale machine learning dataset of smoke plumes in satellite imagery. The dataset is derived from expert annotations created by smoke analysts at the National Oceanic and Atmospheric Administration, which provide coarse temporal and spatial approximations of smoke presence. To enhance annotation precision, we propose **pseudo-label dimension reduction (PLDR)**, a generalizable method that applies pseudo-labeling to refine datasets with mismatching temporal and/or spatial resolutions. Unlike typical pseudo-labeling applications that aim to increase the number of labeled samples, PLDR maintains the original labels but increases the dataset quality by solving for intermediary pseudo-labels (IPLs) that align each annotation to the most representative input data. For SmokeViz, a parent model produces IPLs to identify the single satellite image within each annotations time window that best corresponds with the smoke plume. This refinement process produces a succinct and relevant deep learning dataset consisting of over 160,000 manual annotations. The SmokeViz dataset is expected to be a valuable resource to develop further wildfire-related machine learning models and is publicly available at \url{https://noaa-gsl-experimental-pds.s3.amazonaws.com/index.html#SmokeViz/}.
Croissant File: json
Dataset URL: https://noaa-gsl-experimental-pds.s3.amazonaws.com/index.html#SmokeViz/
Code URL: https://github.com/reykoki/SmokeViz
Primary Area: Datasets & Benchmarks for applications in computer vision
Submission Number: 2027
Loading