Validation of Prerequisites for Correct Performance Evaluation of Image-based Plant Disease Diagnosis using Reliable 221K Images Collected from Actual Fields
Keywords: plant disease diagnosis, convolutional neural networks, image classification, covariate shift
TL;DR: We investigate the evaluation bias and the importance of the regions of interest pre-detection for disease diagnosis on a practical large-scale dataset with multiple crops.
Abstract: Although many image-based plant disease diagnosis systems have reported high diagnostic performance recently, most of them do not seem to have a proper separation between the training and evaluation images. Because of the potential similarity of images taken in the same field, the true performance of a system where the same field is used training and evaluation images is much worse than it appears. However, no systematic evaluation based on large-scale data has been conducted so far. To suppress overfitting due to such similarity, several attempts have been made to detect regions of interest (ROI), such as leaves, in advance, but no systematic studies have been conducted on their effectiveness. In this study, we used a total of 221,842 leaf images of four crops from 24 prefectures with reliable labels to investigate (i) the performance bias due to evaluation within the same farm and (ii) the effect of the ROI detection on the performance. As a result, even if a large number of training images with sufficient resolution are prepared, diagnostic performance for images in fields different from the training images is greatly degraded due to large differences in image characteristics, i.e., covariate shift. In this situation, the benefit of ROI detection became smaller.