Ensembling YOLO and ViT for Plant Disease Detection

Published: 2024, Last Modified: 12 Nov 2025ICPR (21) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As the population of the earth grows, the demand for food grows proportionally. Early and cost-effective detection of plant diseases can result in less food loss. The current methods for image-based plant disease detection tend to fail in field conditions. The proposed pipeline uses an ensemble of an YOLOv8 model trained for disease detection and a disease detection module made of YOLOv8 (as the localizer) and Vision Transformer (as the classifier). The ensembling is performed with a method called Soft-NMS. Our pipeline performs disease detection with 46.12% mAP, beating the YOLOv8 by 14.72%, which detects with 31.4% mAP in the Plant-Doc dataset.
Loading