PaddyFormer: An Improved RT-DETRv2 based Approach for Paddy Crop Growth Stage Detection on Drone based RGB Imagery

Published: 09 Dec 2025, Last Modified: 25 Jan 2026AgriAI 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep Learning, Drone, Detection Transformer, Growth Stage Recognition, YOLO, RT-DETR
TL;DR: *PaddyFormer* is a model for recognizing paddy crop growth stages using drone-based RGB imagery. It handles class imbalance and field variability, achieving 84.5% mAP@[0.5], 75.6% precision, and 82.5% recall, enabling real-time crop monitoring.
Abstract: Accurately recognizing crop growth stages is vital in precision agriculture, particularly for predicting yield and determining harvesting times. However, this task is challenging due to the significant morphological variations across different growth stages, often impacting model performance. This study addresses the five-class growth stage recognition task for paddy crops, using high-resolution drone-based RGB imagery captured by a DJI Inspire-1 Pro drone equipped with a Zenmuse X5 camera. We propose PaddyFormer, an enhanced version of RT-DETRv2, integrated with a weighted dataloader and asymmetric loss to handle class imbalance and field-level variability effectively. Our experimental results show that the proposed approach, PaddyFormer, demonstrates strong performance, achieving the highest $mAP@[0.5]$ of 84.5\%, with a precision of 75.6\% and a recall of 82.5\%, highlighting its effectiveness and robustness under complex agricultural conditions. Overall, this emphasizes the importance of drone-based image acquisition, transformer-based in developing scalable, real-time solutions for crop monitoring.
Submission Number: 26
Loading