Keywords: deep learning, dynamic discard mechanism, yield prediction, mixed-accuracy data integration
Abstract: Effectively fusing scarce high-accuracy data with massive but noisy low-accuracy data is a common challenge faced by machine learning across various fields, including agriculture, medicine, and remote sensing. Existing methods, which either directly concatenate datasets while ignoring accuracy differences or employ static weighting for training, struggle to achieve optimal performance. To address this, we introduce a deep learning framework incorporating a dynamic discard mechanism (DDL) that manages mixed-accuracy data through the selective, dynamic removal of low-accuracy instances characterized by high Mean Absolute Error (MAE) and the application of an adaptive weighting scheme.Our study validated this approach using rice cultivation data from China's four major rice-growing regions: South, Central, North, and Northeast China. Using site characteristics and nitrogen application rates as feature variables and rice yield as the target variable, we designated the high-accuracy dataset as the test set. Compared to machine learning models that process only single-accuracy datasets and other models designed for mixed-accuracy data, our DDL framework demonstrated a performance improvement of over 10% in metrics such as RMSE, MAE, and MAPE, achieving significantly higher prediction accuracy.A crop yield prediction model capable of handling multiple datasets simultaneously holds significant practical value for policymakers and other stakeholders. The dynamic discard mechanism and adaptive weighting algorithm employed by DDL also have considerable reference value for applications in other domains.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 18039
Loading