Robust Outlier Detection in Multi-environment Trial Data: Comparative Analysis and Application to T3/Wheat Dataset

Dupuy Rony Charles, Andrea G. B. Tettamanzi, Pascal Pultrini

Published: 01 Jan 2025, Last Modified: 21 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Abstract: In plant breeding, Multi-Environment Field Trials (MET) are essential for evaluating genotypes across multiple traits and estimating their genetic breeding value through Genomic Prediction (GP). The presence of outliers in MET data adversely affects the accuracy of GP, necessitating robust Outlier Detection (OD) mechanisms. Despite this, OD in MET is frequently neglected. MET data are prone to heteroscedasticity, leading to swamping and masking effects where actual data points are misclassified as outliers. Thus, a robust OD algorithm is critical for enhancing GP accuracy, particularly with limited sample sizes. Our previous study identified the Subspace Outlier Detection Method as the most effective technique among various OD methods, including Mahalanobis Distance, Principal Component Analysis (PCA), and Local Outlier Factor (LOF), based on its performance across eleven real-world MET datasets.
Loading