Abstract: Linear discriminant analysis is a widely used method for classification. However, the
high dimensionality of predictors combined with small sample sizes often results in large
classification errors. To address this challenge, it is crucial to leverage data from related
source models to enhance the classification performance of a target model. This paper
proposes a transfer learning approach via regularized random-effects linear discriminant analysis, where the discriminant direction is estimated as a weighted combination
of ridge estimates obtained from both the target and source models. Multiple strategies
for determining these weights are introduced and evaluated, including one that minimizes the estimation risk of the discriminant vector and another that minimizes the
classification error. Utilizing results from random matrix theory, we explicitly derive the
asymptotic values of these weights and the associated classification error rates in the
high-dimensional setting, where p/n → γ, with p representing the predictor dimension
and n the sample size. Extensive numerical studies, including simulations and analysis
of proteomics-based 10-year cardiovascular disease risk classification, demonstrate the
effectiveness of the proposed approach.
Loading