Abstract: The DNA microarray may contain missing expression data. Estimation of missing values is a necessary step in microarray analysis, because data mining procedures require a complete expression as their input. In this paper, we propose a missing data estimation algorithm, named KPCAimpute, based on kernel principal component analysis. We consider a family of heavy-tailed kernel functions, which is a generalization of the famous Gaussian kernel. The performance of the proposed KPCAimpute algorithm is compared with two state-of-the-art linear regression methods, i.e., Bayesian principal component analysis imputation (BPCA) and local least squares imputation (LLSimpute). The KPCAimpute outperforms the LL-Simpute when the missing percentage increases. The performance of the KPCAimpute is similar to that of the BPCA imputation. Therefore, it is an effective and promising algorithm in estimating missing values for DNA microarray profiles.
0 Replies
Loading