Joint Bayesian Variable Selection and Graph Estimation for Non-linear SVM with Application to Genomics Data
Abstract: Support vector machine (SVM) is a powerful classification tool for analysis of high dimensional data such as genomics. Regularized linear and nonlinear SVM methods with feature selection have been developed. On the other hand, there is a growing body of literature showing that incorporating prior biological knowledge such as functional genomics, which are typically represented by graphs, into the analysis of genomic data can improve feature selection and prediction. In practice, however, such biological knowledge can often be inaccurate or unavailable. To attack this problem, we propose a Bayesian modeling approach which enables us to learn the graph structure among features and perform feature selection simultaneously. Our approach employs a Gaussian graphical model for inferring the graphical information and exploits the inferred graph to guide feature selection for SVM. An efficient MCMC algorithm is developed and our numerical analysis demonstrates that the proposed method has advantages over existing methods in feature selection and prediction via simulations and an application to the analysis of glioblastoma patient data.
Loading