ccSVM: correcting Support Vector Machines for confounding factors in biological data classification

Published: 01 Jan 2011, Last Modified: 16 May 2025Bioinform. 2011EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: : Classifying biological data into different groups is a central task of bioinformatics: for instance, to predict the function of a gene or protein, the disease state of a patient or the phenotype of an individual based on its genotype. Support Vector Machines are a wide spread approach for classifying biological data, due to their high accuracy, their ability to deal with structured data such as strings, and the ease to integrate various types of data. However, it is unclear how to correct for confounding factors such as population structure, age or gender or experimental conditions in Support Vector Machine classification.
Loading