Abstract: The immense volume and rapid growth of human genomic data, especially single nucleotide polymorphisms (SNPs), present special challenges for both biomedical researchers and automatic algorithms. SNPs are confirmed as a major factor in human genome polymorphisms, and are found to be suitable as a genetic marker for disease characteristics. SNPs hold much promise as a basis for genome-wide disease-gene association. Determining the relationship between disease complexity and SNPs requires complex genotyping for large SNP data sets, and is thus very expensive and labor-intensive. In this paper, we attempt two novel approaches to solve the problem of tag SNP selection, one using self-organizing maps (SOM) for clustering the SNPs and the other using Fuzzy C Means clustering. Both the above methods have been shown to select a more optimal set of tag SNPs which capture the remaining SNPs more efficiently as compared to Haploview Tagger, thus satisfying the goal of tag SNP selection in a more suitable way.
Loading