Analysis and Comparison of Genomes of HIV-1 and HIV-2 Using Apriori Algorithm, Decision Tree, and Support Vector Machine

Published: 01 Jan 2016, Last Modified: 30 Jul 2025ICIC (1) 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: AIDS is caused by HIV, which can be divided into two strains: HIV-1 and HIV-2. Whereas HIV-1 is distributed around the world and is the major cause of global infections, HIV-2 is less infectious and transmissible and is therefore generally confined to West Africa. Thus this research aims to account for their difference by analyzing genome sequences of HIV-1 and HIV-2 using some methods: Apriori algorithm, Decision tree, and Support Vector Machine. Apriori demonstrates that HIV-1 has lysine, arginine, and serine as its typical amino acids, while HIV-2 has glycine, lysine, leucine, and arginine. Decision tree determines the significant positions of amino acids that can distinguish the two viruses: pos5 in 9 window, pos13 in 13 window, and pos16 in 19 window. SVM indicates that two viruses are seemingly similar but indeed different. The collective results provide a biologically verifiable background for making effective vaccines for HIV, especially for HIV-2.
Loading