B3SafirBiyo: Genomic variant analysis with big data technologies

Tugce Dongel, Yasemin Timar

Published: 2017, Last Modified: 14 Jan 2026IEEE BigData 2017EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this information age, the DNA information itself, as well as the genomic variations of individuals are popular examples of big data to be processed. In case of analyzing thousands of individuals, the size of the data set is getting so much larger which requires big data processing technologies. In order to support the studies in bioinformatics, specifically on genomic variants and population genetics, we have implemented B3SafirBiyo, a framework with the recent big data technologies; web-based user interfaces, Spark engine and machine learning libraries. We have demonstrated the efficiency of basic filtering, querying operations on large variant files. The performance of the population clustering on 1000 genome dataset is also presented in this work.

External IDs:dblp:conf/bigdataconf/DongelT17