A Nucleotide-Position-Based Data Format for Fast Variant Calling and Its Hardware Analyzer Design

Hao-Wei Liu, Zhe-Wei Shen, Yang-Ming Yeh, Yi-Chang Lu

Published: 2022, Last Modified: 08 Mar 2026BioCAS 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this paper, we propose a file format, vBAM, to improve the performance of variant calling tasks. The vBAM format removes data irrelevant to variant calling and compresses base/quality information by positions to reduce data bits. Thus, the vBAM format takes shorter variant calling time and is smaller in size when compared to the conventional BAM/pileup files. Our C++ software supports BAM to vBAM conversion, vBAM decoding, and variant calling. We also implement an accelerator to shorten the computing time of decoding and calling stages. The hardware can achieve at least a 7.2X speed-up when compared to its software counterpart.
Loading