Abstract: Trimmomatic is a de-facto standard trimmer for Illumina sequencing data. However, limited by its sub-optimal implementation, it cannot fully exploit the computational power of common multi-core platforms. Therefore, we propose RabbitTrim, a highly optimized implementation of Trimmomatic based on efficient I/O strategies, parallel (de)compression engines, block-based memory pools, bitwise operations and vectorization techniques. RabbitTrim achieves speedups between 1.5x and 3.3x (3.7x and 8.0x) when processing plain (gzip-compressed) FASTQ files on a 48-core Intel server. Overall, RabbitTrim is able to process 101 GB gzip-compressed sequencing data in only 5 min while Trimmomatic requires at least 21 min. The source code is available at https://github.com/RabbitBio/RabbitTrim.
Loading