KIMI: Knockoff Inference for Motif Identification from molecular sequences with controlled false discovery rate

Published: 01 Jan 2021, Last Modified: 15 May 2025Bioinform. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rapid development of sequencing technologies has enabled us to generate a large number of metagenomic reads from genetic materials in microbial communities, making it possible to gain deep insights into understanding the differences between the genetic materials of different groups of microorganisms, such as bacteria, viruses, plasmids, etc. Computational methods based on k-mer frequencies have been shown to be highly effective for classifying metagenomic sequencing reads into different groups. However, such methods usually use all the k-mers as features for prediction without selecting relevant k-mers for the different groups of sequences, i.e. unique nucleotide patterns containing biological significance.
Loading