Continuous Embeddings of DNA Sequencing Reads and Application to MetagenomicsOpen Website

2019 (modified: 12 May 2023)J. Comput. Biol. 2019Readers: Everyone
Abstract: We propose a new model for fast classification of DNA sequences output by next-generation sequencing machines. The model, which we call fastDNA, embeds DNA sequences in a vector space by learning continuous low-dimensional representations of the k-mers it contains. We show on metagenomics benchmarks that it outperforms the state-of-the-art methods in terms of accuracy and scalability.
0 Replies

Loading