Towards the Latent Transcriptome

Assya Trofimov; Francis Dutil; Claude Perreault; Sebastien Lemieux; Yoshua Bengio; Joseph Paul Cohen

Towards the Latent Transcriptome

Assya Trofimov, Francis Dutil, Claude Perreault, Sebastien Lemieux, Yoshua Bengio, Joseph Paul Cohen

27 Sept 2018 (modified: 08 Feb 2026)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: In this work we propose a method to compute continuous embeddings for kmers from raw RNA-seq data, in a reference-free fashion. We report that our model captures information of both DNA sequence similarity as well as DNA sequence abundance in the embedding latent space. We confirm the quality of these vectors by comparing them to known gene sub-structures and report that the latent space recovers exon information from raw RNA-Seq data from acute myeloid leukemia patients. Furthermore we show that this latent space allows the detection of genomic abnormalities such as translocations as well as patient-specific mutations, making this representation space both useful for visualization as well as analysis.

Keywords: representation learning, RNA-Seq, gene expression, bioinformatics, computational biology, transcriptomics, deep learning, genomics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/towards-the-latent-transcriptome/code)

9 Replies

Loading