RNA Alternative Splicing Prediction with Discrete Compositional Energy NetworkDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: RNA splicing, Computational Biology, RNA
Abstract: A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the compositionality of key components to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and its ablation variants.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We construct an RNA alternative splicing regression dataset (CAPD) and propose DCEN to predict splicing outcomes by modeling mRNA transcript probabilities through its constituent splice junctions' energy.
Reviewed Version (pdf): https://openreview.net/references/pdf?id=uzxQ1r5ul
5 Replies

Loading