SplicedVAE: Learning Splicing Ratios from scRNA-seq to Enhance RNA Velocity and Cellular Trajectories

SplicedVAE: Learning Splicing Ratios from scRNA-seq to Enhance RNA Velocity and Cellular Trajectories

04 Feb 2026 (modified: 04 Mar 2026)Submitted to ICLR 2026 Workshop LMRLEveryoneRevisionsBibTeXCC BY 4.0

Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.

Track: tiny / short paper (up to 5 pages)

Keywords: RNA velocity, single-cell transcriptomics, variational autoencoders, splicing dynamics, trajectory inference, generative modeling, multitask learning, cellular dynamics, scRNA-seq, representation learning

TL;DR: SplicedVAE uses multitask variational autoencoders to predict splicing ratios directly from gene expression, enabling RNA velocity estimation in datasets lacking spliced/unspliced counts and improving cellular trajectory inference.

Abstract: Single-cell RNA sequencing (scRNA-seq) provides high-resolution snapshots of cellular states, yet inferring temporal dynamics such as RNA velocity and differentiation trajectories remains challenging due to dropout, limited availability of reliable spliced/unspliced (S/U) counts, and the static nature of most models. Existing velocity methods rely on strong kinetic assumptions and require S/U measurements, while trajectory inference approaches reconstruct only static pseudotime orderings. To address these limitations, we present SplicedVAE, a supervised generative framework that augments the scVI variational autoencoder with a dedicated decoder for predicting per-gene splicing ratios only from raw counts. Trained across multi-species datasets from the Arc Virtual Cell Atlas, SplicedVAE jointly optimizes gene expression reconstruction and splicing-ratio prediction, enabling biologically informed regularization. Our model achieves improved splicing-ratio prediction accuracy (RMSE 0.1271, Pearson r = 0.67), enhanced latent representations, and substantially more coherent velocity fields compared to standard scVI. When reconstructed S/U counts are passed into scVelo, SplicedVAE recapitulates developmental flow patterns and yields high cosine similarity to ground-truth velocities. These results demonstrate multi-task learning can improve velocity-based trajectory reconstruction and establishes a foundation for future diffusion-based models capable of generating fully stochastic cellular trajectories.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Presenter: ~Sahil_Gupta3

Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).

Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.

Submission Number: 24

Loading