MoRE: Batch-Robust Multi-Omics Representations from Frozen Language Models

20 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models; adapter tuning; frozen transformers; masked reconstruction; supervised contrastive learning; alignment loss; multi-omics integration; batch effect; modality fusion
TL;DR: We repurpose frozen LLM transformers for multi-omics via lightweight modality adapters and alignment objectives, delivering stronger integration while training only a small fraction of parameters.
Abstract: Representation learning on multi-omics data is challenging due to extreme dimensionality, modality heterogeneity, and cohort-specific batch effects. While transformer-based large language models (LLMs) generalize broadly, their use in omics integration remains limited. We present MoRE (Multi-Omics Representation Embedding). This LLM-inspired framework repurposes frozen language-model backbones for omics and aligns heterogeneous assays into a shared latent space for downstream analysis. Unlike purely generative approaches, MoRE prioritizes cross-sample and cross-modality alignment over sequence reconstruction. Concretely, MoRE attaches parameter-efficient, modality-specific adapters and a task-adaptive fusion layer to the frozen backbone, and optimizes a language-modeling-style masked reconstruction objective jointly with supervised contrastive and batch-invariant alignment losses, yielding structure-preserving embeddings that generalize to unseen cell types, donors, and platforms. We compare MoRE to strong baselines—including scGPT, scVI, Scrublet, and Harmony—across single-cell applications, evaluating integration fidelity, rare population detection, and modality transfer. These results position MoRE as a practical, batch-robust representation learner for high-dimensional biological data and a concrete step toward general-purpose omics foundation models built on LLM backbones.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 23403
Loading