BIOLOGICAL SYMMETRY CONSISTENCY: A FRAMEWORK FOR EVALUATING REPRESENTATIONAL MEANINGFULNESS IN COMPUTATIONAL BIOLOGY
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Track: tiny / short paper (up to 5 pages)
Keywords: Representation learning, biological embeddings, symmetry consistency, self-supervised learning, single-cell transcriptomics, cellular morphology, model evaluation, invariant representations
TL;DR: We introduce Biological Symmetry Consistency (BSC), a lightweight framework to evaluate whether embeddings respect biologically meaningful transformations, revealing differences between models that standard task accuracy cannot capture.
Abstract: The increasing use of representation learning in computational biology has not been matched by principled evaluation methods to assess whether learned embeddings capture biologically meaningful structure. Current practices largely rely on downstream predictive performance, which provides limited insight into whether embeddings respect known biological invariances and symmetries. We introduce Biological Symmetry Consistency (BSC), a lightweight evaluation framework that measures the stability of learned representations under biologically meaningful transformations. We define the Symmetry Consistency Score (SCS) to quantify embedding similarity between original samples and their symmetry-preserving variants, enabling assessment of representational invariance independent of downstream tasks. While model-agnostic in architecture, BSC requires defining transformations appropriate to the data modality (e.g., rotations for cellular images, gene permutations for transcriptomics). We evaluate BSC on cellular morphology (BBBC021) and single-cell transcriptomics (PBMC 3k) using both supervised and self-supervised embedding models. We observe SCS differences of 0.10–0.17 across models of similar task performance (p < 0.01, bootstrap 95% CI: ±0.02), highlighting gaps between predictive accuracy and symmetry adherence. These findings demonstrate that BSC provides a complementary diagnostic for evaluating biologically faithful embeddings and can guide model selection across data modalities where meaningful transformations can be defined.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Mahule_Roy1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 10
Loading