Track: Full / long paper (5-8 pages)
Keywords: sequence design, phylogenetic inference, ancestral sequence reconstruction, generative models
TL;DR: Joint Sequence Generation and Phylogenetic Inference with Generative Flow Networks
Abstract: Phylogenetic inference remains computationally challenging due to the exponen-
tially growing tree topology search space, and current methods rely heavily on
multiple sequence alignments (MSAs) which are expensive and error-prone. We
propose AncestorGFN, a proof-of-concept approach leveraging Generative Flow
Networks (GFlowNets) for simultaneous sequence generation and phylogenetic
exploration without requiring explicit MSAs. Our method learns to generate se-
quences matching a target distribution while the flow trajectories implicitly encode
structural relationships among sequences. We demonstrate that greedy traceback
on maximum-flow trajectories recovers shared intermediate states suggestive of
common ancestry, and evaluate on the let-7 microRNA family where the learned
flow structure qualitatively captures phylogenetic branching patterns. Further-
more, beam search at inference time discovers novel sequences clustering near
known targets, suggesting applications in $\textit{de novo}$ sequence design. This work es-
tablishes an initial foundation for alignment-free phylogenetic exploration using
generative models.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 11
Loading