Sequence Generation and Phylogenetic Inference with Generative Flow Networks

Published: 02 Mar 2026, Last Modified: 10 Mar 2026Gen² 2026 PosterEveryoneRevisionsCC BY 4.0
Track: Full / long paper (5-8 pages)
Keywords: sequence design, phylogenetic inference, ancestral sequence reconstruction, generative models
TL;DR: Joint Sequence Generation and Phylogenetic Inference with Generative Flow Networks
Abstract: Phylogenetic inference remains computationally challenging due to the exponen- tially growing tree topology search space, and current methods rely heavily on multiple sequence alignments (MSAs) which are expensive and error-prone. We propose AncestorGFN, a proof-of-concept approach leveraging Generative Flow Networks (GFlowNets) for simultaneous sequence generation and phylogenetic exploration without requiring explicit MSAs. Our method learns to generate se- quences matching a target distribution while the flow trajectories implicitly encode structural relationships among sequences. We demonstrate that greedy traceback on maximum-flow trajectories recovers shared intermediate states suggestive of common ancestry, and evaluate on the let-7 microRNA family where the learned flow structure qualitatively captures phylogenetic branching patterns. Further- more, beam search at inference time discovers novel sequences clustering near known targets, suggesting applications in $\textit{de novo}$ sequence design. This work es- tablishes an initial foundation for alignment-free phylogenetic exploration using generative models.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 11
Loading