Differentiable Search of Evolutionary Trees

Published: 20 Jun 2023, Last Modified: 11 Oct 2023SODS 2023 PosterEveryoneRevisionsBibTeX
Keywords: Soft Combinatorial Optimization, End-to-end Differentiable Models, Phylogenetic Inference, Evolutionary Biology, Maximum Parsimony, Differentiable Tree sampling
TL;DR: We introduce a differentiable approach to phylogenetic tree search, optimizing tree and ancestral sequences in its original representation itself, thus requiring no prior training data.
Abstract: Inferring the most probable evolutionary tree given leaf nodes is an important problem in computational biology that reveals the evolutionary relationships between species. Due to the exponential growth of possible tree topologies, finding the best tree in polynomial time becomes computationally infeasible. In this work, we propose a novel differentiable approach as an alternative to traditional heuristic-based combinatorial tree search methods in phylogeny. The optimization objective of interest in this work is to find the most parsimonious tree (i.e., to minimize the total number of evolutionary changes in the tree). We empirically evaluate our method using randomly generated trees of up to 128 leaves, with each node represented by a 256-length protein sequence. Our method exhibits promising convergence ($<1$% error for trees up to 32 leaves, $<8$% error up to 128 leaves, given only leaf node information), illustrating its potential in much broader phylogenetic inference problems and possible integration with end-to-end differentiable models. The code to reproduce the experiments in this paper can be found at https://github.ramith.io/diff-evol-tree-search.
Submission Number: 37
Loading