Solving the Tree Containment Problem Using Graph Neural Networks

Arkadiy Dushatskiy; Esther Julien; Leen Stougie; Leo van Iersel

Solving the Tree Containment Problem Using Graph Neural Networks

Arkadiy Dushatskiy, Esther Julien, Leen Stougie, Leo van Iersel

Published: 12 Jun 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: \textsc{Tree containment} is a fundamental problem in phylogenetics useful for verifying a proposed phylogenetic network, representing the evolutionary history of certain species. \textsc{Tree containment} asks whether the given phylogenetic tree (for instance, constructed from a DNA fragment showing tree-like evolution) is contained in the given phylogenetic network. In the general case, this is an NP-complete problem. We propose to solve it approximately using Graph Neural Networks. In particular, we propose to combine the given network and the tree and apply a Graph Neural Network to this network-tree graph. This way, we achieve the capability of solving the tree containment instances representing a larger number of species than the instances contained in the training dataset (i.e., our algorithm has the inductive learning ability). Our algorithm demonstrates an accuracy of over $95\%$ in solving the tree containment problem on instances with up to 100 leaves.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/ArkadiyD/PhyloGNN

Assigned Action Editor: ~Ellen_Vitercik1

Submission Number: 2229

Loading