Abstract: We introduce PHILHARMONIC, a computational framework that couples deep learning de novo network inference with robust unsupervised spectral clustering algorithms to uncover functional relationships and high-level organization in non-model organisms. Our novel clustering approach produces highly informative functional modules by de-noising the predicted network. We also develop a novel algorithm called ReCIPE, which aims to reconnect disconnected clusters, increasing functional enrichment and biological interpretability. We initially perform remote homology-based functional annotation by leveraging hmmscan and GODomainMiner to assign initial functions to proteins at large evolutionary distances; our clusters then enable us to newly assign functions to uncharacterized proteins through “function by association.” We validate the ability of PHILHARMONIC to recover gold-standard functional enrichments in the well-annotated fruit fly D. melanogaster, and apply it to investigate stress response in the reef-building coral P. damicornis and its algal symbiont C. goreaui. Easy to run end-to-end and requiring only a sequenced proteome, PHILHARMONIC is an engine for biological hypothesis generation and discovery in non-model organisms.
Loading