SemiRetro: Semi-template framework boosts deep retrosynthesis predictionDownload PDF

Published: 28 Jan 2022, Last Modified: 22 Oct 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Retrosynthesis prediction, molecular graph learning
Abstract: Retrosynthesis brings scientific and societal benefits by inferring possible reaction routes toward novel molecules. Recently, template-based (TB) and template-free (TF) molecule graph learning methods have shown promising results to solve this problem. TB methods are more accurate using pre-encoded reaction templates, and TF methods are more scalable by decomposing retrosynthesis into subproblems, i.e., center identification and synthon completion. To combine both advantages of TB and TF, we suggest breaking a full-template into several semi-templates and embedding them into the two-step TF framework. Since many semi-templates are reduplicative, the template redundancy can be reduced while the essential chemical knowledge is still preserved to facilitate synthon completion. We call our method SemiRetro and introduce a directed relational graph attention (DRGAT) layer to extract expressive features for better center identification. Experimental results show that SemiRetro significantly outperforms both existing TB and TF methods. In scalability, SemiRetro covers 96.9\% data using 150 semi-templates, while previous template-based GLN requires 11,647 templates to cover 93.3\% data. In top-1 accuracy, SemiRetro exceeds template-free G2G 3.4\% (class known) and 6.4\% (class unknown). Besides, SemiReto has better interpretability and training efficiency than existing methods.
One-sentence Summary: We propose a deep graph learning framework using semi-template to facilitate retrosynthesis prediction.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2202.08205/code)
39 Replies

Loading