Differential top-k learning for template-based single-step retrosynthesis

Andres M Bran; Philippe Schwaller

Differential top-k learning for template-based single-step retrosynthesis

Andres M Bran, Philippe Schwaller

Published: 22 Nov 2022, Last Modified: 05 May 2023AI4Mat 2022 SpotlightReaders: Everyone

Keywords: retrosynthesis, loss function, template based, differential top-k

TL;DR: Using a new family of loss functions, we improve top-k accuracies for organic retrosynthesis in a baseline model.

Abstract: Retrosynthesis is one of the core tasks in the organic molecule design cycle, yet it is still a computational challenge to produce suitable sets of precursors for a desired product. Commonly used template-based approaches reduce the problem to a multi-class classification task for single steps. However, reactions in available datasets are noisy and incomplete, making usual training methods problematic. In this work, considering that multiple disconnections are possible for a product, we propose training models using differential top-k losses. We show that using these loss functions yields improvements in every top-N metric, with little overhead relative to cross-entropy. The use of more powerful models, more diverse and complete datasets, and other methodologies, is expected to yield significant improvements on this task when combined with the training approach presented here.

Paper Track: Papers

Submission Category: Automated Chemical Synthesis

Supplementary Material: pdf

0 Replies

Loading