Automatic Generation of Mechanistic Pathways of Organic Reactions with Dual Templates

Published: 27 Oct 2023, Last Modified: 03 Nov 2023AI4Mat-2023 PosterEveryoneRevisionsBibTeX
Keywords: Organic Reaction Mechanisms, Mechanistic Annotation, Mechanistic Dataset, Reaction Dataset
TL;DR: We present MechFinder, the very first attempt to automatically label reaction mechanisms used to curate a high-quality reaction dataset, which would hugely benefit the chemistry community to develop mechanism-level reactivity prediction models.
Abstract: Understanding organic reaction mechanisms is crucial for interpreting the formation of products at the atomic and electronic level, but still remains as a domain of knowledgeable experts. The lack of a large-scale dataset with chemically reasonable mechanistic sequences also hinders the development of reliable machine learning models to predict organic reactions based on mechanisms as human chemists do. Here, we propose a method that automatically generates reaction mechanisms of a large dataset of organic reactions using autonomously extracted reaction templates and expert-coded mechanistic templates. By applying this method, we labeled 94.8\% of 33k USPTO reactions into chemically reasonable arrow-pushing diagrams, validated by expert chemists. Our method is simple, flexible, and can be expanded to cover a wider range of reactions, regardless of type or complexity. We envision it becoming an invaluable tool to propose reaction mechanisms, and to develop future reaction outcome prediction models and discover new reactions.
Submission Track: Papers
Submission Category: Automated Chemical Synthesis
Supplementary Material: pdf
Submission Number: 21
Loading