Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Raj Ghugare; Santiago Miret; Adriana Hugessen; Mariano Phielipp; Glen Berseth

Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Raj Ghugare, Santiago Miret, Adriana Hugessen, Mariano Phielipp, Glen Berseth

Published: 27 Oct 2023, Last Modified: 03 Nov 2023AI4Mat-2023 PosterEveryoneRevisionsBibTeX

Keywords: chemistry, reinforcement learning, language models, molecular docking, pytdc

TL;DR: A new RL algorithm for better molecular discovery.

Abstract: Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs. However, RL requires careful structuring of the search space and algorithm design to be effective in this challenge. Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy's ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (ChemRLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. From this analysis, we discover unique insights in this problem space and show that ChemRLformer achieves state-of-the-art performance while being more straightforward than prior work by demystifying which design choices are actually helpful for text-based molecule design.

Submission Track: Papers

Submission Category: AI-Guided Design

Digital Discovery Special Issue: Yes

Submission Number: 38

Loading