De novo Drug Design using Reinforcement Learning with Dynamic Vocabulary

Xiuyuan Hu; Guoqing Liu; Yang Zhao; Hao Zhang

De novo Drug Design using Reinforcement Learning with Dynamic Vocabulary

Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang

17 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: De novo drug design, Molecular generation, Reinforcement learning, Dynamic vocabulary

Abstract: De novo drug design constitutes a fundamental challenge within the domain of computer-aided drug discovery (CADD). Generative models relying on SMILES molecular strings have emerged as promising tools for this purpose. However, extant SMILES-based generative models all adopt a fixed vocabulary, leading to deficiencies in both sampling efficiency and interpretability. In this paper, we propose RLDV, a reinforcement learning (RL) algorithm based on a GPT agent, which uses a dynamic chemical vocabulary (DV) during RL iterations. Specifically, we utilize SMILES pair encoding to analyze high-scoring molecular SMILES strings generated during the RL process, and extract their high-frequency common substrings, which are then added as new tokens to the agent's vocabulary. These additions aid in the generation of molecules during subsequent RL steps. Experimental results on the GuacaMol benchmark demonstrate that our algorithm outperforms existing models across multiple tasks, highlighting the practical significance of the dynamic vocabulary in drug design. Furthermore, the application of our algorithm in the design of protein-targeting drugs for SARS-CoV-2 underscores its substantial practical relevance.

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 929

Loading