Discovering Chemical Space from First Principles with Reinforcement Learning

Bjarke Hastrup; François R J Cornet; Tejs Vegge; Arghya Bhowmik

Discovering Chemical Space from First Principles with Reinforcement Learning

Bjarke Hastrup, François R J Cornet, Tejs Vegge, Arghya Bhowmik

Published: 31 Oct 2025, Last Modified: 24 Nov 2025SIMBIOCHEM 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, 3D Molecular Design, Physics Guided Discovery

TL;DR: We present an online reinforcement learning framework that autonomously generates valid 3D molecular structures across diverse compositions without pretraining, achieving up to an order of magnitude improvement in novel isomer discovery

Abstract: Discovering novel stable molecules without training data remains a grand scientific challenge. Current molecular generative models are trained on large, pre-curated datasets, which introduce biases and limit exploration of novel chemistry. In contrast, we propose a new paradigm: autonomous, generalized agents capable of mapping vast, unknown chemical spaces without any pretraining. For the first time, we present a self-guided agent that autonomously constructs valid 3D isomers under stoichiometric constraints and is trained exclusively online using reinforcement learning. Unlike existing approaches that generally overfit to a specific chemical formula, we establish a multi-composition training scheme that enables a broad generalization across diverse chemistry, guided by energy- and validity-based rewards. Our agent can discover up to an order of magnitude more valid isomers on unseen test formulas than the baseline. These results fulfil the promise of online RL as a powerful paradigm for scalable tabula rasa exploration of the chemical configuration space.

Release To Public: Yes, please release this paper to the public

Submission Number: 31

Loading