Keywords: Reinforcement Learning, 3D Molecular Design, Physics Guided Discovery
TL;DR: We present an online reinforcement learning framework that autonomously generates valid 3D molecular structures across diverse compositions without pretraining, achieving up to an order of magnitude improvement in novel isomer discovery
Abstract: Discovering novel stable molecules without training data remains a grand scientific
challenge. Current molecular generative models are trained on large, pre-curated
datasets, which introduce biases and limit exploration of novel chemistry. In
contrast, we propose a new paradigm: autonomous, generalized agents capable of
mapping vast, unknown chemical spaces without any pretraining. For the first time,
we present a self-guided agent that autonomously constructs valid 3D isomers under
stoichiometric constraints and is trained exclusively online using reinforcement
learning. Unlike existing approaches that generally overfit to a specific chemical
formula, we establish a multi-composition training scheme that enables a broad
generalization across diverse chemistry, guided by energy- and validity-based
rewards. Our agent can discover up to an order of magnitude more valid isomers
on unseen test formulas than the baseline. These results fulfil the promise of online
RL as a powerful paradigm for scalable tabula rasa exploration of the chemical
configuration space.
Release To Public: Yes, please release this paper to the public
Submission Number: 31
Loading