Abstract: Due to the vast design space of molecules, generating molecules conditioned on a specific sub-structure relevant to a particular function or therapeutic target is a crucial task in computer-aided drug design. Existing works mainly focus on specific tasks, such as linker design or scaffold hopping, each task requires training a model from scratch, and many well-pretrained De Novo molecule generation model parameters are not effectively utilized. To this end, we propose a two-stage training approach, consisting of condition learning and condition optimization. In the condition learning stage, we adopt the idea of ControlNet and design some meaningful adjustments to make the unconditional generative model learn sub-structure conditioned generation. In the condition optimization stage, by using human preference learning, we further enhance the stability and robustness of sub-structure control. In our experiments, only trained on randomly partitioned sub-structure data, the proposed method outperforms previous techniques by generating more valid and diverse molecules. Our method is easy to implement and can be quickly applied to various pre-trained molecule generation models.
Loading