Keywords: generative machine learning, molecular machine learning, materials discovery, transfer learning, reinforcement learning
TL;DR: This study benchmarks transfer learning strategies for open-source models to generate novel covalent triazne frameworks, demonstrating that an optimized methodology significantly increases the yield of valid candidates.
Abstract: Generative AI holds immense promise for accelerating materials discovery. However, current AI models, often pre-trained on general chemical datasets, fail to generalize to materials science due to divergent chemical spaces and unique property constraints. In this work, we explore the transferability of knowledge from existing generative AI solutions to the design of porous carbon materials for supercapacitors. We evaluate the ability of open-source models to adhere to stringent material-specific requirements, such as high porosity, electron conductivity, and the potential of cyclization into solid frameworks. Furthermore, we investigate the optimal transfer learning strategies, assessing the trade-offs between retraining, fine-tuning, and reinforcement learning. Preliminary experiments with the REINVENT, MolMIM, and Mol-AIR generative models demonstrate that applying fine-tuning and reinforcement learning increases the generation of valid candidate molecules from at most 5.8\% before transfer learning, up to 88.4\% afterwards. Critically, our findings reveal that measures commonly used for benchmarking generative models in chemistry, such as validity, novelty, or uniqueness, are not aligned with the true goals of de novo molecule generation for materials science.
Submission Track: Findings, Tools & Open Challenges
Submission Category: AI-Guided Design
Institution Location: Poznan, Poland
AI4Mat RLSF: Yes
Submission Number: 86
Loading