NEXT-MOL: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

ICLR 2025 Conference Submission7088 Authors

Published: 22 Jan 2025, Last Modified: 22 Jan 2025ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D molecule generation, molecular conformer generation, large language models, diffusion models, geometric deep learning
TL;DR: We develop a foundation model that combines the strengths of 1D language modeling and 3D diffusion for 3D molecule generation.
Abstract: 3D molecule generation is crucial for drug discovery and material design. While prior efforts focus on 3D diffusion models for their benefits in modeling continuous 3D conformers, they overlook the advantages of 1D SELFIES-based Language Models (LMs), which can generate 100% valid molecules and leverage the billion-scale 1D molecule datasets. To combine these advantages for 3D molecule generation, we propose a foundation model -- NEXT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation. NEXT-Mol uses an extensively pretrained molecule LM for 1D molecule generation, and subsequently predicts the generated molecule's 3D conformers with a 3D diffusion model. We enhance NEXT-Mol's performance by scaling up the LM's model size, refining the diffusion neural architecture, and applying 1D to 3D transfer learning. Notably, our 1D molecule LM significantly outperforms baselines in distributional similarity while ensuring validity, and our 3D diffusion model achieves leading performances in conformer prediction. Given these improvements in 1D and 3D modeling, NEXT-Mol achieves a 26\% relative improvement in 3D FCD for de novo 3D generation on GEOM-DRUGS, and a 13% average relative gain for conditional 3D generation on QM9-2014. Our codes and pretrained checkpoints are available at https://anonymous.4open.science/r/NEXT-Mol.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7088
Loading