Lyrebird: Toward Robust and Generalizable 3D Molecular Conformer Generation via Equivariant Flows

Published: 02 Mar 2026, Last Modified: 13 Apr 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Conformer, benchmarks, energy, split, flow matching, equivariant, chemistry, transformer, molecules
TL;DR: Butina splits are more realistic than random splits for conformer generation. Joint training on QM9, Drugs, and CREMP improves macrocycle coverage without hurting drug-like performance. We introduce two new benchmarks for macrocycles and energy.
Abstract: Recent generative models for 3D molecular conformer generation have made impressive progress, but data and benchmarks are limited and often fail to evaluate usefulness and trustworthiness as computational chemistry tools. We introduce Lyrebird, a general-purpose model for 3D molecular conformer generation built on the ET-Flow (Equivariant Transformer Flow) architecture, and evaluate generalization by training jointly on Butina-split datasets of drug-like molecules from GEOM-Drugs and GEOM-QM9 and macrocyclic peptides from CREMP. Additionally, we introduce a macrocyclic conformer generation benchmark set: MPCONF196GEN, derived from the MPCONF196 energy benchmark set. We also introduce an energy-based benchmark that evaluates both conformer sampling within the lowest-energy basin and the degree of structural relaxation in generated conformers. Lyrebird matches state-of-the-art ML methods and outperforms ETKDGv3 on coverage and matching metrics for drug-like molecules, and improves performance on macrocycles over models only trained on GEOM-Drugs.
Submission Number: 121
Loading