Keywords: Variational Inference, Probabilistic Learning, Molecule Generation
Abstract: Autoregressive models (ARMs) have become the workhorse for sequence generation tasks, because of their simplicity and ability to exactly evaluate their log-likelihood. Classical Fixed-Order (FO) ARMs factorize high-dimensional data according to a fixed canonical ordering, framing the task as next-token prediction. While a natural ordering exists for text (left-to-right), canonical orderings are less obvious for many data modalities, such as molecular graphs and sequences. Learning-Order (LO) ARMs address this limitation, but their training relies on the optimization of an Evidence Lower Bound (ELBO), rather than on their exact log-likelihood. Therefore, FO-ARMs tend to remain advantageous. In this paper, we introduce LO-ARMs++, an improved version of LO-ARMs, to address this issue through incorporating several technical improvements. We introduce an improved training method called $\alpha$-$\beta$-ELBO, together with network architectural improvements. We demonstrate the general applicability of $\alpha$-$\beta$-ELBO, which yields improvement on the distribution learning metrics on both molecular graph and string generation. Moreover, on the challenging domain of molecular sequence generation, LO-ARMs++ match or surpass state-of-the-art results of Fixed-Order(FO) ARMs on the GuacaMol and MOSES SMILES benchmarks in terms of key metrics for distribution similarity.
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Submission Number: 20780
Loading