Stepping Back to SMILES Transformers for Fast Molecular Representation InferenceDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: molecular representation learning, knowledge distillation
Abstract: In the intersection of molecular science and deep learning, tasks like virtual screening have driven the need for a high-throughput molecular representation generator on large chemical databases. However, as SMILES strings are the most common storage format for molecules, using deep graph models to extract molecular feature from raw SMILES data requires an SMILES-to-graph conversion, which significantly decelerates the whole process. Directly deriving molecular representations from SMILES is feasible, yet there exists a large performance gap between the existing SMILES-based models and graph-based models at benchmark results. To address this issue, we propose ST-KD, an end-to-end SMILES Transformer for molecular representation learning boosted by Knowledge Distillation. In order to conduct knowledge transfer from graph Transformers to ST-KD, we have redesigned the attention layers and introduced a pre-transformation step to tokenize the SMILES strings and inject structure-based positional embeddings. ST-KD shows competitive results on latest standard molecular datasets PCQM4M-LSC and QM9, with $3\text{-}14\times$ inference speed compared with existing graph models.
One-sentence Summary: SMILES Transformer for fast molecular representation inference with knowledge distillation.
Supplementary Material: zip
13 Replies

Loading