Chemically Interpretable Molecular Representation for Property Prediction
Abstract: Molecular property prediction using a molecule's structure is a crucial step in drug and novel material discovery, as computational screening approaches rely on predicted properties to refine the existing design of molecules. Although the problem has existed for decades, it has recently gained attention due to the advent of big data and deep learning. On average, one FDA drug is approved for 250 compounds entering the preclinical research stage, requiring screening of chemical libraries containing more than 20000 compounds. In-silico property prediction approaches using learnable representations increase the pace of development and reduce the cost of discovery. We propose developing molecule representations using functional groups in chemistry to address the problem of deciphering the relationship between a molecule's structure and property. Functional groups are substructures in a molecule with distinctive chemical properties that influence its chemical characteristics. These substructures are found by (i) curating functional groups annotated by chemists and (ii) mining a large corpus of molecules to extract frequent substructures using a pattern-mining algorithm. We show that the Functional Group Representation (FGR) framework beats state-of-the-art models on several benchmark datasets while ensuring explainability between the predicted property and molecular structure to experimentalists.
Article: pdf
3 Replies
Loading