MolFeSCue: Enhancing Molecular Property Prediction in Data-Limited and Imbalanced Contexts using Few-Shot and Contrastive Learning

Published: 29 Mar 2024, Last Modified: 03 Jul 2024BioinformaticsEveryoneRevisionsCC BY-NC-ND 4.0
Abstract: Motivation: Predicting molecular properties is a pivotal task in various scientific domains, including drug discovery, material science, and computational chemistry. This problem is often hindered by the lack of annotated data and imbalanced class distributions, which pose significant challenges in developing accurate and robust predictive models. Results: This study tackles these issues by employing pre-trained molecular models within a few-shot learning framework. A novel dynamic contrastive loss function is utilized to further improve model performance in the situation of class imbalance. The proposed MolFeSCue framework not only facilitates rapid generalization from minimal samples, but also employs a contrastive loss function to extract meaningful molecular representations from imbalanced datasets. Extensive evaluations and comparisons of MolFeSCue and state-of-the-art algorithms have been conducted on multiple benchmark datasets, and the experimental data demonstrate our algorithm’s effectiveness in molecular representations and its broad applicability across various pre-trained models. Our findings underscore MolFeSCue’s potential to accelerate advancements in drug discovery. Contacts: Fengfeng Zhou (FengfengZhou@gmail.com) and Kewei Li (kwbb1997@gmail.com). Availability: We have made all the source code utilized in this study publicly accessible via GitHub at http://www.healthinformaticslab.org/supp/ or https://github.com/zhangruochi/MolFeSCue .
Loading