Abstract: The progress in materials science and drug discovery is impeded by the availability of labeled data and the high costs of manual annotation, driving the need for efficient strategies to capture molecular representations and enable accurate predictions. Pretrained Graph Neural Networks have shown promise in capturing universal molecular representations, but adapting them to task-specific applications remains challenging. In this paper, we propose Multilevel Informed Prompt-Tuning (MIPT), a novel framework for effectively tailoring pretrained models to molecule-related tasks. MIPT utilizes a lightweight, multi-level prompt learning module to capture node-level and graph-level task-specific knowledge, ensuring adaptable and efficient tuning. Additionally, a noise penalty mechanism is introduced to address mismatches between pretrained representations and downstream tasks, reducing irrelevant or noisy information. Experimental results show that MIPT surpasses all baselines, aligning graph space and task space while achieving significant improvements in molecule-related tasks, demonstrating its scalability and versatility for molecular tasks.
Lay Summary: Discovering new drugs and materials is tough partly because it’s hard and expensive to collect labeled data for molecules. This makes it difficult to train models that can understand and predict molecular behavior. To solve this, we developed a method called Multilevel Informed Prompt-Tuning (MIPT). It builds on powerful pretrained models but adds lightweight “prompts” that guide the model to focus on the task at hand—both at the atom level and the whole molecule level. We also introduced a noise-reduction step to avoid misleading signals from the original training. Our approach worked better than existing methods on a range of molecular tasks. This matters because it helps AI models learn faster and more accurately in chemistry and biology, even with limited data—making drug discovery and materials design more efficient and accessible.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: graph prompt, molecule property prediction
Submission Number: 495
Loading