Zero-Shot Learning for Materials Science Texts: Leveraging Duck Typing Principles

Published: 01 Jan 2025, Last Modified: 19 May 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Materials science text mining (MSTM), involving tasks like property extraction and synthesis action retrieval, is pivotal for advancing research by deriving critical insights from scientific literature. Descriptors, serving as essential task labels, often vary in meaning depending on researchers' usage purposes across different mining tasks. (e.g., 'Material' can refer to both synthesis components and participants in fuel cell experiment). This meaning difference makes it difficult for existing methods, fine-tuned to specific task, to handle the same descriptors in other tasks. To overcome above limitation, we propose MatDuck, a simple and effective approach for Zero-Shot MSTM by evoking material knowledge within Large Language Models (LLMs). Specifically, inspired by the Duck Typing principles in programming languages, we present a ClassDefinition-Style Descriptor generation method that evokes task-specific characteristics to address usage variation. Subsequently, we introduce code-style in-context learning for zero-shot tasks, reframing them into code to leverage LLMs' proficiency in code understanding. Extensive experiments on eight benchmark datasets demonstrate that MatDuck, as a plug-and-play approach, significantly improves the Zero-Shot MSTM performance of LLMs by an average of 11.3% across seven tasks.
Loading