Abstract: Graph neural networks have recently demonstrated remarkable performance in predicting material properties.
Crystalline material data is manually encoded into graph representations.
Existing methods incorporate different attributes into constructing representations to satisfy the constraints arising from symmetries of material structure.
However, existing methods for obtaining graph representations are specific to certain constraints, which are ineffective when facing new constraints.
In this work, we propose a code generation framework with multiple large language model agents to obtain representations named Rep-CodeGen with three iterative stages simulating an evolutionary algorithm.
To the best of our knowledge, Rep-CodeGen is the first framework for automatically generating code to obtain representations that can be used when facing new constraints.
Furthermore, a type of representation from generated codes by our framework satisfies six constraints, with codes satisfying three constraints as bases.
Extensive experiments on two real-world material datasets show that a property prediction method based on such a graph representation achieves state-of-the-art performance in material property prediction tasks.
Lay Summary: Predicting material properties with AI holds great promise, but current methods face a key limitation: scientists must manually design complex rules to convert material data into computable graph structures. These rule-based approaches often fail when encountering new scientific scenarios beyond their original design.
We present Rep-CodeGen, an innovative AI system that automatically writes and improves its own code through an evolutionary process. Like natural selection, our framework uses multiple AI agents that collaboratively generate, test, and refine code representations through iterative cycles. This breakthrough allows automatic adaptation to completely new material constraints - a capability traditional methods lack.
This technology could revolutionize materials discovery by eliminating the bottleneck of manual representation design, potentially accelerating development in critical areas like battery technology and semiconductor materials. Our open framework also provides a foundation for addressing representation challenges in other scientific domains.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: Material structure representation, Material property prediction, Multi-agent systems, Large language models
Submission Number: 9534
Loading