TDCM25: A Multi-Modal Multi-Task Benchmark for Temperature-Dependent Crystalline Materials

Published: 03 Mar 2025, Last Modified: 09 Apr 2025AI4MAT-ICLR-2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Submission Track: Multi-Modal Data for Materials Design - Full Paper
Submission Category: All of the above
Keywords: dftb, multi-modal, llm, materials science, density functional theory, density functional tight binding, graph neural networks, benchmark
Supplementary Material: pdf
TL;DR: TDCM25 is a multi-modal dataset of ~100,000 TiO₂ entries across three phases between 0K–1000K. It includes 3D structures, molecular images, and metadata, providing a benchmark for machine learning models in temperature-dependent materials research.
Abstract: Materials exhibit phase and temperature dependent properties that are critical for applications ranging from catalysis to energy storage and environmental remediation and accurate modeling of these dependencies requires high-quality, multi-modal datasets. In this work, TDCM25 (Temperature Dependent Crystalline Materials 2025) is introduced as a comprehensive dataset featuring approximately 100,000 entries spanning three crystalline phases of TiO$_2$ (anatase, brookite, and rutile) sampled over 21 temperatures from 0K to 1000K. Each entry comprises 3D atomic coordinates, corresponding RGB molecular images, and detailed textual metadata including Ti:O ratios, temperature, spatial dimensions, and transformation parameters. TDCM25 provides a benchmark for developing and evaluating machine learning methods that integrate multi-modal data to capture temperature dependent material behavior. The dataset is publicly available at https://github.com/KurbanIntelligenceLab/TDCM25.
AI4Mat Journal Track: Yes
Submission Number: 4
Loading