Keywords: information extraction, scientific charts, materials science
TL;DR: This paper explores the challenges and advancements in using transformer-based models to extract materials science data from scientific figures, introducing new benchmarks and fine-tuning techniques to improve performance.
Abstract: The rapid advancements in machine learning necessitate parallel improvements in the size and quality of domain-specific datasets, especially in fields like materials science, where such datasets are often lacking due to the unstructured nature of real-world information. Despite the wealth of knowledge generated in this domain, much of it remains underutilized as experimental data is often buried in charts. In this paper, we curate two new benchmarks and introduce Relative Coordinate-Label Similarity (RCLS), a novel metric for measuring the state-of-the-art in extracting materials science data from scientific figures. We find that existing pretrained image-to-text Transformer based models for chart-to-table translation struggle with the diverse and complex nature of materials science figures, leading to issues such as inconsistent extraction of axis labels, irregular presentation of tabular data, and the omission of critical elements like legend labels from charts. We further fine-tune LLaMA 3.2-Vision 11B model to enhance its performance. Our study focuses on two subdomains of materials science, demonstrating both the successes and ongoing challenges in using multimodal models to extract scientific chart data.
Submission Number: 29
Loading