CaLuX: A Catalyst and Lubricant Properties Extraction System for Domain Experts

Catalina Riano, Fumian Chen, Hui Fang

Published: 2025, Last Modified: 02 Jan 2026NLDB (2) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The literature is a valuable asset in many different areas. There are a great number of documents available within each knowledge discipline. This volume of information grows rapidly and leaves stakeholders with the challenge of reviewing all of these documents manually, creating the need for tools to extract relevant data in a more efficient and automatic manner. This data is particularly important for the domain experts in a wide range of fields, such as chemical engineering and material science, who rely on specialized records for their work, including the development of new models, experiments, and more. While existing tools can extract some common chemical properties from texts, currently, there is no comprehensive system available that focuses on extracting catalyst and lubricant properties from scientific literature. Catalysts and lubricants play pivotal roles in scientific research and industrial applications. Despite their significance, the lack of dedicated tools and structured databases presents a barrier to innovation in these fields. To address this gap, we introduce CaLuX, a Catalyst and Lubricant Properties Extraction System for domain experts specifically designed and tailored to extract catalyst and lubricant properties leveraging an existing chemistry-oriented Natural Language Processing (NLP) pipeline. CaLuX stands out for its ability to handle multiple document formats and produce structured data that can be used to build valuable databases. It offers a ready-to-use, straightforward interface for extracting properties from the scientific literature in the catalyst and lubricant fields, helping domain experts without the need for them to use their computational resources or to possess any coding skills.

External IDs:dblp:conf/nldb/RianoCF25