Text-to-ES Bench: A Comprehensive Benchmark for Converting Natural Language to Elasticsearch Query

Text-to-ES Bench: A Comprehensive Benchmark for Converting Natural Language to Elasticsearch Query

ACL ARR 2025 February Submission1284 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Elasticsearch (ES) is a distributed RESTful search engine optimized for large-scale and long-text search scenarios. Recent research on Text-to-Query has explored using large language models (LLMs) to convert user query intent into executable code, making it an increasingly popular research topic. We are the first to introduce the novel semantic parsing task text-to-ES, aiming to bridge the gap between LLMs and ES by leveraging LLMs to generate Domain-Specific Language (DSL) and corresponding post-processing codes to support multi-index ES queries. We propose the text-to-ES benchmark, which consists of two datasets: the Large Elasticsearch Dataset (LED), containing 26,207 text-ES pairs derived from a 224.9GB schema-free database, and the ElasticSearch (BirdES) dataset with 10,926 pairs sourced from the Bird dataset on a 33.4GB schema-fixed database. Our trained model outperformed DeepSeek-R1 by 15.63% on the LED dataset, setting a new state-of-the-art, and achieved 78% of DeepSeek-R1's performance on the BirdES dataset. Additionally, we provide in-depth experimental analyses and suggest future research directions for this task, with plans to release our code and datasets in the future.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking;NLP datasets;

Contribution Types: Data resources

Languages Studied: English

Submission Number: 1284

Loading