Structuring the Unstructured: A Systematic Review of Text-to-Structure Generation with a Universal Evaluation Framework

ACL ARR 2025 July Submission1339 Authors

29 Jul 2025 (modified: 01 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The evolution of AI systems toward agentic operation and context-aware retrieval necessitates transforming unstructured text into structured formats like tables, knowledge graphs, and charts. While such conversions enable critical applications from summarization to data mining, current research lacks a comprehensive synthesis of methodologies, datasets, and metrics. This systematic review examines text-to-structure techniques and the encountered challenges, evaluates current datasets and assessment criteria, and outlines potential directions for future research. We also introduce a universal evaluation framework for structured outputs, establishing text-to-structure as foundational infrastructure for next-generation AI systems.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Large Language Models, Text-to-Structure Generation, Information Extraction
Contribution Types: NLP engineering experiment, Surveys
Languages Studied: English, Chinese
Submission Number: 1339
Loading