Representing data in words: A context engineering approach

Representing data in words: A context engineering approach

ACL ARR 2026 January Submission9750 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, prompt engineering, context engineering, NLP

Abstract: Large language models (LLMs) have demonstrated remarkable potential across a broad range of applications. However, producing reliable text that faithfully represents data remains a challenge. While prior work has shown that task-specific conditioning through in-context learning and knowledge augmentation can improve performance, LLMs continue to struggle with interpreting and reasoning about numerical data. To address this, we introduce wordalisations, a methodology for generating stylistically natural narratives from data. Much like how visualisations display numerical data in a way that is easy to digest, wordalisations abstract data insights into descriptive texts. To illustrate its versatility, we apply our method to three application areas: scouting football players, personality tests, and international survey data. Due to the absence of standardized benchmarks for this specific task, we conduct LLM-as-a-judge and human-as-a-judge evaluations to assess accuracy across the three applications. We found that the wordalisation methods reduces misrepresentation of the data and shows the potential to improve communication about data. We further describe best practice methods for open and transparent development of communication about data.

Paper Type: Long

Research Area: Natural Language Generation

Research Area Keywords: LLM/AI agents, prompting, safety and alignment, human evaluation, automatic evaluation, few-shot generation, analysis, domain adaptation, data-to-text generation

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: gemini-2.5-flash,gpt-4o-mini

Submission Number: 9750

Loading