Abstract: Large language models (LLMs) are effective at code generation. Some code tasks, such as data wrangling or analysis, can be data-dependent. We introduce two novel taxonomies to characterize (1) the extent to which a code generation task depends on data and (2) the effect of data redaction. We curate two new datasets for Python code generation from natural language for data-centric tasks. We evaluate these datasets by varying configurations over our taxonomies and find that code generation performance varies based on the task class, data access, and prompting strategy. This is the first empirical measurement of the impact of data in the NL-to-code setting using LLMs for data-centric tasks.
Paper Type: long
0 Replies
Loading