nbDescribe: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines

nbDescribe: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines

ACL ARR 2024 April Submission506 Authors

16 Apr 2024 (modified: 18 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Generating cell-level descriptions for Jupyter Notebooks, which is a major resource consisting of codes, tables, and documentation, has been attracting increasing research attention. However, existing methods for Jupyter Notebooks mostly focused on generating descriptions from code snippets or table outputs solely. On the other side, the descriptions for Jupyter cell should be personalized as users have their own preferences or user-written guidelines while previous work ignores these informative guidelines during description generation. In this work, we formulate a new task, personalized description generation with code, tables, and user-written guidelines in Jupyter Notebooks along with a novel collected new dataset, nbDescrib. Specifically, the proposed benchmark, namely nbDescrib, contains code, tables, and user-written guidelines paired with target personalized descriptions. Extensive experiments show that existing models on text generation, e.g., can generate fluent and readable text as well as different types of text for the same input according to different user-written guidelines. However, they still struggle to produce faithful descriptions that are factually correct. To understand how each component contributes to the generated descriptions, we conduct extensive experiments and show that guidelines significantly enhance model performance, helping users create accurately oriented and reasonable descriptions. Moreover, by analyzing the error patterns of the model-generated text, we found that the most frequent errors involve generating incorrectly oriented text based on the guidelines, with additional common errors related to incorrect value generation and reasoning mistakes.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Resources and Evaluation, Human-Centered NLP

Contribution Types: Data resources

Languages Studied: English

Submission Number: 506

Loading