Abstract: Code-generating large language models (LLMs) are transforming programming. Their capability to generate multi-step solutions provides even non-programmers a mechanism to harness the power of coding. Non-programmers often use spreadsheets to manage tabular data, as they offer an intuitive understanding of data manipulation and formula out-comes. Considering that LLMs can generate complex, potentially incorrect code, our focus is on enabling user trust in the accuracy of LLM-generated code. We present ColDeco, the first end-user inspection tool for comprehending code produced by LLMs for tabular data tasks. ColDeco integrates two new features for inspection with a grid-based interface. First, users can decompose a generated solution into intermediate helper columns to understand how the problem is solved step by step. Second, users can interact with a filtered table of summary rows, which highlight interesting cases in the program. We evaluate our tool using a within-subjects user study (n=24) where participants are asked to verify the correctness of programs generated by an LLM. We found that while all features are independently useful, participants preferred them in combination. Users especially noted the usefulness of helper columns, but wanted more transparency in how summary rows are generated to assist with understanding and trusting them. Users also highlighted the application of ColDeco in collaborative settings for explaining and understanding existing formulas.
0 Replies
Loading