Empowering Data Analysis with Program Synthesis

Published: 01 Jan 2021, Last Modified: 30 Sept 2024undefined 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Data manipulation and visualization support data scientists' efforts to explore and understand data throughout the exploratory analysis process. Nowadays, experienced data scientists can use programming languages like SQL and R to achieve efficient and flexible analysis, and inexperienced users can easily learn and use interactive tools to accomplish simple analysis tasks. However, the lack of tools in between interactive tools and programming systems leads to a programmability gap that prevents inexperienced users from conducting expressive analysis that only users with programming experience can achieve. To help end users traverse this gap, we apply program synthesis to build tools that can synthesize programs from examples. We first introduce Falx, a visualization by example tool that lets the user create expressive visualizations using demonstrations of how a few data points are mapped to the canvas. Falx's compositional algorithm design let it synthesize both data transformation and visualization programs directly from end-to-end demonstration. We next introduce Scythe, a SQL query synthesizer that lets the user author advanced SQL queries using input-output examples. Using a language of abstract queries, Scythe can prune families of infeasible queries to achieve synthesis efficiency. To let inexperienced users distinguish synthesized complex queries, we developed a symbolic engine to compute a distinguishing input that the two queries would return different outputs. Finally, we summarize our synthesizer building experience into a framework, Kopis, that illustrates how to build an efficient relational query synthesizer using {value-preserving abstractions}. Together, these three contributions demonstrate the value of using program synthesizers to empower future data science, and offer guidance on how to build such synthesis-powered tools efficiently for new domains.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview