Abstract: Data manipulation and visualization support data scientists' efforts to explore and understand data throughout the exploratory analysis process. Nowadays, experienced data scientists can use programming languages like SQL and R to achieve efficient and flexible analysis, and inexperienced users can easily learn and use interactive tools to accomplish simple analysis tasks. However, the lack of tools in between interactive tools and programming systems leads to a programmability gap that prevents inexperienced users from conducting expressive analysis that only users with programming experience can achieve. To help end users traverse this gap, we apply program synthesis to build tools that can synthesize programs from examples. We first introduce Falx, a visualization by example tool that lets the user create expressive visualizations using demonstrations of how a few data points are mapped to the canvas. Falx's compositional algorithm design let it synthesize both data transformation and visualization programs directly from end-to-end demonstration. We next introduce Scythe, a SQL query synthesizer that lets the user author advanced SQL queries using input-output examples. Using a language of abstract queries, Scythe can prune families of infeasible queries to achieve synthesis efficiency. To let inexperienced users distinguish synthesized complex queries, we developed a symbolic engine to compute a distinguishing input that the two queries would return different outputs. Finally, we summarize our synthesizer building experience into a framework, Kopis, that illustrates how to build an efficient relational query synthesizer using {value-preserving abstractions}. Together, these three contributions demonstrate the value of using program synthesizers to empower future data science, and offer guidance on how to build such synthesis-powered tools efficiently for new domains.
Loading