Automating Common Data Science Matrix TransformationsOpen Website

2019 (modified: 25 Jan 2023)PKDD/ECML Workshops (1) 2019Readers: Everyone
Abstract: Programming languages such as R or Python are commonplace in data science projects. However, transforming data is usually tricky and the composition of the right primitives (using the appropriate libraries) to get the most elegant code transformation is not always easy. In this paper, we present the first system that is able to automatically synthesise program snippets in R given an input data matrix and an output matrix, partially filled by the user representing the required transformation. We use the type information given by the dimensions of the matrix primitives (and other constraints) to reduce the combinatorial explosion of primitive compositions. We test the performance of our approach with a set of artificial data and real examples from Stack Overflow questions.
0 Replies

Loading