Abstract: We demonstrate the Approximate Selection Query Processing (ASQP-RL) system, which uses Reinforcement Learning to select a subset of a large external dataset to process locally in a notebook during data exploration. Given a query workload over an external database and notebook memory size, the system translates the workload to select-project-join (non-aggregate) queries and finds a subset of each relation such that the data subset - called the approximation set - fits into the notebook memory and maximizes query result quality. The data subset can then be loaded into the notebook, and rapidly queried by the analyst. Our demonstration shows how ASQP-RL can be used during data exploration and achieve comparable results to external queries over the large dataset at significantly reduced query times. It also shows how ASQP-RL can be used for aggregation queries, achieving surprisingly good results compared to state-of-the-art techniques.
Loading