Abstract: Creation of data analytics pipeline is a tedious task. The algorithm search space for creating a suitable solution for a given goal in a given constrained infrastructure is generally very large. The exploratory work to choose the best possible solution is an effort-, time- and intellect-intensive process. The current industry practice largely relies on the domain experts for this work. To improve a domain expert’s productivity, we propose a model- and rule-based system to automate the process of creation of data analytics pipeline. The proposed system provides a mechanism to specify domain knowledge in the form of an object model and a set of rules defined over it. Recommendations are given to choose suitable algorithm/s for carrying out various data analytics tasks based on the problem context. On successful creation of the pipeline, the system generates pipeline code. Moreover, the system also generates a trace data to help in cognitive knowledge upgrade. We discuss the approach using case study of sensor data-based health monitoring system and showcase its efficacy and lesson learnt.
0 Replies
Loading