An Empirical Study on Quality Issues of eBay's Big Data SQL Analytics PlatformDownload PDFOpen Website

Published: 01 Jan 2022, Last Modified: 12 May 2023ICSE (SEIP) 2022Readers: Everyone
Abstract: Big data SQL analytics platform has evolved as the key infrastructure for business data analysis. Compared with traditional costly commercial RDBMS, scalable solutions with open-source projects, such as SQL-on-Hadoop, are more popular and attractive to enter-prises. In eBay, we build Carmel, a company-wide interactive SQL analytics platform based on Apache Spark. Carmel has been serving thousands of customers from hundreds of teams globally for more than 3 years. Meanwhile, despite the popularity of open-source based big data SQL analytics platforms, few empirical studies on service quality issues (e.g., job failure) were carried out for them. However, a deep understanding of service quality issues and taking right mitigation are significant to the ease of manual maintenance efforts. To fill this gap, we conduct a comprehensive empirical study on 1,884 real-word service quality issues from Carmel. We summa-rize the common symptoms and identify the root causes with typical cases. Stakeholders including system developers, researchers, and platform maintainers can benefit from our findings and implications. Furthermore, we also present lessons learned from critical cases in our daily practice, as well as insights to motivate automatic tool support and future research directions.
0 Replies

Loading