SliceTeller: A Data Slice-Driven Approach for Machine Learning Model ValidationDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 06 Nov 2023IEEE Trans. Vis. Comput. Graph. 2023Readers: Everyone
Abstract: Real-world machine learning applications need to be thoroughly evaluated to meet critical product requirements for model release, to ensure fairness for different groups or individuals, and to achieve a consistent performance in various scenarios. For example, in autonomous driving, an object classification model should achieve high detection rates under different conditions of weather, distance, etc. Similarly, in the financial setting, credit-scoring models must not discriminate against minority groups. These conditions or groups are called as “ <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Data Slices</i> ”. In product MLOps cycles, product developers must identify such critical data slices and adapt models to mitigate data slice problems. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Discovering</i> where models fail, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">understanding</i> why they fail, and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">mitigating</i> these problems, are therefore essential tasks in the MLOps life-cycle. In this paper, we present <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><i>SliceTeller</i></b> , a novel tool that allows users to debug, compare and improve machine learning models driven by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">critical</i> data slices. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SliceTeller</i> automatically discovers problematic slices in the data, helps the user understand why models fail. More importantly, we present an efficient algorithm, <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><i>SliceBoosting</i></b> , to estimate trade-offs when prioritizing the optimization over certain slices. Furthermore, our system empowers model developers to compare and analyze different model versions during model iterations, allowing them to choose the model version best suitable for their applications. We evaluate our system with three use cases, including two real-world use cases of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">product development</i> , to demonstrate the power of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SliceTeller</i> in the debugging and improvement of product-quality ML models.
0 Replies

Loading