Abstract: At the same time that artificial intelligence (AI) and machine learning are becoming central to human life, their potential harms become more vivid. In the presence of such drawbacks, a critical question to address before using individual predictions for critical decision-making is whether those are reliable. Aligned with recent efforts on data-centric AI, this paper proposes a novel approach, complementary to the existing work on trustworthy AI, to address the reliability question through the lens of data. Specifically, it associates data sets with distrust quantification that specifies their scope of use for individual predictions. It develops novel algorithms for efficient and effective computation of distrust values. The proposed algorithms learn the necessary components of the measures from the data itself and are sublinear, which makes them scalable to very large and multi-dimensional settings. Furthermore, an estimator is designed to enable no-data access during the query time. Besides theoretical analyses, the algorithms are evaluated experimentally, using multiple real and synthetic data sets and different tasks. The experiment results reflect a consistent correlation between distrust values and model performance. This highlights the necessity of dismissing prediction outcomes for cases with high distrust values, at least for critical decisions.
0 Replies
Loading