[Re] Classwise-Shapley values for data valuation

TMLR Paper2227 Authors

16 Feb 2024 (modified: 13 Mar 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: We evaluate CS-Shapley, a data valuation method introduced in Schoch et al. (2022) for classification problems. We repeat the experiments in the paper, including two additional methods, the Least Core (Yan & Procaccia, 2021) and Data Banzhaf (Wang & Jia, 2023), a comparison not found in the literature. We include more conservative error estimates and additional metrics, like rank stability, and a variance-corrected version of Weighted Accuracy Drop, originally introduced in Schoch et al. (2022). We conclude that while CS-Shapley helps in the scenarios it was originally tested in, in particular for the detection of corrupted labels, it is outperformed by the conceptually simpler Data Banzhaf in the task of detecting highly influential points.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Addressed all comments by reviewer vPWC
Assigned Action Editor: ~Matthew_J._Holland1
Submission Number: 2227
Loading