From Fairness to Truthfulness: Rethinking Data Valuation Design

Published: 06 Mar 2025, Last Modified: 13 Apr 2025ICLR 2025 Workshop Data Problems PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Data Valuation, Data Market, Data Pricing, Truthful Mechanism Design
Abstract: As large language models increasingly rely on external data sources, fairly com- pensating data contributors has become a central concern. In this paper, we revisit the design of data markets through a game-theoretic lens, where data owners face private, heterogeneous costs for data sharing. We show that commonly used valu- ation methods—such as Leave-One-Out and Data Shapley—fail to ensure truthful reporting of these costs, leading to inefficient market outcomes. To address this, we adapt well-established payment rules from mechanism design, namely Myer- son and Vickrey-Clarke-Groves (VCG), to the data market setting. We demon- strate that the Myerson payment is the minimal truthful payment mechanism, op- timal from the buyer’s perspective, and that VCG and Myerson payments coincide in unconstrained allocation settings. Our findings highlight the importance of in- corporating incentive compatibility into data valuation, paving the way for more robust and efficient data markets.
Submission Number: 86
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview