Fair Revenue Compensation in Data Markets
Keywords: Fairness, Datamarkets, Linear Programs, Approximation Algorithms, Optimization
Abstract: The immense success of ML systems relies heavily on large-scale, high-quality data. The high demand for data has led to several paradigms that involve selling and exchanging data. A key model in this landscape is a data market-- a two-sided marketplace that (i) receives ML task requests from buyers, (ii) addresses these requests using the datasets hosted by the sellers, (iii) collects revenue from the buyers, and (iv) compensates the sellers from the earned revenue. Typically, a data market compensates the sellers based on their contributions to the total earned revenue (this is usually measured by standard credit sharing rules, e.g., Shapley shares). However, we observe that multiple data allocations can yield the same optimal revenue while resulting in vastly different compensation outcomes. For example, when multiple sellers offer equally valuable, *non-complimentary* datasets, a revenue-maximizing allocation may select only one, thereby excluding others from compensation despite their comparable data quality. Such discrepancies highlight the need for fairness in revenue distribution. In this paper, we develop a revenue maximization framework for data markets that incorporates fairness constraints for seller compensation. We show that while this problem is NP-hard, we can still obtain a $\mathcal{O}(\log n)$-bi-criteria approximation (approximating revenue and fairness) in polynomial time.
Area: Game Theory and Economic Paradigms (GTEP)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 1516
Loading