Abstract: We develop a framework for capturing the instrumental
value of data production processes, which
accounts for two key factors: (a) the context of
the agent’s decision-making; (b) how much data
or information the buyer already possesses. We
"micro-found" our data valuation function by establishing
its connection to classic notions of signals
and information design in economics. When
instantiated in Bayesian linear regression, our
value naturally corresponds to information gain.
Applying our proposed data value in Bayesian linear
regression for monopoly pricing, we show that
if the seller can fully customize data production,
she can extract the first-best revenue (i.e., full surplus)
from any population of buyers, i.e., achieving
first-degree price discrimination. If data can
only be constructed from an existing data pool,
this limits the seller’s ability to customize, and
achieving first-best revenue becomes generally
impossible. However, we design a mechanism
that achieves seller revenue at most $\log(\kappa)$ less
than the first-best, where $\kappa$ is the condition number
associated with the data matrix. As a corollary,
the seller extracts the first-best revenue in the
multi-armed bandits special case.
Lay Summary: How do we determine the value of data to an agent? It depends on the problem the agent is facing and the amount of information they already possess. From the perspective of rational agent decision-making, we propose an instrumental value framework that characterizes valid data valuation. Notably, we show that in the case of Bayesian linear regression, this value coincides with information gain. We then apply our instrumental value framework to a monopoly data pricing setting. We find that when the seller can perfectly customize data production, the buyer's surplus is zero, leading to severe market asymmetry and unfairness. In contrast, under limited customization, we derive an upper bound on the buyer's surplus. This prompts broader reflections on how to price such novel products in the data era and the resulting concerns about market fairness.
Primary Area: Theory->Game Theory
Keywords: Instrumental Value, Data Production Process, Data Pricing, Data Customization
Submission Number: 8273
Loading