Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
DataBright: Towards a Global Exchange for Decentralized Data Ownership and Trusted Computation
David Dao, Dan Alistarh, Claudiu Musat, Ce Zhang
Feb 12, 2018 (modified: Feb 12, 2018)ICLR 2018 Workshop Submissionreaders: everyone
Abstract:It is safe to assume that, for the foreseeable future, machine learning,
especially deep learning will remain both data- and computation-hungry. In this
paper, we ask: Can we build a global exchange where
everyone can contribute computation and data to train the
next generation of machine learning applications?
We present an early, but running prototype of DataBright,
a system that turns the creation of training examples and the sharing of computation into an investment mechanism. Unlike most crowdsourcing platforms, where the contributor gets paid when they submit their data, DataBright pays dividends whenever a contributor's data or hardware is used by someone to train a machine learning model. The contributor becomes a shareholder in the dataset they created. To enable the measurement of usage, a computation platform that contributors can trust is also necessary. DataBrigh thus merges both a data market
and a trusted computation market.
We illustrate that trusted computation can enable the creation of an AI market, where each data point has an exact value that should be paid to its creator.DataBright allows data creators to retain ownership of their contribution and attaches to it a measurable value. The value of the data is given by its utility in subsequent distributed computation done on the DataBright computation market.
The computation market allocates tasks and subsequent payments to pooled hardware. This leads to the creation of a decentralized AI cloud. Our experiments show that trusted hardware such as Intel SGX can be added to the usual ML pipeline with no additional costs. We use this setting to orchestrate distributed computation that enables the creation of a computation market. DataBright is available for download at https://github.com/ds3lab/databright.
Keywords:trusted computation, data market, model parallelism, data parallelism, distributed training
Enter your feedback below and we'll get back to you as soon as possible.