Abstract: The ubiquity of data lakes has created fascinating new challenges for data management research. In this tutorial, we review the state-of-the-art in data management for data lakes.
We consider how data lakes are introducing new problems
including dataset discovery and how they are changing the
requirements for classic problems including data extraction,
data cleaning, data integration, data versioning, and metadata management.
Loading