Abstract: In industry practice, zone models have been established as data lake architectures of choice to enable the reuse of data preparation, data modeling, and analytical results across the entire platform. However, when implementing a zone-based data lake, developers and architects find themselves lacking guidance, leading to costly, complex, and potentially single-case-only solutions. This paper addresses this challenge: Based on a comprehensive literature review and practical experiences from a large-scale enterprise context, we identify different groups of implementation approaches for zone models. We then derive nine systematic implementation patterns for zone-based data lake architectures. We evaluate the applicability and benefit of these patterns using three real-world case studies from different business contexts. Our assessment shows how the developed patterns support an effective and standardized implementation process of a zone-based data lake in enterprises.
Loading