Bigbird: Exabyte-Scale Big Data Storage and Analytics Framework in Hybrid Cloud

Published: 23 Jan 2023, Last Modified: 03 Feb 2025OpenReview Archive Direct UploadEveryoneRevisionsCC BY-ND 4.0
Abstract: Implementing big data storage at scale is a complex and arduous task that requires an advanced infrastructure. With the rise of public cloud computing, various big data management services can be readily leveraged. As a critical part of Twitter’s “Project Partly Cloudy", the cold storage data and analytics systems are being moved to the public cloud. This paper showcases our approach to designing a scalable big data storage and analytics management framework using BigQuery in the Google Cloud Platform while ensuring security, privacy, and data protection. The paper also discusses the limitations of public cloud resources and how they can be effectively overcome when designing an exabyte-scale big data storage and analytics solution. Although the paper discusses the framework implementation in Google Cloud Platform, it can easily be applied to all major cloud providers.
Loading