Ultra-large scale services

Published: 01 Jan 2012, Last Modified: 11 Feb 2025CASCON 2012EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Ultra-large-scale (ULS) services are extremely large software systems across all dimensions, such as the size of their code base, the number of users, the amount of data transferred, the number of developers, and the infrastructure for running these services. ULS services are at the core of many current Web 2.0 applications and are the building blocks of future Web 3.0 applications. ULS services provide the facilities needed for communication (the BlackBerry platform and the Rogers or Bell wireless networks), international banking (Interac and Visa), e-commerce (the eBay auction system and the Amazon Elastic Compute Cloud), social communities (Facebook and MySpace), massive multiplayer online games (World of Warcraft) and future e-healthcare deployments (Health Canada's Infostructure). ULS services are causing a revolution in computing. The scale of ULS services is often hard to grasp. Current ULS services are composed of thousands of hardware nodes; petabytes of data in databases; billions of lines of code, and millions of stakeholders.The Software Engineering Institute (SEI at Carnegie Mellon) report, Ultra-Large-Scale Systems: The Software Challenge of the Future (June 2006), highlighted the dire need for unique and modern approaches to cope with the scale, the wide and varied worldwide user base, the frequent failures and the evolving requirements of these systems. The National Science Foundation (NSF) and the UK Engineering and Physical Sciences Research Council (EPSRC) who recently established research and training centres for Ultra-Large-Scale Software Intensive Systems have heard these sentiments. More recently (2010), the IEEE "Future Directions Committee" highlighted the need for a shift within IEEE to cater to this critical and important domain. Research on a variety of challenging topics is needed to ensure that ULS services can be created and managed reliably in a cost-effective manner. Figure 1: Delivering Ultra Large Scale Services (ULSS)Figure 1 presents the four main perspectives of an efficient ULS service namely, the Focused Quality, Flexible Delivery, Scalable Quality, and Adaptive Infrastructure. Focused Quality implies delivering high quality ULS services, which is a top priority for all service providers. However, the characteristics of ULS services make this a very challenging and complex goal to achieve. The dynamic nature of ULS services (being composed of many other ULS services) and their large and varied user base (often consisting of millions of users with varying needs and expectations) increase the complexity of ensuring the quality of ULS services. For instance, it is impossible to test and verify every possible configuration and usage pattern as these configurations and patterns are continuously changing. This situation has led many ULS providers to depend heavily on monitoring approaches as a way to monitor the quality of their ULS services post-release. However, quality issues should be addressed early, rather than delayed until post-release where quality improvement options are very limited and costly. Techniques and approaches are needed to assist practitioners in focusing their limited resources on the quality improvement efforts which have the highest return, that is, those that are most likely to improve the customer experience.In the world of ULS services, services must handle millions of users with varying needs and capabilities; which is why flexible delivery is recognized as an important aspect. For example, a medical remote diagnostic service that is part of Health Canada's Infostructure might be used by a senior citizen with limited knowledge about medical terms and options, or by a seasoned emergency response worker who requires a more elaborate tool to support his decision making. Providing a single service to satisfy the needs of both users is not feasible today. Instead, different services must be created and maintained. Moreover users of such a medical service might need to perform resource-intensive operations (e.g., watching videos or performing complex simulations based on inputted data), these operations must be delivered in flexible manner instead of simply performing all the operations either locally or centrally (i.e., at the data center). ULS Services need to support both flexible personalization of the service and access through a variety of devices, including smart phones and tablets, in order to accommodate all potential users.ULS systems provide computing, storage and abstraction services, allowing applications to access services with limited knowledge of, expertise with, or control over, the technology infrastructure that supports them. For example, the Blackberry platform connects the wired and wireless internet infrastructure to enable the seamless integration of enterprise and entertainment services between both types of networks. ULS systems are deployed at a worldwide scale and most systems provide an application programming interface (API) to extend various functionalities. These APIs create an ecosystem through which ULS systems grow their user base. However, this leads to varying and innovative usage patterns that must be taken into account, increasing the impact of the reliability of ULS systems while also increasing the complexity of ensuring such reliability. Current industrial approaches to cope with the characteristics of ULS services ensure that they are of high quality but are usually ad hoc, last-resort efforts. Principled approaches are needed to provide scalable high quality services and to evolve these services in an efficient manner while handling their continuously growing user base. For example, current monitoring approaches must be enhanced and adapted to deal with the characteristics of ULS services; current testing techniques and approaches must be revised based on an understanding of the impact of very large user bases on the performance and correctness of software systems in general and ULS service in particular, and traditional security approaches must be adapted to cope with the complexity of ULS services and their evolving usage.To remain competitive, the cost of providing ULS services must scale well with new users and/or units are added. Some ULS providers advocate the need for sub-linear growth in cost, that is, the cost per user/unit should decrease as more users are added to their services. Given the vast number of users, the fluctuation in workload intensities and complexities, the large number of services that must be supported and the unprecedented volumes of different types of data to be processed, it is imperative that ULS infrastructures, on which ULS services run, adapt to the growth, expansion and ever-increasing demands placed on them. The infrastructures, therefore, need to provide autonomic management of the resources, elastic provisioning of storage and other resources to services and support for high availability.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview