Content Replication in a Distributed and Controlled Environment

Bo Li

Published: 1999, Last Modified: 08 Aug 2024J. Parallel Distributed Comput. 1999EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recently, we have witnessed a phenomenal growth in the Internet/Intranet coupled with rapid deployment of new services. Information dissemination over the network has become one of the most important activities in our daily life. The existing systems, however, often suffer from notorious long delays experienced by clients, especially during peak hours. This is caused by the combination of a variety of factors, including inadequate link bandwidth, server overload, and network congestion. Content replication has been shown to be one of the most effective mechanisms to cope with this problem. The basic idea is to replicate the information across a network so that clients' requests can be spread out. One of the major issues is which locations inside the network these replications should take place, i.e., where to place the replicated servers. In this paper we investigate the content replication is a controlled and distributed environment. The salient feature of this environment is that the decision where to replicate information can be determined by a single authority, the Intranet being the typical example. We consider the problem of placing multiple replicated servers within a network, given there exist multiple target web servers as information providers. We formulate this as an optimization problem by taking into consideration the characteristics of the network topology. We first show that this is a NP-complete problem, and then we present a number of heuristsic-based algorithms for content replications. Finally, in order to investigate various trade-offs in terms of cost and algorithm complexity, we carry out comparison studies among different heuristic algorithms and, also, with an optimal placement algorithm we recently proposed for the single target web server environment.