G-RCA: a generic root cause analysis platform for service quality management in large IP networks

Published: 01 Jan 2010, Last Modified: 06 Aug 2024CoNEXT 2010EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As IP networks have become the mainstay of an increasingly diverse set of applications ranging from Internet games and streaming videos, to e-commerce and online-banking, and even to mission-critical 911, best effort service is no longer acceptable. This requires a transformation in network management from detecting and replacing individual faulty network elements to managing the service quality as a whole.In this paper we describe the design and development of a Generic Root Cause Analysis platform (G-RCA) for service quality management (SQM) in large IP networks. G-RCA contains a comprehensive service dependency model that includes network topological and cross-layer relationships, protocol interactions, and control plane dependencies. G-RCA abstracts the RCA process into signature identification for symptom and diagnostic events, temporal and spatial event correlation, and reasoning and inference logic. G-RCA provides a flexible rule specification language that allows operators to quickly customize G-RCA into different RCA tools as new problems need to be investigated. G-RCA is also integrated with the data trending, manual data exploration, and statistical correlation mining capabilities. G-RCA has proven to be a highly effective SQM platform in several different applications and we present results regarding BGP flaps, PIM flaps in Multicast VPN service, and end-to-end throughput drop in CDN service.
Loading