He Yan, Lee Breslau, Zihui Ge, Dan Massey, Dan Pei, Jennifer Yates
2010 Proceedings of the 6th International COnference on - Co-NEXT '10  
As IP networks have become the mainstay of an increasingly diverse set of applications ranging from Internet games and streaming videos, to e-commerce and online banking, and even to mission-critical 911 over VoIP, best effort service is no longer acceptable. This requires a transformation in network management, changing its focus from detecting and replacing individual faulty network elements, such as routers and line cards, to managing the service quality as a whole for end-users. In this
more » ... r we describe the design and development of a Generic Root Cause Analysis platform (G-RCA) for service quality management (SQM) in large IP networks. G-RCA contains a comprehensive service dependency model that includes network topological and cross-layer relationships, protocol interactions, and routing and control plane dependencies. G-RCA abstracts the RCA process into signature identification for symptom and diagnostic events, temporal and spatial event correlation, and reasoning and inference logic. G-RCA provides a simple yet flexible rule specification language that allows operators to quickly customize G-RCA into different RCA tools as new problems need to be investigated and understood. G-RCA is also integrated with the data trending, manual data exploration, and statistical correlation mining capabilities that are tailored for SQM. G-RCA has proven to be a highly effective SQM platform in several different applications and we present results regarding BGP flaps, PIM flaps in Multicast VPN service, and end-to-end throughput drop in CDN service.
doi:10.1145/1921168.1921175 dblp:conf/conext/YanBGMPY10 fatcat:rye4ohf23zdgldqu6l2yamfotq