Hadoop Technology for Flow Analysis of the Internet Traffic
International Journal of Innovative Research in Computer and Communication Engineering
Flow analysis of the internet traffic elucidates the sequence and pattern of the traffic in the network. This helps the network administrator to monitor the operations going on in the network, to understand the network usage and to examine the behaviour of the user using the network. Analysis of the internet traffic can avoid a huge amount of problems. Flow analysis helps in fault tolerance, traffic engineering, resource allocation and network capacity planning. Due to the fast growing network,
... the volume of the traffic is getting very big day by day. So it is very difficult to collect, store and analyse this huge data on a single machine. Hadoop is a leading framework which is designed to execute tremendous datasets that can be of hundreds of terabytes and even petabytes of data. Hadoop performs brute force scan for multiple traces of input data and produces the output for traffic flow identification, flow clustering. In this paper a Hadoop based traffic analysis of the internet traffic is done. Here the system accepts a large amount of packets coming from various networks, the input is appended to the Hadoop Distributed File System (HDFS) and finally processing is done through an approach called MapReduce. Once the output is obtained it is graphically shown to the network operators and a detailed analysis is done on the internet traffic.