A New MapReduce Framework Based on Virtual IP Mechanism and Load Balancing Strategy

Song Yang, Hao Pingting, Hu Jiejun, Hu Liang, Che Xilong
2015 Open Cybernetics and Systemics Journal  
MapReduce is an important method for large-scale data processing on parallel architecture. In Hadoop ecosystem, MapReduce runs on the application-level, thus it provides system with flexibility. MapReduce is good at offline batch processing and it could accelerate the whole execution time. The deficiency of the MapReduce architecture is a lack in balancing and scalability, thus leads to low efficiency when dealing with large-scale data. In this paper, we propose a new MapReduce framework that
more » ... more suitable for Hadoop ecosystem. The framework is based on the virtual IP mechanism and load balancing strategy. Comparative experiments indicate that the new framework achieve twice the performance compared to the original MapReduce. Besides, the framework fully meets the environment of Hadoop ecosystem, and provides a stable and efficient data processing.
doi:10.2174/1874110x01509010253 fatcat:fvkdcz4dtbatpcbsr6wsgj2o5a