Scalability Study of Hadoop MapReduce and Hive in Big Data Analytics

Khadija Jabeen
2016 International Journal Of Engineering And Computer Science  
Hadoop is a data management solution for the analysis of Big Data. In Hadoop, Hive is used to store the metadata. This study compares the scalability of Hadoop MapReduce and Hive for small and medium datasets besides showing how the metadata can be created, loaded, accessed and stored using Hivea data warehousing solution built on top of Hadoop. To make the comparison of scalabilities Hadoop MapReduce and Hive, a word count program was investigated using two data management solutions-Hadoop
more » ... educe and Hive. This comparison demonstrates that the Hadoop MapReduce programming model is very low level and it will make the developers write custom programs which are hard to maintain and reuse, where as Hive uses an SQL-like query language called HiveQL to store large amounts of data consuming less time and also plugs in the Map Reduce scripts into queries.
doi:10.18535/ijecs/v5i11.11 fatcat:6jjayno47fci5kf7532kf7qssq