Yusuf Perwej
2018 International Journal of Advanced Research  
Today scenario, we live in the data age and a key metric of existing times is the amount of data that is originates ubiquitously around us. At present-time intense increase in the number of Internet subscriber and connected devices, as well as rising of the IoT. As an outcome, quantities of data are originated (so called Big Data), such as user data (structured, unstructured, or semi structured), sensor data and log files. It is an increasingly business for companies to collect and analysis Big
more » ... Data and provides insights to their client. In general processing such spacious amount of data with multifarious formats can be time consuming. The Hadoop is an open source framework that is used to process spacious amounts of data in an economical and proficient way, and job scheduling has become a significant factor to attain high performance in Hadoop cluster. The job scheduling algorithms are essential for efficient make use of cluster resources and executing them in short time. The fundamental purpose of this paper is to present a classification of Hadoop schedulers along with their existing scheduling algorithm in Hadoop territory. In addition, this paper paraphrases the features, advantages, disadvantages, and limitations of several Hadoop scheduling algorithms.
doi:10.21474/ijar01/6672 fatcat:bdyexu2po5a3fl3wz5i2k4odxy