Filters








3,970 Hits in 4.8 sec

Configuring a MapReduce Framework for Performance-Heterogeneous Clusters

Jessica Hartog, Renan Delvalle, Madhusudhan Govindaraju, Michael J. Lewis
2014 2014 IEEE International Congress on Big Data  
Our study further suggests the opportunity for cluster managers to build performance-heterogeneous clusters by design, if they also run MapReduce frameworks that can exploit them. 1  ...  Our results suggest that frameworks should support finer grained sub-tasking and dynamic data partitioning when running on some performance-heterogeneous clusters.  ...  We identify beneficial framework configurations for adapting to performance-heterogeneous clusters.  ... 
doi:10.1109/bigdata.congress.2014.26 dblp:conf/bigdata/HartogDGL14 fatcat:7xkyijf3sjhtvms5evdvbjfdty

Improving MapReduce performance in heterogeneous environments with adaptive task tuning

Dazhao Cheng, Jia Rao, Yanfei Guo, Xiaobo Zhou
2014 Proceedings of the 15th International Middleware Conference on - Middleware '14  
Despite existing optimizations on task scheduling and load balancing, MapReduce still performs poorly on heterogeneous clusters.  ...  As most Mapreduce implementations assume homogeneous clusters, heterogeneity can cause significant load imbalance in task execution, leading to poor performance and low cluster utilizations.  ...  [2] identified key reasons for MapReduce poor performance on heterogeneous clusters.  ... 
doi:10.1145/2663165.2666089 dblp:conf/middleware/ChengRGZ14 fatcat:cv2ljujxkrexfcnkwdqaunf6xq

Heterogeneous cores for MapReduce processing: Opportunity or challenge?

Feng Yan, Ludmila Cherkasova, Zhuoyao Zhang, Evgenia Smirni
2014 2014 IEEE Network Operations and Management Symposium (NOMS)  
In this work, we design a new Hadoop scheduler, called DyScale, that exploits capabilities offered by heterogeneous cores for achieving a variety of performance objectives.  ...  Our preliminary performance evaluation results confirm potential benefits of heterogeneous multi-core processors for "faster" processing of the small, interactive MapReduce jobs, while at the same time  ...  FRAMEWORK DESIGN In this section, we outline a new Hadoop scheduling framework DyScale, which can efficiently use the heterogeneous multi-core processors for MapReduce processing.  ... 
doi:10.1109/noms.2014.6838339 dblp:conf/noms/YanCZS14 fatcat:jliihgk4ozbl3il4slco4r5ihe

Hadoop+

Wenting He, Youliang Yan, Huimin Cui, Binbin Lu, Jiacheng Zhao, Shengmei Li, Gong Ruan, Jingling Xue, Xiaobing Feng, Wensen Yang
2015 Proceedings of the 29th ACM on International Conference on Supercomputing - ICS '15  
In a CPU/GPU hybrid heterogeneous cluster, allocating more computing resources to a MapReduce application does not always mean better performance, since simultaneously running CPU and GPU tasks will contend  ...  Despite the widespread adoption of heterogeneous clusters in modern data centers, modeling heterogeneity is still a big challenge, especially for large-scale MapReduce applications.  ...  We would like to thank all the reviewers for their valuable comments and suggestions.  ... 
doi:10.1145/2751205.2751236 dblp:conf/ics/HeCLZLRXFYY15 fatcat:vcdt3vdwlrczlbkxy3iqcdg4ey

DyScale: A MapReduce Job Scheduler for Heterogeneous Multicore Processors

Feng Yan, Ludmila Cherkasova, Zhuoyao Zhang, Evgenia Smirni
2017 IEEE Transactions on Cloud Computing  
Here, we prototype and evaluate a new Hadoop scheduler, called DyScale, that exploits capabilities offered by heterogeneous cores within a single multi-core processor for achieving a variety of performance  ...  MapReduce jobs, while offering improved throughput (up to 40 percent) for large, batch jobs.  ...  This new framework aims at taking advantage of capabilities of heterogeneous cores for achieving a variety of performance objectives.  ... 
doi:10.1109/tcc.2015.2415772 fatcat:ls62o7yt2vbulnpw5zfwzj5lui

Configuring a MapReduce Framework for Dynamic and Efficient Energy Adaptation

Jessica Hartog, Zacharia Fadika, Elif Dede, Madhusudhan Govindaraju
2012 2012 IEEE Fifth International Conference on Cloud Computing  
MapReduce has become a popular framework for Big Data applications.  ...  Our work shows that given an ideal framework configuration, certain nodes may consume only 62.3% of the dynamic power they consumed when the same framework was configured as it would be in a traditional  ...  The primary difference between our framework and MARLA is that we have scheduled this framework for energy awareness, whereas MARLA schedules for performance in heterogeneous clusters alone.  ... 
doi:10.1109/cloud.2012.137 dblp:conf/IEEEcloud/HartogFDG12 fatcat:tjnqvrl5ljfsrhbbvsuerlthla

Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster [article]

Sudhakar Singh, Rakhi Garg, P. K. Mishra
2017 arXiv   pre-print
In this paper, we have focused on the performance of MapReduce based Apriori on homogeneous as well as on heterogeneous Hadoop cluster.  ...  We have investigated a number of factors that significantly affects the execution time of MapReduce based Apriori running on homogeneous and heterogeneous Hadoop Cluster.  ...  Section 3 summarizes works related to optimization of Apriori on MapReduce framework and performance improvement of MapReduce job on heterogeneous clusters.  ... 
arXiv:1701.05982v1 fatcat:mt3nog3purhwbkh2pxwnhoc75a

Enhance Performance of Mapreduce Job on Hadoop Framework using Setup and Cleanup

Priyam Jain, Satyaranjan Patra, Pankaj Richhariya
2016 International Journal of Computer Applications  
To improve the performance of MapReduce in heterogeneous or shared environments, a data prefetching mechanism is proposed, In this paper, we can fetch the data to corresponding compute nodes in advance  ...  MapReduce is an effective programming model for largescale data-intensive computing applications. Hadoop is an open-source implementation of MapReduce which has been widely used.  ...  Improve MapReduce Performance through Data Placement in Heterogeneous Hadoop Clusters [5] .  ... 
doi:10.5120/ijca2016912400 fatcat:oyt5osvutraydkgvwnieemxmpm

A time–energy performance analysis of MapReduce on heterogeneous systems with GPUs

Dumitrel Loghin, Lavanya Ramapantulu, Oana Barbu, Yong Meng Teo
2015 Performance evaluation (Print)  
We evaluate the time and energy performance of three MapReduce applications with diverse resource demands on a Hadoop-CUDA framework.  ...  To investigate this, we perform a time-energy analysis of MapReduce on intra-node and intra-chip heterogeneous systems.  ...  Acknowledgements We are grateful to Nvidia for providing us with four Jetson TK1 boards.  ... 
doi:10.1016/j.peva.2015.06.015 fatcat:h2e3dk2dwjawdlgc4t3g7g3lme

Enabling Computational Steering with an Asynchronous-Iterative Computation Framework

Alexandre di Costanzo, Chao Jin, Carlos A. Varela, Rajkumar Buyya
2009 2009 Fifth IEEE International Conference on e-Science  
The framework supports steerable applications by introducing an asynchronous iterative MapReduce programming model that is deployed using Hadoop over a set of virtual machines executing on a multi-cluster  ...  In this paper, we present a framework that enables scientists to steer computations executing over large-scale grid computing environments.  ...  Due to the heterogeneity between machines from different sites and clusters, the performance does not increase linearly.  ... 
doi:10.1109/e-science.2009.43 dblp:conf/eScience/CostanzoJVB09 fatcat:vtp4777lofhtfapgu5jfvpocve

The Creation and Placement of VMs and Tasks in Virtualized Hadoop Cluster Environments

Tae-Won Kim, Hae-jin Chung, Joon-Mo Kim
2012 Journal of Korea Multimedia Society  
But, when we configure distributed processing system for big data in virtual machine environments, many problems occur.  ...  In this paper, we did an experiment on the optimization of I/O bandwidth according to the creation and placement of VMs and tasks with composing Hadoop cluster in virtual environments and evaluated the  ...  For the third experiment heterogeneous environments of Hadoop virtual machine cluster builds as cluster C in Fig, 5 below.  ... 
doi:10.9717/kmms.2012.15.12.1499 fatcat:7liqhah4fbcbnm5xd3ga434ppe

User-Centric Heterogeneity-Aware MapReduce Job Provisioning in the Public Cloud

Eric Pettijohn, Yanfei Guo, Palden Lama, Xiaobo Zhou
2014 IEEE International Conference on Autonomic Computing  
In this paper, we propose and develop U-CHAMPION, a user-centric middleware that automates job provisioning and configuration of the Hadoop MapReduce framework in a public cloud to improve job performance  ...  In our case study on Amazon's EC2 public cloud, we observe that the average execution time of Hadoop MapReduce jobs vary by up to 30% in spite of using identical VM instances for the Hadoop cluster.  ...  AROMA [16] provides a novel framework for automated parameter estimation and cluster resource provisioning in order to maximize job performance in a given cluster.  ... 
dblp:conf/icac/PettijohnGLZ14 fatcat:6dv427e47vgifhhmwyapbwnhpy

A Survey on Job Scheduling in Big Data

M. Senthilkumar, P. Ilango
2016 Cybernetics and Information Technologies  
In this paper, we discussed various tools and frameworks used for monitoring and the ways to improve the performance in MapReduce.  ...  The Hadoop framework becomes very popular and most used frameworks in a distributed data processing. Hadoop is also open source software that allows the user to effectively utilize the hardware.  ...  In a heterogeneous environment synchronization becomes a problem; each node in the Hadoop cluster has unique computation facility, hardware and bandwidth and this leads to decreased performance of Hadoop  ... 
doi:10.1515/cait-2016-0033 fatcat:psrtc3l3dzgkfklqjl6hr3qczi

A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments

Hadi Yazdanpanah, Amin Shouraki, Abbas Ali
2015 International Journal of Computer Applications  
MapReduce has been widely used as a Big Data processing platform, proposed by Google in 2004 and has become a popular parallel computing framework for large-scale data processing since then.  ...  This paper tries to illustrate and analyze the overview of thirteen different aware scheduling algorithms with different techniques and approaches for MapReduce in Hadoop and their scheduling issues and  ...  This scheduler increases the performance in heterogeneous Hadoop clusters. Although still in a simulation stage, this approach seeks performance gains by using the best of each node on the cluster.  ... 
doi:10.5120/ijca2015906395 fatcat:m5fax4qzcbfgrhk6p7p2csxpw4

AdMap: a framework for advertising using MapReduce pipeline

Abhay Chaudhary, K R Batwada Batwada, Namita Mittal, Emmanuel S. Pilli
2022 Computer Science and Information Technologies  
Hence there is a void formed between the producer and the client. To fill that void, there is the need for a framework which can facilitate all the needs for query updating of the data.  ...  There is a vast collection of data for consumers due to tremendous development in digital marketing.  ...  Claim 3, in which this pipeline data charge framework is running sequentially, requires MapReduce jobs on that Hadoop cluster for most of the stage A corresponding task is to be performed. Claim 7.  ... 
doi:10.11591/csit.v3i2.p82-93 fatcat:b4p3un3mpbgy5df2dutd4qbrhu
« Previous Showing results 1 — 15 out of 3,970 results