Filters








75,227 Hits in 9.2 sec

Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications

Engin Arslan, Mrigank Shekhar, Tevfik Kosar
2014 2014 5th International Workshop on Data-Intensive Computing in the Clouds  
In this paper, we propose a new algorithm (LoNARS) for reduce task scheduling, which takes both data locality and network traffic into consideration.  ...  Effective scheduling of the reduce tasks to the resources becomes especially important for the performance of data-intensive applications where large amounts of data are moved between the map and reduce  ...  In this paper, we propose a locality and network-aware reduce task scheduling algorithm (LoNARS) in order to optimize the shuffle phase of data-intensive MapReduce applications.  ... 
doi:10.1109/datacloud.2014.10 dblp:conf/sc/ArslanSK14 fatcat:grbli3ckprgxhpuda7oglwtxry

A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments

Hadi Yazdanpanah, Amin Shouraki, Abbas Ali
2015 International Journal of Computer Applications  
It is best suited for embarrassingly parallel and data-intensive tasks.  ...  This paper tries to illustrate and analyze the overview of thirteen different aware scheduling algorithms with different techniques and approaches for MapReduce in Hadoop and their scheduling issues and  ...  It is best suited for processing parallel and data-intensive tasks.  ... 
doi:10.5120/ijca2015906395 fatcat:m5fax4qzcbfgrhk6p7p2csxpw4

Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks

Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue, Seung-Jong Park
2012 Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit - FederatedClouds '12  
Placement of map tasks close to its data split is critical for performance of Hadoop. In this work, we add network awareness in Hadoop while scheduling the map tasks over federated clusters.  ...  We observe 12 % to 15 % reduction of execution time in FIFO and FAIR schedulers of Hadoop for varying workloads.  ...  The proposed work focuses on improving data locality while scheduling map tasks for global MapReduce so that all types of data intensive applications such as, map-only, mapmostly and map-reduce application  ... 
doi:10.1145/2378975.2378985 fatcat:fokcusmn7zfmzi2m63hk33z6pe

A simulation framework for energy efficient data grids

Ziliang Zong, Kiranmai Bellam, Xiao Qin, Yiming Yang, Xiaojun Ruan, Adam Manzanares
2007 2007 Winter Simulation Conference  
Our framework aims at simulating a data grid that can conserve energy for data-intensive applications running on data grids.  ...  High performance data grids are increasingly becoming popular platforms to support data-intensive applications. Reducing high energy consumption caused by data grids is a challenging issue.  ...  Therefore, the global level scheduler has to communicate with and appropriately trigger several local level schedulers to complete data-intensive tasks submitted by users.  ... 
doi:10.1109/wsc.2007.4419751 dblp:conf/wsc/ZongQRBYM07 fatcat:fbaol3yxmnhkdbaef6nsmgx3ba

Nebula: Distributed edge cloud for data-intensive computing

Mathew Ryden, Kwangsung Oh, Abhishek Chandra, Jon Weissman
2014 2014 International Conference on Collaboration Technologies and Systems (CTS)  
However, they suffer from inefficient data mobility due to the centralization of cloud resources, and hence, are highly unsuited for disperseddata-intensive applications, where the data may be spread at  ...  Centralized cloud infrastructures have become the de-facto platform for data-intensive computing today.  ...  Third, the schedulers are different. In CSCI and CSDI, tasks are assigned randomly without concern for data locality, while the default Nebula scheduler is locality-aware.  ... 
doi:10.1109/cts.2014.6867613 dblp:conf/cts/RydenOCW14 fatcat:aswwg5wi3nbdnpzhfkyujmfx7y

Nebula: Distributed Edge Cloud for Data Intensive Computing

Mathew Ryden, Kwangsung Oh, Abhishek Chandra, Jon Weissman
2014 2014 IEEE International Conference on Cloud Engineering  
However, they suffer from inefficient data mobility due to the centralization of cloud resources, and hence, are highly unsuited for disperseddata-intensive applications, where the data may be spread at  ...  Centralized cloud infrastructures have become the de-facto platform for data-intensive computing today.  ...  Third, the schedulers are different. In CSCI and CSDI, tasks are assigned randomly without concern for data locality, while the default Nebula scheduler is locality-aware.  ... 
doi:10.1109/ic2e.2014.34 dblp:conf/ic2e/RydenOCW14 fatcat:exr2j63wxvcerlfd5axtkvuyuq

Improving MapReduce Performance in Heterogeneous Network Environments and Resource Utilization

Zhenhua Guo, Geoffrey Fox
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
We investigate network heterogeneity aware scheduling of both map and reduce tasks.  ...  In MapReduce, map and reduce tasks are assigned to map and reduce slots hosted by worker nodes. Usually the numbers of map and reduce slots are carefully chosen to gain optimal resource usage.  ...  Their native support of data locality aware scheduling dramatically reduces data movement.  ... 
doi:10.1109/ccgrid.2012.12 dblp:conf/ccgrid/GuoF12 fatcat:7cmx4wtff5g2dcsjpzdjqiqfqy

Data-Aware Resource Scheduling for Multicloud Workflows: A Fine-Grained Simulation Approach

Wei Tang, Jonathan Jenkins, Folker Meyer, Robert Ross, Rajkumar Kettimuthu, Linda Winkler, Xi Yang, Thomas Lehman, Narayan Desai
2014 2014 IEEE 6th International Conference on Cloud Computing Technology and Science  
We develop a workflow simulator based on a network simulation framework for fine-grained simulation for workflow computation and data movement.  ...  In this work, we study the impact of data-aware resource management and scheduling on scientific workflows in multicloud environments.  ...  Thus, we need data-aware task allocation to reduce the data movement overhead. E. Simulation for Data-Aware Scheduling In this section, we evaluate data-aware task allocation.  ... 
doi:10.1109/cloudcom.2014.19 dblp:conf/cloudcom/TangJMRKWYLD14 fatcat:tma42lfk5vfc7haxlvcdfv7vr4

Cross-Phase Optimization in MapReduce [chapter]

Benjamin Heintz, Abhishek Chandra, Jon Weissman
2014 Cloud Computing for Data-Intensive Applications  
Using Hadoop, we show that the absence of network and node homogeneity and locality of data lead to poor performance.  ...  Similarly, we propose techniques that optimize the map and reduce phases to enable shuffle cost to feed back and affect map scheduling decisions.  ...  ACKNOWLEDGMENT The authors would like to acknowledge NSF Grant IIS-0916425 and NSF Grant CNS-0643505, which supported this research.  ... 
doi:10.1007/978-1-4939-1905-5_12 fatcat:izaj33b62zhblnxh2vmtno5ivq

CAM

Min Li, Dinesh Subhraveti, Ali R. Butt, Aleksandr Khasymski, Prasenjit Sarkar
2012 Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing - HPDC '12  
MapReduce has emerged as a prevailing distributed computation paradigm for enterprise and large-scale data-intensive computing.  ...  In this paper we propose, CAM, a cloud platform that provides an innovative resource scheduler particularly designed for hosting MapReduce applications in the cloud.  ...  Michela Taufer for her feedback in preparing the final draft of the paper, and to Guanying Wang for his initial ideas on using min-cost flow approach for VM and data placement for supporting virtualized  ... 
doi:10.1145/2287076.2287110 dblp:conf/hpdc/LiSBKS12 fatcat:d6uxuoodvjh3hiflarcl5e6qim

Cross-Phase Optimization in MapReduce

B. Heintz, Chenyu Wang, A. Chandra, J. Weissman
2013 2013 IEEE International Conference on Cloud Engineering (IC2E)  
Using Hadoop, we show that the absence of network and node homogeneity and locality of data lead to poor performance.  ...  Similarly, we propose techniques that optimize the map and reduce phases to enable shuffle cost to feed back and affect map scheduling decisions.  ...  ACKNOWLEDGMENT The authors would like to acknowledge NSF Grant IIS-0916425 and NSF Grant CNS-0643505, which supported this research.  ... 
doi:10.1109/ic2e.2013.26 dblp:conf/ic2e/HeintzWCW13 fatcat:hucgibm4ezb3pd3ek3f3e3aw4y

Hierarchical MapReduce Programming Model and Scheduling Algorithms

Yuan Luo, Beth Plale
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
Two scheduling algorithms are introduced: Compute Capacity Aware Scheduling for compute-intensive jobs and Data Location Aware Scheduling for data-intensive jobs.  ...  The applications implemented in this framework adopt the Map-Reduce-GlobalReduce model where computations are expressed as three functions: Map, Reduce, and GlobalReduce.  ...  Philip Papadopoulos and Cindy Zheng of the PRAGMA community for access to PRAGMA Cloud and help bringing up our IU PRAGMA node.  ... 
doi:10.1109/ccgrid.2012.132 dblp:conf/ccgrid/LuoP12 fatcat:56nbhhlslbezrhgxlfzhgxv7nm

Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

Xiangping Bu, Jia Rao, Cheng-zhong Xu
2013 Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13  
In a virtual MapReduce cluster, the interference between virtual machines (VMs) causes performance degradation of map and reduce tasks and renders existing data locality-aware task scheduling policy, like  ...  In this paper, we present a task scheduling strategy to mitigate interference and meanwhile preserving task data locality for MapReduce applications.  ...  NSF grants CCF-1016966 and CNS-0914330.  ... 
doi:10.1145/2493123.2462904 fatcat:u5njufiirnazdhwfcxft7hqsji

THE AMBIENT SCRUTINIZE OF SCHEDULING ALGORITHMS IN BIG DATA TERRITORY

Yusuf Perwej
2018 International Journal of Advanced Research  
It is an increasingly business for companies to collect and analysis Big Data and provides insights to their client.  ...  The job scheduling algorithms are essential for efficient make use of cluster resources and executing them in short time.  ...  This technique to designs a locality-aware, skew-aware reduce task scheduler for stop MapReduce network traffic.  ... 
doi:10.21474/ijar01/6672 fatcat:bdyexu2po5a3fl3wz5i2k4odxy

Software Design and Implementation for MapReduce across Distributed Data Centers

Lizhe Wang, Jie Tao, Yan Ma, Samee U. Khan, Joanna Kołodziej, Dan Chen
2013 Applied Mathematics & Information Sciences  
The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications.  ...  Recently, the computational requirements for large-scale data-intensive analysis of scientific data have grown significantly.  ...  It is optimized for wide-area operation and offers the required location awareness to allow data-aware scheduling among clusters.  ... 
doi:10.12785/amis/071l13 fatcat:7ip6kgxvc5dgzcyicdhxpyaxem
« Previous Showing results 1 — 15 out of 75,227 results