2,482 Hits in 5.2 sec

Network-aware scheduling of mapreduce framework ondistributed clusters over high speed networks

Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue, Seung-Jong Park
2012 Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit - FederatedClouds '12  
In this work, we add network awareness in Hadoop while scheduling the map tasks over federated clusters.  ...  Google's MapReduce has gained significant popularity as a platform for large scale distributed data processing.  ...  In our implementation of global MapReduce architecture, network topology script has information about virtual machines and physical location of the cluster from which they are provisioned.  ... 
doi:10.1145/2378975.2378985 fatcat:fokcusmn7zfmzi2m63hk33z6pe

vLocality: Revisiting Data Locality for MapReduce in Virtualized Clouds

Xiaoqiang Ma, Xiaoyi Fan, Jiangchuan Liu, Hongbo Jiang, Kai Peng
2017 IEEE Network  
In this article, through real-world experiments, we show strong evidence that the conventional notion of data locality is unfortunately not always beneficial for MapReduce in a virtualized environment.  ...  State-of-the-art public clouds heavily rely on virtualization to enable resource sharing and scaling for massive users, however.  ...  This work was supported in part by the National Natural Science Foundation of China under Grants 61572219 and 61502192, a Canada NSERC Discovery Grant, and a Canada NSERC Strategic Project Grant.  ... 
doi:10.1109/mnet.2016.1500133nm fatcat:2bhvbzxsxvagff5tq4hzezztia

Morpho: A decoupled MapReduce framework for elastic cloud computing

Lu Lu, Xuanhua Shi, Hai Jin, Qiuyue Wang, Daxing Yuan, Song Wu
2014 Future generations computer systems  
A load-aware data placement strategy is complementary to the VM placement. Abstract MapReduce as a service enjoys wide adoption in commercial clouds today [1, 2] .  ...  In cloud environments, the basic executing units of data processing are virtual machines.  ...  Conclusion This paper presents Morpho, a decoupled but locality-aware MapReduce framework for cloud computing.  ... 
doi:10.1016/j.future.2013.12.026 fatcat:ff4nnpbpa5buhjy7kaniyjkx2u


Min Li, Dinesh Subhraveti, Ali R. Butt, Aleksandr Khasymski, Prasenjit Sarkar
2012 Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing - HPDC '12  
The model is also increasingly used in the massively-parallel cloud environment, where MapReduce jobs are run on a set of virtual machines (VMs) on pay-as-needed basis.  ...  In this paper we propose, CAM, a cloud platform that provides an innovative resource scheduler particularly designed for hosting MapReduce applications in the cloud.  ...  MapReduce.  ... 
doi:10.1145/2287076.2287110 dblp:conf/hpdc/LiSBKS12 fatcat:d6uxuoodvjh3hiflarcl5e6qim

Locality and loading aware virtual machine mapping techniques for optimizing communications in MapReduce applications

Ching-Hsien Hsu, Kenn D. Slagter, Yeh-Ching Chung
2015 Future generations computer systems  
h i g h l i g h t s • Improving performance of MapReduce programs in heterogeneous environments and hybrid clouds. • Enhancing data locality through a virtual machine mapping technique. • Optimizing shuffle  ...  In this paper we propose a method to improve MapReduce execution in heterogeneous environments.  ...  Research into on MapReduce clouds [26] looked at how locality aware VM reconfiguration could be attained.  ... 
doi:10.1016/j.future.2015.04.006 fatcat:zygcfqyn2ne4neif3vpiwc7yki

The Study of a Hierarchical Hadoop Architecture in Multiple Data Centers Environment

Sun Shengtao, Wu Aizhi, Liu Xiaoyang
2015 Open Cybernetics and Systemics Journal  
Hadoop is a reasonable tool for cloud computing in big data era and MapReduce paradigm may be a highly successful programming model for large-scale data-intensive computing application, but the conventional  ...  The job submitted by user can be decomposed automatically into several sub-jobs which are then allocated and executed on corresponding clusters by location-aware manner.  ...  ACKNOWLEDGEMENTS The authors would like to thank the advices of Professor Lizhe Wang and the help from team members of Data Technology Department in CEODE (Center of Earth Observation and Digital Earth  ... 
doi:10.2174/1874110x01509010131 fatcat:ic5m5se3hrcf3ov7jbpan7wruu

Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

Xiangping Bu, Jia Rao, Cheng-zhong Xu
2013 Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13  
In a virtual MapReduce cluster, the interference between virtual machines (VMs) causes performance degradation of map and reduce tasks and renders existing data locality-aware task scheduling policy, like  ...  We implement the interference and locality-aware (ILA) scheduling strategy in a virtual MapReduce framework. We evaluated its effectiveness and efficiency on a 72-node Xen-based virtual cluster.  ...  This research was supported in part by U.S. NSF grants CCF-1016966 and CNS-0914330.  ... 
doi:10.1145/2493123.2462904 fatcat:u5njufiirnazdhwfcxft7hqsji


Balaji Palanisamy, Aameek Singh, Ling Liu, Bhushan Jain
2011 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11  
Purlieus provisions virtual MapReduce clusters in a locality-aware manner enabling MapReduce virtual machines (VMs) access to input data and importantly, intermediate data from local or close-by physical  ...  We present Purlieus, a MapReduce resource allocation system aimed at enhancing the performance of MapReduce jobs in the cloud.  ...  Using virtual machines (VMs) and storage hosted by the cloud, enterprises can simply create virtual MapReduce clusters to analyze their data.  ... 
doi:10.1145/2063384.2063462 dblp:conf/sc/PalanisamySLJ11 fatcat:gslnfk3k3jgjpdb4rilqphz5sq

Dependency-Aware Data Locality for MapReduce

Xiaoyi Fan, Xiaoqiang Ma, Jiangchuan Liu, Dan Li
2014 2014 IEEE 7th International Conference on Cloud Computing  
Beside algorithmic design within the framework, we have also closely examined the deployment challenges, particularly in public virtualized cloud environments.  ...  In this thesis, we present DALM (Dependency-Aware Locality for MapReduce), a comprehensive and practical solution toward dependency-aware locality for processing the real-world input data that can be highly  ...  In the virtualized environment, a location-aware data allocation strategy was proposed in [20] , which allocates file blocks across all physical machines evenly and the replicas are located in different  ... 
doi:10.1109/cloud.2014.62 dblp:conf/IEEEcloud/FanMLL14 fatcat:sqyn6akjdbfxxafdrt34n2mjhm

Provisioning and Evaluating Multi-domain Networked Clouds for Hadoop-based Applications

Anirban Mandal, Yufeng Xin, Ilia Baldine, Paul Ruth, Chris Heerman, Jeff Chase, Victor Orlikowski, Aydan Yumerefendi
2011 2011 IEEE Third International Conference on Cloud Computing Technology and Science  
Hadoop's topology-awareness feature can mitigate these penalties to a modest degree in these hybrid bandwidth scenarios.  ...  The evaluations examine conditions in which multi-cloud Hadoop deployments pose significant advantages or disadvantages during Map/Reduce/Shuffle operations.  ...  We also observed that resource contention plays a major role in determining performance when there are co-located virtual machines on a single node.  ... 
doi:10.1109/cloudcom.2011.107 dblp:conf/cloudcom/MandalXBRHCOY11 fatcat:knqkz634tjbgnkawpv5clcfltq

On interference-aware provisioning for cloud-based big data processing

Yi Yuan, Haiyang Wang, Dan Wang, Jiangchuan Liu
2013 2013 IEEE/ACM 21st International Symposium on Quality of Service (IWQoS)  
To address this problem, we re-model the resource provisioning problem in the cloud-based big data systems and present an interference-aware solution that smartly allocates the MapReduce jobs to different  ...  In this paper, we take the first steps towards a better understanding of the big data system on the cloud platforms.  ...  Modeling of Interference-aware Resource Provisioning In order to remodel resource provisioning problem with considering interference in cloud. We remodel completion time of a MapReduce job first.  ... 
doi:10.1109/iwqos.2013.6550282 dblp:conf/iwqos/YuanWWL13 fatcat:kqxfhmx6cra6nkbhgm56arca2e

Hierarchical MapReduce Programming Model and Scheduling Algorithms

Yuan Luo, Beth Plale
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
Two scheduling algorithms are introduced: Compute Capacity Aware Scheduling for compute-intensive jobs and Data Location Aware Scheduling for data-intensive jobs.  ...  We present a Hierarchical MapReduce framework that gathers computation resources from different clusters and runs MapReduce jobs across them.  ...  Philip Papadopoulos and Cindy Zheng of the PRAGMA community for access to PRAGMA Cloud and help bringing up our IU PRAGMA node.  ... 
doi:10.1109/ccgrid.2012.132 dblp:conf/ccgrid/LuoP12 fatcat:56nbhhlslbezrhgxlfzhgxv7nm

Improving Hadoop Service Provisioning in a Geographically Distributed Cloud

Qi Zhang, Ling Liu, Kisung Lee, Yang Zhou, Aameek Singh, Nagapramod Mandagere, Sandeep Gopisetty, Gabriel Alatorre
2014 2014 IEEE 7th International Conference on Cloud Computing  
In this paper, we first compare multi-datacenter Hadoop deployment with single-datacenter Hadoop deployment to identify the performance issues inherent in a geographically distributed cloud.  ...  A generalization of the problem characterization in the context of geographically distributed cloud datacenters is also provided with discussions on general optimization strategies.  ...  Problem Definition We define a geographically distributed cloud as a virtual cloud datacenter that manages multiple geographically dispersed datacenters.  ... 
doi:10.1109/cloud.2014.65 dblp:conf/IEEEcloud/ZhangLLZSMGA14 fatcat:pjzhqt633vavnbylfnqopiand4

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks [article]

Sanaa Hamid Mohamed, Taisir E.H. El-Gorashi, Jaafar M.H. Elmirghani
2019 arXiv   pre-print
The MapReduce programming model and its widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services and big data applications.  ...  This survey article reviews the challenges associated with deploying and optimizing big data applications and machine learning algorithms in cloud data centers and networks.  ...  All data are provided in full in the results section of this paper.  ... 
arXiv:1910.00731v1 fatcat:kvi3br4iwzg3bi7fifpgyly7m4

MapReduce: A Technical Review

T. Y. J. Naga Malleswari, G. Vadivu
2016 Indian Journal of Science and Technology  
MapReduce is implemented in vHadoop (Virtual Hadoop), a scalable hadoop virtual cluster to process machine learning algorithms.  ...  The scenarios discussed in this paper help developers and researchers how to customize and use MapReduce in their applications.  ...  MapReduce is a parallel data processing approach 5 and is implemented in cloud environment on a computer cluster.  ... 
doi:10.17485/ijst/2016/v9i1/78964 fatcat:ewly2owceraszhv6qx32jootma
« Previous Showing results 1 — 15 out of 2,482 results