313 Hits in 3.8 sec

A New Framework for Evaluating Straggler Detection Mechanisms in MapReduce

Tien-Dat Phan, Guillaume Pallez, Shadi Ibrahim, Padma Raghavan
2019 ACM Transactions on Modeling and Performance Evaluation of Computing Systems  
These results demonstrate how our framework can offer useful insights and be applied in practical settings to characterize and design new straggler detection mechanisms for MapReduce systems.  ...  In this paper, we consider a complete framework for straggler detection and mitigation.  ...  XX:3 Contributions: In this paper, we suggest a new framework that characterizes the straggler detection mechanisms by utilizing a comprehensive list of evaluation metrics, including Precision, Recall,  ... 
doi:10.1145/3328740 fatcat:occqb5nr25cnrfns6inwsg4e6i

Improving MapReduce Performance through Process Migration

Rahul R. Ghule, Sachine N. Deshmukh
2015 International Journal of Engineering Research and  
MapReduce is widely used and popular programming model for huge amount of data processing. Hadoop is open source implementation of MapReduce framework.  ...  Hadoop MapReduce is used for large data processing. It computes large amount of data in less time.  ...  Mapreduce is proposed by Google in 2004 and is popular for parallel computing framework for large scale data processing. In Mapreduce job, master divides the files into multiple map and reduce task.  ... 
doi:10.17577/ijertv4is070392 fatcat:nwmrbbkz5nhihib77skbsswk7u

Failure detector abstractions for MapReduce-based systems

Bunjamin Memishi, María S. Pérez, Gabriel Antoniu
2017 Information Sciences  
In the case of MapReduce-based systems, many state-of-the-art approaches have preferred to explore and extend speculative execution mechanisms.  ...  In this paper, we have studied the omission failures in MapReduce systems, formalizing their failure detector abstraction by means of three different algorithms for defining the timeout.  ...  Acknowledgments The research leading to these results has received funding from the H2020 project reference number 642963 in the call H2020-MSCA-ITN-2014.  ... 
doi:10.1016/j.ins.2016.08.013 fatcat:aduhhah2qfhfvnt2p2a7y7sxmu

Mitigate data skew caused stragglers through ImKP partition in MapReduce

Xue Ouyang, Huan Zhou, Stephen Clement, Paul Townend, Jie Xu
2017 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC)  
Speculative execution is the mechanism adopted by current MapReduce framework when dealing with the straggler problem, and it functions through creating redundant copies for identified stragglers.  ...  In this paper, we focus on mitigating data skew caused Reduce stragglers, propose ImKP, an Intermediate Key Pre-processing framework that enables the even distributed partition for Reduce inputs.  ...  Abstract-Speculative execution is the mechanism adopted by current MapReduce framework when dealing with the straggler problem, and it functions through creating redundant copies for identified stragglers  ... 
doi:10.1109/pccc.2017.8280475 dblp:conf/ipccc/OuyangZCTX17 fatcat:ory4bvmpofdpfnzxnjcjumut2i

Fault Tolerance in MapReduce: A Survey [chapter]

Bunjamin Memishi, Shadi Ibrahim, María S. Pérez, Gabriel Antoniu
2016 Computer Communications and Networks  
Data-intensive computing systems, such as Hadoop MapReduce, have as main goal the processing of an enormous amount of data in a short time, by transmitting the computation where the data resides.  ...  In particular, MapReduce frameworks tolerate machine failures (crash failures) by re-executing all the tasks of the failed machine by the virtue of data replication.  ...  Acknowledgments The research leading to these results has received funding from the H2020 project reference number 642963 in the call H2020-MSCA-ITN-2014.  ... 
doi:10.1007/978-3-319-44881-7_11 dblp:series/ccn/MemishiIPA16 fatcat:m5x33gpzunhzzgrdslagndiwzy

Enhancing MapReduce Fault Recovery Through Binocular Speculation [article]

Huansong Fu, Yue Zhu, Amit Kumar Nath, Md. Muhib Khan, Weikuan Yu
2019 arXiv   pre-print
To address the speculation myopia caused by MapReduce dichotomy, we introduce a new scheme called binocular speculation to help MapReduce increase its assessment scope for speculation.  ...  MapReduce speculation plays an important role in finding potential task stragglers and failures.  ...  ACKNOWLEDGMENT This work is supported in part by the National Science Foundation awards 1561041, 1564647, and 1744336.  ... 
arXiv:1901.07715v1 fatcat:bwchplbyzncgxhpn773rfhkfsq

An Approach for Modeling and Ranking Node-Level Stragglers in Cloud Datacenters

Xue Ouyang, Peter Garraghan, Changjian Wang, Paul Townend, Jie Xu
2016 2016 IEEE International Conference on Services Computing (SCC)  
Different sample sets have been filtered in order to evaluate the generality of our framework, and the analytic results demonstrate that node abilities of executing parallel tasks tend to follow a 3parameter-loglogistic  ...  We exploit a graphbased algorithm for partitioning server nodes into five levels, with 0.83% of node-level stragglers identified.  ...  An approach for node-level straggler detection.  ... 
doi:10.1109/scc.2016.93 dblp:conf/IEEEscc/OuyangGWTX16 fatcat:2edokdyrdve6xiu3cfqgnphllu

Improved Hadoop Cluster Performance by Dynamic Load and Resource Aware Speculative Execution and Straggler Node Detection

2020 International Journal of Engineering and Advanced Technology  
For the lightly loaded case, a task cloning scheme, namely, the combined file task cloning algorithm, which is based on maximizing the overall system utility, a straggler detection algorithm is proposed  ...  The detection and cloning of tasks assigned with the stragglers only will not be enough to enhance the performance unless cloning of tasks is allocated in a resource aware method.  ...  Design of a straggler-detection-based Algorithm for the heavily loaded regime For a heavily loaded cluster, i.e.  ... 
doi:10.35940/ijeat.d8017.049420 fatcat:6b7d6gkh4nc3xkz7qpu65anqia

Collaborative Learning Based Straggler Prevention in Large-Scale Distributed Computing Framework

Shyam Deshmukh, Komati Thirupathi Rao, Mohammad Shabaz, Manjit Kaur
2021 Security and Communication Networks  
To alleviate such problems, we propose a novel collaborative learning-based approach for straggler prediction, the alternate direction method of multipliers (ADMM), which is resource-efficient and learns  ...  how to efficiently deal with mitigating stragglers without moving data to a centralized location.  ...  Figure 1: Workflow of proposed straggler detection framework.  ... 
doi:10.1155/2021/8340925 fatcat:7fe5onujrjbaveoykoqrcux7uu

Tolhit – A Scheduling Algorithm for Hadoop Cluster

M. Brahmwar, M. Kumar, G. Sikka
2016 Procedia Computer Science  
In this work, a new scheme is introduced to aid the scheduler in identifying the nodes on which stragglers can be executed.  ...  Use of MapReduce as a programming model has become pervasive for processing such wide range of Big Data Applications in cloud computing environment.  ...  In cluster In order to evaluate the performance of the algorithm Wordcount job is run. Wordcount is a MapReduce application which is used for counting the words in the input file.  ... 
doi:10.1016/j.procs.2016.06.043 fatcat:mpa7x4kifzdhzaiqnoqijcj6ii

Proxy Responses by FPGA-Based Switch for MapReduce Stragglers

2018 IEICE transactions on information and systems  
We also propose how to offload detecting stragglers and computing their results in the network switch with no additional communications between worker nodes.  ...  In this paper, we propose a network switch based straggler handling system to mitigate the burden of the compute nodes.  ...  Backup Task Backup Task [1] is the conventional solution for the straggler problem in MapReduce framework. With Backup Task, the master node monitors all the task progress and detects delayed ones.  ... 
doi:10.1587/transinf.2017edp7287 fatcat:zx4pmkyq4fh5hlju5i4y2hru3m

Chronos: A Unifying Optimization Framework for Speculative Execution of Deadline-critical MapReduce Jobs [article]

Maotong Xu, Sultan Alamro, Tian Lan, Suresh Subramaniam
2018 arXiv   pre-print
In this paper, we bring several speculative scheduling strategies together under a unifying optimization framework, called Chronos, which defines a new metric, Probability of Completion before Deadlines  ...  While a number of strategies have been developed in existing work to mitigate stragglers by launching speculative or clone task attempts, none of them provides a quantitative framework that optimizes the  ...  They proposed new mechanisms to detect stragglers reactively and proactively and launch speculative tasks accordingly [2] [3] [4] [6] [7] [8] [9] [10] .  ... 
arXiv:1804.05890v1 fatcat:io5yllvmvzdodhqv6pvabdcqne

A Survey of Load Balancing Techniques for Data Intensive Computing [chapter]

Zhiquan Sui, Shrideep Pallickara
2011 Handbook of Data Intensive Computing  
Data Intensive Computing Frameworks Google MapReduce Framework MapReduce [1] is a framework introduced by Google that is well suited for concurrent processing of large datasets (usually more than 1 Tb)  ...  In Sect. 2 we discuss several popular data intensive computing frameworks. APIs available to for the development of cloud-scale applications are discussed in Sect. 3.  ...  In such a case, a straggler detection and avoidance mechanism becomes necessary.  ... 
doi:10.1007/978-1-4614-1415-5_6 fatcat:aj7uycimt5f65aeku64lbc7chm

Dominoes: Speculative Repair in Erasure-Coded Hadoop System

Xi Yang, Chen Feng, Zhiwei Xu, Xian-He Sun
2015 2015 IEEE 22nd International Conference on High Performance Computing (HiPC)  
In an erasure-coded system, data reconstruction time will be paid while tasks access the missing blocks during MapReduce job processing.  ...  Erasure coding can provide equivalent three-way fault tolerance to HDFS's default three replication mechanism but degrades data availability for task scheduling.  ...  This research is also supported in part by NSF under NSF grants CNS-0751200, CNS-1162540, and CNS-1526887.  ... 
doi:10.1109/hipc.2015.39 dblp:conf/hipc/YangFXS15 fatcat:k66mwepd5nh3dd5l5pffueil24

Cracking Down MapReduce Failure Amplification through Analytics Logging and Migration

Yandong Wang, Huansong Fu, Weikuan Yu
2015 2015 IEEE International Parallel and Distributed Processing Symposium  
In this paper, we introduce a new faulttolerant framework that can crack down failure amplification and gracefully handle failure scenarios.  ...  Our performance evaluation demonstrates that these techniques can eliminate failure amplification and deliver fast job execution compared to the existing task re-execution mechanism in MapReduce.  ...  Yandong Wang contributed to the research as a graduate student at Auburn. He is currently affiliated with IBM Watson.  ... 
doi:10.1109/ipdps.2015.111 dblp:conf/ipps/WangFY15 fatcat:hoo7elreo5d3hntvev76nhuemi
« Previous Showing results 1 — 15 out of 313 results