A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Boosting MapReduce with Network-Aware Task Assignment
[chapter]
2014
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
We further design a network-aware task assignment strategy to shorten the completion time of MapReduce jobs in shared clusters. ...
task assignment strategies, yet with an acceptable computational overhead. ...
Effectiveness of Network-Aware Task Assignment To illustrate the effectiveness of our network-aware task assignment strategy, we compared it with three widely-used strategies, i.e., random assignment, ...
doi:10.1007/978-3-319-05506-0_8
fatcat:ph7646za2zeq5j4reho2odov24
Software Design and Implementation for MapReduce across Distributed Data Centers
2013
Applied Mathematics & Information Sciences
G-Hadoop uses the Gfarm file system as an underlying file system and executes MapReduce tasks across distributed clusters. ...
The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications. ...
In traditional Hadoop clusters with HDFS, map tasks are preferably assigned to nodes where the required input data is locally present. ...
doi:10.12785/amis/071l13
fatcat:7ip6kgxvc5dgzcyicdhxpyaxem
Improving MapReduce Performance in Heterogeneous Network Environments and Resource Utilization
2012
2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
We investigate network heterogeneity aware scheduling of both map and reduce tasks. ...
In MapReduce, map and reduce tasks are assigned to map and reduce slots hosted by worker nodes. Usually the numbers of map and reduce slots are carefully chosen to gain optimal resource usage. ...
However, long tasks may get starved and priority boosting can be used to avoid starvation. The second heuristic is to choose the task with maximum expected completion time among C j , shown in (8) . ...
doi:10.1109/ccgrid.2012.12
dblp:conf/ccgrid/GuoF12
fatcat:7cmx4wtff5g2dcsjpzdjqiqfqy
A survey on bandwidth-aware geo-distributed frameworks for big-data analytics
2021
Journal of Big Data
In this article, we discuss challenges and survey the latest geo-distributed big-data analytics frameworks and schedulers (based on MapReduce and Spark) with WAN-bandwidth awareness. ...
While cluster computing applications, such as MapReduce and Spark, have been widely deployed in data centres to support commercial applications and scientific research, they are not designed for running ...
MapReduce jobs are submitted to a resource manager that supervises and assigns the execution of tasks to node managers. ...
doi:10.1186/s40537-021-00427-9
fatcat:u2jx7x6hkfc47kn2iqpkcquhi4
MAPREDUCE CHALLENGES ON PERVASIVE GRIDS
2014
Journal of Computer Science
context-awareness and fault-tolerance features; and providing an alternative pervasive grid implementation, fully adapted to dynamic environments. ...
fault-tolerance features to provide efficient and reliable MapReduce services on pervasive grids. ...
Container assignment with context-awareness configuration simulating heterogeneous environment Table 1. ...
doi:10.3844/jcssp.2014.2194.2210
fatcat:7nd7azrvifc6xi6wqrdwp274cu
FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
2016
IEEE Transactions on Systems, Man & Cybernetics. Systems
JobTracker is responsible for assigning and scheduling tasks; each TaskTracker handles Map or Reduce tasks assigned by JobTracker. ...
The overarching goal of FiDoop-DP is to boost the performance. A similarity metric to facilitate data-aware partitioning. ...
doi:10.1109/tsmc.2015.2437327
fatcat:sgodseagojalzpvby7svpxgt74
Locality-Aware Reduce Task Scheduling for MapReduce
2011
2011 IEEE Third International Conference on Cloud Computing Technology and Science
LARTS attempts to collocate reduce tasks with the maximum required data computed after recognizing input data network locations and sizes. ...
This paper describes Locality-Aware Reduce Task Scheduler (LARTS), a practical strategy for improving MapReduce performance. ...
Thus, similar to map task scheduling, we suggest making MapReduce aware of partitions' network locations in order to apply locality to reduce task scheduling. ...
doi:10.1109/cloudcom.2011.87
dblp:conf/cloudcom/HammoudS11
fatcat:iyrlovqosnfoldayb4lq7qoh4q
Investigation of data locality and fairness in MapReduce
2012
Proceedings of third international workshop on MapReduce and its Applications Date - MapReduce '12
Its data locality aware scheduling strategy exploits the locality of data accessing to minimize data movement and thus reduce network traffic. ...
In data-intensive computing, MapReduce is an important tool that allows users to process large amounts of data easily. ...
For typical MapReduce clusters where most jobs are small, scheduling delay of several seconds is sufficient to generate performance boost. ...
doi:10.1145/2287016.2287022
fatcat:to2ism2yxvfmhfuko52htnfzqe
A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks
[article]
2019
arXiv
pre-print
This survey article reviews the challenges associated with deploying and optimizing big data applications and machine learning algorithms in cloud data centers and networks. ...
MapReduce and Hadoop thus introduce innovative, efficient, and accelerated intensive computations and analytics. ...
8 servers
(using Pica8 3297), trace-driven simulations
[277]*
Network-aware MapReduce tasks placement
to reduce transmission costs in DCNs
Hadoop 1.2.1
Probabilistic tasks
scheduling algorithm ...
arXiv:1910.00731v1
fatcat:kvi3br4iwzg3bi7fifpgyly7m4
MapReduce across Distributed Clusters for Data-intensive Applications
2012
2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
G-Hadoop uses the Gfarm file system as an underlying file system and executes MapReduce tasks across distributed clusters. ...
The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications. ...
In traditional Hadoop clusters with HDFS, map tasks are preferably assigned to nodes where the required input data is locally present. ...
doi:10.1109/ipdpsw.2012.249
dblp:conf/ipps/WangTMSKKC12
fatcat:xnrdnqzpubgm5lm7jwhwejuo5m
A cross-job framework for MapReduce scheduling
2014
2014 IEEE International Conference on Big Data (Big Data)
. (2) It can support all the existing MapReduce applications with no changes to their source code. (3) It is a general framework, which can work with different scheduling algorithms. ...
Our experimental results show that the cross-job Hadoop can significantly reduce both the total processing time of a job sequence and the size of data transferred over the network. ...
HaLoop not only extends MapReduce with programming support for iterative applications, but also dramatically improves their efficiency by making the task scheduler loop-aware and by adding various caching ...
doi:10.1109/bigdata.2014.7004222
dblp:conf/bigdataconf/XiaoTCXW14
fatcat:pd34gbqp2nccbkok3424ck7hxm
H-WorD: Supporting Job Scheduling in Hadoop with Workload-Driven Data Redistribution
[chapter]
2016
Lecture Notes in Computer Science
Today's distributed data processing systems typically follow a query shipping approach and exploit data locality for reducing network traffic. ...
We exemplify our algorithm in the context of MapReduce jobs in a Hadoop ecosystem. Finally, we evaluate our approach and demonstrate the benefits of automatic data redistribution. ...
Using such information, we can proactively perform data redistribution in advance for boosting tasks' data locality and parallelism of the MapReduce jobs. ...
doi:10.1007/978-3-319-44039-2_21
fatcat:fs73k4otknhnhbj25wlvfhttwi
Support Vector Regression based Mapreduce Throttled Load Balancer for Data Centers
2019
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
In order to improve the load balancing with maximum throughput and minimum makespan, Support Vector Regression based MapReduce Throttled Load Balancing (SVR-MTLB) technique is introduced. ...
Therefore, the incoming tasks are allocated with better utilization of resources to minimize the workload across the server in the cloud. ...
The task assigner performs priority task classification of incoming tasks using gradient Boosting ensemble classifier. ...
doi:10.35940/ijitee.a6102.119119
fatcat:zsqjyqn2yff7phntko5k6yipau
Toward scalable internet traffic measurement and analysis with Hadoop
2012
Computer communication review
We also explain the performance issues related with traffic analysis MapReduce jobs. ...
From experiments with a 200-node testbed, we achieved 14 Gbps throughput for 5 TB files with IP and HTTP-layer analysis MapReduce jobs. ...
such as CPU, memory, hard disk, and network, and the other with MapReduce algorithm optimization. ...
doi:10.1145/2427036.2427038
fatcat:43elfcm5kbdbbojjvj7ljevwmm
Hadoop MapReduce for Mobile Clouds
2016
IEEE Transactions on Cloud Computing
., caused by unexpected device failures or topology changes in a dynamic network). ...
We have developed the Hadoop MapReduce framework over MDFS and have studied its performance by varying input workloads in a real heterogeneous mobile cluster. ...
Energy-aware task scheduling Hadoop Mapreduce framework relies on data locality for boosting overall system throughput. Computation is moved closer to the nodes where the data resides. ...
doi:10.1109/tcc.2016.2603474
fatcat:2kdyoj2xefeztc3yt2akk5bqo4
« Previous
Showing results 1 — 15 out of 790 results