A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Dynamic Job Ordering and Slot Configurations for MapReduce Workloads
2016
IEEE Transactions on Services Computing
This paper proposes two classes of algorithms to minimize the makespan and the total completion time for an offline MapReduce workload. ...
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and data centers. ...
[33] presented an I/O-efficient MapReduce system called Themis that improves the performance of MapReduce by minimizing the number of I/O operations. ...
doi:10.1109/tsc.2015.2426186
fatcat:wcvmhd63ubfofg3t5wtjoz7gka
Achieving cost-efficient, data-intensive computing in the cloud
2015
Proceedings of the Sixth ACM Symposium on Cloud Computing - SoCC '15
Themis MapReduce derives much of its I/O-efficiency from its pipelined implementation, which limits the amount of extraneous I/O relative to frameworks like Hadoop [39] . ...
We now model the performance of Themis under several assumptions about I/O efficiency and data durability.
2-IO Because Themis eschews traditional task-level fault tolerance, it exhibits the 2-IO property ...
doi:10.1145/2806777.2806781
dblp:conf/cloud/ConleyVP15
fatcat:tfphsj7yfjfrtdfsuukjgxgtci
Efficient Mapreduce Workloads based on Slot Configuration and Job Ordering
2017
International Journal of Computer & Mathematical Sciences IJCMS
unpublished
Hadoop, is an open source execution of MapReduce, is established in vast bunches consisting many machines in organizations, for example, Instagram and Twitter. ...
In these group and datacenter environments, MapReduce and Hadoop are employed for batch handling tasks assigned from several clients (i.e., MapReduce workloads). ...
Rasmussen et al. presented an I/O-efficient MapReduce system called Themis that improves the performance of MapReduce by minimizing the number of I/O operations. ...
fatcat:bryoydeeqbchziddeiehbrlbsq
PortHadoop: Support direct HPC data processing in Hadoop
2015
2015 IEEE International Conference on Big Data (Big Data)
In this study, we propose PortHadoop, an enhanced Hadoop architecture that enables MapReduce applications reading data directly from HPC parallel file systems (PFS). ...
The success of the Hadoop MapReduce programming model has greatly propelled research in big data analytics. ...
Therefore, this procedure guarantees both data integrity and I/O efficiency. In summary, PortHadoop implements three split alignment strategies as shown in Figure 5 , to support split alignment. ...
doi:10.1109/bigdata.2015.7363759
dblp:conf/bigdataconf/YangLFSZ15
fatcat:7mubyjhbvfgmzlevy4x5x4c6dm
29th International Conference on Data Engineering [book of abstracts]
2013
2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW)
Then based on an novel cost model, we propose an I/O efficient strategy to evaluate SPARQL queries as quickly as possible, especially queries with solution modifiers specified, e.g., PROJECTION, ORDER ...
WeD/10
EAGRE: Towards Scalable I/O Efficient SPARQL Query Evaluation on the Cloud
Xiaofei Zhang, Lei Chen, Yongxin Tong (Hong Kong University of Science and Technology) Min Wang (HP Labs China) To ...
doi:10.1109/icdew.2013.6547409
fatcat:wadzpuh3b5htli4mgb4jreoika
Program book
2010
2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)
RIOT makes R programs I/O-efficient in a way transparent to users. It features a flexible array storage manager and an optimization engine suitable for statistical and numerical operations. ...
I/O-Efficient Statistical Computing with RIOT Yi Zhang, Weiping Zhang, Jun Yang; Duke University, USA Statistical analysis of massive data is becoming indispensable to science, commerce, and society today ...
SMDB'10 will be a one-day workshop where accepted papers are presented in an informal and interactive setting. Participation in the workshop is not limited to authors of accepted papers. ...
doi:10.1109/icdew.2010.5452773
fatcat:oyq2tujbvjfpxjlyixux5q57vu
A Framework for Integrating IoT Streaming Data from Multiple Sources
2021
In detail, their
index is on the segmented sections of an incoming records stream, which is stored in files
for disk I/O efficiency. ...
Pairwise document similarity
in large collections with mapreduce. ...
doi:10.26181/17211713
fatcat:6ahlrck3t5gs3onejnojd72ray