A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Characterizing and subsetting big data workloads
2014
2014 IEEE International Symposium on Workload Characterization (IISWC)
Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. ...
In this paper, we first use Principle Component Analysis (PCA) to identify the most important characteristics from 45 metrics to characterize big data workloads from BigDataBench, a comprehensive big data ...
We can successfully subset big data workloads. ...
doi:10.1109/iiswc.2014.6983058
dblp:conf/iiswc/JiaZWHMYLL14
fatcat:bqi6cq3qw5hlnmhhdzswi5swgm
A characterization of big data benchmarks
2013
2013 IEEE International Conference on Big Data
However, benchmarking big data systems is much more challenging than ever before. First, big data systems are still in their infant stage and consequently they are not well understood. ...
In this paper, we first analyze the redundancy among benchmarks from ICTBench, HiBench and typical workloads from real world applications: spatio-temporal data analysis for Shenzhen transportation system ...
BigDataBench was purposed for large-scale systems and architecture researches and for characterizing big data applications; each benchmark in BigDataBench is equal to a single big application [7] . ...
doi:10.1109/bigdata.2013.6691707
dblp:conf/bigdataconf/XiongYBZZZBLX13
fatcat:tearw6y2cva55arivabjhhottu
Characterization and architectural implications of big data workloads
2016
2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
Big data areas are expanding in a fast way in terms of increasing workloads and runtime systems, and this situation imposes a serious challenge to workload characterization, which is the foundation of ...
Second, corroborating the previous work, Hadoop and Spark based big data workloads have higher front-end stalls. Comparing with the traditional workloads i. e. ...
WCRT WCRT is a comprehensive workload characterization tool, which can subset the whole workload set by removing redundant ones to facilitate workload characterization and other architecture research. ...
doi:10.1109/ispass.2016.7482083
dblp:conf/ispass/0004RZJ16
fatcat:u5s763kbgra5rpbv5cly3tvxgi
Benchmarking Big Data Systems: State-of-the-Art and Future Directions
[article]
2015
arXiv
pre-print
and veracity), as well as implement application-specific but still comprehensive workloads. ...
The complexity, diversity, and rapid evolution of big data systems gives rise to various new challenges about how we design generators to produce data with the 4V properties (i.e. volume, velocity, variety ...
ACKNOWLEDGMENTS This technical report is a significant extended version of its preliminary version entitled "On Big Data Benchmarking", which is published in BPOE-4 (Co-located with ASPLOS 2014) [36] ...
arXiv:1506.01494v1
fatcat:3icae6wgjjfj7afsmlzppd4e2q
Memory system characterization of big data workloads
2013
2013 IEEE International Conference on Big Data
This paper examines how these trends may intersect by characterizing the memory access patterns of various Hadoop and noSQL big data workloads. ...
Two recent trends that have emerged include (1) Rapid growth in big data technologies with new types of computing models to handle unstructured data, such as mapreduce and noSQL (2) A growing focus on ...
Recent studies have also proposed characterizing and understanding these big data usage cases. ...
doi:10.1109/bigdata.2013.6691693
dblp:conf/bigdataconf/DimitrovKLVW13
fatcat:x5kheseoh5cyfdclaisevd3lwq
Workload characterization for MG-RAST metagenomic data analytics service in the cloud
2014
2014 IEEE International Conference on Big Data (Big Data)
In this paper, we characterize the MG-RAST workloads running in the cloud, from the perspectives of computation, I/O, and data transfer. ...
The consequent data deluge has imposed big burdens for data analysis applications. ...
ACKNOWLEDGMENTS This work was supported in part by the NIH award U01HG006537 "OSDF: Support infrastructure for NextGen sequence storage, analysis, and management", and U.S. ...
doi:10.1109/bigdata.2014.7004394
dblp:conf/bigdataconf/TangBDMGHWM14
fatcat:4qcliocqhbfyxam2ch26dmst2u
BenchCouncil's View on Benchmarking AI and Other Emerging Workloads
[article]
2019
arXiv
pre-print
This paper outlines BenchCouncil's view on the challenges, rules, and vision of benchmarking modern workloads like Big Data, AI or machine learning, and Internet Services. ...
We conclude the challenges of benchmarking modern workloads as FIDSS (Fragmented, Isolated, Dynamic, Service-based, and Stochastic), and propose the PRDAERS benchmarking rules that the benchmarks should ...
On the basis of the data motif methodology, we are proposing a new benchmark suite, named BENCHCPU [6] , to characterize emerging workloads, including Big Data, AI, and Internet Services. ...
arXiv:1912.00572v2
fatcat:oc73gvvw2behdiq27ib2yvdifu
Classifying Student's Learning Experience using Improved Apriori and CART
2017
International Journal of Computer Applications
The experimental results are performed and tested on various parameters such as precision and recall and final Score. ...
The various student's learning experience and their classification is done here using Fuzzy-Apriori and CART provide and better way to final and issue problems in various fields. ...
According to the definition of Big Data, Big Data is characterized by volume, velocity, and variety where traditional data processing methods and tools cannot be qualified. ...
doi:10.5120/ijca2017915311
fatcat:blwi3ksoinbx5dzngllqupxkre
System-Level Characterization of Datacenter Applications
2015
Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering - ICPE '15
A large volume of recent literature in characterizing "Big Data" applications have largely focused on two extremes of the characterization spectrum. ...
In recent years, a number of benchmark suites have been created for the "Big Data" domain, and a number of such applications fit the client-server paradigm. ...
, and how. • Characterization results are presented for several "Big Data" workloads, concentrating on the coarse-grain, per-server, system-level behavior rather than the finegrained microarchitectural ...
doi:10.1145/2668930.2688059
dblp:conf/wosp/AwasthiSGSGB15
fatcat:sxlhqvtmljfvxnbe5y226yefzu
Resource Distribution Estimation for Data-Intensive Workloads: Give Me My Share & No One Gets Hurt!
[chapter]
2016
Communications in Computer and Information Science
Robust resource share estimation of data-intensive workloads is integral to efficient workload management in a cluster where multiple systems co-exist and share the same infrastructure. ...
To address above challenges, we propose an inclusive framework and related techniques for workload profiling, similar job identification, and resource distribution prediction in a cluster. ...
Modern big data clusters run a diverse mix of applications and production workloads [18] , thereby characterizing similar jobs is challenging. ...
doi:10.1007/978-3-319-33313-7_17
fatcat:goglen56ejggrba5oxphfz2eqy
ShenZhen transportation system (SZTS): a novel big data benchmark suite
2016
Journal of Supercomputing
Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. ...
We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets. ...
Background: big data and MapReduce Big data applications are often characterized using the four Vs: volume, velocity, variety and veracity. ...
doi:10.1007/s11227-016-1742-7
fatcat:2uszi5spwjhopi3xme75pnhfcm
Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads
[article]
2018
arXiv
pre-print
This paper proposes a new approach to modelling and characterizing big data and AI workloads. ...
/BigDataBench), and perform comprehensive characterization of those data motifs from perspective of data sizes, types, sources, and patterns as a lens towards fully understanding big data and AI workloads ...
In this paper, we propose a new approach to modelling and characterizing big data and AI workloads. ...
arXiv:1808.08512v1
fatcat:4tmagnlfmvfbfj2kwqpdujfcbu
BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite
[article]
2018
arXiv
pre-print
the combination of one or more data motifs---to represent diversity of big data and AI workloads. ...
Unfortunately, complexity, diversity, frequently-changed workloads, and rapid evolution of big data and AI systems raise great challenges. ...
We thoroughly perform workload characterizations of big data and AI benchmarks on CPUs and GPUs, respectively. ...
arXiv:1802.08254v2
fatcat:6ktsa3yowvaqtjbez26akp7a7e
Energy efficient job scheduling in single-ISA heterogeneous chip-multiprocessors
2014
Fifteenth International Symposium on Quality Electronic Design
In recent years, single-ISA heterogeneous chip multiprocessors (CMP) consisting of big high-performance cores and small power-saving cores on the same die have been proposed for the exploration of high ...
In this work, we pay attention to reducing the energy consumption for workloads running on heterogeneous CMPs and propose a scheduling algorithm based on dynamic execution behaviors to exploit better energy-efficiency ...
Therefore, the rules for the small core essentially characterize the execution phases that are not likely to result in extreme high power on a big core. ...
doi:10.1109/isqed.2014.6783390
dblp:conf/isqed/ZhangDLPS14
fatcat:dvqmwnjbmjajhjala7xwruk3v4
Big Data Benchmark Compendium
[chapter]
2016
Lecture Notes in Computer Science
employ for Big Data systems. ...
The goal is to understand the current state in Big Data benchmarking and guide practitioners in their approaches and use cases. ...
The dataload is characterized by the size and the nature of the data sets used as inputs for a benchmark, and the workload is characterized by the number of concurrent clients and the distribution of the ...
doi:10.1007/978-3-319-31409-9_9
fatcat:n7lwtxainnblpf2xp4c5o2eynq
« Previous
Showing results 1 — 15 out of 9,646 results