Filters








11 Hits in 5.6 sec

PRIMEBALL: a Parallel Processing Framework Benchmark for Big Data Applications in the Cloud [article]

Jaume Ferrarons , Sandra Pietrowska
2013 arXiv   pre-print
In this paper, we draw the specifications of a novel benchmark for comparing parallel processing frameworks in the context of big data applications hosted in the cloud.  ...  The main strengths of our benchmark are parallelization capabilities supporting cloud features and big data properties.  ...  Conclusions We propose in this paper the specifications for PRIMEBALL, a complete and unified benchmark for measuring the characteristics of parallel cloud processing frameworks for big data applications  ... 
arXiv:1312.6293v1 fatcat:wdiutxlvmrckloyzneu5sszb54

PRIMEBALL: A Parallel Processing Framework Benchmark for Big Data Applications in the Cloud [chapter]

Jaume Ferrarons, Mulu Adhana, Carlos Colmenares, Sandra Pietrowska, Fadila Bentayeb, Jérôme Darmont
2014 Lecture Notes in Computer Science  
In this position paper, we draw the specifications for a novel benchmark for comparing parallel processing frameworks in the context of big data applications hosted in the cloud.  ...  The main strengths of our benchmark definition are parallelization capabilities supporting cloud features and big data properties.  ...  Conclusions We propose in this paper the specifications for PRIMEBALL, a complete and unified benchmark for measuring the characteristics of parallel cloud processing frameworks for big data applications  ... 
doi:10.1007/978-3-319-04936-6_8 fatcat:3s36gsmp5bbrxlirfjkhyincqu

Big Data Benchmark Compendium [chapter]

Todor Ivanov, Tilmann Rabl, Meikel Poess, Anna Queralt, John Poelman, Nicolas Poggi, Jeffrey Buell
2016 Lecture Notes in Computer Science  
Therefore, the traditional way of specifying a standardized benchmark with pre-defined workloads, which have been in use for years in the transaction and analytical processing systems, is not trivial to  ...  employ for Big Data systems.  ...  Acknowledgment This research has been supported by the Research Group of the Standard Performance Evaluation Corporation (SPEC).  ... 
doi:10.1007/978-3-319-31409-9_9 fatcat:n7lwtxainnblpf2xp4c5o2eynq

Data Processing Benchmarks [article]

Jérôme Darmont
2017 arXiv   pre-print
We also address the newer trends in cloud benchmarking. Finally, we discuss the issues, tradeoffs and future trends for data processing benchmarks.  ...  The aim of this article is to present an overview of the major families of state-of-the-art data processing benchmarks, namely transaction processing benchmarks and decision support benchmarks.  ...  Cloud benchmarks In the timely context of cloud computing and big data processing and analysis, benchmarking needs are as high as ever to compare parallel processing capability or infrastructure scalability  ... 
arXiv:1701.08634v1 fatcat:22lp2aqhubc7bhya5y52oqsoty

Big Data Methodologies, Tools And Infrastructures

Kim Hee, Todor Ivanov, Roberto V. Zicari, Rut Waldenfels, Hevin Özmen, Naveed Mushtaq, Minsung Hong, Tharsis Teoh, Rajendra Akerkar
2018 Zenodo  
This report, which is a follow up of Deliverable 1.1, offers an in-depth introduction to relevant technologies for Big Data Analytics and Big Data Management.  ...  In order to tackle the demands and challenges in the transportation domain, an optimal stack of Big Data technologies needs to be selected and designed based on the application requirements.  ...  The future of data analytics in transportation has many applications and opportunities.  ... 
doi:10.5281/zenodo.1465539 fatcat:mkad5yu2tnfw7fdi3xqcermac4

TextBenDS: a generic Textual data Benchmark for Distributed Systems [article]

Ciprian-Octavian Truica, Ira Assent
2021 arXiv   pre-print
Therefore, in a Big Data context, it is crucial to lower the runtime of computing weighting schemes, without hindering the analysis process and the accuracy of the machine learning algorithms.  ...  Our benchmark offers a generic data model designed with a multidimensional approach for storing text documents.  ...  There are other types of benchmarks that evaluate parallel text processing in Big Data, cloud applications.  ... 
arXiv:2108.05689v1 fatcat:ozt5dleqnnattgwp3ic5w4uxga

Benchmarking top-kkeyword and top-kdocument processing with T2K2and T2K2D2

Ciprian-Octavian Truică, Jérôme Darmont, Alexandru Boicea, Florin Rădulescu
2018 Future generations computer systems  
Hence, in this paper, we present T^2K^2, a top-k keywords and documents benchmark, and its decision support-oriented evolution T^2K^2D^2.  ...  Both benchmarks feature a real tweet dataset and queries with various complexities and selectivities.  ...  This last family of benchmarks evaluates parallel text processing in big data, cloud applications.  ... 
doi:10.1016/j.future.2018.02.037 fatcat:lnn7yxjvave3lfjzi474brwcvu

Data-Centric Benchmarking [chapter]

Jérôme Darmont
Advances in Computer and Electrical Engineering  
The Transaction Processing Performance Council (TPC), a non-profit organization founded in 1988, plays a preponderant role in data-centric benchmarking.  ...  We survey benchmarks from three families: transaction benchmarks aimed at On-Line Transaction Processing (OLTP), decision-support benchmarks aimed at On-Line Analysis Processing (OLAP) and big data benchmarks  ...  MalStone (Open Cloud Consortium, 2009) is a benchmark for assessing data intensive parallel processing.  ... 
doi:10.4018/978-1-5225-7598-6.ch025 fatcat:6zfmlk6bhfdahag5g52xjqhzai

Data-Centric Benchmarking [chapter]

Jérôme Darmont
Encyclopedia of Information Science and Technology, Fourth Edition  
The Transaction Processing Performance Council (TPC), a non-profit organization founded in 1988, plays a preponderant role in data-centric benchmarking.  ...  We survey benchmarks from three families: transaction benchmarks aimed at On-Line Transaction Processing (OLTP), decision-support benchmarks aimed at On-Line Analysis Processing (OLAP) and big data benchmarks  ...  MalStone (Open Cloud Consortium, 2009) is a benchmark for assessing data intensive parallel processing.  ... 
doi:10.4018/978-1-5225-2255-3.ch154 fatcat:u26qfefzebgwpbuewscnnv4rpa

Frameworks for distributed big data processing: a comparison in the domain of predictive maintenance

Rudolf Plettenberg, Manuel Wimmer, Alexandra Mazak-Huemer
2018
While there are currently many benchmarks available for other domains such as retail, social network, or search engines, there are none available for Big Data analytic frameworks in the application area  ...  This thesis introduces the predictive maintenance benchmark (PMB). The PMB is a benchmark aimed at measuring the performance of Big Data analytic frameworks in the field of predictive maintenance.  ...  [28] present a benchmark called Primeball, which focuses on data processing. They propose a fictitious news site hosted in the cloud to serve as a benchmark.  ... 
doi:10.34726/hss.2018.52507 fatcat:hhmxgo7wvfgufi6dqj47snfw3a

An evaluation of deep hashing for high-dimensional similarity search on embedded data [article]

Rutuja Shivraj Pawar, Universitäts- Und Landesbibliothek Sachsen-Anhalt, Martin-Luther Universität, Gunter Saake, Gabriel Campero Durand
2019
However, Big Data, due to its charac-teristics, poses a variety of challenges to ML applications, such as high class imbalance, the need for feature engineering to support heterogeneous data and the need  ...  In such a scenario, high-dimensional similarity search serves as a popular method to extract relevant information from large data volumes or Big Data, and it further drives different Machine Learning (  ...  39 An open-source end-to-end benchmark to measure the performance of Apache Pig systems PRIMEBALL [158] Measures and compares the performance of Big Data parallel processing frameworks in cloud SparkBench  ... 
doi:10.25673/31719 fatcat:76okmnvxnrgyliqq3vbky3e5zq