Filters








2,107 Hits in 3.7 sec

A Comparison of Big Data Frameworks on a Layered Dataflow Model [article]

Claudia Misale and Maurizio Drocco and Marco Aldinucci and Guy Tremblay
2016 arXiv   pre-print
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters.  ...  Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.  ...  In Section 2.1 we review the Dataflow model of computation, as pre-A Comparison of Big Data Frameworks on a Layered Dataflow Model 3 sented by Lee and Parks [16] .  ... 
arXiv:1606.05293v1 fatcat:l5xkqpqcjbbjfk7aih2t54af6q

A Comparison of Big Data Frameworks on a Layered Dataflow Model

Claudia Misale, Maurizio Drocco, Marco Aldinucci, Guy Tremblay
2017 Parallel Processing Letters  
In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters.  ...  Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.  ...  With this abstraction, we showed that Big Data analytics tools have similar expressiveness at all levels and we proceeded with the description of a layered model capturing different levels of Big Data  ... 
doi:10.1142/s0129626417400035 fatcat:bwsjg4qs7rf6jpkvqd5mnablqm

Practical Multi-level Modeling on MOF-compliant Modeling Frameworks

Kosaku Kimura, Yoshihide Nomura, Yuka Tanaka, Hidetoshi Kurihara, Rieko Yamamoto
2015 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems  
We design modeling patterns for achieving the multi-level modeling methodologies on Eclipse Modeling Framework, and implement the dataflow model by applying the patterns.  ...  This paper describes practices for multi-level modeling by only using existing modeling frameworks that comply Meta-Object Facility (MOF).  ...  This paper describes practices for achieving multi-level models on EMF. We use a hierarchy of a dataflow model as an example model that is used on graphical editing tools.  ... 
dblp:conf/models/KimuraNTKY15 fatcat:bbpa2nl6fzdzhdjtxovc7usxpm

DOT

Yin Huai, Rubao Lee, Simon Zhang, Cathy H. Xia, Xiaodong Zhang
2011 Proceedings of the 2nd ACM Symposium on Cloud Computing - SOCC '11  
Under the DOT model, we provide a set of optimization guidelines, which are framework and implementation independent, and applicable to a wide variety of big data analytics jobs.  ...  With the DOT model, any big data analytics job execution in various software frameworks can be represented by a specific or non-specific number of elementary/composite DOT blocks, each of which performs  ...  We appreciate many discussions with big data researchers and engineers in Facebook and Intel. We thank Bill Bynum for reading the paper and for his suggestions.  ... 
doi:10.1145/2038916.2038920 dblp:conf/cloud/HuaiLZX011 fatcat:g66j72for5e3fa74727zsmofse

A Data-flow Language for Big RDF Data Processing

Fadi Maali
2014 International Semantic Web Conference  
My PhD work aims at enhancing programmability of big RDF data. The gaol is to augment the existing tools with a declarative dataflow language that focuses on the analysis of large-scale RDF data.  ...  On the other hand, a graph-based data model and support for pattern matching as in SPARQL are to be adopted.  ...  Furthermore, there has been a surge of activity on layering declarative languages on top of these platforms.  ... 
dblp:conf/semweb/Maali14a fatcat:mju6owljxjenfi7vr7kiqxnaue

Pico: A Domain-Specific Language For Data Analytics Pipelines

Claudia Misale, Marco Aldinucci, Guy Tremblay
2017 Zenodo  
of layers that build a prototypical framework for Big Data analytics.  ...  This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics.  ...  Acknowledgements Funding This work has been partially supported by the Italian Ministry of Education and Research (MIUR), by the EU-H2020 RIA project "Toreador" (no. 688797), the EU-H2020 RIA project  ... 
doi:10.5281/zenodo.579753 fatcat:aadje57qh5hk3ijmqn4j7vkhpm

PiCo: A Novel Approach to Stream Data Analytics [chapter]

Claudia Misale, Maurizio Drocco, Guy Tremblay, Marco Aldinucci
2018 Lecture Notes in Computer Science  
PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to re-use the same algorithms and pipelines on different  ...  PiCo's programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance.  ...  Acknowledgements This work has been partially supported by the OptiBike experiment of the EU-H2020-IA "Fortissimo2" project (no. 680481), the EU-H2020-RIA "Rephrase" project (no. 644235), the EU-H2020-  ... 
doi:10.1007/978-3-319-75178-8_10 fatcat:2yaxzubpibhtnlniose365ww6m

Deep learning accelerators: a case study with MAESTRO

Hamidreza Bolhasani, Somayyeh Jafarali Jassbi
2020 Journal of Big Data  
Measured performance indicators of novel optimized architecture, NVDLA shows higher L1 and L2 computation reuse, and lower total runtime (cycles) in comparison to the other one.  ...  Performance of a deep learning task is measured and compared in two different data flow strategies: NLR (No Local Reuse) and NVDLA (NVIDIA Deep Learning Accelerator), using an open source tool called MAESTRO  ...  The main focus of the suggested model is on the memory structure to be optimized for big neural network computations.  ... 
doi:10.1186/s40537-020-00377-8 fatcat:3lxclxhhivearodnx6xpxrcoa4

SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop

Pengfei Xuan, Yueli Zheng, Sapna Sarupria, Amy Apon
2013 2013 IEEE International Conference on Big Data  
on a Hadoop platform.  ...  It provides an efficient mechanism for building a parallel scientific application with dataflow patterns, and enables the design, deployment, and execution of data intensive, many-task computing tasks  ...  ACKNOWLEDGMENTS This research is sponsored in part by NSF awards CNS-1228312 and OCI-1212680, and used HPC resources of the Clemson Computing and Information Technology.  ... 
doi:10.1109/bigdata.2013.6691725 dblp:conf/bigdataconf/XuanZSA13 fatcat:7sptcg6ge5cxbcemhfomxkbyli

Applying CNN on a scientific application accelerator based on dataflow architecture

Xiaochun Ye, Taoran Xiang, Xu Tan, Yujing Feng, Haibin Wu, Meng Wu, Dongrui Fan
2019 CCF Transactions on High Performance Computing  
In this paper, we propose a scheme for implementing and optimizing CNN on fine-grained dataflow architecture designed for scientific applications, namely Scientific Processing Unit (SPU).  ...  The experiment results reveal that by using our scheme, the performance of AlexNet and VGG-19 running on SPU is averagely 2.29 × higher than that on NVIDIA Titan Xp, and the energy consumption of our hardware  ...  Chen et al. (2016) proposed a novel dataflow model that can minimize the energy consumption of data movement.  ... 
doi:10.1007/s42514-019-00015-7 fatcat:4n5kyzorsfdvph3uuvaaz65chi

An Architecture for Predictive Maintenance of Railway Points Based on Big Data Analytics [chapter]

Giulio Salierno, Sabatino Morvillo, Letizia Leonardi, Giacomo Cabri
2020 Lecture Notes in Business Information Processing  
In this paper, we propose a four-layers big data architecture with the goal of establishing a data management policy to manage massive amounts of data produced by railway switch points and perform analytical  ...  An implementation of the architecture is given along with the realization of a Long Short-Term Memory prediction model for detecting failures on the Italian Railway Line of Milano -Monza -Chiasso.  ...  Ingestion layer has been realized through Apache NiFi, a dataflow system based on the concepts of flow-based programming.  ... 
doi:10.1007/978-3-030-49165-9_3 fatcat:ngecozyruvdgjozxbssknf7pri

In-Network Accumulation: Extending the Role of NoC for DNN Acceleration [article]

Binayak Tiwari, Mei Yang, Xiaohang Wang, Yingtao Jiang
2022 arXiv   pre-print
In this paper, we propose the In-Network Accumulation (INA) method to further accelerate a DNN workload execution on a many-core spatial DNN accelerator for the Weight Stationary (WS) dataflow model.  ...  Network-on-Chip (NoC) plays a significant role in the performance of a DNN accelerator.  ...  Fig. 3 shows an example of the WS dataflow model on a 4 × 4 mesh.  ... 
arXiv:2209.10056v1 fatcat:u5yyfjp27bgjjf2lvulpu4nygy

Polyhedral Dataflow Programming: A Case Study

Romain Fontaine, Laure Gonnord, Lionel Morel
2018 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)  
This approach is validated through the development of a prototype toolchain based on an extended version of the ΣC language.  ...  We demonstrate the benefit of this approach and the potentiality of further improvements on relevant case studies.  ...  Some of the experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities  ... 
doi:10.1109/cahpc.2018.8645947 dblp:conf/sbac-pad/FontaineGM18 fatcat:iinv4zjxznbhblp2tfblt47kr4

Evaluation of high-level query languages based on MapReduce in Big Data

Marouane Birjali, Abderrahim Beni-Hssane, Mohammed Erritali
2018 Journal of Big Data  
Abstract MapReduce (MR) is a criterion of Big Data processing model with parallel and distributed large datasets.  ...  Introduction Since it was presented by Google in 2004, MapReduce (MR) [1] has been emerged as a popular framework for Big Data processing model in cluster environment and cloud computing [2] .  ...  Availability of data and materials Data will not be shared at this moment, as the datasets are for use in extension for my research work.  ... 
doi:10.1186/s40537-018-0146-3 fatcat:5zkczkf3ancrvd4wt3xkhndua4

The Berlin Big Data Center (BBDC)

Christoph Boden, Tilmann Rabl, Volker Markl
2018 it - Information Technology  
However, writing efficient implementations of data analysis programs on these systems requires a deep understanding of systems programming, prohibiting large groups of data scientists and analysts from  ...  Framework for transparent and reproducible benchmark experiments of distributed data processing systems, approaches to foster the interpretability of machine learning models and finally provide an overview  ...  Flink provides a programming model that provides support of iterative algorithms and complex user-defined functions which simplifies the process of creating data analysis programs in comparison with other  ... 
doi:10.1515/itit-2018-0016 fatcat:mnxe772elba5rekvxds5j4xgpm
« Previous Showing results 1 — 15 out of 2,107 results