176 Hits in 1.8 sec

Review of Big Data and Processing Frameworks for Disaster Response Applications

Silvino Pedro Cumbane, Gyozo Gidófalvi
2019 ISPRS International Journal of Geo-Information  
Secondly, the big data processing frameworks are characterized and grouped based on the sources of data they handle.  ...  The availability of different big data such as satellite imageries, Global Positioning System (GPS) traces, mobile Call Detail Records (CDRs), social media posts, etc., in conjunction with advances in  ...  The overall architecture of Apache Flink can be found on [66] .  ... 
doi:10.3390/ijgi8090387 fatcat:2fhh4kol2nfatbokwiggnfm5ge

Spatiotemporal Aspects of Big Data

Saadia Karim, Tariq Rahim Soomro, S. M. Aqil Burney
2018 Applied Computer Systems  
The analysis of big data involves determined attempts on previous data.  ...  Frameworks have been compared on the basis of architectural characteristics and feature attributes as well.  ...  , SPARK, AND FLINK ON THE BASIS OF SPATIOTEMPORAL BIG DATA Apache Hadoop TABLE III III FEATURE COMPARISON OF HADOOP, SPATIAL HADOOP, SAMZA, STORM, SPARK, GEOSPARK, SPATIAL SPARK, FLINK BASED ON SPATIOTEMPORAL  ... 
doi:10.2478/acss-2018-0012 fatcat:rxp74qe4bvd4rk6bchql7ajaeu

An experimental survey on big data frameworks

Wissem Inoubli, Sabeur Aridhi, Haithem Mezni, Mondher Maddouri, Engelbert Mephu Nguifo
2018 Future generations computer systems  
Yet, many research works focus on Big Data, a buzzword referring to the processing of massive volumes of (unstructured) data.  ...  Social media Social media is another representative data source for big data that requires real-time processing and results.  ...  The Name Node will seek their location in its indexing system and subsequently sends their address back to the client.  ... 
doi:10.1016/j.future.2018.04.032 fatcat:dxl42yu54retblcgttysadacqu

A New Big Data Architecture for Real-Time Student Attention Detection and Analysis

Tarik Hachad, Abdelalim Sadiq, Fadoua Ghanimi
2020 International Journal of Advanced Computer Science and Applications  
Flink and Storm perform real streaming while Spark is able to do micro-batching.  ...  Fig. 3explains the architecture of Storm. Fig. 2 . 2 Architecture of Flink Cluster. Fig. 3 . 3 Storm Cluster Architecture.  ... 
doi:10.14569/ijacsa.2020.0110831 fatcat:3v6jtx66wjcopovtgnnjjkhl6a

A Survey of Distributed Data Stream Processing Frameworks

Haruna Isah, Tariq Abughofa, Sazia Mahfuz, Dharmitha Ajerla, Farhana Zulkernine, Shahzad Khan
2019 IEEE Access  
, Spark Streaming, Flink, Kafka Streams) and commercial (IBM Streams) distributed data stream processing frameworks.  ...  INDEX TERMS Dataflow architectures, data stream architectures, distributed processing systems comparison, survey, taxonomy.  ...  These DSPEs include Storm, Spark Streaming, Flink, Kafka Streams, and IBM Streams.  ... 
doi:10.1109/access.2019.2946884 fatcat:lu6oknfpkraybmtuqxismmlqda

Maximum Sustainable Throughput Evaluation Using an Adaptive Method for Stream Processing Platforms

Zheng Chu, Jiong Yu, Askar Hamdull
2020 IEEE Access  
INDEX TERMS Streaming data, stream processing platform, maximum sustainable throughput, throughput evaluation, adaptive algorithm, throughput volatility.  ...  The experimental results from four open-source benchmarks running on three mainstream stream processing platforms show that the adaptive MST evaluation method has a lower error rate and executes faster  ...  Flink, Twitter Heron and Apache Storm .  ... 
doi:10.1109/access.2020.2976738 fatcat:a4ynyqdw7jgove5w3df36edxdm

Real Time Analytics: Algorithms and Systems [article]

Arun Kejariwal, Sanjeev Kulkarni, Karthik Ramasamy
2017 arXiv   pre-print
Velocity is one of the 4 Vs commonly used to characterize Big Data.  ...  We shall walk through how the field has evolved over the last decade and then discuss the current challenges - the impact of the other three Vs, viz., Volume, Variety and Veracity, on Big Data streaming  ...  Examples include, S4 [131] , Samza [8] , Sonora [167] , Millwheel [37] , Photon [40] , Storm [158] , Flink [4] , Spark [9] , Pulsar [130] and Heron [118] .  ... 
arXiv:1708.02621v1 fatcat:jiltbmbzmjee5gdnc7vdwkbisa

Challenges and Solutions for Processing Real-Time Big Data Stream: a systematic literature review

Erum Mehmood, Tayyaba Anees
2020 IEEE Access  
Background: Published surveys and reviews either cover papers focusing on stream analysis in applications other than real-time DWH or focusing on extraction, transformation, loading (ETL) challenges for  ...  This study included Storm, Spark Streaming, Flink, Kafka Streams and IBM Streams as data stream processing engines in their review.  ...  This study provided technical experimental comparison among three big data stream processing applications: Spark Streaming, Storm and Flink.  ... 
doi:10.1109/access.2020.3005268 fatcat:b2xlblvarrgenctrnpqrpvijau

An Empirical Exploration of the Yarn in Big Data

Yusuf Perwej, Bedine Kerim, Mohmed Sirelkhtem Adrees, Osama E. Sheta
2017 International Journal of Applied Information Systems  
Yarn allow multiple applications to run simultaneously on the coequal shared cluster and assent applications to negotiate resources based on necessity.  ...  Storm on Yarn is strong for scenarios in need of actual time analytics, machine learning and sustained monitoring of operations. The Storm concatenates with Yarn via Apache slider.  ...  Apache Giraph is actual time graph processing software that is for the most part used to analyze social media data.  ... 
doi:10.5120/ijais2017451730 fatcat:oncpioqbmjfqhmqvtpqef5rpza

Real-Time Spatial Queries for Moving Objects Using Storm Topology

Feng Zhang, Ye Zheng, Dengping Xu, Zhenhong Du, Yingzhi Wang, Renyi Liu, Xinyue Ye
2016 ISPRS International Journal of Geo-Information  
In this paper, we present a distributed spatial index based on Apache Storm, an open-source distributed real-time computation system.  ...  Storm.  ...  contributed to the data acquisition and experimental study; Zhenhong Du was involved in data acquisition and revision of the manuscript; Yingzhi Wang was involved in data acquisition and analysis, worked on  ... 
doi:10.3390/ijgi5100178 fatcat:jzlxahk3pbbw7kvudw3fpfvnjy

Business Process Analytics and Big Data Systems: A Roadmap to Bridge the Gap

Sherif Sakr, Zakaria Maamar, Ahmed Awad, Boualem Benatallah, Wil M. P. Van Der Aalst
2018 IEEE Access  
These are generated as business processes are executed and stored in transaction logs, databases, e-mail correspondences, free form text on (enterprise) social media, and so on.  ...  Unfortunately, the business process management (BPM) community has not kept up to speed with such developments and often rely merely on traditional modeling-based approaches.  ...  We are witnessing the uptake of a new generation of Big Data systems like Spark, Flink, Storm, and Impala [4] .  ... 
doi:10.1109/access.2018.2881759 fatcat:2fcc4au7bfgklf3zemq7xfxcii

A survey of open source tools for machine learning with big data in the Hadoop ecosystem

Sara Landset, Taghi M. Khoshgoftaar, Aaron N. Richter, Tawfiq Hasanin
2015 Journal of Big Data  
We discuss the advantages and disadvantages of three different processing paradigms along with a comparison of engines that implement them, including MapReduce, Spark, Flink, Storm, and H 2 O.  ...  In order to evaluate tools, one should have a thorough understanding of what to look for.  ...  Social media or IoT data may require real-time results, necessitating the use of Storm or Flink along with their associated ML libraries.  ... 
doi:10.1186/s40537-015-0032-1 fatcat:zgcsiokrynfhzbmaudqf7rcll4

Big Data Analytics = Machine Learning + Cloud Computing [chapter]

C. Wu, R. Buyya, K. Ramamohanarao
2016 Big Data  
Flink and other data process engines Apart from Spark, there are several data processing engines such as Microsoft Dryad, Storm, Tez, Flink and CIEL (see Figure 18 ) that are capable of supporting MapReduce  ...  It led to the development of complimentary ecosystems, such as Hama, Storm, Spark, and Flink that addressed weakness of MapReduce-based systems.  ...  Due to the demand for processing all types of BDA workloads, many Hadoop's ecosystems have been developed, such as Spark, Storm, Hama, Tachyon, TEZ, S4 and Flink.  ... 
doi:10.1016/b978-0-12-805394-2.00001-5 fatcat:2a2avnxwivbztmp7iksxqgkv2a

Big Data Analytics = Machine Learning + Cloud Computing [article]

Caesar Wu, Rajkumar Buyya, Kotagiri Ramamohanarao
2016 arXiv   pre-print
We show that Big Data is not just 3Vs, but 32 Vs, that is, 9 Vs covering the fundamental motivation behind Big Data, which is to incorporate Business Intelligence (BI) based on different hypothesis or  ...  History of Big Data has demonstrated that the most cost effective way of performing BDA is to employ Machine Learning (ML) on the Cloud Computing (CC) based infrastructure or simply, ML + CC -> BDA.  ...  Flink and other data process engines Apart from Spark, there are several data processing engines such as Microsoft Dryad, Storm, Tez, Flink and CIEL (see Figure 18 ) that are capable of supporting MapReduce  ... 
arXiv:1601.03115v1 fatcat:ogzvtaigsngelj7hhlkqzheraa

Technical Report: On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science [article]

Bilal Akil, Ying Zhou, Uwe Röhm
2018 arXiv   pre-print
Our findings show that Spark and Flink are preferred platforms over MapReduce.  ...  We report on the design, execution and results of a usability study with a cohort of masters students, who were learning and working with all three platforms in order to solve different use cases set in  ...  Apache Spark's method of micro-batching has been found to be slower but more resilient to failure, than native streaming in Apache Storm and Apache Flink [21] .  ... 
arXiv:1803.10836v1 fatcat:o4gpa6gvsbdepjg2qez2aqzaiq
« Previous Showing results 1 — 15 out of 176 results