951 Hits in 5.5 sec

Data Optimization using Apache Flink

2019 International Journal of Engineering and Advanced Technology  
Flink will be an open platform Big Data processing system for Apache-powered batch storage and streaming of data.  ...  Map Reduce, Flink, and Spark, also become more popular in the processing of big data lately.  ...  Apache Flink integrates windowing into a state-of -the-art operator controlled by a dynamic statement consisting of 3 processes: a window assigner associating optionally a trigger and an evicter.  ... 
doi:10.35940/ijeat.b3081.129219 fatcat:ey6yjpryengfldhwf67t2a5zpq

Flink-ER: An Elastic Resource-Scheduling Strategy for Processing Fluctuating Mobile Stream Data on Flink

Ziyang Li, Jiong Yu, Chen Bian, Yonglin Pu, Yuefei Wang, Yitian Zhang, Binglei Guo
2020 Mobile Information Systems  
As real-time and immediate feedback becomes increasingly important in tasks related to mobile information, big data stream processing systems are increasingly applied to process massive amounts of mobile  ...  (Flink-ER), which consists of a capacity detection algorithm, an elastic resource reallocation algorithm, and a data redistribution algorithm.  ...  stateful data management.  ... 
doi:10.1155/2020/5351824 fatcat:nb4igucv7vf2ldhiy42tj4tkbe

Towards autoscaling of Apache Flink jobs

Balázs Varga, Márton Balassi, Attila Kiss
2021 Acta Universitatis Sapientiae: Informatica  
Apache Flink is an open-source distributed stream processing engine that is able to process a large amount of data in real time with low latency.  ...  Data stream processing has been gaining attention in the past decade.  ...  Introduction Apache Flink [5, 18, 10] is an open-source distributed data stream processing engine and framework.  ... 
doi:10.2478/ausi-2021-0003 fatcat:ek4pmqbypve3nca2jphmxn6g44

FogGuru: a Fog Computing platform based on Apache Flink

Davaadorj Battulga, Daniele Miorandi, Cedric Tedeschi
2020 2020 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN)  
FogGuru builds upon the stream processing paradigm for fog applications design; its implementation makes use of the open-source Apache Flink stream processing engine.  ...  Systems and methods for developing and deploying an application in Fog nodes are still in their infancy.  ...  Pierre and colleagues from INRIA in University of Rennes 1 for assembling the Raspberry Pi cluster.  ... 
doi:10.1109/icin48450.2020.9059374 dblp:conf/icin/BattulgaMT20 fatcat:chiilke6lrdnfjxtgdvejhebpe

SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink

Oscar Ceballos, Carlos Alberto Ramírez Restrepo, María Constanza Pabón, Andres M. Castillo, Oscar Corcho
2021 Applied Sciences  
New trends in Big Data technologies have also emerged (e.g., Apache Spark, Apache Flink); they use distributed in-memory processing and promise to deliver higher data processing performance.  ...  In this paper, we present a formal interpretation of some PACT transformations implemented in the Apache Flink DataSet API.  ...  Acknowledgments: The scalability test results on local cluster presented in this paper were obtained thanks to ViveLab Nariño, an initiative of Ministerio de Tecnologías de la Información y las Comunicaciones-MinTIC  ... 
doi:10.3390/app11157033 fatcat:kqtyvqp645bctbpriwhwb5qgxu

In-transit molecular dynamics analysis with Apache flink

Henrique C. Zanúz, Bruno Raffin, Omar A. Mures, Emilio J. Padrón
2018 Proceedings of the Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization - ISAV '18  
In this paper we propose to leverage Apache Flink, a scalable stream processing engine from the Big Data domain, in this HPC context.  ...  We build a complete in transit analytics workflow, connecting an MD simulation to Apache Flink and to a distributed database, Apache HBase, to persist all the desired data.  ...  We use Apache Flink, a distributed streaming dataflow engine, to process in transit the data from the simulation.  ... 
doi:10.1145/3281464.3281469 dblp:conf/sc/ZanuzRMP18 fatcat:b7sc4mynvbfapk6n6y73mxzhqm

Real-time analysis of market data leveraging Apache Flink

Cecilia Calavaro, Gabriele Russo Russo, Valeria Cardellini
2022 Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems  
Our solution leverages Apache Flink, an open-source, scalable stream processing platform, which allows us to process incoming data streams with low latency and exploit the parallelism offered by the underlying  ...  CCS CONCEPTS • Information systems → Data analytics; Stream management.  ...  Our solution relies on two well-known open-source distributed frameworks: Apache Kafka for data ingestion and Apache Flink for stream processing.  ... 
doi:10.1145/3524860.3539650 fatcat:267aw324bbejdmqhm6flmgjlyi

Detecting trading trends in streaming financial data using Apache Flink

Emmanouil Kritharakis, Shengyao Luo, Vivek Unnikrishnan, Karan Vombatkere
2022 Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems  
This paper aims to solve the above challenge with a distributed, event-streaming solution built using Apache Flink.  ...  the input stream uses sinks to collect and output results, and scales easily on a distributed Flink cluster.  ...  This paper uses Apache Flink [5, 6] , an open-source, unified stream-processing and batch-processing framework.  ... 
doi:10.1145/3524860.3539647 fatcat:5nfmj2o4g5e5hhgoebrwdxi7pu

Reproducible Experiments for Comparing Apache Flink and Apache Spark on Public Clouds [article]

Shelan Perera, Ashansa Perera, Kamal Hakimzadeh
2016 arXiv   pre-print
Therefore data processing engines such as Apache Flink and Apache Spark emerged in open source world to fulfil that efficient and high performing data processing requirement.  ...  Big data processing is a hot topic in today's computer science world. There is a significant demand for analysing big data to satisfy many requirements of many industries.  ...  CPU utilization of Apache Flink in Stream processing Fig. 15 . 15 CPU utilization of Apache Spark in Stream processing Fig. 16.  ... 
arXiv:1610.04493v1 fatcat:5f7pyp4vqja53jcu42zesvve7u

Analyzing extended property graphs with Apache Flink

Martin Junghanns, André Petermann, Niklas Teichmann, Kevin Gómez, Erhard Rahm
2016 Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics - NDA '16  
Our current implementation is based on the distributed dataflow framework Apache Flink.  ...  Thus, graph analytics plays an important role in research and industry.  ...  Our model includes declarative operators for graph analytics. • We describe the first implementation of the EPGM on top of Apache Flink 2 , a state-of-the-art distributed dataflow framework. • We present  ... 
doi:10.1145/2980523.2980527 dblp:conf/sigmod/JunghannsPTGR16 fatcat:bjb6tchnurgx7luw2ozgt3k7j4

Continuous Outlier Mining of Streaming Data in Flink [article]

Theodoros Toliopoulos, Anastasios Gounaris, Kostas Tsichlas, Apostolos Papadopoulos, Sandra Sampaio
2019 arXiv   pre-print
Our proposal fills this gap and investigates the challenges in transferring state-of-the-art techniques to Apache Flink, a modern platform for intensive streaming analytics.  ...  In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available.  ...  Apache Storm is the first widely used large scale stream processing framework.  ... 
arXiv:1902.07901v1 fatcat:w7pqkipvgvffndtcfhr6e4hawe

Detecting technical trading patterns in financial data with Apache Flink

Quan Pham, Quang Nguyen, Ryte Richard, Shekhar Sharma, Xavier Ruiz
2022 Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems  
This paper aims to solve the above challenge using Apache Flink [1] -an open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation.  ...  The DEBS Grand Challenge is an annual competition in which participants strive to build the fastest and most scalable distributed and event-based systems that solve a practical problem.  ...  BACKGROUND Apache Flink [1] is an open-source framework and distributed processing engine for stateful computations over unbounded data streams.  ... 
doi:10.1145/3524860.3539648 fatcat:zltbhguvrnhjffiuktxrlxz5sq

A comparison on scalability for batch big data processing on Apache Spark and Apache Flink

Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
2017 Big Data Analytics  
Recently a novel framework called Apache Flink has emerged, focused on distributed stream and batch data processing.  ...  Apache Spark is a fast and general engine for large-scale data processing based on the MapReduce model. The main feature of Spark is the in-memory computation.  ...  Apache Flink offers a high fault tolerance mechanism to consistently recover the state of data streaming applications.  ... 
doi:10.1186/s41044-016-0020-2 fatcat:b6uqpjj7nfei7lckkafbrdktpi

A Distributed Online Learning Approach for Pattern Prediction over Movement Event Streams with Apache Flink

Ehab Qadah, Michael Mock, Elias Alevizos, Georg Fuchs
2018 International Conference on Extending Database Technology  
Apache Flink.  ...  In this paper, we present a distributed online prediction system for user-defined patterns over multiple massive streams of movement events, built using the general purpose stream processing framework  ...  Apache Flink is an open source project that provides a large-scale, distributed, and stateful stream processing platform [6] .  ... 
dblp:conf/edbt/QadahMAF18 fatcat:3v45ahlx5ndonnzm4bpbg4fvqm

Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks

Ovidiu-Cristian Marcu, Alexandru Costan, Gabriel Antoniu, Maria S. Perez-Hernandez
2016 2016 IEEE International Conference on Cluster Computing (CLUSTER)  
Spark and Flink are two Apache-hosted data analytics frameworks that facilitate the development of multi-step data pipelines using directly acyclic graph patterns.  ...  This paper aims to bring some justice in this respect, by directly evaluating the performance of Spark and Flink.  ...  The experiments presented in this paper were carried out on the Grid'5000 testbed [49] .  ... 
doi:10.1109/cluster.2016.22 dblp:conf/cluster/MarcuCAP16 fatcat:e6e6aftulved5c3qlj6634b6me
« Previous Showing results 1 — 15 out of 951 results