16,290 Hits in 6.8 sec

Exploring Shared State in Key-Value Store for Window-Based Multi-pattern Streaming Analytics

Ovidiu-Cristian Marcu, Radu Tudoran, Bogdan Nicolae, Alexandru Costan, Gabriel Antoniu, Maria S. Perez-Hernandez
2017 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)  
We design a deduplication method specifically for windowbased operators that rely on key-value stores to hold a shared state.  ...  In this paper, we explore the feasibility of deduplication techniques to address the challenge of reducing memory footprint for window-based stream processing without significant impact on performance.  ...  Second, we plan to explore multi-pattern aggregations using key-value stores that support group queries.  ... 
doi:10.1109/ccgrid.2017.126 dblp:conf/ccgrid/MarcuTNCAP17 fatcat:5oxtilgnibhj5dhp3zb47jhe2a

A Comprehensive Survey on Parallelization and Elasticity in Stream Processing

Henriette Röger, Ruben Mayer
2019 ACM Computing Surveys  
Mencagli and De Matteis [36] investigated parallelization patterns for window-based SP operators. They did neither take into account other parallelization strategies nor discussed elasticity methods.  ...  Stream Processing (SP) has evolved as the leading paradigm to process and gain value from the high volume of streaming data produced e.g. in the domain of the Internet of Things.  ...  It provides data integration, a "marketplace" for operators and operator topologies, and billing. Special is a shared key-value store for state that prevents state migrations.  ... 
doi:10.1145/3303849 fatcat:hq3byyhqvjg2dpryb2thwz4vfe

A Survey of Distributed Data Stream Processing Frameworks

Haruna Isah, Tariq Abughofa, Sazia Mahfuz, Dharmitha Ajerla, Farhana Zulkernine, Shahzad Khan
2019 IEEE Access  
One of the challenges in developing a streaming analytics infrastructure is the difficulty in selecting the right stream processing framework for the different use cases.  ...  The study also reports our ongoing study on a multilevel streaming analytics architecture that can serve as a guide for organizations and individuals planning to implement a real-time data stream processing  ...  State information during Flink's operations is maintained in an embedded key/value store [78] .  ... 
doi:10.1109/access.2019.2946884 fatcat:lu6oknfpkraybmtuqxismmlqda

Complex event analytics

Yingmei Qi, Lei Cao, Medhabi Ray, Elke A. Rundensteiner
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
Complex Event Processing (CEP) is a technology of choice for high performance analytics in time-critical decision-making applications.  ...  Yet while effective technologies for complex pattern detection on continuous event streams have been developed, the problem of scalable online aggregation of such patterns has been overlooked.  ...  We also compare our multi A-Seq technique by comparing it against the non-shared A-Seq technique and the state-of-the-art multi-query sharing strategy for sequence queries [9] to demonstrate the effectiveness  ... 
doi:10.1145/2588555.2593684 dblp:conf/sigmod/QiCRR14 fatcat:mrgaqkzedbchxeyjoj5anmle64


Badrish Chandramouli, Jonathan Goldstein, Mike Barnett, Robert DeLine, Danyel Fisher, John C. Platt, James F. Terwilliger, John Wernsing
2014 Proceedings of the VLDB Endowment  
This paper introduces Trill -a new query processor for analytics.  ...  Trill fulfills a combination of three requirements for a query processor to serve the diverse big data analytics space: (1) Query Model: Trill is based on a tempo-relational model that enables it to handle  ...  Phoenix++ [7] is a variant of map-reduce for in-memory analytics; unlike Trill, it is neither temporal nor streaming, and exposes a low-level key-value API.  ... 
doi:10.14778/2735496.2735503 fatcat:6cwt4vpxufgpzl2biaevkjoqay


Hongyu Miao, Myeongjae Jeon, Gennady Pekhimenko, Kathryn S. McKinley, Felix Xiaozhu Lin
2019 Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '19  
StreamBox-HBM solely uses HBM to store Key Pointer Array (KPA) data structures that contain only partial records (keys and pointers to full records) for grouping operations.  ...  Stream analytics have an insatiable demand for memory and performance. Emerging hybrid memories combine commodity DDR4 DRAM with 3D-stacked High Bandwidth Memory (HBM) DRAM to meet such demands.  ...  Acknowledgments For this project: the authors affiliated with Purdue ECE were supported in part by NSF Award 1718702, NSF Award 1619075, and a Google Faculty Award.  ... 
doi:10.1145/3297858.3304031 dblp:conf/asplos/MiaoJPML19 fatcat:djks3mdja5fgjjuwudbus42wmi

A Survey of State Management in Big Data Processing Systems [article]

Quoc-Cuong To, Juan Soto, Volker Markl
2018 arXiv   pre-print
Given the pivotal role that state management plays in various use cases, in this survey, we present some of the most important uses of state as an enabler, discuss the alternative approaches used to handle  ...  State management and its use in diverse applications varies widely across big data processing systems.  ...  They present four parallel patterns for window-based stateful operators on data streams: window farming, key partitioning, pane farming, and window partitioning.  ... 
arXiv:1702.01596v4 fatcat:474aqppfpjhdrkslqrawrmnck4

In-situ feature-based objects tracking for data-intensive scientific and enterprise analytics workflows

Solomon Lasluisa, Fan Zhang, Tong Jin, Ivan Rodero, Hoang Bui, Manish Parashar
2014 Cluster Computing  
tracking, and that it can be effectively used for in-situ analytics in large scale simulations.  ...  data from the simulations directly via on-chip shared memory.  ...  (6) The trajectory for each temporal cluster is updated by storing features in sorted order according to the analysis window in which they were found.  ... 
doi:10.1007/s10586-014-0396-6 fatcat:rnlbq3djgvd7jivd6mhlv6q5by


Fabian Fischer, Daniel A. Keim
2014 Proceedings of the Eleventh Workshop on Visualization for Cyber Security - VizSec '14  
are presented to analysts in a web-based visual analytics application, called NVisAware.  ...  Furthermore, we visually guide the user in the feature selection process to summarize the slices to focus on the most interesting parts of the stream based on introduced expert knowledge of the analyst  ...  A key-value list can be used to count the number of occurrences for all words to gather a list of frequent words. The key-array list can be used to store for each key an array of values.  ... 
doi:10.1145/2671491.2671495 dblp:conf/vizsec/FischerK14 fatcat:czluay556rgbtad75f7ftn5iry

In-transit molecular dynamics analysis with Apache flink

Henrique C. Zanúz, Bruno Raffin, Omar A. Mures, Emilio J. Padrón
2018 Proceedings of the Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization - ISAV '18  
In this paper, an on-line parallel analytics framework is proposed to process and store in transit all the data being generated by a Molecular Dynamics (MD) simulation run using staging nodes in the same  ...  Flink enables to program analyses within a simple window based map/reduce model, while the runtime takes care of the deployment, load balancing and fault tolerance.  ...  Key-value Store Apache HBase is an open-source distributed database based on Google proprietary BigTable.  ... 
doi:10.1145/3281464.3281469 dblp:conf/sc/ZanuzRMP18 fatcat:b7sc4mynvbfapk6n6y73mxzhqm

Pico: A Domain-Specific Language For Data Analytics Pipelines

Claudia Misale, Marco Aldinucci, Guy Tremblay
2017 Zenodo  
., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics  ...  Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). T [...]  ...  It targets shared-memory multi-core architectures, and exposes parallel patterns for exploiting both stream parallelism and data parallelism.  ... 
doi:10.5281/zenodo.579753 fatcat:aadje57qh5hk3ijmqn4j7vkhpm

Multi-Machine Gaussian Topic Modeling for Predictive Maintenance

Alexander Karlsson, Ebru Turanoglu Bekar, Anders Skoogh
2021 IEEE Access  
Section IV presents a multi-machine simulation; application of our framework for discovering patterns across multiple machines; results with exploration of shared clusters as well as results from a comparison  ...  This allows us to describe the current state of the machine for each point in time corresponding to the sliding window, as shown in Figure 3 .  ... 
doi:10.1109/access.2021.3096387 fatcat:ai4kxusdh5bvfgvtfasrkgqs7e

Accelerating dynamic graph analytics on GPUs

Mo Sha, Yuchen Li, Bingsheng He, Kian-Lee Tan
2017 Proceedings of the VLDB Endowment  
Furthermore, we propose parallel update algorithms to support e cient stream updates so that the maintained graph is immediately available for high-speed analytic processing on GPUs.  ...  In this paper, we propose a GPU-based dynamic graph storage scheme to support existing graph algorithms easily.  ...  All values stored in PMA are displayed in the array.  ... 
doi:10.14778/3151113.3151122 fatcat:zjyzodyd2vczjakz77ecdro6km

Partition and compose

Martin Hirzel
2012 Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems - DEBS '12  
Complex event processing uses patterns to detect composite events in streams of simple events. Typically, the events are logically partitioned by some key.  ...  For instance, the key can be the stock symbol in stock quotes, the author in tweets, the vehicle in transportation, or the patient in health-care.  ...  Thanks to the entire System S team for their encouragement and feedback for the operator described in this paper.  ... 
doi:10.1145/2335484.2335506 dblp:conf/debs/Hirzel12 fatcat:fgmph2y6encs7jmjl6odsq43gi

Multi-query Stream Processing on FPGAs

Mohammad Sadoghi, Rija Javed, Naif Tarafdar, Harsh Singh, Rohan Palaniappan, Hans-Arno Jacobsen
2012 2012 IEEE 28th International Conference on Data Engineering  
A multi-query optimized event processing network is comprised of Rete-specific elements, e.g., pattern detect nodes and join nodes, that share functional resemblance with the key relational algebra operators  ...  We propose key opportunities for intra-and interparallelized execution of window-based-join semantics.  ... 
doi:10.1109/icde.2012.39 dblp:conf/icde/SadoghiJTSPJ12 fatcat:3ztsskxvdre2dmgrmybvjvbwpy
« Previous Showing results 1 — 15 out of 16,290 results