Filters








15 Hits in 2.0 sec

Information Discovery in Polystores: The Augmented Way

Antonio Maccioni, Riccardo Torlone
2019 Sistemi Evoluti per Basi di Dati  
Augmentation can be used to implement augmented search and augmented exploration: two effective methods for information discovery in polystores that avoid middleware layers, abstract query languages, and  ...  We address this problem by illustrating query augmentation, a data manipulation operator for polystores based on the automatic enrichment of the answer to a local query with related data in the rest of  ...  The augmented construct can be at the basis of two alternative ways to access a polystore, as described in the following. Augmented Search.  ... 
dblp:conf/sebd/MaccioniT19 fatcat:dmhkmzban5cd5ciqhfir6rahpa

Supporting Polystore Queries using Provenance in a Hyperknowledge Graph

Leonardo Azevedo, Renan Souza, Elton F. S. Soares, Raphael Melo, Anna Oliveira, Márcio Ferreira Moreno
2021 International Semantic Web Conference  
Current modern applications commonly need to manage various types of datasets, usually composed of heterogeneous data and schema manipulated by disparate tools and techniques in an ad-hoc way.  ...  This demo presents HKPoly -a solution that tackles the challenge of mapping and linking heterogeneous data, providing data access encapsulation by employing semantic, provenance, and data linkage.  ...  The third activity uses the high-quality data files and augments the raw geodata files with extra knowledge informed by geoscience experts (stored in AllegroGraph -T-DBMS).  ... 
dblp:conf/semweb/0001SSMOM21 fatcat:6xuknrkmlva6fkla4dhnhwyafa

Handling Evolution in Big Data Architectures

Darja Solodovnikova, Laila Niedrite
2020 Baltic Journal of Modern Computing  
In this paper, we analyze architectures designed for Big Data processing and analysis described in the literature with the purpose to identify the most appropriate solution for the evolution problem.  ...  We concentrate on four architecture types: data lakes, virtual integration, polystores, and λ-architecture, and, in addition to them, we consider solutions that apply data warehouse/OLAP methods to Big  ...  Acknowledgments This work has been partly supported by the European Regional Development Fund (ERDF) project No. 1.1.1.2./VIAA/1/16/057.  ... 
doi:10.22364/bjmc.2020.8.1.02 fatcat:ghvrfrud7rcblj7a3ww3pzn2dm

Exploring complex and big data

Jerzy Stefanowski, Krzysztof Krawiec, Robert Wrembel
2017 International Journal of Applied Mathematics and Computer Science  
We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture.  ...  All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges), which ultimately seems to be of greater importance than the sheer data  ...  Introduction Big data can be defined in several ways.  ... 
doi:10.1515/amcs-2017-0046 fatcat:q6ugvobzi5cmbos4ct52mb3d34

Data lake concept and systems: a survey [article]

Rihan Hai, Christoph Quix, Matthias Jarke
2021 arXiv   pre-print
It poses a huge difficulty to efficiently integrate, access, and query the large volume of diverse data in information silos with the traditional 'schema-on-write' approaches such as data warehouses.  ...  We hope that the thorough comparison of existing solutions and the discussion of open research challenges in this survey would motivate the future development of data lake research and practice.  ...  For example, for data augmentation it also tries to find tables that can bring new data instances to improve the information gain.  ... 
arXiv:2106.09592v1 fatcat:qqwgp52s6vdhhjvx6y24pvj6jm

Draining the Data Swamp

Will Brackenbury, Rui Liu, Mainack Mondal, Aaron J. Elmore, Blase Ur, Kyle Chard, Michael J. Franklin
2018 Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'18  
So called "data lakes" embrace the storage of data in its natural form, integrating and organizing in a Pay-as-you-go fashion.  ...  While this model defers the upfront cost of integration, the result is that data is unusable for discovery or analysis until it is processed.  ...  suggest other data sets that are similar in such ways.  ... 
doi:10.1145/3209900.3209911 dblp:conf/sigmod/BrackenburyLMEU18 fatcat:gy7rni6pdvh6pfjvbwwmup64ha

Big Data Semantics

Paolo Ceravolo, Antonia Azzini, Marco Angelini, Tiziana Catarci, Philippe Cudré-Mauroux, Ernesto Damiani, Alexandra Mazak, Maurice Van Keulen, Mustafa Jarrar, Giuseppe Santucci, Kai-Uwe Sattler, Monica Scannapieco (+3 others)
2018 Journal on Data Semantics  
In this paper, the third of its kind co-authored by members of IFIP WG 2.6 on Data Semantics, we propose a review of the literature addressing these topics and discuss relevant challenges for future research  ...  It is, however, largely recognized that Big Data impose novel challenges in data and infrastructure management.  ...  An example architecture of a polystore system is shown in Fig. 3 . It is composed of two information islands: a relational one and a NoSQL one.  ... 
doi:10.1007/s13740-018-0086-2 fatcat:bhbeyntbtzdkvf5t3dcko42jpy

Dataset Relationship Management

Zack Ives, Yi Zhang, Soonbo Han, Nan Zheng
2019 Conference on Innovative Data Systems Research  
We argue that the recent adoption of computational notebooks (particularly JupyterLab and Jupyter Notebook), as a unified interface over data tools, provides an ideal way of gathering detailed information  ...  We briefly outline our experiences in building towards JuNEAU, the first prototype DRMS.  ...  the same or similar ways.  ... 
dblp:conf/cidr/IvesZHZ19 fatcat:rpngzqwzfndchlpdpjpi2r5lbm

AI Enabling Technologies: A Survey [article]

Vijay Gadepally, Justin Goodwin, Jeremy Kepner, Albert Reuther, Hayley Reynolds, Siddharth Samsi, Jonathan Su, David Martinez
2019 arXiv   pre-print
Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data  ...  While much of the popular press today surrounds advances in algorithms and computing, most modern AI systems leverage advances across numerous different fields.  ...  ACKNOWLEDGEMENTS The authors wish to thank the following individuals for their contributions, thoughts, and comments toward the development of this section: Charlie Dagli, Arjun Majumdar, Lauren Milechin  ... 
arXiv:1905.03592v1 fatcat:dui76274qvb5bie7pok2gpek6u

Survive the Schema Changes: Integration of Unmanaged Data Using Deep Learning [article]

Zijie Wang, Lixi Zhou, Amitabh Das, Valay Dave, Zhanpeng Jin, Jia Zou
2020 arXiv   pre-print
Data is the king in the age of AI. However data integration is often a laborious task that is hard to automate.  ...  Although there exist mechanisms such as query discovery and schema modification language to handle the problem, these approaches can only work with the assumption that the schema is maintained by a database  ...  Data Discovery Data discovery is to find related tables in a data lake.  ... 
arXiv:2010.07586v1 fatcat:kaux2y3uvzfivon7cusggsyudm

Agora: A Unified Asset Ecosystem Going Beyond Marketplaces and Cloud Services [article]

Jonas Traub, Jorge-Arnulfo Quiané-Ruiz, Zoi Kaoudi, Volker Markl
2020 arXiv   pre-print
Agora presents novel research directions for the data management community as a whole: It requires to combine our traditional expertise in scalable data processing and management with infrastructure provisioning  ...  The Agora system provides the technical infrastructure that allows for offering and using data and algorithms, as well as physical infrastructure components.  ...  Only a unified specification enables easy asset discovery and composition across all the marketplaces in the ecosystem.  ... 
arXiv:1909.03026v3 fatcat:y2naveva4fcw3hyb5ioi2bylje

Cloud and Distributed Architectures for Data Management in Agriculture 4.0 : Review and Future Trends

Olivier Debauche, Saïd Mahmoudi, Pierre Manneback, Frédéric Lebeau
2021 Journal of King Saud University: Computer and Information Sciences  
(4) What are the vertical valuations possibilities to move from algorithms trained in the cloud to embedded or stand-alone products?  ...  The Agriculture 4.0, also called Smart Agriculture or Smart Farming, is at the origin of the production of a huge amount of data that must be collected, stored, and processed in a very short time.  ...  Fabrice Nolack Fote for his help in the elaboration of the conceptual framework.  ... 
doi:10.1016/j.jksuci.2021.09.015 fatcat:jjo2444jvjgxbduepcerewoowe

Tackling the veracity and variety of big data [article]

Ruochun Jin, University Of Edinburgh, Wenfei Fan, Leonid Libkin
2022
Given expected support and recall bounds, this method is able to deduce samples in H and mine rules from H to satisfy the bounds in the entire G.  ...  We formalize association deduction with GARs in terms of the chase, and prove its Church-Rosser property.  ...  As shown above, BigDAWG and Myria handle graphs essentially in the same way as a conventional RDBMS does, although they are polystores.  ... 
doi:10.7488/era/2412 fatcat:me4heaka7fhszhgntvifgw3x6y

Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale [article]

Ihab F. Ilyas, Theodoros Rekatsinas, Vishnu Konda, Jeffrey Pound, Xiaoguang Qi, Mohamed Soliman
2022 pre-print
In this paper, we discuss the unique challenges associated with knowledge graph construction at industrial scale, and review the main components of Saga and how they address these challenges.  ...  An example of such annotations is shown in Figure 13 where short text highlights are augmented with information from the KG using NERD.  ...  A federated polystore approach [28] is used to support the wide variety of workloads against the graph, both in view computation and query APIs.  ... 
doi:10.1145/3514221.3526049 arXiv:2204.07309v1 fatcat:44bwr4zb6zgs5hqwwxvazo54mm

Just-in-time Analytics Over Heterogeneous Data and Hardware

Manolis Karpathiotakis
2017
Acknowledgements Putting together this thesis required the support and feedback of numerous people; without them, neither my PhD journey nor its destination would have been the same.  ...  Polystores.  ...  En même temps, la variété des données augmente continuellement, et ce, sur plusieurs axes.  ... 
doi:10.5075/epfl-thesis-8077 fatcat:y4a3i5zgcfewjbaarrkejthcgi