10 Hits in 5.9 sec

A Practice of TPC-DS Multidimensional Implementation on NoSQL Database Systems [chapter]

Hongwei Zhao, Xiaojun Ye
2014 Lecture Notes in Computer Science  
cuboids instances • Not all dimensions are used in a query • Not all queries are used with the same frequency • ... …  OLAP engine practice on NoSQL systems for low-latency?  ...  WHY MOLAP  MOLAP is online analytical processing that indexes directly into a multidimensional database  User can be able to view different aspects or facets of data aggregates stored in a multidimensional  ...  "d_moy" • Choice of cube model : • Demand-driven & data-driven • Generation for cube data: • Model-driven & requirement-driven Q U E S T I O N S A N S W E R S  ... 
doi:10.1007/978-3-319-04936-6_7 fatcat:qz4i46zuvjhchkekyggkhdkdii

Big Data Analytics and Processing Platform in Czech Republic Healthcare

Martin Štufi, Boris Bačić, Leonid Stoimenov
2020 Applied Sciences  
The aim of this study is to improve the existing healthcare eSystem by implementing a Big Data Analytics (BDA) platform and to meet the requirements of the Czech Republic National Health Service (Tender-Id  ...  The reported PoC BDA platform, artefacts, and concepts are transferrable to healthcare systems in other countries interested in developing or upgrading their own national healthcare infrastructure in a  ...  Data Storage (DS) Data storage (DS) represents a system module that contains cluster-based, horizontally scalable physical architectures built onto NoSQL Vertica databases.  ... 
doi:10.3390/app10051705 fatcat:mogab2rkq5bnvpsow3cqjtrxqq

Big Data Architecture in Czech Republic Healthcare Service: Requirements, TPC-H Benchmarks and Vertica [article]

Martin Štufi, Boris Bačić, Leonid Stoimenov
2020 arXiv   pre-print
The platform, based on analytical Vertica NoSQL database for massive data processing, complies with the TPC-H1 for decision support benchmark, the European Union (EU) and the Czech Republic requirements  ...  The purpose of this study is to improve existing clinical care by implementing a big data platform for the Czech Republic National Health Service.  ...  Acknowledgments We express our appreciation of the contribution of colleagues from the Solutia company who participated in the realisation of the IHIS project.  ... 
arXiv:2001.01192v1 fatcat:54znjmigbbexhhynqm6dyuisce

Benchmarking data warehouse systems in the cloud

Rim Moussa
2013 2013 ACS International Conference on Computer Systems and Applications (AICCSA)  
Finally, we present new requirements for implementing a benchmark for data warehouse systems in the cloud.  ...  The proposed requirements should allow a fair comparison of different cloud systems providers' offerings.  ...  The current TPC-H specification (ditto for TPC-DS), assumes that TPC-H deployment on a parallel machine (shared-disk or shared-memory system architecture), and not on a shared-nothing architecture.  ... 
doi:10.1109/aiccsa.2013.6616442 dblp:conf/aiccsa/Moussa13 fatcat:y4mcf7xk3zdjzgd473dufhpeau

Holistic evaluation in multi-model databases benchmarking

Chao Zhang, Jiaheng Lu
2019 Distributed and parallel databases  
A multi-model database (MMDB) is designed to support multiple data models against a single, integrated back-end. Examples of data models include document, graph, relational, and key-value.  ...  Finally, the extensive experiments based on the proposed benchmark were performed on four representatives of MMDBs: ArangoDB, OrientDB, AgensGraph and Spark SQL.  ...  To view a copy of this licence, visit  ... 
doi:10.1007/s10619-019-07279-6 fatcat:abvnwc6m2ne27ppdjqcwociwim

An Iterative Methodology for Defining Big Data Analytics Architectures

Roberto Tardio, Alejandro Mate, Juan Trujillo
2020 IEEE Access  
TPC-DS or TPC-H, which are widely used and standardized.  ...  Conf. on Big Data, 2015) and "A novel multidimensional approach to integrate big data in business intelligence" (Journal of Database Management, 2015) .  ... 
doi:10.1109/access.2020.3039455 fatcat:2o6ee6cvjveurgbh4s4mrwv65e


Tim Kraska
2018 Proceedings of the VLDB Endowment  
Most importantly, enabling a broader range of users to unfold the potential of (their) data requires a change in the interface and the "protection" we offer them.  ...  On the one hand, visual interfaces for data science have to be intuitive, easy, and interactive to reach users without a strong background in computer science or statistics.  ...  5) Visualizations are not like SQL queries : The workload created by visualization systems is very different from what TPC-H and TPC-DS make us believe analytical workloads look like.  ... 
doi:10.14778/3229863.3240493 fatcat:7v3sxuhth5gnpa2d5lc6rmu4b4

A Unified View of Data-Intensive Flows in Business Intelligence Systems: A Survey [chapter]

Petar Jovanovic, Oscar Romero, Alberto Abelló
2016 Lecture Notes in Computer Science  
In this paper we present a survey of today's research on data-intensive flows and the related fundamental fields of database theory.  ...  As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows.  ...  This work has been partially supported by the Secreteria d'Universitats i Recerca de la Generalitat de Catalunya under 2014 SGR 1534, and by the Spanish Ministry of Education grant FPU12/04915.  ... 
doi:10.1007/978-3-662-54037-4_3 fatcat:il3vwhf22rhzvmimjybvfa5npe

Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

Benedikt Kämpgen
The broad acceptance of the RDF Data Cube Vocabulary (QB) for publishing multidimensional datasets and of Online Analytical Processing (OLAP) interfaces for intuitive and interactive knowledge discovery  ...  Second, aggregation and filtering operations, a varying selectivity of queries, and chains of joins over a growing number of possibly large datasets together with background information from the Web render  ...  NoSQL Systems As an alternative to traditional relational databases, NoSQL (not only SQL) data management systems focus on scalability.  ... 
doi:10.5445/ksp/1000047013 fatcat:jivaxlbyujgbtkqirbcpfxawxm

Business Intelligence on Non-Conventional Data

Enrico Coordinatore, Dottorato Relatore, Paolo Ciaccia, Rizzi
2017 unpublished
Schemaless data refers to the storage of data in NoSQL databases that do not force a predefined schema, but let database instances embed their own local schemata.  ...  In this context, this thesis proposes an approach to determine the schema profile of a document-based database; the goal is to facilitate users in a schema-on-read analysis process by understanding the  ...  Some comparison between CubeLoad and TPC-DS is useful at this point. Overall, the focus in the TPC-DS is more on the complexity of single queries rather than on query sessions.  ...