Filters








1,911 Hits in 7.2 sec

Clustering cubes with binary dimensions in one pass

Carlos Garcia-Alvarado, Carlos Ordonez
2013 Proceedings of the sixteenth international workshop on Data warehousing and OLAP - DOLAP '13  
In this research, we focus on obtaining aggregations of groups of similar records by turning the problem into efficient binary clustering of a fact table as a relaxation of a GROUP BY clause.  ...  Finding aggregations of records with high dimensionality in large data warehouses is a crucial and costly task. These groups of similar records are the result of partitions obtained with GROUP BYs.  ...  SUM, MAX, MIN, AVG) of groups of tuples from a fact table, where the aggregation is the result of using clustering as a relaxation of a GROUP BY clause.  ... 
doi:10.1145/2513190.2513192 dblp:conf/dolap/Garcia-AlvaradoO13 fatcat:imhicwnrefgonic2hq6rd2vw4i

Data Cube Materialization and Mining over MapReduce

Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramakrishnan
2012 IEEE Transactions on Knowledge and Data Engineering  
Computing interesting measures for data cubes and subsequent mining of interesting cube groups over massive datasets are critical for many important analyses done in the real world.  ...  Specifically, we identify an important subset of holistic measures and introduce MR-Cube, a MapReduce based framework for efficient cube computation and identification of interesting cube groups on holistic  ...  Cube analysis provides the users with a convenient way to discover insights from the data by computing aggregate measures (e.g., total sales) over all possible groups defined by the two dimensions (e.g  ... 
doi:10.1109/tkde.2011.257 fatcat:e4qm3n4zbjfizafqbqr4r3qb6a

Distributed cube materialization on holistic measures

Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramakrishnan
2011 2011 IEEE 27th International Conference on Data Engineering  
Unlike commonly studied algebraic measures such as SUM that are amenable to parallel computation, efficient cube computation of holistic measures such as TOP-K is non-trivial and often impossible with  ...  We demonstrate that, unlike existing techniques which cannot scale to the 100 million tuple mark for our datasets, MR-Cube successfully and efficiently computes cubes with holistic measures over billion-tuple  ...  Cube analysis provides the users with a convenient way to discover insights from the data by computing aggregate measures (e.g., total sales) over all possible groups defined by the two dimensions (e.g  ... 
doi:10.1109/icde.2011.5767884 dblp:conf/icde/NandiYBR11 fatcat:sa6y445tkrdonnacoebqnwtpom

Fusion OLAP: Fusing the Pros of MOLAP and ROLAP Together for In-memory OLAP

Yansong Zhang, Yu Zhang, Shan Wang, Jiaheng Lu
2018 IEEE Transactions on Knowledge and Data Engineering  
This is achieved by mapping the relation tables into virtual multidimensional model and binding the multidimensional operations into a set of vector indexes to enable multidimensional computing on relation  ...  In particular, MOLAP is efficient in multidimensional computing at the cost of cube maintenance, while ROLAP reduces the data storage size at the cost of expensive multidimensional join operations.  ...  This work was supported by the National Natural Science Foundation of China (61732014, 61772533) and Academy of Finland (310321). Yu Zhang is the corresponding author.  ... 
doi:10.1109/tkde.2018.2867522 fatcat:vfrtcmiqsvfodeahtx6oake2uu

Computational methods and optimizations for containment and complementarity in web data cubes

Marios Meimaris, George Papastefanatos, Panos Vassiliadis, Ioannis Anagnostopoulos
2018 Information Systems  
We provide an experimental evaluation over real-world and synthetic datasets and we compare our approach to a SPARQL-based and a rule-based alternative, which prove to be inefficient for increasing input  ...  The increasing volume and diversity of these data pose the challenge of finding relations between them in a most efficient and accurate way, by taking into advantage their overlapping schemes.  ...  This work is supported by the EUfunded ICT project "DIACHRON" (agreement no 601043).  ... 
doi:10.1016/j.is.2018.02.010 fatcat:xtzdi5t6fvcnxhzz7czyhs3cxy

Scalable Informative Rule Mining

Guoyao Feng, Lukasz Golab, Divesh Srivastava
2017 2017 IEEE 33rd International Conference on Data Engineering (ICDE)  
The objective is to produce a concise set of rules (patterns) over the values of the dimension attributes that provide the most information about the distribution of a numeric measure attribute.  ...  Informative rules have recently been studied in several contexts, including data summarization, data cube exploration and data quality.  ...  Acknowledgements First, I would like to give the most sincere thanks to my supervisor, Professor Lukasz Golab, for his invaluable support and guidance, his patience and encouragement.  ... 
doi:10.1109/icde.2017.101 dblp:conf/icde/FengGS17 fatcat:iotetdbca5fmpgewbosnabtzkm

Partial materialization for online analytical processing over multi-tagged document collections

Grzegorz Drzadzewski, Frank Wm. Tompa
2015 Knowledge and Information Systems  
By adopting this strategy, summary measures dependent on centroids (including measures involving medoids, sets of representative documents, or sets of representative terms) can be efficiently computed.  ...  Such collections can be viewed as high-dimensional document cubes against which browsers and search systems can be applied in a manner similar to online analytical processing against data cubes.  ...  Thus a d-dimensional cuboid stores aggregated values in cells indexed by the possible values for each of the d unaggregated dimensions, and if each dimension is binary requires O(2 d ) space.  ... 
doi:10.1007/s10115-015-0871-2 fatcat:nevi7g7fk5fx3in54cthrg5qam

Structure-Aware Machine Learning over Multi-Relational Databases

Maximilian Schleich
2021 Proceedings of the 2021 International Conference on Management of Data  
We have shown that the k-dimensional data cube with dimensions S k requires the computation of 2 k group-by aggregate queries of the form (4.1).  ...  Sharing Computation Prior techniques for data cubes use a lattice of sub-queries to capture sharing across the group-by aggregates defining data cubes [65, 91] .  ...  As a result, Rkmeans can scale easily to large datasets, and can compute the clusters with a much lower memory footprint than mlpack.  ... 
doi:10.1145/3448016.3461670 fatcat:xhqcdfxkdbdrnkn3gklevd6tmq

Materialization of fragmented views in multidimensional databases

Matteo Golfarelli, Vittorio Maniezzo, Stefano Rizzi
2004 Data & Knowledge Engineering  
In the classical approach to materialization, each view includes all and only the measures of the cube it aggregates.  ...  The most effective technique to enhance performances of multidimensional databases consists in materializing redundant aggregates called views.  ...  dimension tables (grouping set) and computes summarized values for measures by means of some aggregation operators (see Fig. 1 ).  ... 
doi:10.1016/j.datak.2003.11.001 fatcat:7fjokqe2xvfvvanspt3gz44use

Beyond Roll-Up's and Drill-Down's: An Intentional Analytics Model to Reinvent OLAP (long-version) [article]

Panos Vassiliadis and Patrick Marcel and Stefano Rizzi
2018 arXiv   pre-print
data cell of a cube with information about the models that pertain to it -- practically converting the small parts that build up the models to data that annotate each cell.  ...  We exploit this data-to-model mapping to provide highlights of the data, by isolating data and models that maximize the delivery of new information to the user.  ...  We are particularly thankful to the reviewers of both this paper and its preliminary version in DOLAP 2018 for their comments that helped us enrich the breadth of the related work, and the clarity of concepts  ... 
arXiv:1812.07854v1 fatcat:5oflgb75hfgwlgfx3gy3u4pwmy

Clustering-Based Materialized View Selection in Data Warehouses [chapter]

Kamel Aouiche, Pierre-Emmanuel Jouve, Jérôme Darmont
2006 Lecture Notes in Computer Science  
In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries.  ...  A judicious choice of views must be costdriven and influenced by the workload experienced by the system.  ...  D d ), where S is a conjunction of simple range predicates on dimension table attributes, G is a set of attributes from dimension tables D i (grouping set), and M is a set of aggregated measures each defined  ... 
doi:10.1007/11827252_9 fatcat:b5ymtpysenanpikzxim5r5wqgy

Clustering-Based Materialized View Selection in Data Warehouses [article]

Kamel Aouiche and Pierre-Emmanuel Jouve and Jerome Darmont
2007 arXiv   pre-print
In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries.  ...  A judicious choice of views must be cost-driven and influenced by the workload experienced by the system.  ...  . ⊲⊳ D d ), where S is a conjunction of simple range predicates on dimension table attributes, G is a set of attributes from dimension tables D i (grouping set), and M is a set of aggregated measures each  ... 
arXiv:cs/0703114v1 fatcat:wegfq3caknf55e25zby2l7ynfa

Cubrick

Pedro Pedreira, Chris Croswhite, Luis Bona
2016 Proceedings of the VLDB Endowment  
Cubrick has a strictly multidimensional data model composed of cubes, dimensions and metrics, supporting sub-second OLAP operations such as slice and dice, roll-up and drill-down over terabytes of data  ...  All data stored in Cubrick is range partitioned by every dimension and stored within containers called bricks in an unordered and sparse fashion, providing high data ingestion rates and indexed access  ...  For instance, a query that groups by the dimension Region from the example shown in Figure 1 , is expected to have an output of, at most, 8 rows -the maximum cardinality defined for that dimension.  ... 
doi:10.14778/3007263.3007269 fatcat:krf5qjcnjrg47m34gvb53apum4

On the formation of trapezium-like systems

Richard J. Allison, Simon P. Goodwin
2011 Monthly notices of the Royal Astronomical Society  
We perform ensembles of N-body simulations of the evolution of N=1000 Orion-like clusters with initial conditions ranging from cool and clumpy to relatively smooth and relaxed.  ...  We investigate the formation and evolution of high-order massive star multiples similar to the Trapezium in the Orion Nebula Cluster.  ...  This work has made use of the Iceberg computing facility, part of the White Rose Grid computing facilities at the University of Sheffield.  ... 
doi:10.1111/j.1365-2966.2011.18849.x fatcat:7jplc275gnb3zp6btkonntiw3q

Distribution of local relaxation events in an aging three-dimensional glass: Spatiotemporal correlation and dynamical heterogeneity

Anton Smessaert, Jörg Rottler
2013 Physical Review E  
Dynamical heterogeneity is spatially resolved as the aggregation of hops into clusters, and we analyze their volume distribution and growth during aging.  ...  We introduce an efficient algorithm to directly identify hops during the simulation, which allows the creation of a map of relaxation events for the whole system.  ...  This work was supported by the Natural Sciences and Engineering Council of Canada (NSERC). Computing time was provided by WestGrid.  ... 
doi:10.1103/physreve.88.022314 pmid:24032839 fatcat:imbknox4ajfbnf2soxujnp6hca
« Previous Showing results 1 — 15 out of 1,911 results