A Data Mining-Based OLAP Aggregation of Complex Data

Riadh Ben Messaoud, Omar Boussaid, Sabine Loudcher Rabaséda
2006 International Journal of Data Warehousing and Mining  
Nowadays, most organizations deal with complex data having different formats and coming from different sources. The XML formalism is evolving and becoming a promising solution for modelling and warehousing these data in decision support systems. Nevertheless, classical OLAP tools are still not capable to analyze such data. In this paper, we associate OLAP and data mining to cope advanced analysis on complex data. We provide a generalized OLAP operator, called OpAC, based on the AHC. OpAC is
more » ... ted for all types of data since it deals with data cubes modelled within XML. Our operator enables significant aggregates of facts expressing semantic similarities. Evaluation criteria of aggregates' partitions are proposed in order to assist the choice of the best partition. Furthermore, we developed a Web application for our operator. We also provide performance experiments and drive a case study on XML documents dealing with the breast cancer researches domain. A Data Mining-Based OLAP Aggregation 4 measures. For example, a user wants to observe the sum of sales amount of products according to years and regions. This aggregation should use attributes to describe the targeted facts and make computation over their measures. In the recent years, as more organizations see the web as an integral part of their communication and business, we have been dealing with a proliferation of new data formats. These data are complex and quite different and harder to treat than classical ones. They need new methodologies to be warehoused first, and then to be analyzed. XML (eXtensible Markup Language) is providing some promising solutions for integrating complex
doi:10.4018/jdwm.2006100101 fatcat:ok34fyhiknfgldkzujes7w45dm