Multi-Sorted Inverse Frequent Itemsets Mining [article]

Domenico Sacca', Edoardo Serra, Pietro Dicosta, Antonio Piccolo
2013 arXiv   pre-print
The development of novel platforms and techniques for emerging "Big Data" applications requires the availability of real-life datasets for data-driven experiments, which are however out of reach for academic research in most cases as they are typically proprietary. A possible solution is to use synthesized datasets that reflect patterns of real ones in order to ensure high quality experimental findings. A first step in this direction is to use inverse mining techniques such as inverse frequent
more » ... temset mining (IFM) that consists of generating a transactional database satisfying given support constraints on the itemsets in an input set, that are typically the frequent ones. This paper introduces an extension of IFM, called many-sorted IFM, where the schemes for the datasets to be generated are those typical of Big Tables as required in emerging big data applications, e.g., social network analytics.
arXiv:1310.3939v1 fatcat:aj54nykpfzakbkenmpl63fx3fy