Empirical Evaluation of Big Data Analytics using Design of Experiment: Case Studies on Telecommunication Data

Samneet Singh, Yan Liu, Wayne Ding, Zheng Li
2016 Services Transactions on Big Data  
Data analytics involves the process of data collection, data analysis, and report generation. Data mining workflow tools usually orchestrate this process. The data analysis step in this process further consists a series of machine learning algorithms. There exists a variety of data mining tools and machine learning algorithms. Each tool or algorithm has its own set of features that become factors to affect both functional and nonfunctional attributes of the system of data analytics. Given
more » ... -specific requirements of data analytics, understanding the effects of these factors and their combinations provide a guideline of selecting workflow tools and machine learning algorithms. In this paper, we develop an empirical evaluation method based on the principle of Design of Experiment. We apply this method to evaluate data mining tools and machine learning algorithms towards building big data analytics for telecommunication monitoring data. Two case studies are conducted to provide insights of relations between the requirements of data analytics and the choice of a tool or algorithm in the context of data analysis workflows. The demonstration also shows that our evaluation method can facilitate the replication of this evaluation study, and can conveniently be expanded for evaluating other tools and algorithms.
doi:10.29268/stbd.2016.3.2.1 fatcat:kgv2koygkjcfbpg3ps4z45uvbm