SQL-Based KDD with Infobright's RDBMS: Attributes, Reducts, Trees [chapter]

Jakub Wróblewski, Sebastian Stawicki
2014 Lecture Notes in Computer Science  
We present a framework for KDD process implemented using SQL procedures, consisting of constructing new attributes, finding rough set-based reducts and inducing decision trees. We focus particularly on attribute reduction, which is important especially for high-dimensional data sets. The main technical contribution of this paper is a complete framework for calculating short reducts using SQL queries on data stored in a relational form, without a need of any external tools generating or
more » ... their syntax. A case study of large real-world data is presented. The paper also recalls some other examples of SQL-based data mining implementations. The experimental results are based on the usage of Infobright's analytic RDBMS, whose performance characteristics perfectly fit the requirements of presented algorithms.
doi:10.1007/978-3-319-08729-0_3 fatcat:i4wt3l3tvbgrpih7ze35ahyabi