Using automated planning for improving data mining processes

Susana Fernández, Tomás de la Rosa, Fernando Fernández, Rubén Suárez, Javier Ortiz, Daniel Borrajo, David Manzano
2013 Knowledge engineering review (Print)  
This paper presents a distributed architecture for automating data mining processes using standard languages. Data mining is a difficult task that relies on an exploratory and analytic process of processing large quantities of data in order to discover meaningful patterns. The increasing heterogeneity and complexity of available data requires some expert knowledge on how to combine the multiple and alternative data mining tasks to process the data. Here, we describe data-mining tasks in terms
more » ... Automated Planning, which allows us to automate the data-mining knowledge flow construction. The work is based on the use of standards that have been defined in both data mining and automated-planning communities. Thus, we use PMML (Predictive Model Markup Language) to describe data mining tasks. From the PMML, a problem description in PDDL (Planning Domain Definition Language) can be generated, so any current planning system can be used to generate a plan. This plan is, again, translated to a data-mining workflow description, KFML format (Knowledge Flow file for the WEKA tool), so the plan or data-mining workflow can be executed in WEKA (Waikato Environment for Knowledge Analysis).
doi:10.1017/s0269888912000409 fatcat:6nagvnjaovdntof3nce7eaiqgu