Computational Estimation by Scientific Data Mining with Classical Methods to Automate Learning Strategies of Scientists
ACM Transactions on Knowledge Discovery from Data
Experimental results are often plotted as 2-dimensional graphical plots (aka graphs) in scientific domains depicting dependent versus independent variables to aid visual analysis of processes. Repeatedly performing laboratory experiments consumes significant time and resources, motivating the need for computational estimation. The goals are to estimate the graph obtained in an experiment given its input conditions, and to estimate the conditions that would lead to a desired graph. Existing
... ation approaches often do not meet accuracy and efficiency needs of targeted applications. We develop a computational estimation approach called AutoDomainMine that integrates clustering and classification over complex scientific data in a framework so as to automate classical learning methods of scientists. Knowledge discovered thereby from a database of existing experiments serves as the basis for estimation. Challenges include preserving domain semantics in clustering, finding matching strategies in classification, striking a good balance between elaboration and conciseness while displaying estimation results based on needs of targeted users, and deriving objective measures to capture subjective user interests. These and other challenges are addressed in this work. The AutoDomainMine approach is used to build a computational estimation system, rigorously evaluated with real data in Materials Science. Our evaluation confirms that AutoDomainMine provides desired accuracy and efficiency in computational estimation. It is extendable to other science and engineering domains as proved by adaptation of its sub-processes within fields such as Bioinformatics and Nanotechnology.