Analysing chromatographic data using data mining to monitor petroleum content in water [chapter]

Geoffrey Holmes, Dale Fletcher, Peter Reutemann, Eibe Frank
2009 Information Technologies in Environmental Engineering  
Chromatography is an important analytical technique that has widespread use in environmental applications. A typical application is the monitoring of water samples to determine if they contain petroleum. These tests are mandated in many countries to enable environmental agencies to determine if tanks used to store petrol are leaking into local water systems. Chromatographic techniques, typically using gas or liquid chromatography coupled with mass spectrometry, allow an analyst to detect a vast
more » ... array of compounds-potentially in the order of thousands. Accurate analysis relies heavily on the skills of a limited pool of experienced analysts utilising semi-automatic techniques to analyse these datasets-making the outcomes subjective. The focus of current laboratory data analysis systems has been on refinements of existing approaches. The work described here represents a paradigm shift achieved through applying data mining techniques to tackle the problem. These techniques are compelling because the efficacy of preprocessing methods, which are essential in this application area, can be objectively evaluated. This paper presents preliminary results using a data mining framework to predict the concentrations of petroleum compounds in water samples. Experiments demonstrate that the framework can be used to produce models of sufficient accuracy-measured in terms of root 2 Geoffrey Holmes, Dale Fletcher, Peter Reutemann and Eibe Frank mean squared error and correlation coefficients-to offer the potential for significantly reducing the time spent by analysts on this task.
doi:10.1007/978-3-540-88351-7_21 dblp:conf/itee/HolmesFRF09 fatcat:mishncgkwvcihibtexxtds3foi