Multivariate techniques and their application in nutrition: a metabolomics case study

E. Katherine Kemsley, Gwénaëlle Le Gall, Jack R. Dainty, Andrew D. Watson, Linda J. Harvey, Henri S. Tapp, Ian J. Colquhoun
2007 British Journal of Nutrition  
The post-genomic technologies are generating vast quantities of data but many nutritional scientists are not trained or equipped to analyse it. In high-resolution NMR spectra of urine, for example, the number and complexity of spectral features mean that computational techniques are required to interrogate and display the data in a manner intelligible to the researcher. In addition, there are often multiple underlying biological factors influencing the data and it is difficult to pinpoint which
more » ... are having the most significant effect. This is especially true in nutritional studies, where small variations in diet can trigger multiple changes in gene expression and metabolite concentration. One class of computational tools that are useful for analysing this highly multivariate data include the well-known 'whole spectrum' methods of principal component analysis and partial least squares. In this work, we present a nutritional case study in which NMR data generated from a human dietary Cu intervention study is analysed using multivariate methods and the advantages and disadvantages of each technique are discussed. It is concluded that an alternative approach, called feature subset selection, will be important in this type of work; here we have used a genetic algorithm to identify the small peaks (arising from metabolites of low concentration) that have been altered significantly following a dietary intervention.
doi:10.1017/s0007114507685365 pmid:17381968 fatcat:gtdvs6y42vagfougrv33svqpfu