Metabolomics by numbers: acquiring and understanding global metabolite data
Trends in Biotechnology
In this postgenomic era, there is a specific need to assign function to orphan genes in order to validate potential targets for drug therapy and to discover new biomarkers of disease. Metabolomics is an emerging field that is complementary to the other 'omics and proving to have unique advantages. As in transcriptomics or proteomics, a typical metabolic fingerprint or metabolomic experiment is likely to generate thousands of data points, of which only a handful might be needed to describe the
... oblem adequately. Extracting the most meaningful elements of these data is thus key to generating useful new knowledge with mechanistic or explanatory power. Since the completion of the first whole-genome sequence of a free-living organism (that of the bacterium Haemophilus influenzae , although the sequencing of a human mitochondrion long predates it ), we began to realize the paucity of our knowledge with respect to the existence, let alone the function, of the novel genes thereby uncovered. Sequencing of the microbiologist's pet organism, Escherichia coli, revealed that a staggering 38% of the total 4288 open reading frames had not been observed or studied before . More recently, completion of the human genome sequence [4,5] has accelerated further the demand for determining the biochemical function of orphan genes and for validating them as molecular targets for therapeutic intervention. The search for biomarkers that can serve as indicators of disease progression or response to therapeutic intervention has also increased. Functional studies have thus emphasized analyses at the level of gene expression (transcriptomics), protein translation (proteomics) including post-translational modifications, and the metabolic network (metabolomics), with a view to a 'systems biology' approach of defining the phenotype and bridging the genotype-to-phenotype gap . There is active debate in the research community over the exact definition of the 'metabolome', but it was first defined by Oliver et al.  as the quantitative complement of all of the low molecular weight molecules present in cells in a particular physiological or developmental state. Another definition states that the metabolome consists 'only of those native small molecules (definable nonpolymeric compounds) that are participants in general metabolic reactions and that are required for the maintenance, growth and normal function of a cell' . Although the metabolome is certainly 'complementary' to transcriptomics and proteomics, it might be seen to have special advantages. In particular, it is known from both the theory underlying metabolic control analysis [9,10] and experiment  that, although changes in the quantities of individual enzymes might be expected to have little effect on metabolic fluxes, they can and do have significant effects on the concentrations of numerous individual metabolites. In addition, the metabolome is further down the line from gene to function and so reflects more closely the activities of the cell at a functional level. Thus, as the 'downstream' result of gene expression, changes in the metabolome are expected to be amplified relative to changes in the transcriptome and the proteome . As expected, metabolic fluxes (at least as exemplified by glycolysis in trypanosomes) are not regulated by gene expression alone, which provides a further rationale for pursuing metabolomics . In this review we describe the growing field of metabolomics, the needs and means by which metabolome data can be generated, and how this information can be turned into knowledge. Measuring the metabolome The ultimate starting point of a metabolomic experiment is to quantify all of the metabolites in a cellular system (i.e. the cell or tissue in a given state at a given point in time). Currently this is impossible, given the lack of simple automated analytical strategies that can effect this in a reproducible and robust way. The main challenges are the chemical complexity and heterogeneity of metabolites, the dynamic range of the measuring technique, the throughput of the measurements, and the extraction protocols. Ideally, metabolomics should be non-biased but, considering the above, at best it can be thought of as 'non-targeted'. Moreover, the paucity of our knowledge with respect to known metabolites is staggering, although perhaps Corresponding author: Royston Goodacre (firstname.lastname@example.org).