Construction of Confidence Regions for Isotopic Abundance Patterns in LC/MS Data Sets for Rigorous Determination of Molecular Formulas

Andreas Ipsen, Elizabeth J. Want, Timothy M. D. Ebbels
2010 Analytical Chemistry  
It has long been recognized that estimates of isotopic abundance patterns may be instrumental in identifying the many unknown compounds encountered when conducting untargeted metabolic profiling using Liquid Chromatography-Mass Spectrometry. While numerous methods have been developed for assigning heuristic scores to rank the degree of fit of the observed abundance patterns with theoretical ones, little work has been done to quantify the errors that are associated with the measurements made.
more » ... s, it is generally not possible to determine, in a statistically meaningful manner, whether a given chemical formula would likely be capable of producing the observed data. In this article, we present a method for constructing confidence regions for the isotopic abundance patterns based on the fundamental distribution of the ion arrivals. Moreover, we develop a method for doing so that makes use of the information pooled together from the measurements obtained across an entire chromatographic peak, as well as from any adducts, dimers and fragments observed in the mass spectra. This greatly increases the statistical power, thus enabling the analyst to rule out a potentially much larger number of candidate formulas while explicitly guarding against false positives. In practice, small departures from the model assumptions are possible due to detector saturation, and interferences between adjacent isotopologues. While these factors form impediments to statistical rigor they can to a large extent be overcome by restricting the analysis to moderate ion counts and by applying robust statistical methods. Using real metabolic data, we demonstrate that the method is capable of reducing the number of candidate formulas by a substantial amount, even when no bromine or chlorine atoms are present. We argue that further developments in our ability to characterize the data mathematically could enable much more powerful statistical analyses. construction of the simple multinomial confidence region follow immediately. However, owing to the distorting effects of the heavy tails of the mass peaks, this is not generally the case, and the distribution of the x 2 (M) statistics has a somewhat heavier tail than the appropriate χ 2 -distribution. It therefore remains to determine whether the robust confidence region is sufficiently small to be useful in excluding candidate formulas. Preparation of synthetic urine. Eighty-three endogenous mammalian metabolites were weighed into a 1 L bottle and dissolved in 1 L HPLC grade water (Sigma-Aldrich, St Louis, MO). The remaining solids were removed by vacuum filtration. The final metabolite concentrations were targeted to fall between 1 and 20 mM, with sodium azide added at 0.05% v/v as a preservative. In order to eliminate the effect of salt suppression in the sample introduction interfaces, the ordinarily high levels of inorganic salts found in urine were not added. The stock solution was stored at -80ºC. Instrumentation. The synthetic urine samples (5µl) were injected onto a 2.1 x 100mm (1.7µm) HSS T3 Acquity column (Waters Corporation, Milford, USA) and were eluted using a 18min gradient of 100% A to 100% B (A = water, 0.1% formic acid, B = acetonitrile, 0.1% formic acid). The column temperature was 40ºC, the sample temperature 4ºC and a flow rate of 500µl/min was used. Samples were analyzed using a UPLC system (UPLC Acquity, Waters Ltd. Elstree, U.K.) coupled online to a Q-ToF Premier mass spectrometer (Waters MS Technologies, Ltd., Manchester, U.K.) in both positive and negative ion electrospray mode, using a scan range of 50-1000 m/z and a scan time of 0.08s. A total of three technical replicates were run. The data were acquired in continuum mode in order to obtain data that were as raw as possible. Similarly the Dynamic Range Enhancement (DRE) lens, which the Q-ToF Premier employs in order to minimize detector saturation, was switched off.
doi:10.1021/ac101278x pmid:20690638 pmcid:PMC2930401 fatcat:6yylzv7bgrgolhw5f3yv3bawoy