RESUMO
Tile-based Fisher ratio (F-ratio) analysis has recently been developed and validated for discovery-based studies of highly complex data collected using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC-TOFMS). In previous studies, interpretation and utilization of F-ratio hit lists has relied upon manual decomposition and quantification performed by chemometric methods such as parallel factor analysis (PARAFAC), or via manual translation of the F-ratio hit list information to peak table quantitative information provided by the instrument software (ChromaTOF). Both of these quantification approaches are bottlenecks in the overall workflow. In order to address this issue, a more automatable approach to provide accurate relative quantification for F-ratio analyses was investigated, based upon the mass spectral selectivity provided via the F-ratio spectral output. Diesel fuel spiked with 15 analytes at four concentration levels (80, 40, 20, and 10 ppm) produced three sets of two class comparisons that were submitted to tile-based F-ratio analysis to obtain three hit lists, with an F-ratio spectrum for each hit. A novel algorithm which calculates the signal ratio (S-ratio) between two classes (eg., 80 ppm versus 40 ppm) was applied to all mass channels (m/z) in the F-ratio spectrum for each hit. A lack of fit (LOF) metric was utilized as a measure of peak purity and combined with F-ratio and p-values to study the relationship of each of these metrics with m/z purity. Application of a LOF threshold coupled with a p-value threshold yielded a subset of the most pure m/z for each of the 15 spiked analytes, evident by the low deviations (< 5%) in S-ratio relative to the true concentration ratio. A key outcome of this study was to demonstrate the isolation of pure m/z without the need for higher level signal decomposition algorithms.
Assuntos
Cromatografia Gasosa-Espectrometria de Massas/métodos , Algoritmos , Compostos de Anilina/química , Bromobenzenos/química , Álcoois Graxos/química , Gasolina/análise , Espectrometria de MassasRESUMO
Evaluation of a recently developed data reduction method for gas chromatography time-of-flight mass spectrometry (GC-TOFMS) is presented in the context of the statistical model of overlap (SMO) using simulated chromatographic data. The two-dimensional mass cluster plot method (2D m/z cluster plot method) significantly improves separation visualization by measuring the retention time, tR, and peak width-at-base, wb, of each analyte peak on a per mass channel, m/z, basis and plotting wb versus tR as a single point for each peak. Additional selectivity is provided by the peak width dimension, allowing for the differentiation of "pure" or selective m/z and shared or overlapped m/z. Analyte clusters in the 2D mass cluster plot are defined based on clustering of individual points, representing the selective m/z for those analytes, and encompassed by a box of user-specified size. The method is applied to simulated chromatographic data with a random, independent distribution of analyte peaks and constant peak wb. Two levels of chromatographic saturation factor, α, and two sets of analyte mass spectra with varying spectral similarity are studied to assess method performance. The percentage of analyte clusters found relative to the number of analytes simulated in the chromatogram increases as the box size (analogous to chromatographic resolution, Rs) is decreased, resulting in an Rs limit of 0.05 for the method. Additionally, the percentage of analyte clusters discovered also increases with lower α and greater dissimilarity between analyte mass spectra, demonstrating the immense benefit of improving the chromatographic separation and chemical selectivity in analyte discovery, identification, and quantification.
Assuntos
Cromatografia Gasosa-Espectrometria de Massas/métodos , Modelos Estatísticos , Análise por ConglomeradosRESUMO
A new approach is presented to determine the probability of achieving a successful quantitative analysis for gas chromatography coupled with mass spectrometry (GC-MS). The proposed theory is based upon a probabilistic description of peak overlap in GC-MS separations to determine the probability of obtaining a successful quantitative analysis, which has its lower limit of chromatographic resolution Rs at some minimum chemometric resolution, Rs*; that is to say, successful quantitative analysis can be achieved when Rs ≥ Rs*. The value of Rs* must be experimentally determined and is dependent on the chemometric method to be applied. The approach presented makes use of the assumption that analyte peaks are independent and randomly distributed across the separation space or are at least locally random, namely, that each analyte represents an independent Bernoulli random variable, which is then used to predict the binomial probability of successful quantitative analysis. The theoretical framework is based on the chromatographic-saturation factor and chemometric-enhanced peak capacity. For a given separation, the probability of quantitative success can be improved via two pathways, a chromatographic-efficiency pathway that reduces the saturation of the sample and a chemometric pathway that reduces Rs* and improves the chemometric-enhanced peak capacity. This theory is demonstrated through a simulation-based study to approximate the resolution limit, Rs*, of multivariate curve resolution-alternating least-squares (MCR-ALS). For this study, Rs* was determined to be â¼0.3, and depending on the analytical expectations for the quantitative bias and the obtained mass-spectral match value, a lower value of Rs* â¼ 0.2 may be achievable.
RESUMO
We report a quantitative approach to optimize implementation of discovery-based software for comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC × GC-TOFMS). The software performs a tile-based Fisher ratio (F-ratio) analysis and facilitates a supervised nontargeted analysis based upon the experimental design to aid in the discovery of analytes with statistically different variances between sample classes. The quantitative approach for software optimization uses receiver operating characteristic (ROC) curves. The area under the curve (AUC) for each ROC curve serves as a quantitative metric to optimize two key algorithm parameters: the signal-to-noise ratio (S/N) threshold of the data prior to calculating F-ratios at each m/z mass channel and the number of these F-ratios per m/z used to calculate the average F-ratio of a tile. A total of 25 combinations of S/N threshold by number of m/z were studied. Fifty analytes were spiked into a diesel fuel at two concentration levels to produce two sample classes that should in principle produce 50 positive instances in the ROC curves. The "sweet spot" for F-ratio analysis was determined to be a S/N threshold of 10 coupled with a maximum of the 10 most chemically selective m/z (requiring a minimum of 3 m/z), corresponding to an â¼21% improvement in the discrimination of true positives relative to prior studies. This equates to an additional 9 true positives being discovered at a false positive probability of 0.2 and 5 additional true positives being found overall. Furthermore, optimization of these software parameters did not depend upon a priori determination of the statistically correct number of positive instances in the sample classes. The AUC metric appears to be suitable for the evaluation of all data analysis methods that utilize the proper experimental design.
Assuntos
Curva ROC , Software , Algoritmos , Cromatografia Gasosa , Espectrometria de Massas , Razão Sinal-Ruído , Fatores de TempoRESUMO
A novel analytical workflow is presented for the analysis of time-dependent (13)C-labeling of the metabolites in the methylotrophic bacterium Methylobacterium extorquens AM1 using gas chromatography time-of-flight mass spectrometry (GC-TOFMS). Using (13)C-methanol as the substrate in a time course experiment, the method provides an accurate determination of the number of carbons converted to the stable isotope. The method also extracts a quantitative isotopic dilution time course profile for (13)C uptake of each metabolite labeled that could in principle be used to obtain metabolic flux rates. The analytical challenges encountered require novel analytical platforms and chemometric techniques. GC-TOFMS offers advanced separation of mixtures, identification of individual components, and high data density for the application of advanced chemometrics. This workflow combines both novel and traditional chemometric techniques, including the recently reported two-dimensional mass cluster plot method (2D m/z cluster plot method) as well as principal component analysis (PCA). The 2D m/z cluster plot method effectively indexed all metabolites present in the sample and deconvoluted metabolites at ultra-low chromatographic resolution (RS≈0.04). Using the pure mass spectra extracted, two PCA models were created. Firstly, PCA was used on the first and last time points of the time course experiment to determine and quantify the extent of (13)C uptake. Secondly, PCA modeled the full time course in order to quantitatively extract the time course profile for each metabolite. The 2D m/z cluster plot method found 152 analytes (metabolites and reagent peaks), with 54 pure analytes, and 98 were convoluted, with 65 of the 98 requiring mathematical deconvolution. Of the 152 analytes surveyed, 83 were metabolites determined by the PCA model to have incorporated (13)C while 69 were determined to be either metabolites or reagent peaks that remained unlabeled.
Assuntos
Metaboloma , Methylobacterium extorquens/metabolismo , Isótopos de Carbono , Cromatografia Gasosa-Espectrometria de Massas/métodos , Análise de Componente PrincipalRESUMO
A novel data reduction and representation method for gas chromatography time-of-flight mass spectrometry (GC-TOFMS) is presented that significantly facilitates separation visualization and analyte peak deconvolution. The method utilizes the rapid mass spectral data collection rate (100 scans/s or greater) of current generation TOFMS detectors. Chromatographic peak maxima (serving as the retention time, tR) above a user specified signal threshold are located, and the chromatographic peak width, W, are determined on a per mass channel (m/z) basis for each analyte peak. The peak W (per m/z) is then plotted against its respective tR (with 10 ms precision) in a two-dimensional (2D) format, producing a cluster of points (i.e., one point per peak W versus tR in the 2D plot). Analysis of GC-TOFMS data by this method produces what is referred to as a two-dimensional mass channel cluster plot (2D m/z cluster plot). We observed that adjacent eluting (even coeluting) peaks in a temperature programmed separation can have their peak W vary as much as â¼10-15%. Hence, the peak W provides useful chemical selectivity when viewed in the 2D m/z cluster plot format. Pairs of overlapped analyte peaks with one-dimensional GC resolution as low as Rs ≈ 0.03 can be visually identified as fully resolved in a 2D m/z cluster plot and readily deconvoluted using chemometrics (i.e., demonstrated using classical least-squares analysis). Using the 2D m/z cluster plot method, the effective peak capacity of one-dimensional GC separations is magnified nearly 40-fold in one-dimensional GC, and potentially â¼100-fold in the context of comparing it to a two-dimensional separation. The method was studied using a 73 component test mixture separated on a 30 m × 250 µm i.d. RTX-5 column with a LECO Pegasus III TOFMS.