RESUMO
Solid phase extraction (SPE) sample preparation for the analysis of complex organic mixtures is often applied assuming all analytes of interest will preconcentrate on the stationary phase. This assumption ignores the reality that extraction is a dynamic interactive process and a diverse range of affinities for the stationary phase will result in equally diverse breakthrough volumes due to competitive sorption processes. To study this dynamic interactive process, and further to take advantage of it, we extracted a JP-8 jet fuel spiked with 40 ppm of a polar compound mix with silica and alumina SPE cartridges and analyzed sequential extracted fractions of the fuel to both assess the shifting chemical landscape present in the extraction and the impact of both SPE stationary phases on this process. Tile-based 1v1 comparative analysis (a recently reported extension of tile-based Fisher ratio analysis) was used to discover the (polar) compounds whose concentrations change between extracted fractions, discovering 21 compounds extracted with silica and 27 compounds extracted with alumina with at least a 2-fold change in concentration from the neat sample relative to the first 1 mL pass fraction sample. These compounds were quantified in each fraction to construct concentration ratio profiles, defined as the concentration ratio for a given SPE fraction per analyte compound relative to the analyte concentration in the neat fuel, for which the extraction behavior for each analyte could be assessed. These analyte compounds were found to breakthrough at different rates, with some analytes remaining on the column indefinitely (until extracted with a subsequent polar solvent) and other analytes eluting before the extraction is complete. Furthermore, in a comparison of the effect of selected stationary phase, alumina was found to retain oxygen-containing phenolic compounds to a greater extent than silica. Principal component analysis (PCA) was used to analyze the concentration ratio profiles of the various trace analytes in the JP8 fuel (phenols, indoles, etc.) in the context of their stationary phase affinity (silica or alumina) and competitive sorption behavior.
RESUMO
Chemometric methods like partial least squares (PLS) regression are valuable for correlating sample-based differences hidden in comprehensive two-dimensional gas chromatography (GC × GC) data to independently measured physicochemical properties. Herein, this work establishes the first implementation of tile-based variance ranking as a selective data reduction methodology to improve PLS modeling performance of 58 diverse aerospace fuels. Tile-based variance ranking discovered a total of 521 analytes with a square of the relative standard deviation (RSD2) in signal between 0.07 to 22.84. The goodness-of-fit for the models were determined by their normalized root-mean-square error of cross-validation (NRMSECV) and normalized root-mean-square error of prediction (NRMSEP). PLS models developed for viscosity, hydrogen content, and heat of combustion using all 521 features discovered by tile-based variance ranking had a respective NRMSECV (NRMSEP) equal to 10.5 % (10.2 %), 8.3 % (7.6 %), and 13.1 % (13.5 %). In contrast, use of a single-grid binning scheme, a common data reduction strategy for PLS analysis, resulted in less accurate models for viscosity (NRMSECV = 14.2 %; NRMSEP = 14.3 %), hydrogen content (NRMSECV = 12.1 %; NRMSEP = 11.0 %), and heat of combustion (NRMSECV = 14.4 %; NRMSEP = 13.6 %). Further, the features discovered by tile-based variance ranking can be optimized for each PLS model with RReliefF analysis, a machine learning algorithm. RReliefF feature optimization selected 48, 125, and 172 analytes out of the original 521 discovered by tile-based variance ranking to model viscosity, hydrogen content, and heat of combustion, respectively. The RReliefF optimized features developed highly accurate property-composition models for viscosity (NRMSECV = 7.9 %; NRMSEP = 5.8 %), hydrogen content (NRMSECV = 7.0 %; NRMSEP = 4.9 %), heat of combustion (NRMSECV = 7.9 %; NRMSEP = 8.4 %). This work also demonstrates that processing the chromatograms with a tile-based approach allows the analyst to directly identify the analytes of importance in a PLS model. Coupling tile-based feature selection with PLS analysis allows for deeper understanding in any property-composition study.
Assuntos
Algoritmos , Análise dos Mínimos Quadrados , Cromatografia Gasosa-Espectrometria de Massas/métodosRESUMO
Tile-based Fisher ratio (F-ratio) analysis of comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC × GC-TOFMS) data is a powerful, supervised discovery methodology for pinpointing sample class-distinguishing analytes between two or more sample classes. Herein, we extend this analytical methodology to focus upon specific chemical groups in kerosene-based aerospace fuel using solid-phase extraction (SPE). Treating samples with SPE removes specific compounds depending on the SPE stationary phase (i.e., silica), creating an altered "pass" sample, identical to the original "neat" sample except for the extracted compounds. Application of F-ratio analysis to the neat samples against the pass samples provides global discovery with a numerically sorted hit list of all analytes affected by the SPE procedure. Sections of GC × GC-TOFMS data from the top analyte hits are reconstructed to form a "stitch" chromatogram to visualize the sample class-distinguishing compounds, revealing excellent agreement with the extract chromatogram. Additionally, utilizing the four-grid tiling scheme developed for tile-based F-ratio analysis, we demonstrate a tile-based pairwise analysis method, referred to as 1v1 analysis, to discover analytes that differ in concentration between two fuel chromatograms. Application of 1v1 analysis is highly efficient since replicates do not necessarily need to be run on the GC × GC-TOFMS instrument, which is beneficial for sample-limited applications. The 1v1 analyses discovered most of the same features as F-ratio analysis, ranging from 69 to 81% of the features discovered by F-ratio analysis while requiring one-sixth the data. Lastly, the overall methodology is applied to three candidate rocket fuels to better understand the compound class-distinguishing differences. The separate hit lists produced for high-concentration bulk hydrocarbon differences and low-concentration level polar compound differences provided valuable insight into these candidate rocket fuels.
RESUMO
Tile-based variance rank initiated-unsupervised sample indexing (VRI-USI) analysis is introduced for comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC-TOFMS). VRI-USI analysis addresses the challenge that irrelevant variables can often obscure true chemical variation when using other unsupervised chemometric tools. Implementation of VRI-USI analysis with GC×GC-TOFMS data incorporates the tile-based Fisher ratio (F-ratio) analysis software platform that mitigates the effects of retention shifting in both separation dimensions with an unsupervised variance metric (instead of the F-ratio metric) as the initial step of ranking the hitlist. Next, implementation of k-means clustering, k, per hit using the silhouette metric, Smax, is used to reveal to what extent recurring indexed sample clusters are uncovered. Finally, based upon a probability-based evaluation of how the individual samples cluster throughout the hitlist an unsupervised class membership is revealed. For a JP8 jet fuel dataset spiked with a sulfur-containing analyte mix at 30-ppm, 15-ppm, and neat, clustering by spike level at k = 3 was the most commonly re-occurring set of index assignments, occurring for 11 out of 14 spiked analytes. Upon application of these k-means index assignments to the entire hitlist, all 14 spiked hits had one way ANOVA p-values < 0.05, validating the presumption of classes. Next, application of VRI-USI to a 3-ppm spiked and neat JP8 jet fuel comparison exhibited similar performance to F-ratio analysis for analyte discovery. In the last study, for a dataset of J1800A, JP4, and JP8 jet fuel, each spiked with the sulfur-containing analyte mix at 30-ppm and neat, 453 out of 520 hits in the hitlist exhibited index assignments indicative of fuel type clustering, with the remaining 67 hits having contradictory assignments. Scrutinization of these 67 hits revealed nine hits with "split combinations" in index assignments, whereby the spiked and neat samples for a given fuel were in separate clusters. Eight of these hits were identified as spiked sulfur analytes. Interestingly, these hits also had large Smax indicative of a true sub-cluster. Thus, tile-based VRI-USI analysis appears to be a promising tool for unsupervised multi-class classification studies using GC×GC-TOFMS data.
Assuntos
Software , Enxofre , Análise por Conglomerados , Cromatografia Gasosa-Espectrometria de Massas/métodosRESUMO
A new tile-based pairwise analysis workflow, termed 1v1 analysis, is presented to discover and identify analytes that differentiate two chromatograms collected using comprehensive two-dimensional (2D) gas chromatography coupled with time-of-flight mass spectrometry (GC × GC-TOFMS). Tile-based 1v1 analysis easily discovered all 18 non-native analytes spiked in diesel fuel within the top 30 hits, outperforming standard pairwise chromatographic analyses. However, eight spiked analytes could not be identified with multivariate curve resolution-alternating least-squares (MCR-ALS) nor parallel factor analysis (PARAFAC) due to background contamination. Analyte identification was achieved with class comparison enabled-mass spectrum purification (CCE-MSP), which obtains a pure analyte spectrum by normalizing the spectra to an interferent mass channel (m/z) identified from 1v1 analysis and subtracting the two spectra. This report also details the development of CCE-MSP assisted MCR-ALS, which removes the identified interferent m/z from the data prior to decomposition. In total, 17 out of 18 spiked analytes had a match value (MV) > 800 with both versions of CCE-MSP. For example, MCR-ALS and PARAFAC were unable to decompose the pure spectrum of methyl decanoate (MVs < 200) due to its low 2D chromatographic resolution (â¼0.34) and high interferent-to-analyte signal ratio (â¼30:1). By leveraging information gained from 1v1 analysis, CCE-MSP and CCE-MSP assisted MCR-ALS obtained a pure spectrum with an average MV of 908 and 964, respectively. Furthermore, tile-based 1v1 analysis was applied to track moisture damage in cacao beans, where 86 analytes with at least a 2-fold concentration change were discovered between the unmolded and molded samples. This 1v1 analysis workflow is beneficial for studies where multiple replicates are either unavailable or undesirable to save analysis time.
Assuntos
Gasolina , Cromatografia Gasosa-Espectrometria de Massas/métodos , Gasolina/análise , Análise dos Mínimos Quadrados , Espectrometria de MassasRESUMO
Integration of rice and fish farming, eg., pacu fish in Argentina, has raised concern that herbicides used for rice paddies may adversely affect the fish metabolome. To study this issue, tile-based Fisher ratio (F-ratio) analysis was applied to comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometry (GCâ¯×â¯GC-TOFMS) data of pacu fish raised in an integrated rice-fish farming system (farmed class) versus fish raised in tanks (tank-raised class) to discover class-distinguishing analytes. F-ratio analysis resulted in hit lists initially dominated by artifact peaks from the sample derivatization process, as well as some redundant hits. These challenges were addressed by developing an automated artifact removal algorithm and an improvement to redundant hit removal in the tile-based F-ratio analysis using a pooled farmed fish sample, either spiked with 29 metabolites (spiked class) or unspiked (i.e., the background serving as the control class). Of the 29 spiked metabolites, 23 were discovered by standard F-ratio analysis, improving to 28 discovered using control-normalized F-ratio analysis. Standard and control-normalized F-ratio hit lists initially with 185 and 246 hits, were reduced to 56 and 49 hits, respectively, after artifact removal and removing redundant hits. Next, we returned to the F-ratio analysis of the farmed fish versus the control fish. Here, we introduce a minimum variance optimized (MVO) F-ratio calculation (MVOF-ratio) that provides a comprehensive hit list ranking. The initial MVOF-ratio analysis hit list of 537 hits was reduced to 110 hits following artifact removal. Of the 110 hits, 70 expressed a concentration ratio statistically different than 1 (p < 0.05). The MVOF-ratio hit list discovered more true positives compared to the standard, tank-normalized, and farm-normalized F-ratio hit lists, providing the combination of results from the farm-normalized and tank-normalized hit lists. A majority of analytes (54 out of 70) important for normal biological functioning of pacu fish were significantly downregulated in the farmed fish, suggesting the integrated farming system may negatively impact pacu fish quality.
Assuntos
Algoritmos , Metaboloma , Cromatografia Gasosa-Espectrometria de Massas/métodosRESUMO
Tile-based Fisher ratio (F-ratio) analysis is emerging as a versatile data analysis tool for supervised discovery-based experimentation using comprehensive two-dimensional (2D) gas chromatography coupled with time-of-flight mass spectrometry (GC × GC-TOFMS). None the less, analyte identification can often be marred by poor 2D resolution and low analyte abundance relative to overlapping compounds. Linear algebra-based chemometric methods, in particular multivariate curve resolution alternating least squares (MCR-ALS), parallel factor analysis (PARAFAC) and PARAFAC2, are often applied in an effort to address this situation. However, these chemometric methods can fail to produce an accurate spectrum when the analyte is at low 2D resolution and/or in low relative abundance. To address this challenge, we introduce class comparison enabled mass spectrum purification (CCE-MSP), a method that utilizes the underlying requirement for signal consistency of the background interference compounds between the two classes in the F-ratio analysis to purify the mass spectrum of the analyte hits. CCE-MSP is validated using a dataset obtained for a neat JP-8 jet fuel spiked with 14 sulfur containing compounds at two levels (15 ppm and 30 ppm), using the p-value and lack-of-fit (LOF) for each analyte hit as consistency metrics. A purified mass spectrum was produced for each spiked analyte hit and their mass spectrum match value (MV) was compared to the MV obtained by MCR-ALS, PARAFAC, and PARAFAC2. The resulting MV for CCE-MSP were found to be as good or better than these chemometric methods, eg., for 2-butyl-5-ethylthiophene with an analyte-to-interference relative signal abundance of 1:87 and a 2D resolution of 0.2, CCE-MSP produced a MV of 831, compared to 476 for MCR-ALS, 403 for PARAFAC, and 336 for PARAFAC2. CCE-MSP is also extended to obtain the purified spectrum for more than one analyte, eg., two analyte hits in overlapping hit locations. The spectra produced by CCE-MSP can also be utilized as estimates to facilitate quantitative signal decomposition using MCR-ALS.
Assuntos
Espectrometria de Massas , Análise Fatorial , Cromatografia Gasosa-Espectrometria de Massas , Análise dos Mínimos QuadradosRESUMO
Comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC-TOFMS) is followed by tile-based Fisher ratio (F-ratio) analysis to investigate the "limit of discovery" for low concentration levels of sulfur-containing compounds in JP8 jet fuel. A mixture of 14 sulfur-containing compounds was spiked at 30 ppm, 15 ppm, 3 ppm and 1.5 ppm into the neat fuel prior to GC×GC-TOFMS analysis with a "reversed" column format (aka polar first dimension (1D) and non-polar second dimension (2D) column). Prior standard implementation of tile-based F-ratio analysis utilized an average F-ratio requiring a minimum of 3 mass channels (m/z) with the highest F-ratios. Herein, we explore the notion that use of the top F-ratio m/z for hitlist ranking is superior to the standard implementation for analytes near their limit-of-quantitation (LOQ), defined as an analyte concentration that produces a signal equal to ten times the standard deviation of the baseline noise (10σn). Hitlist ranking comparisons revealed that using only the top F-ratio m/z resulted in impressive improvements in discoverability for the low concentration comparisons. Specifically, for the 3 ppm versus neat hitlist, 1,4-oxathiane (LOQ = 2.5 ppm) improved from hit 114 via standard F-ratio analysis, to hit 25. For the 1.5 ppm versus neat hitlist, 2-propylthiophene (LOQ = 0.64 ppm) improved from hit 59 to 17, benzo[b]thiophene (LOQ = 1.1 ppm) from hit 98 to 28, and 2,5-dimethylthiophene (LOQ = 1.3 ppm) from hit 262 to 39. Additional hitlist ranking comparisons revealed the importance of proper tile size selection, as analyte discoverability deteriorated upon using either an inappropriately too small or too large of a tile.
Assuntos
Cromatografia Gasosa-Espectrometria de Massas/métodos , Limite de Detecção , Hidrocarbonetos/análise , Enxofre/análise , Tiofenos/químicaRESUMO
An innovative form of Fisher ratio (F-ratio) analysis (FRA) is developed for use with comprehensive two-dimensional gas chromatography/time-of-flight mass spectrometry (GC × GC-TOFMS) data and applied to the investigation of the changes in the metabolome in human plasma for patients with injury to their anterior cruciate ligament (ACL). Specifically, FRA provides a supervised discovery of metabolites that express a statistically significant variance in a two-sample class comparison: patients and healthy controls. The standard F-ratio utilizes the between-class variance relative to the pooled within-class variance. Because standard FRA is adversely impacted by metabolites expressed with a large within-class variance in the patient class, "control-normalized FRA" has been developed to provide complementary information, by normalizing the between-class variance to the variance of the control class only. Thirty plasma samples from patients who recently suffered from an ACL injury, along with matched controls, were subjected to GC × GC-TOFMS analysis. Following both standard and control-normalized FRA, the concentration ratio for the top 30 "hits" in each comparison was obtained and then t-tested for statistical significance. Twenty four out of 30 metabolites plus the therapeutic agent, naproxen (24/30), passed the t-test for the control-normalized FRA, which included 8/24 unique to control-normalized FRA and 16/24 in common with the standard FRA. Likewise, standard FRA provided 21/30 metabolites passing the t-test, with 5/21 undiscovered by control-normalized FRA. The complementary information obtained by both F-ratio analyses demonstrates the general utility of the new approach for a variety of applications.
Assuntos
Lesões do Ligamento Cruzado Anterior/metabolismo , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodos , Lesões do Ligamento Cruzado Anterior/sangue , Biomarcadores/sangue , Biomarcadores/metabolismo , Humanos , Limite de Detecção , Fatores de TempoRESUMO
Tile-based Fisher ratio (F-ratio) analysis has recently been developed and validated for discovery-based studies of highly complex data collected using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC-TOFMS). In previous studies, interpretation and utilization of F-ratio hit lists has relied upon manual decomposition and quantification performed by chemometric methods such as parallel factor analysis (PARAFAC), or via manual translation of the F-ratio hit list information to peak table quantitative information provided by the instrument software (ChromaTOF). Both of these quantification approaches are bottlenecks in the overall workflow. In order to address this issue, a more automatable approach to provide accurate relative quantification for F-ratio analyses was investigated, based upon the mass spectral selectivity provided via the F-ratio spectral output. Diesel fuel spiked with 15 analytes at four concentration levels (80, 40, 20, and 10 ppm) produced three sets of two class comparisons that were submitted to tile-based F-ratio analysis to obtain three hit lists, with an F-ratio spectrum for each hit. A novel algorithm which calculates the signal ratio (S-ratio) between two classes (eg., 80 ppm versus 40 ppm) was applied to all mass channels (m/z) in the F-ratio spectrum for each hit. A lack of fit (LOF) metric was utilized as a measure of peak purity and combined with F-ratio and p-values to study the relationship of each of these metrics with m/z purity. Application of a LOF threshold coupled with a p-value threshold yielded a subset of the most pure m/z for each of the 15 spiked analytes, evident by the low deviations (< 5%) in S-ratio relative to the true concentration ratio. A key outcome of this study was to demonstrate the isolation of pure m/z without the need for higher level signal decomposition algorithms.
Assuntos
Cromatografia Gasosa-Espectrometria de Massas/métodos , Algoritmos , Compostos de Anilina/química , Bromobenzenos/química , Álcoois Graxos/química , Gasolina/análise , Espectrometria de MassasRESUMO
Basic principles are introduced for implementing discovery-based analysis with automated quantification of data obtained using comprehensive three-dimensional gas chromatography with flame ionization detection (GC3-FID). The GC3-FID instrument employs dynamic pressure gradient modulation, providing full modulation (100% duty cycle) with a fast modulation period (PM) of 100 ms. Specifically, tile-based Fisher-ratio analysis, previously developed for comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometry (GC×GC-TOFMS), is adapted and applied for GC3-FID where the third chromatographic dimension (3D) is treated as the "spectral" dimension. To evaluate the instrumental platform and software implementation, ten "non-native" compounds were spiked into a ninety-component base mixture to create two classes with a concentration ratio of two for the spiked analyte compounds. The Fisher ratio software identified 95 locations of potential interest (i.e., hits), with all ten spiked analytes discovered within the top fourteen hits. All 95 hits were quantified by a novel signal ratio (S-ratio) algorithm portion of the F-ratio software, which determines the time-dependent S-ratio of the 3D chromatograms from one class to another, thus providing relative quantification. The average S-ratio for spiked analytes was 1.94 ± 0.14 mean absolute error (close to the nominal concentration ratio of two), and 1.06 ± 0.16 mean absolute error for unspiked (i.e., matrix) components. The appearance of the S-ratio as a function of 3D retention time in the GC3 dataset, referred to as an S-ratiogram, provides indication of peak purity for each hit. The unique shape of the S-ratiogram for hit 1, α-pinene, suggested likely 3D overlap. Parallel factor analysis (PARAFAC) decomposition of the hit location confirmed that overlap was occurring and successfully decomposed α-pinene from a highly overlapped (3Rs = 0.1) matrix interferent.
Assuntos
Cromatografia Gasosa/métodos , Ionização de Chama , Algoritmos , Monoterpenos Bicíclicos/análise , Análise Fatorial , Espectrometria de Massas/métodos , SoftwareRESUMO
We report the discovery, preliminary investigation, and demonstration of a novel form of differential flow modulation for comprehensive two-dimensional (2D) gas chromatography (GC×GC). Commercially available components are used to apply a flow of carrier gas with a suitable applied auxiliary gas pressure (Paux) to a T-junction joining the first (1D) and second (2D) dimension columns. The 1D eluate is confined at the T-junction, and introduced for 2D separation with a cyclic rhythm, dependent upon the relationship of the modulation period (PM) to the pulse width (pw), where pw is defined as the time interval when the auxiliary gas flow at the T-junction is off. We refer to this flow modulation technique as "dynamic pressure gradient modulation" (DPGM) since a pressure gradient oscillates with the PM along the 1D and 2D column ensemble providing temporary stop-flow conditions and fast 2D flow rates, resulting in 100% duty cycle and full modulation. A 90-component test mixture was used to evaluate the technique with a pw of 60 ms and a PM of 750 ms. The resulting peaks were narrow, with 2Wb ranging from about 20-180 ms. With an average 1Wb of 3 s and a 2nc of 10, a 2D peak capacity, nc,2D, for the 25 min separation was 5000. The detector response enhancement factor (DREF) is reported, defined as the peak height of the highest modulated 2D peak divided by the unmodulated 1D peak height (DREFâ¯=â¯2h/1h). The DREF ranged from about 7-87, depending on the 1Wb and 2Wb for a given analyte. A diesel sample was analyzed to demonstrate performance with a complex sample. Based upon the average 1Wb of 5 s and an average 2Wb of 168 ms, a nc,2D of 8640 was obtained for the 60 min diesel separation. Finally, the modulation principle was investigated as a function of PM, pw, and the volumetric flow rates, 1F and 2F. The measured 2Wb correlate well with the theoretical 2D injected width, given by 2Winjâ¯=â¯(1F/2F) ·PM. However, the relevant 1F appears to be dictated by the 1D flow rate when no pressure is applied (during the pw interval), instead of 1F being the average flow rate on 1D (defined by the 1D dead time). The findings provide strong evidence for a differential flow modulation mechanism.