RESUMEN
Multiple reaction monitoring (MRM) is a powerful and popular technique used for metabolite quantification in targeted metabolomics. Accurate and consistent quantitation of metabolites from the MRM data is essential for subsequent analyses. Here, we developed an automated tool, MRMQuant, for targeted metabolomic quantitation using high-throughput liquid chromatography-tandem mass spectrometry MRM data to provide users with an easy-to-use tool for accurate MRM data quantitation with minimal human intervention. This tool has many user-friendly functions and features to inspect and correct the quantitation results as required. MRMQuant possesses the following features to ensure accurate quantitation: (1) dynamic signal smoothing, (2) automatic deconvolution of coeluted peaks, (3) absolute quantitation via standard curves and/or internal standards, (4) visualized inspection and correction, (5) corrections applicable to multiple samples, and (6) batch-effect correction.
Asunto(s)
Metabolómica , Espectrometría de Masas en Tándem , Metabolómica/métodos , Espectrometría de Masas en Tándem/métodos , Humanos , Automatización , Cromatografía Liquida/métodos , Programas InformáticosRESUMEN
Examination of changes in urinary metabolomic profiles after vegetable ingestion may lead to new methods of assessing plant food intake. To this regard, we developed a proof-of-principle methodology to identify urinary metabolomic signatures for spinach, celery, and onion. Three feeding studies were conducted. In the first study, healthy individuals were fed with spinach, celery, onion, and no vegetables in four separate experiments with pooled urinary samples for metabolite discovery. The same protocol was used to validate the finding at the individual level in the second study and when feeding all three vegetables simultaneously in the third study. An LC-MS-based metabolomics approach was adopted to search for indicative metabolites from urine samples collected during multiple time periods before and after the meal. Consequently, a total of 1, 9, and 3 nonoverlapping urinary metabolites were associated with the intake of spinach, celery, and onion, respectively. The PCA signature of these metabolites followed a similar "time cycle" pattern, which maximized at approximately 2-4 h after intake. In addition, the metabolite profiles for the same vegetable were consistent across samples, regardless of whether it was consumed individually or in combination. The developed methodology along with the identified urinary metabolomic signatures were potential tools for assessing plant food intake.
Asunto(s)
Ingestión de Alimentos , Metabolómica/métodos , Orina/química , Verduras/metabolismo , Biomarcadores/orina , Cromatografía Liquida , Humanos , Espectrometría de Masas , Prueba de Estudio ConceptualRESUMEN
Non-alcoholic fatty liver disease (NAFLD) as a global health problem has clinical manifestations ranging from simple non-alcoholic fatty liver (NAFL) to non-alcoholic steatohepatitis (NASH), cirrhosis, and cancer. The role of different types of fatty acids in driving the early progression of NAFL to NASH is not understood. Lipid overload causing lipotoxicity and inflammation has been considered as an essential pathogenic factor. To correlate the lipid profiles with cellular lipotoxicity, we utilized palmitic acid (C16:0)- and especially unprecedented palmitoleic acid (C16:1)-induced lipid overload HepG2 cell models coupled with lipidomic technology involving labeling with stable isotopes. C16:0 induced inflammation and cell death, whereas C16:1 induced significant lipid droplet accumulation. Moreover, inhibition of de novo sphingolipid synthesis by myriocin (Myr) aggravated C16:0 induced lipoapoptosis. Lipid profiles are different in C16:0 and C16:1-treated cells. Stable isotope-labeled lipidomics elucidates the roles of specific fatty acids that affect lipid metabolism and cause lipotoxicity or lipid droplet formation. It indicates that not only saturation or monounsaturation of fatty acids plays a role in hepatic lipotoxicity but also Myr inhibition exasperates lipoapoptosis through ceramide in-direct pathway. Using the techniques presented in this study, we can potentially investigate the mechanism of lipid metabolism and the heterogeneous development of NAFLD.
Asunto(s)
Marcaje Isotópico , Metabolismo de los Lípidos , Metaboloma , Metabolómica , Ácidos Grasos/metabolismo , Ácidos Grasos Monoinsaturados/metabolismo , Células Hep G2 , Humanos , Marcaje Isotópico/métodos , Metabolómica/métodos , Enfermedad del Hígado Graso no Alcohólico/metabolismo , Ácido Palmítico/metabolismo , Esfingolípidos/biosíntesisRESUMEN
Metabolite identification remains a bottleneck in mass spectrometry (MS)-based metabolomics. Currently, this process relies heavily on tandem mass spectrometry (MS/MS) spectra generated separately for peaks of interest identified from previous MS runs. Such a delayed and labor-intensive procedure creates a barrier to automation. Further, information embedded in MS data has not been used to its full extent for metabolite identification. Multimers, adducts, multiply charged ions, and fragments of given metabolites occupy a substantial proportion (40-80%) of the peaks of a quantitation result. However, extensive information on these derivatives, especially fragments, may facilitate metabolite identification. We propose a procedure with automation capability to group and annotate peaks associated with the same metabolite in the quantitation results of opposite modes and to integrate this information for metabolite identification. In addition to the conventional mass and isotope ratio matches, we would match annotated fragments with low-energy MS/MS spectra in public databases. For identification of metabolites without accessible MS/MS spectra, we have developed characteristic fragment and common substructure matches. The accuracy and effectiveness of the procedure were evaluated using one public and two in-house liquid chromatography-mass spectrometry (LC-MS) data sets. The procedure accurately identified 89% of 28 standard metabolites with derivative ions in the data sets. With respect to effectiveness, the procedure confidently identified the correct chemical formula of at least 42% of metabolites with derivative ions via MS/MS spectrum, characteristic fragment, and common substructure matches. The confidence level was determined according to the fulfilled identification criteria of various matches and relative retention time.
Asunto(s)
Metabolómica/métodos , Espectrometría de Masas en Tándem/métodos , Animales , Cromatografía Liquida/métodos , Diabetes Mellitus Experimental/metabolismo , Dieta , Iones/análisis , Iones/metabolismo , Metaboloma , Ratones , RatasRESUMEN
Glycosylation is a highly complex modification influencing the functions and activities of proteins. Interpretation of intact glycopeptide spectra is crucial but challenging. In this paper, we present a mass spectrometry-based automated glycopeptide identification platform (MAGIC) to identify peptide sequences and glycan compositions directly from intact N-linked glycopeptide collision-induced-dissociation spectra. The identification of the Y1 (peptideY0 + GlcNAc) ion is critical for the correct analysis of unknown glycoproteins, especially without prior knowledge of the proteins and glycans present in the sample. To ensure accurate Y1-ion assignment, we propose a novel algorithm called Trident that detects a triplet pattern corresponding to [Y0, Y1, Y2] or [Y0-NH3, Y0, Y1] from the fragmentation of the common trimannosyl core of N-linked glycopeptides. To facilitate the subsequent peptide sequence identification by common database search engines, MAGIC generates in silico spectra by overwriting the original precursor with the naked peptide m/z and removing all of the glycan-related ions. Finally, MAGIC computes the glycan compositions and ranks them. For the model glycoprotein horseradish peroxidase (HRP) and a 5-glycoprotein mixture, a 2- to 31-fold increase in the relative intensities of the peptide fragments was achieved, which led to the identification of 7 tryptic glycopeptides from HRP and 16 glycopeptides from the mixture via Mascot. In the HeLa cell proteome data set, MAGIC processed over a thousand MS(2) spectra in 3 min on a PC and reported 36 glycopeptides from 26 glycoproteins. Finally, a remarkable false discovery rate of 0 was achieved on the N-glycosylation-free Escherichia coli data set. MAGIC is available at http://ms.iis.sinica.edu.tw/COmics/Software_MAGIC.html .
Asunto(s)
Algoritmos , Biología Computacional , Glicopéptidos/análisis , Programas Informáticos , Automatización , Bases de Datos Factuales , Escherichia coli/química , Glicopéptidos/química , Células HeLa , HumanosRESUMEN
BACKGROUND: Lack of power and reproducibility are caveats of genetic association studies of common complex diseases. Indeed, the heterogeneity of disease etiology demands that causal models consider the simultaneous involvement of multiple genes. Rothman's sufficient-cause model, which is well known in epidemiology, provides a framework for such a concept. In the present work, we developed a three-stage algorithm to construct gene clusters resembling Rothman's causal model for a complex disease, starting from finding influential gene pairs followed by grouping homogeneous pairs. RESULTS: The algorithm was trained and tested on 2,772 hypertensives and 6,515 normotensives extracted from four large Caucasian and Taiwanese databases. The constructed clusters, each featured by a major gene interacting with many other genes and identified a distinct group of patients, reproduced in both ethnic populations and across three genotyping platforms. We present the 14 largest gene clusters which were capable of identifying 19.3% of hypertensives in all the datasets and 41.8% if one dataset was excluded for lack of phenotype information. Although a few normotensives were also identified by the gene clusters, they usually carried less risky combinatory genotypes (insufficient causes) than the hypertensive counterparts. After establishing a cut-off percentage for risky combinatory genotypes in each gene cluster, the 14 gene clusters achieved a classification accuracy of 82.8% for all datasets and 98.9% if the information-short dataset was excluded. Furthermore, not only 10 of the 14 major genes but also many other contributing genes in the clusters are associated with either hypertension or hypertension-related diseases or functions. CONCLUSIONS: We have shown with the constructed gene clusters that a multi-causal pie-multi-component approach can indeed improve the reproducibility of genetic markers for complex disease. In addition, our novel findings including a major gene in each cluster and sufficient risky genotypes in a cluster for disease onset (which coincides with Rothman's sufficient cause theory) may not only provide a new research direction for complex diseases but also help to reveal the disease etiology.
Asunto(s)
Biología Computacional , Hipertensión/etiología , Hipertensión/genética , Familia de Multigenes/genética , Edad de Inicio , Algoritmos , Femenino , Genotipo , Humanos , Persona de Mediana Edad , Reproducibilidad de los Resultados , Factores de Tiempo , TranscriptomaRESUMEN
Tumour metabolomics and transcriptomics co-expression network as related to biological folate alteration and cancer malignancy remains unexplored in human non-small cell lung cancers (NSCLC). To probe the diagnostic biomarkers, tumour and pair lung tissue samples (n = 56) from 97 NSCLC patients were profiled for ultra-performance liquid chromatography tandem mass spectrometry (UPLC/MS/MS)-analysed metabolomics, targeted transcriptionomics, and clinical folate traits. Weighted Gene Co-expression Network Analysis (WGCNA) was performed. Tumour lactate was identified as the top VIP marker to predict advance NSCLC (AUC = 0.765, Sig = 0.017, CI 0.58-0.95). Low folate (LF)-tumours vs. adjacent lungs displayed higher glycolytic index of lactate and glutamine-associated amino acids in enriched biological pathways of amino sugar and glutathione metabolism specific to advance NSCLCs. WGCNA classified the green module for hub serine-navigated glutamine metabolites inversely associated with tumour and RBC folate, which module metabolites co-expressed with a predominant up-regulation of LF-responsive metabolic genes in glucose transport (GLUT1), de no serine synthesis (PHGDH, PSPH, and PSAT1), folate cycle (SHMT1/2 and PCFR), and down-regulation in glutaminolysis (SLC1A5, SLC7A5, GLS, and GLUD1). The LF-responsive WGCNA markers predicted poor survival rates in lung cancer patients, which could aid in optimizing folate intervention for better prognosis of NSCLCs susceptible to folate malnutrition.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Neoplasias Pulmonares/metabolismo , Ácido Fólico , Glutamina/metabolismo , Espectrometría de Masas en Tándem , Pronóstico , Metabolómica/métodos , Antígenos de Histocompatibilidad Menor , Sistema de Transporte de Aminoácidos ASCRESUMEN
Isotope labeling combined with liquid chromatography-mass spectrometry (LC-MS) provides a robust platform for analyzing differential protein expression in proteomics research. We present a web service, called MaXIC-Q Web (http://ms.iis.sinica.edu.tw/MaXIC-Q_Web/), for quantitation analysis of large-scale datasets generated from proteomics experiments using various stable isotope-labeling techniques, e.g. SILAC, ICAT and user-developed labeling methods. It accepts spectral files in the standard mzXML format and search results from SEQUEST, Mascot and ProteinProphet as input. Furthermore, MaXIC-Q Web uses statistical and computational methods to construct two kinds of elution profiles for each ion, namely, PIMS (projected ion mass spectrum) and XIC (extracted ion chromatogram) from MS data. Toward accurate quantitation, a stringent validation procedure is performed on PIMSs to filter out peptide ions interfered with co-eluting peptides or noise. The areas of XICs determine ion abundances, which are used to calculate peptide and protein ratios. Since MaXIC-Q Web adopts stringent validation on spectral data, it achieves high accuracy so that manual validation effort can be substantially reduced. Furthermore, it provides various visualization diagrams and comprehensive quantitation reports so that users can conveniently inspect quantitation results. In summary, MaXIC-Q Web is a user-friendly, interactive, robust, generic web service for quantitation based on ICAT and SILAC labeling techniques.
Asunto(s)
Cromatografía Liquida , Espectrometría de Masas , Proteínas/análisis , Programas Informáticos , Biología Computacional , Interpretación Estadística de Datos , Células Endoteliales/metabolismo , Internet , Marcaje Isotópico , Ácido Nítrico/metabolismo , Péptidos/química , Reproducibilidad de los Resultados , Interfaz Usuario-ComputadorRESUMEN
MOTIVATION: Identification of disease-related genes using high-throughput microarray data is more difficult for complex diseases as compared with monogenic ones. We hypothesized that an endophenotype derived from transcriptional data is associated with a set of genes corresponding to a pathway cluster. We assumed that a complex disease is associated with multiple endophenotypes and can be induced by their up/downregulated gene expression patterns. Thus, a neural network model was adopted to simulate the gene-endophenotype-disease relationship in which endophenotypes were represented by hidden nodes. RESULTS: We successfully constructed a three-endophenotype model for Taiwanese hypertensive males with high identification accuracy. Of the three endophenotypes, one is strongly protective, another is weakly protective and the third is highly correlated with developing young-onset male hypertension. Sixteen of the involved 101 genes were highly and consistently influential to the endophenotypes. Identification of SLC4A5, SLC5A10 and LDOC1 indicated that sodium/bicarbonate transport, sodium/glucose transport and cell-proliferation regulation may play important upstream roles and identification of BNIP1, APOBEC3F and LDOC1 suggested that apoptosis, innate immune response and cell-proliferation regulation may play important downstream roles in hypertension. The involved genes not only provide insights into the mechanism of hypertension but should also be considered in future gene mapping endeavors.
Asunto(s)
Hipertensión/epidemiología , Hipertensión/genética , Redes Neurales de la Computación , Análisis de Secuencia por Matrices de Oligonucleótidos , Edad de Inicio , Simulación por Computador , Humanos , Masculino , FenotipoRESUMEN
Efficient and accurate quantitation of metabolites from LC-MS data has become an important topic. Here we present an automated tool, called iMet-Q (intelligent Metabolomic Quantitation), for label-free metabolomics quantitation from high-throughput MS1 data. By performing peak detection and peak alignment, iMet-Q provides a summary of quantitation results and reports ion abundance at both replicate level and sample level. Furthermore, it gives the charge states and isotope ratios of detected metabolite peaks to facilitate metabolite identification. An in-house standard mixture and a public Arabidopsis metabolome data set were analyzed by iMet-Q. Three public quantitation tools, including XCMS, MetAlign, and MZmine 2, were used for performance comparison. From the mixture data set, seven standard metabolites were detected by the four quantitation tools, for which iMet-Q had a smaller quantitation error of 12% in both profile and centroid data sets. Our tool also correctly determined the charge states of seven standard metabolites. By searching the mass values for those standard metabolites against Human Metabolome Database, we obtained a total of 183 metabolite candidates. With the isotope ratios calculated by iMet-Q, 49% (89 out of 183) metabolite candidates were filtered out. From the public Arabidopsis data set reported with two internal standards and 167 elucidated metabolites, iMet-Q detected all of the peaks corresponding to the internal standards and 167 metabolites. Meanwhile, our tool had small abundance variation (≤ 0.19) when quantifying the two internal standards and had higher abundance correlation (≥ 0.92) when quantifying the 167 metabolites. iMet-Q provides user-friendly interfaces and is publicly available for download at http://ms.iis.sinica.edu.tw/comics/Software_iMet-Q.html.
Asunto(s)
Metaboloma , Metabolómica/métodos , Programas Informáticos , Arabidopsis/metabolismo , HumanosRESUMEN
Nature determines the complexity of disease etiology and the likelihood of revealing disease genes. While culprit genes for many monogenic diseases have been successfully unraveled, efforts to map major complex disease genes have not been as productive as hoped. The conceptual framework currently adopted to deal with the heterogeneous nature of complex diseases focuses on using homogeneous internal features of the disease phenotype for mapping. However, phenotypic homogeneity does not equal genotypic homogeneity. In this report, we advocate working with well-measured phenotypes portrayed by amounts of transcripts and activities of gene products or their metabolites, which are pertinent to relatively small pathway clusters. Reliable and controlled measures for oligogenic traits resulting from proper dissection efforts may enhance statistical power. The large amounts of information obtained on gene and protein expression from technological advances can add to the power of gene finding, particularly for diseases with unclear etiology. Data-mining tools for dimension reduction can assist biologists to reveal novel molecular endophenotypes. However, there are still hurdles to overcome, including high cost, relatively poor reproducibility and comparability among platforms, the cross-sectional nature of the information, and the accessibility of human tissues. Concerted efforts are required to carry out large-scale prospective studies that are integrated at the levels of phenotype characterization, high throughput experimental techniques, data analyses, and beyond.