Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Bioinformatics ; 35(7): 1159-1166, 2019 04 01.
Article in English | MEDLINE | ID: mdl-30184069

ABSTRACT

MOTIVATION: As the time and cost of sequencing decrease, the number of available genomes and transcriptomes rapidly increases. Yet the quality of the assemblies and the gene annotations varies considerably and often remains poor, affecting downstream analyses. This is particularly true when fragments of the same gene are annotated as distinct genes, which may cause them to be mistaken as paralogs. RESULTS: In this study, we introduce two novel phylogenetic tests to infer non-overlapping or partially overlapping genes that are in fact parts of the same gene. One approach collapses branches with low bootstrap support and the other computes a likelihood ratio test. We extensively validated these methods by (i) introducing and recovering fragmentation on the bread wheat, Triticum aestivum cv. Chinese Spring, chromosome 3B; (ii) by applying the methods to the low-quality 3B assembly and validating predictions against the high-quality 3B assembly; and (iii) by comparing the performance of the proposed methods to the performance of existing methods, namely Ensembl Compara and ESPRIT. Application of this combination to a draft shotgun assembly of the entire bread wheat genome revealed 1221 pairs of genes that are highly likely to be fragments of the same gene. Our approach demonstrates the power of fine-grained evolutionary inferences across multiple species to improving genome assemblies and annotations. AVAILABILITY AND IMPLEMENTATION: An open source software tool is available at https://github.com/DessimozLab/esprit2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Triticum , Genome, Plant , Molecular Sequence Annotation , Phylogeny , Software
2.
Bioinformatics ; 34(17): i612-i619, 2018 09 01.
Article in English | MEDLINE | ID: mdl-30423067

ABSTRACT

Motivation: A key goal in plant biotechnology applications is the identification of genes associated to particular phenotypic traits (for example: yield, fruit size, root length). Quantitative Trait Loci (QTL) studies identify genomic regions associated with a trait of interest. However, to infer potential causal genes in these regions, each of which can contain hundreds of genes, these data are usually intersected with prior functional knowledge of the genes. This process is however laborious, particularly if the experiment is performed in a non-model species, and the statistical significance of the inferred candidates is typically unknown. Results: This paper introduces QTLSearch, a method and software tool to search for candidate causal genes in QTL studies by combining Gene Ontology annotations across many species, leveraging hierarchical orthologous groups. The usefulness of this approach is demonstrated by re-analysing two metabolic QTL studies: one in Arabidopsis thaliana, the other in Oryza sativa subsp. indica. Even after controlling for statistical significance, QTLSearch inferred potential causal genes for more QTL than BLAST-based functional propagation against UniProtKB/Swiss-Prot, and for more QTL than in the original studies. Availability and implementation: QTLSearch is distributed under the LGPLv3 license. It is available to install from the Python Package Index (as qtlsearch), with the source available from https://bitbucket.org/alex-warwickvesztrocy/qtlsearch. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Quantitative Trait Loci , Software , Arabidopsis/genetics , Genomics , Molecular Sequence Annotation , Oryza/genetics
3.
ACS Synth Biol ; 7(4): 1163-1166, 2018 04 20.
Article in English | MEDLINE | ID: mdl-29558112

ABSTRACT

Computational systems biology methods enable rational design of cell factories on a genome-scale and thus accelerate the engineering of cells for the production of valuable chemicals and proteins. Unfortunately, the majority of these methods' implementations are either not published, rely on proprietary software, or do not provide documented interfaces, which has precluded their mainstream adoption in the field. In this work we present cameo, a platform-independent software that enables in silico design of cell factories and targets both experienced modelers as well as users new to the field. It is written in Python and implements state-of-the-art methods for enumerating and prioritizing knockout, knock-in, overexpression, and down-regulation strategies and combinations thereof. Cameo is an open source software project and is freely available under the Apache License 2.0. A dedicated Web site including documentation, examples, and installation instructions can be found at http://cameo.bio . Users can also give cameo a try at http://try.cameo.bio .


Subject(s)
Computational Biology/methods , Metabolic Engineering/methods , Software , Gene Knockout Techniques , Models, Biological , Programming Languages , Systems Biology/methods , Workflow
4.
Nucleic Acids Res ; 46(D1): D477-D485, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29106550

ABSTRACT

The Orthologous Matrix (OMA) is a leading resource to relate genes across many species from all of life. In this update paper, we review the recent algorithmic improvements in the OMA pipeline, describe increases in species coverage (particularly in plants and early-branching eukaryotes) and introduce several new features in the OMA web browser. Notable improvements include: (i) a scalable, interactive viewer for hierarchical orthologous groups; (ii) protein domain annotations and domain-based links between orthologous groups; (iii) functionality to retrieve phylogenetic marker genes for a subset of species of interest; (iv) a new synteny dot plot viewer; and (v) an overhaul of the programmatic access (REST API and semantic web), which will facilitate incorporation of OMA analyses in computational pipelines and integration with other bioinformatic resources. OMA can be freely accessed at https://omabrowser.org.


Subject(s)
Biological Evolution , Databases, Genetic , Genome , Molecular Sequence Annotation , Proteins/genetics , Synteny , Algorithms , Animals , Archaea/classification , Archaea/genetics , Archaea/metabolism , Bacteria/classification , Bacteria/genetics , Bacteria/metabolism , Computational Biology/methods , Fungi/classification , Fungi/genetics , Fungi/metabolism , Gene Ontology , Humans , Internet , Phylogeny , Plants/classification , Plants/genetics , Plants/metabolism , Protein Domains , Proteins/chemistry , Proteins/metabolism , Web Browser
5.
Plant Methods ; 13: 13, 2017.
Article in English | MEDLINE | ID: mdl-28331535

ABSTRACT

BACKGROUND: Growth is an important parameter to consider when studying the impact of treatments or mutations on plant physiology. Leaf area and growth rates can be estimated efficiently from images of plants, but the experiment setup, image analysis, and statistical evaluation can be laborious, often requiring substantial manual effort and programming skills. RESULTS: Here we present rosettR, a non-destructive and high-throughput phenotyping protocol for the measurement of total rosette area of seedlings grown in plates in sterile conditions. We demonstrate that our protocol can be used to accurately detect growth differences among different genotypes and in response to light regimes and osmotic stress. rosettR is implemented as a package for the statistical computing software R and provides easy to use functions to design an experiment, analyze the images, and generate reports on quality control as well as a final comparison across genotypes and applied treatments. Experiment procedures are included as part of the package documentation. CONCLUSIONS: Using rosettR it is straight-forward to perform accurate, reproducible measurements of rosette area and relative growth rate with high-throughput using inexpensive equipment. Suitable applications include screening mutant populations for growth phenotypes visible at early growth stages and profiling different genotypes in a wide variety of treatments.

6.
Mitochondrion ; 33: 58-71, 2017 Mar.
Article in English | MEDLINE | ID: mdl-27476757

ABSTRACT

Cellular signaling pathways are regulated in a highly dynamic fashion in order to quickly adapt to distinct environmental conditions. Acetylation of lysine residues represents a central process that orchestrates cellular metabolism and signaling. In mitochondria, acetylation seems to be the most prevalent post-translational modification, presumably linked to the compartmentation and high turnover of acetyl-CoA in this organelle. Similarly, the elevated pH and the higher concentration of metabolites in mitochondria seem to favor non-enzymatic lysine modifications, as well as other acylations. Hence, elucidating the mechanisms for metabolic control of protein acetylation is crucial for our understanding of cellular processes. Recent advances in mass spectrometry-based proteomics have considerably increased our knowledge of the regulatory scope of acetylation. Here, we review the current knowledge and functional impact of mitochondrial protein acetylation across species. We first cover the experimental approaches to identify and analyze lysine acetylation on a global scale, we then explore both commonalities and specific differences of plant and animal acetylomes and the evolutionary conservation of protein acetylation, as well as its particular impact on metabolism and diseases. Important future directions and technical challenges are discussed, and it is pointed out that the transfer of knowledge between species and diseases, both in technology and biology, is of particular importance for further advancements in this field.


Subject(s)
Acetyl Coenzyme A/metabolism , Lysine/metabolism , Mitochondria/metabolism , Mitochondrial Proteins/metabolism , Protein Processing, Post-Translational , Acetylation , Animals , Computational Biology , Mass Spectrometry , Plants , Proteomics
7.
Trends Plant Sci ; 21(7): 609-621, 2016 07.
Article in English | MEDLINE | ID: mdl-27021699

ABSTRACT

The evolutionary history of nearly all flowering plants includes a polyploidization event. Homologous genes resulting from allopolyploidy are commonly referred to as 'homoeologs', although this term has not always been used precisely or consistently in the literature. With several allopolyploid genome sequencing projects under way, there is a pressing need for computational methods for homoeology inference. Here we review the definition of homoeology in historical and modern contexts and propose a precise and testable definition highlighting the connection between homoeologs and orthologs. In the second part, we survey experimental and computational methods of homoeolog inference, considering the strengths and limitations of each approach. Establishing a precise and evolutionarily meaningful definition of homoeology is essential for understanding the evolutionary consequences of polyploidization.


Subject(s)
Genes, Plant/genetics , Evolution, Molecular , Gene Expression Regulation, Plant/genetics , Genome, Plant/genetics , Polyploidy
8.
Nucleic Acids Res ; 43(Database issue): D240-9, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25399418

ABSTRACT

The Orthologous Matrix (OMA) project is a method and associated database inferring evolutionary relationships amongst currently 1706 complete proteomes (i.e. the protein sequence associated for every protein-coding gene in all genomes). In this update article, we present six major new developments in OMA: (i) a new web interface; (ii) Gene Ontology function predictions as part of the OMA pipeline; (iii) better support for plant genomes and in particular homeologs in the wheat genome; (iv) a new synteny viewer providing the genomic context of orthologs; (v) statically computed hierarchical orthologous groups subsets downloadable in OrthoXML format; and (vi) possibility to export parts of the all-against-all computations and to combine them with custom data for 'client-side' orthology prediction. OMA can be accessed through the OMA Browser and various programmatic interfaces at http://omabrowser.org.


Subject(s)
Databases, Protein , Plant Proteins/genetics , Proteome/chemistry , Sequence Homology, Amino Acid , Algorithms , Gene Ontology , Genome, Plant , Humans , Internet , Plant Proteins/chemistry , Proteome/genetics , Synteny , Triticum/genetics
9.
Front Plant Sci ; 5: 668, 2014.
Article in English | MEDLINE | ID: mdl-25506350

ABSTRACT

An attempt has been made to define the extent to which metabolic flux in central plant metabolism is reflected by changes in the transcriptome and metabolome, based on an analysis of in vitro cultured immature embryos of two oilseed rape (Brassica napus) accessions which contrast for seed lipid accumulation. Metabolic flux analysis (MFA) was used to constrain a flux balance metabolic model which included 671 biochemical and transport reactions within the central metabolism. This highly confident flux information was eventually used for comparative analysis of flux vs. transcript (metabolite). Metabolite profiling succeeded in identifying 79 intermediates within the central metabolism, some of which differed quantitatively between the two accessions and displayed a significant shift corresponding to flux. An RNA-Seq based transcriptome analysis revealed a large number of genes which were differentially transcribed in the two accessions, including some enzymes/proteins active in major metabolic pathways. With a few exceptions, differential activity in the major pathways (glycolysis, TCA cycle, amino acid, and fatty acid synthesis) was not reflected in contrasting abundances of the relevant transcripts. The conclusion was that transcript abundance on its own cannot be used to infer metabolic activity/fluxes in central plant metabolism. This limitation needs to be borne in mind in evaluating transcriptome data and designing metabolic engineering experiments.

10.
BMC Genomics ; 13: 79, 2012 Feb 21.
Article in English | MEDLINE | ID: mdl-22353141

ABSTRACT

BACKGROUND: The importance of peptide microarrays as a tool for serological diagnostics has strongly increased over the last decade. However, interpretation of the binding signals is still hampered by our limited understanding of the technology. This is in particular true for arrays probed with antibody mixtures of unknown complexity, such as sera. To gain insight into how signals depend on peptide amino acid sequences, we probed random-sequence peptide microarrays with sera of healthy and infected mice. We analyzed the resulting antibody binding profiles with regression methods and formulated a minimal model to explain our findings. RESULTS: Multivariate regression analysis relating peptide sequence to measured signals led to the definition of amino acid-associated weights. Although these weights do not contain information on amino acid position, they predict up to 40-50% of the binding profiles' variation. Mathematical modeling shows that this position-independent ansatz is only adequate for highly diverse random antibody mixtures which are not dominated by a few antibodies. Experimental results suggest that sera from healthy individuals correspond to that case, in contrast to sera of infected ones. CONCLUSIONS: Our results indicate that position-independent amino acid-associated weights predict linear epitope binding of antibody mixtures only if the mixture is random, highly diverse, and contains no dominant antibodies. The discovered ensemble property is an important step towards an understanding of peptide-array serum-antibody binding profiles. It has implications for both serological diagnostics and B cell epitope mapping.


Subject(s)
Antibodies, Bacterial/immunology , Antibodies/immunology , Models, Immunological , Peptides/immunology , Algorithms , Animals , Antibodies, Bacterial/blood , Antibodies, Monoclonal/immunology , Computer Simulation , Epitope Mapping , Immunoglobulin G/blood , Immunoglobulin G/immunology , Mice , Mice, Inbred BALB C , Nematospiroides dubius/immunology , Peptides/chemistry , Protein Binding/immunology , Regression Analysis , Sensitivity and Specificity
11.
BMC Syst Biol ; 5: 176, 2011 Oct 28.
Article in English | MEDLINE | ID: mdl-22034874

ABSTRACT

BACKGROUND: Increasing awareness of limitations to natural resources has set high expectations for plant science to deliver efficient crops with increased yields, improved stress tolerance, and tailored composition. Collections of representative varieties are a valuable resource for compiling broad breeding germplasms that can satisfy these diverse needs. RESULTS: Here we show that the untargeted high-coverage metabolomic characterization of such core collections is a powerful approach for studying the molecular backgrounds of quality traits and for constructing predictive metabolome-trait models. We profiled the metabolic composition of kernels from field-grown plants of the rice diversity research set using 4 complementary analytical platforms. We found that the metabolite profiles were correlated with both the overall population structure and fine-grained genetic diversity. Multivariate regression analysis showed that 10 of the 17 studied quality traits could be predicted from the metabolic composition independently of the population structure. Furthermore, the model of amylose ratio could be validated using external varieties grown in an independent experiment. CONCLUSIONS: Our results demonstrate the utility of metabolomics for linking traits with quantitative molecular data. This opens up new opportunities for trait prediction and construction of tailored germplasms to support modern plant breeding.


Subject(s)
Breeding/methods , Genetic Variation , Metabolomics/methods , Models, Genetic , Oryza/genetics , Oryza/metabolism , Phenotype , Genetics, Population , Regression Analysis
12.
BMC Syst Biol ; 5: 192, 2011 Nov 21.
Article in English | MEDLINE | ID: mdl-22104211

ABSTRACT

BACKGROUND: 14-3-3 proteins are considered master regulators of many signal transduction cascades in eukaryotes. In plants, 14-3-3 proteins have major roles as regulators of nitrogen and carbon metabolism, conclusions based on the studies of a few specific 14-3-3 targets. RESULTS: In this study, extensive novel roles of 14-3-3 proteins in plant metabolism were determined through combining the parallel analyses of metabolites and enzyme activities in 14-3-3 overexpression and knockout plants with studies of protein-protein interactions. Decreases in the levels of sugars and nitrogen-containing-compounds and in the activities of known 14-3-3-interacting-enzymes were observed in 14-3-3 overexpression plants. Plants overexpressing 14-3-3 proteins also contained decreased levels of malate and citrate, which are intermediate compounds of the tricarboxylic acid (TCA) cycle. These modifications were related to the reduced activities of isocitrate dehydrogenase and malate dehydrogenase, which are key enzymes of TCA cycle. In addition, we demonstrated that 14-3-3 proteins interacted with one isocitrate dehydrogenase and two malate dehydrogenases. There were also changes in the levels of aromatic compounds and the activities of shikimate dehydrogenase, which participates in the biosynthesis of aromatic compounds. CONCLUSION: Taken together, our findings indicate that 14-3-3 proteins play roles as crucial tuners of multiple primary metabolic processes including TCA cycle and the shikimate pathway.


Subject(s)
14-3-3 Proteins/physiology , Arabidopsis Proteins/physiology , Arabidopsis/metabolism , 14-3-3 Proteins/genetics , 14-3-3 Proteins/metabolism , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Citric Acid Cycle , Metabolic Networks and Pathways , Protein Interaction Maps , Shikimic Acid/metabolism , Signal Transduction
13.
Anal Chem ; 83(14): 5645-51, 2011 Jul 15.
Article in English | MEDLINE | ID: mdl-21630645

ABSTRACT

Metabolomics has become an integral part of many life-science applications but is technically still very challenging. Numerous analytical approaches are needed as metabolites have very broad concentration ranges and extremely diverse chemical properties. Configuring a metabolomics pipeline and exploring its merits is a complex task that depends on effective and transparent evaluation procedures. Unfortunately, there are no widely applicable methods to evaluate how well acquired data can approximate actual concentration differences. Here, we introduce a powerful approach that provides semiquantitative calibration curves over a biologically defined concentration range for all detected compounds. By performing metabolomics on a stepwise gradient between two biological specimens, we obtain a data set where each peak would ideally show a linear dependency on the mixture ratio. An example gradient between extracts of tomato leaf and fruit demonstrates good calibration statistics for a large proportion of the peaks but also highlights cases with strong background-dependent signal interference. Analysis of artificial biological gradients is a general and inexpensive tool for calibration that greatly facilitates data interpretation, quality control and method comparisons.


Subject(s)
Metabolomics/methods , Plant Extracts/chemistry , Solanum lycopersicum/chemistry , Calibration , Fruit/chemistry , Plant Leaves/chemistry
14.
Bioinformatics ; 27(13): i357-65, 2011 Jul 01.
Article in English | MEDLINE | ID: mdl-21685093

ABSTRACT

MOTIVATION: Studying the interplay between gene expression and metabolite levels can yield important information on the physiology of stress responses and adaptation strategies. Performing transcriptomics and metabolomics in parallel during time-series experiments represents a systematic way to gain such information. Several combined profiling datasets have been added to the public domain and they form a valuable resource for hypothesis generating studies. Unfortunately, detecting coresponses between transcript levels and metabolite abundances is non-trivial: they cannot be assumed to overlap directly with underlying biochemical pathways and they may be subject to time delays and obscured by considerable noise. RESULTS: Our aim was to predict pathway comemberships between metabolites and genes based on their coresponses to applied stress. We found that in the presence of strong noise and time-shifted responses, a hidden Markov model-based similarity outperforms the simpler Pearson correlation but performs comparably or worse in their absence. Therefore, we propose a supervised method that applies pathway information to summarize similarity statistics to a consensus statistic that is more informative than any of the single measures. Using four combined profiling datasets, we show that comembership between metabolites and genes can be predicted for numerous KEGG pathways; this opens opportunities for the detection of transcriptionally regulated pathways and novel metabolically related genes. AVAILABILITY: A command-line software tool is available at http://www.cin.ufpe.br/~igcf/Metabolites. CONTACT: henning@psc.riken.jp; igcf@cin.ufpe.br


Subject(s)
Gene Expression Profiling , Metabolic Networks and Pathways , Metabolomics , Models, Statistical , Adaptation, Physiological , Arabidopsis/genetics , Arabidopsis/metabolism , Computational Biology , Markov Chains , Software
15.
J Biol Chem ; 286(28): 25224-35, 2011 Jul 15.
Article in English | MEDLINE | ID: mdl-21558269

ABSTRACT

The genome of Synechocystis PCC 6803 contains a single gene encoding an aquaporin, aqpZ. The AqpZ protein functioned as a water-permeable channel in the plasma membrane. However, the physiological importance of AqpZ in Synechocystis remains unclear. We found that growth in glucose-containing medium inhibited proper division of ΔaqpZ cells and led to cell death. Deletion of a gene encoding a glucose transporter in the ΔaqpZ background alleviated the glucose-mediated growth inhibition of the ΔaqpZ cells. The ΔaqpZ cells swelled more than the wild type after the addition of glucose, suggesting an increase in cytosolic osmolarity. This was accompanied by a down-regulation of the pentose phosphate pathway and concurrent glycogen accumulation. Metabolite profiling by GC/TOF-MS of wild-type and ΔaqpZ cells revealed a relative decrease of intermediates of the tricarboxylic acid cycle and certain amino acids in the mutant. The changed levels of metabolites may have been the cause for the observed decrease in growth rate of the ΔaqpZ cells along with decreased PSII activity at pH values ranging from 7.5 to 8.5. A mutant in sll1961, encoding a putative transcription factor, and a Δhik31 mutant, lacking a putative glucose-sensing kinase, both exhibited higher glucose sensitivity than the ΔaqpZ cells. Examination of protein expression indicated that sll1961 functioned as a positive regulator of aqpZ gene expression but not as the only regulator. Overall, the ΔaqpZ cells showed defects in macronutrient metabolism, pH homeostasis, and cell division under photomixotrophic conditions, consistent with an essential role of AqpZ in glucose metabolism.


Subject(s)
Aquaporins/metabolism , Bacterial Proteins/metabolism , Cell Membrane/metabolism , Glucose/metabolism , Synechocystis/metabolism , Aquaporins/genetics , Bacterial Proteins/genetics , Cell Membrane/genetics , Cytosol/metabolism , Gene Deletion , Glucose Transport Proteins, Facilitative/genetics , Glucose Transport Proteins, Facilitative/metabolism , Osmolar Concentration , Pentose Phosphate Pathway/physiology , Synechocystis/genetics
16.
PLoS One ; 6(2): e16989, 2011 Feb 16.
Article in English | MEDLINE | ID: mdl-21359231

ABSTRACT

As metabolomics can provide a biochemical snapshot of an organism's phenotype it is a promising approach for charting the unintended effects of genetic modification. A critical obstacle for this application is the inherently limited metabolomic coverage of any single analytical platform. We propose using multiple analytical platforms for the direct acquisition of an interpretable data set of estimable chemical diversity. As an example, we report an application of our multi-platform approach that assesses the substantial equivalence of tomatoes over-expressing the taste-modifying protein miraculin. In combination, the chosen platforms detected compounds that represent 86% of the estimated chemical diversity of the metabolites listed in the LycoCyc database. Following a proof-of-safety approach, we show that % had an acceptable range of variation while simultaneously indicating a reproducible transformation-related metabolic signature. We conclude that multi-platform metabolomics is an approach that is both sensitive and robust and that it constitutes a good starting point for characterizing genetically modified organisms.


Subject(s)
Food, Genetically Modified , Metabolomics , Solanum lycopersicum/chemistry , Algorithms , Food Safety/methods , Gene Expression Regulation, Plant , Glycoproteins/genetics , Solanum lycopersicum/genetics , Solanum lycopersicum/metabolism , Metabolome , Metabolomics/methods , Models, Biological , Nutritive Value , Plants, Genetically Modified/chemistry , Plants, Genetically Modified/genetics , Plants, Genetically Modified/metabolism , Quality Control , Taste/physiology
17.
J Exp Bot ; 62(4): 1439-53, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21220784

ABSTRACT

Plants can assimilate inorganic nitrogen (N) sources to organic N such as amino acids. N is the most important of the mineral nutrients required by plants and its metabolism is tightly coordinated with carbon (C) metabolism in the fundamental processes that permit plant growth. Increased understanding of N regulation may provide important insights for plant growth and improvement of quality of crops and vegetables because N as well as C metabolism are fundamental components of plant life. Metabolomics is a global biochemical approach useful to study N metabolism because metabolites not only reflect the ultimate phenotypes (traits), but can mediate transcript levels as well as protein levels directly and/or indirectly under different N conditions. This review outlines analytical and bioinformatic techniques particularly used to perform metabolomics for studying N metabolism in higher plants. Examples are used to illustrate the application of metabolomic techniques to the model plants Arabidopsis and rice, as well as other crops and vegetables.


Subject(s)
Arabidopsis/metabolism , Metabolomics , Nitrogen/metabolism , Crops, Agricultural/metabolism , Gas Chromatography-Mass Spectrometry , Mass Spectrometry , Models, Biological , Nitrogen/chemistry , Nuclear Magnetic Resonance, Biomolecular , Oryza/metabolism
18.
BMC Syst Biol ; 5: 1, 2011 Jan 01.
Article in English | MEDLINE | ID: mdl-21194489

ABSTRACT

BACKGROUND: Deciphering the metabolome is essential for a better understanding of the cellular metabolism as a system. Typical metabolomics data show a few but significant correlations among metabolite levels when data sampling is repeated across individuals grown under strictly controlled conditions. Although several studies have assessed topologies in metabolomic correlation networks, it remains unclear whether highly connected metabolites in these networks have specific functions in known tissue- and/or genotype-dependent biochemical pathways. RESULTS: In our study of metabolite profiles we subjected root tissues to gas chromatography-time-of-flight/mass spectrometry (GC-TOF/MS) and used published information on the aerial parts of 3 Arabidopsis genotypes, Col-0 wild-type, methionine over-accumulation 1 (mto1), and transparent testa4 (tt4) to compare systematically the metabolomic correlations in samples of roots and aerial parts. We then applied graph clustering to the constructed correlation networks to extract densely connected metabolites and evaluated the clusters by biochemical-pathway enrichment analysis. We found that the number of significant correlations varied by tissue and genotype and that the obtained clusters were significantly enriched for metabolites included in biochemical pathways. CONCLUSIONS: We demonstrate that the graph-clustering approach identifies tissue- and/or genotype-dependent metabolomic clusters related to the biochemical pathway. Metabolomic correlations complement information about changes in mean metabolite levels and may help to elucidate the organization of metabolically functional modules.


Subject(s)
Arabidopsis/metabolism , Metabolomics/methods , Algorithms , Arabidopsis/genetics , Cluster Analysis , Databases, Factual , Gas Chromatography-Mass Spectrometry , Genotype , Metabolome , Multivariate Analysis , Phenotype , Plant Components, Aerial/metabolism , Plant Roots/metabolism
19.
BMC Bioinformatics ; 11: 214, 2010 Apr 29.
Article in English | MEDLINE | ID: mdl-20426876

ABSTRACT

BACKGROUND: Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, but are still lacking for metabolites. Chemical identifiers are notoriously incoherent, encompassing a wide range of different referencing schemes with varying scope and coverage. Online chemical databases use multiple types of identifiers in parallel but lack a common primary key for reliable database consolidation. Connecting identifiers of analytes found in experimental data with the identifiers of their parent metabolites in public databases can therefore be very laborious. RESULTS: Here we present a strategy and a software tool for integrating metabolite identifiers from local reference libraries and public databases that do not depend on a single common primary identifier. The program constructs groups of interconnected identifiers of analytes and metabolites to obtain a local metabolite-centric SQLite database. The created database can be used to map in-house identifiers and synonyms to external resources such as the KEGG database. New identifiers can be imported and directly integrated with existing data. Queries can be performed in a flexible way, both from the command line and from the statistical programming environment R, to obtain data set tailored identifier mappings. CONCLUSIONS: Efficient cross-referencing of metabolite identifiers is a key technology for metabolomics data analysis. We provide a practical and flexible solution to this task and an open-source program, the metabolite masking tool (MetMask), available at http://metmask.sourceforge.net, that implements our ideas.


Subject(s)
Metabolomics/methods , Software , Databases, Factual
20.
Amino Acids ; 39(4): 1013-21, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20354740

ABSTRACT

Methionine (Met) is an essential amino acid for all organisms. In plants, Met also functions as a precursor of plant hormones, polyamines, and defense metabolites. The regulatory mechanism of Met biosynthesis is highly complex and, despite its great importance, remains unclear. To investigate how accumulation of Met influences metabolism as a whole in Arabidopsis, three methionine over-accumulation (mto) mutants were examined using a gas chromatography-mass spectrometry-based metabolomics approach. Multivariate statistical analyses of the three mto mutants (mto1, mto2, and mto3) revealed distinct metabolomic phenotypes. Orthogonal projection to latent structures-discriminant analysis highlighted discriminative metabolites contributing to the separation of each mutant and the corresponding control samples. Though Met accumulation in mto1 had no dramatic effect on other metabolic pathways except for the aspartate family, metabolite profiles of mto2 and mto3 indicated that several extensive pathways were affected in addition to over-accumulation of Met. The pronounced changes in metabolic pathways in both mto2 and mto3 were associated with polyamines. The findings suggest that our metabolomics approach not only can reveal the impact of Met over-accumulation on metabolism, but also may provide clues to identify crucial pathways for regulation of metabolism in plants.


Subject(s)
Arabidopsis/genetics , Arabidopsis/metabolism , Methionine/biosynthesis , Methionine/genetics , Methionine/metabolism , Amino Acids/metabolism , Arabidopsis/enzymology , Arabidopsis/growth & development , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , DNA, Plant/genetics , Gas Chromatography-Mass Spectrometry , Gene Expression Profiling , Gene Expression Regulation, Enzymologic , Gene Expression Regulation, Plant , Genes, Plant , Metabolic Networks and Pathways , Metabolomics , Mutation , Plants, Genetically Modified
SELECTION OF CITATIONS
SEARCH DETAIL