Pesquisa | Biblioteca Virtual em Saúde

Point estimates in phylogenetic reconstructions.

Benner, Philipp; Bacák, Miroslav; Bourguignon, Pierre-Yves.

Bioinformatics ; 30(17): i534-40, 2014 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-25161244

RESUMO

MOTIVATION: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated posterior distributions, such methods fall short of providing a faithful summary of posterior distributions if the data do not offer compelling evidence for a single topology. RESULTS: Building upon previous work of Billera et al., summary statistics such as sample mean, median and variance are defined as the geometric median, Fréchet mean and variance, respectively. Their computation is enabled by recently published works, and embeds an algorithm for computing shortest paths in the space of trees. Studying the phylogeny of a set of plants, where several tree topologies occur in the posterior sample, the posterior mean balances correctly the contributions from the different topologies, where a consensus tree would be biased. Comparisons of the posterior mean, median and consensus trees with the ground truth using simulated data also reveals the benefits of a sound averaging method when reconstructing phylogenetic trees. AVAILABILITY AND IMPLEMENTATION: We provide two independent implementations of the algorithm for computing Fréchet means, geometric medians and variances in the space of phylogenetic trees. TFBayes: https://github.com/pbenner/tfbayes, TrAP: https://github.com/bacak/TrAP.

Assuntos

Modelos Estatísticos , Filogenia , Algoritmos , Teorema de Bayes

Core and panmetabolism in Escherichia coli.

Vieira, Gilles; Sabarly, Victor; Bourguignon, Pierre-Yves; Durot, Maxime; Le Fèvre, François; Mornico, Damien; Vallenet, David; Bouvet, Odile; Denamur, Erick; Schachter, Vincent; Médigue, Claudine.

J Bacteriol ; 193(6): 1461-72, 2011 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-21239590

RESUMO

Escherichia coli exhibits a wide range of lifestyles encompassing commensalism and various pathogenic behaviors which its highly dynamic genome contributes to develop. How environmental and host factors shape the genetic structure of E. coli strains remains, however, largely unknown. Following a previous study of E. coli genomic diversity, we investigated its diversity at the metabolic level by building and analyzing the genome-scale metabolic networks of 29 E. coli strains (8 commensal and 21 pathogenic strains, including 6 Shigella strains). Using a tailor-made reconstruction strategy, we significantly improved the completeness and accuracy of the metabolic networks over default automatic reconstruction processes. Among the 1,545 reactions forming E. coli panmetabolism, 885 reactions were common to all strains. This high proportion of core reactions (57%) was found to be in sharp contrast to the low proportion (13%) of core genes in the E. coli pangenome, suggesting less diversity of metabolic functions compared to that of all gene functions. Core reactions were significantly overrepresented among biosynthetic reactions compared to the more variable degradation processes. Differences between metabolic networks were found to follow E. coli phylogeny rather than pathogenic phenotypes, except for Shigella networks, which were significantly more distant from the others. This suggests that most metabolic changes in non-Shigella strains were not driven by their pathogenic phenotypes. Using a supervised method, we were yet able to identify small sets of reactions related to pathogenicity or commensalism. The quality of our reconstructed networks also makes them reliable bases for building metabolic models.

Assuntos

Escherichia coli/genética , Escherichia coli/metabolismo , Genoma Bacteriano , Redes e Vias Metabólicas/genética , Biologia Computacional , Variação Genética

Genome-scale models of bacterial metabolism: reconstruction and applications.

Durot, Maxime; Bourguignon, Pierre-Yves; Schachter, Vincent.

FEMS Microbiol Rev ; 33(1): 164-90, 2009 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19067749

RESUMO

Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety of computational methods exploiting metabolic models have been developed and applied to bacteria, yielding valuable insights into bacterial metabolism and evolution, and providing a sound basis for computer-assisted design in metabolic engineering. Recent advances in computational systems biology and high-throughput experimental technologies pave the way for the systematic reconstruction of metabolic models from genomes of new species, and a corresponding expansion of the scope of their applications. In this review, we provide an introduction to the key ideas of metabolic modeling, survey the methods, and resources that enable model reconstruction and refinement, and chart applications to the investigation of global properties of metabolic systems, the interpretation of experimental results, and the re-engineering of their biochemical capabilities.

Assuntos

Bactérias/metabolismo , Genoma Bacteriano , Redes e Vias Metabólicas , Modelos Biológicos , Bactérias/genética , Bases de Dados Genéticas , Biologia de Sistemas

Noncanonical DNA polymerization by aminoadenine-based siphoviruses.

Pezo, Valerie; Jaziri, Faten; Bourguignon, Pierre-Yves; Louis, Dominique; Jacobs-Sera, Deborah; Rozenski, Jef; Pochet, Sylvie; Herdewijn, Piet; Hatfull, Graham F; Kaminski, Pierre-Alexandre; Marliere, Philippe.

Science ; 372(6541): 520-524, 2021 04 30.

Artigo em Inglês | MEDLINE | ID: mdl-33926956

RESUMO

Bacteriophage genomes harbor the broadest chemical diversity of nucleobases across all life forms. Certain DNA viruses that infect hosts as diverse as cyanobacteria, proteobacteria, and actinobacteria exhibit wholesale substitution of aminoadenine for adenine, thereby forming three hydrogen bonds with thymine and violating Watson-Crick pairing rules. Aminoadenine-encoded DNA polymerases, homologous to the Klenow fragment of bacterial DNA polymerase I that includes 3'-exonuclease but lacks 5'-exonuclease, were found to preferentially select for aminoadenine instead of adenine in deoxynucleoside triphosphate incorporation templated by thymine. Polymerase genes occur in synteny with genes for a biosynthesis enzyme that produces aminoadenine deoxynucleotides in a wide array of Siphoviridae bacteriophages. Congruent phylogenetic clustering of the polymerases and biosynthesis enzymes suggests that aminoadenine has propagated in DNA alongside adenine since archaic stages of evolution.

Assuntos

2-Aminopurina/análogos & derivados , Replicação do DNA , DNA Viral/biossíntese , DNA Polimerase Dirigida por DNA/química , Polimerização , Siphoviridae/química , Siphoviridae/enzimologia , Proteínas não Estruturais Virais/química , 2-Aminopurina/química , DNA Polimerase Dirigida por DNA/classificação , DNA Polimerase Dirigida por DNA/genética , Genoma Viral , Filogenia , Siphoviridae/genética , Proteínas não Estruturais Virais/classificação , Proteínas não Estruturais Virais/genética

Challenges in experimental data integration within genome-scale metabolic models.

Bourguignon, Pierre-Yves; Samal, Areejit; Képès, François; Jost, Jürgen; Martin, Olivier C.

Algorithms Mol Biol ; 5: 20, 2010 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-20412574

RESUMO

A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.

Variation in crossing-over rates across chromosome 4 of Arabidopsis thaliana reveals the presence of meiotic recombination "hot spots".

Drouaud, Jan; Camilleri, Christine; Bourguignon, Pierre-Yves; Canaguier, Aurélie; Bérard, Aurélie; Vezon, Daniel; Giancola, Sandra; Brunel, Dominique; Colot, Vincent; Prum, Bernard; Quesneville, Hadi; Mézard, Christine.

Genome Res ; 16(1): 106-14, 2006 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-16344568

RESUMO

Crossover (CO) is a key process for the accurate segregation of homologous chromosomes during the first meiotic division. In most eukaryotes, meiotic recombination is not homogeneous along the chromosomes, suggesting a tight control of the location of recombination events. We genotyped 71 single nucleotide polymorphisms (SNPs) covering the entire chromosome 4 of Arabidopsis thaliana on 702 F2 plants, representing 1404 meioses and allowing the detection of 1171 COs, to study CO localization in a higher plant. The genetic recombination rates varied along the chromosome from 0 cM/Mb near the centromere to 20 cM/Mb on the short arm next to the NOR region, with a chromosome average of 4.6 cM/Mb. Principal component analysis showed that CO rates negatively correlate with the G+C content (P = 3x10(-4)), in contrast to that reported in other eukaryotes. COs also significantly correlate with the density of single repeats and the CpG ratio, but not with genes, pseudogenes, transposable elements, or dispersed repeats. Chromosome 4 has, on average, 1.6 COs per meiosis, and these COs are subjected to interference. A detailed analysis of several regions having high CO rates revealed "hot spots" of meiotic recombination contained in small fragments of a few kilobases. Both the intensity and the density of these hot spots explain the variation of CO rates along the chromosome.

Assuntos

Arabidopsis/genética , Centrômero/genética , Cromossomos de Plantas/genética , Troca Genética/genética , Meiose/genética , Polimorfismo de Nucleotídeo Único , Composição de Bases/genética , Variação Genética , Sequências Repetitivas de Ácido Nucleico/genética

seq++: analyzing biological sequences with a range of Markov-related models.

Miele, Vincent; Bourguignon, Pierre-Yves; Robelin, David; Nuel, Grégory; Richard, Hugues.

Bioinformatics ; 21(11): 2783-4, 2005 Jun 01.

Artigo em Inglês | MEDLINE | ID: mdl-15774554

RESUMO

SUMMARY: The seq++ package offers a reference set of programs and an extensible library to biologists and developers working on sequence statistics. Its generality arises from the ability to handle sequences described with any alphabet (nucleotides, amino acids, codons and others). seq++ enables sequence modelling with various types of Markov models, including variable length Markov models and the newly developed parsimonious Markov models, all of them potentially phased. Simulation modules are supplied for Monte Carlo methods. Hence, this toolbox allows the study of any biological process which can be described by a series of states taken from a finite set.

Assuntos

Algoritmos , Modelos Genéticos , Alinhamento de Sequência/métodos , Análise de Sequência/métodos , Software , Cadeias de Markov , Modelos Estatísticos , Homologia de Sequência

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA