Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Bioinformatics ; 30(17): i534-40, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25161244

RESUMEN

MOTIVATION: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated posterior distributions, such methods fall short of providing a faithful summary of posterior distributions if the data do not offer compelling evidence for a single topology. RESULTS: Building upon previous work of Billera et al., summary statistics such as sample mean, median and variance are defined as the geometric median, Fréchet mean and variance, respectively. Their computation is enabled by recently published works, and embeds an algorithm for computing shortest paths in the space of trees. Studying the phylogeny of a set of plants, where several tree topologies occur in the posterior sample, the posterior mean balances correctly the contributions from the different topologies, where a consensus tree would be biased. Comparisons of the posterior mean, median and consensus trees with the ground truth using simulated data also reveals the benefits of a sound averaging method when reconstructing phylogenetic trees. AVAILABILITY AND IMPLEMENTATION: We provide two independent implementations of the algorithm for computing Fréchet means, geometric medians and variances in the space of phylogenetic trees. TFBayes: https://github.com/pbenner/tfbayes, TrAP: https://github.com/bacak/TrAP.


Asunto(s)
Modelos Estadísticos , Filogenia , Algoritmos , Teorema de Bayes
2.
J Bacteriol ; 193(6): 1461-72, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21239590

RESUMEN

Escherichia coli exhibits a wide range of lifestyles encompassing commensalism and various pathogenic behaviors which its highly dynamic genome contributes to develop. How environmental and host factors shape the genetic structure of E. coli strains remains, however, largely unknown. Following a previous study of E. coli genomic diversity, we investigated its diversity at the metabolic level by building and analyzing the genome-scale metabolic networks of 29 E. coli strains (8 commensal and 21 pathogenic strains, including 6 Shigella strains). Using a tailor-made reconstruction strategy, we significantly improved the completeness and accuracy of the metabolic networks over default automatic reconstruction processes. Among the 1,545 reactions forming E. coli panmetabolism, 885 reactions were common to all strains. This high proportion of core reactions (57%) was found to be in sharp contrast to the low proportion (13%) of core genes in the E. coli pangenome, suggesting less diversity of metabolic functions compared to that of all gene functions. Core reactions were significantly overrepresented among biosynthetic reactions compared to the more variable degradation processes. Differences between metabolic networks were found to follow E. coli phylogeny rather than pathogenic phenotypes, except for Shigella networks, which were significantly more distant from the others. This suggests that most metabolic changes in non-Shigella strains were not driven by their pathogenic phenotypes. Using a supervised method, we were yet able to identify small sets of reactions related to pathogenicity or commensalism. The quality of our reconstructed networks also makes them reliable bases for building metabolic models.


Asunto(s)
Escherichia coli/genética , Escherichia coli/metabolismo , Genoma Bacteriano , Redes y Vías Metabólicas/genética , Biología Computacional , Variación Genética
3.
FEMS Microbiol Rev ; 33(1): 164-90, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19067749

RESUMEN

Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety of computational methods exploiting metabolic models have been developed and applied to bacteria, yielding valuable insights into bacterial metabolism and evolution, and providing a sound basis for computer-assisted design in metabolic engineering. Recent advances in computational systems biology and high-throughput experimental technologies pave the way for the systematic reconstruction of metabolic models from genomes of new species, and a corresponding expansion of the scope of their applications. In this review, we provide an introduction to the key ideas of metabolic modeling, survey the methods, and resources that enable model reconstruction and refinement, and chart applications to the investigation of global properties of metabolic systems, the interpretation of experimental results, and the re-engineering of their biochemical capabilities.


Asunto(s)
Bacterias/metabolismo , Genoma Bacteriano , Redes y Vías Metabólicas , Modelos Biológicos , Bacterias/genética , Bases de Datos Genéticas , Biología de Sistemas
4.
Science ; 372(6541): 520-524, 2021 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-33926956

RESUMEN

Bacteriophage genomes harbor the broadest chemical diversity of nucleobases across all life forms. Certain DNA viruses that infect hosts as diverse as cyanobacteria, proteobacteria, and actinobacteria exhibit wholesale substitution of aminoadenine for adenine, thereby forming three hydrogen bonds with thymine and violating Watson-Crick pairing rules. Aminoadenine-encoded DNA polymerases, homologous to the Klenow fragment of bacterial DNA polymerase I that includes 3'-exonuclease but lacks 5'-exonuclease, were found to preferentially select for aminoadenine instead of adenine in deoxynucleoside triphosphate incorporation templated by thymine. Polymerase genes occur in synteny with genes for a biosynthesis enzyme that produces aminoadenine deoxynucleotides in a wide array of Siphoviridae bacteriophages. Congruent phylogenetic clustering of the polymerases and biosynthesis enzymes suggests that aminoadenine has propagated in DNA alongside adenine since archaic stages of evolution.


Asunto(s)
2-Aminopurina/análogos & derivados , Replicación del ADN , ADN Viral/biosíntesis , ADN Polimerasa Dirigida por ADN/química , Polimerizacion , Siphoviridae/química , Siphoviridae/enzimología , Proteínas no Estructurales Virales/química , 2-Aminopurina/química , ADN Polimerasa Dirigida por ADN/clasificación , ADN Polimerasa Dirigida por ADN/genética , Genoma Viral , Filogenia , Siphoviridae/genética , Proteínas no Estructurales Virales/clasificación , Proteínas no Estructurales Virales/genética
5.
Algorithms Mol Biol ; 5: 20, 2010 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-20412574

RESUMEN

A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincaré, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.

6.
Genome Res ; 16(1): 106-14, 2006 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-16344568

RESUMEN

Crossover (CO) is a key process for the accurate segregation of homologous chromosomes during the first meiotic division. In most eukaryotes, meiotic recombination is not homogeneous along the chromosomes, suggesting a tight control of the location of recombination events. We genotyped 71 single nucleotide polymorphisms (SNPs) covering the entire chromosome 4 of Arabidopsis thaliana on 702 F2 plants, representing 1404 meioses and allowing the detection of 1171 COs, to study CO localization in a higher plant. The genetic recombination rates varied along the chromosome from 0 cM/Mb near the centromere to 20 cM/Mb on the short arm next to the NOR region, with a chromosome average of 4.6 cM/Mb. Principal component analysis showed that CO rates negatively correlate with the G+C content (P = 3x10(-4)), in contrast to that reported in other eukaryotes. COs also significantly correlate with the density of single repeats and the CpG ratio, but not with genes, pseudogenes, transposable elements, or dispersed repeats. Chromosome 4 has, on average, 1.6 COs per meiosis, and these COs are subjected to interference. A detailed analysis of several regions having high CO rates revealed "hot spots" of meiotic recombination contained in small fragments of a few kilobases. Both the intensity and the density of these hot spots explain the variation of CO rates along the chromosome.


Asunto(s)
Arabidopsis/genética , Centrómero/genética , Cromosomas de las Plantas/genética , Intercambio Genético/genética , Meiosis/genética , Polimorfismo de Nucleótido Simple , Composición de Base/genética , Variación Genética , Secuencias Repetitivas de Ácidos Nucleicos/genética
7.
Bioinformatics ; 21(11): 2783-4, 2005 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-15774554

RESUMEN

SUMMARY: The seq++ package offers a reference set of programs and an extensible library to biologists and developers working on sequence statistics. Its generality arises from the ability to handle sequences described with any alphabet (nucleotides, amino acids, codons and others). seq++ enables sequence modelling with various types of Markov models, including variable length Markov models and the newly developed parsimonious Markov models, all of them potentially phased. Simulation modules are supplied for Monte Carlo methods. Hence, this toolbox allows the study of any biological process which can be described by a series of states taken from a finite set.


Asunto(s)
Algoritmos , Modelos Genéticos , Alineación de Secuencia/métodos , Análisis de Secuencia/métodos , Programas Informáticos , Cadenas de Markov , Modelos Estadísticos , Homología de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA