Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Mol Biol Evol ; 40(11)2023 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-37948764

RESUMO

Performing phylogenetic analysis with genome sequences maximizes the information used to estimate phylogenies and the resolution of closely related taxa. The use of single-nucleotide polymorphisms (SNPs) permits estimating trees without genome alignments and permits the use of data sets of hundreds of microbial genomes. kSNP4 is a program that identifies SNPs without using a reference genome, estimates parsimony, maximum likelihood, and neighbor-joining trees, and is able to annotate the discovered SNPs. kSNP4 is a command-line program that does not require any additional programs or dependencies to install or use. kSNP4 does not require any programming experience or bioinformatics experience to install and use. It is suitable for use by students through senior investigators. It includes a detailed user guide that explains all of the many features of kSNP4. In this study, we provide a detailed step-by-step protocol for downloading, installing, and using kSNP4 to build phylogenetic trees from genome sequences.


Assuntos
Biologia Computacional , Evolução Molecular , Humanos , Filogenia
2.
Mol Biol Evol ; 34(12): 3303-3309, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-29029174

RESUMO

Growth rates are an important tool in microbiology because they provide high throughput fitness measurements. The release of GrowthRates, a program that uses the output of plate reader files to automatically calculate growth rates, has facilitated experimental procedures in many areas. However, many sources of variation within replicate growth rate data exist and can decrease data reliability. We have developed a new statistical package, CompareGrowthRates (CGR), to enhance the program GrowthRates and accurately measure variation in growth rate data sets. We define a metric, Variability-score (V-score), that can help determine if variation within a data set might result in false interpretations. CGR also uses the bootstrap method to determine the fraction of bootstrap replicates in which a strain will grow the fastest. We illustrate the usage of CGR with growth rate data sets similar to those in Mira, Meza, et al. (Adaptive landscapes of resistance genes change as antibiotic concentrations change. Mol Biol Evol. 32(10): 2707-2715). These statistical methods are compatible with the analytic methods described in Growth Rates Made Easy and can be used with any set of growth rate output from GrowthRates.


Assuntos
Bactérias/crescimento & desenvolvimento , Contagem de Colônia Microbiana/métodos , Contagem de Colônia Microbiana/estatística & dados numéricos , Biometria/métodos , Viabilidade Microbiana/genética , Reprodutibilidade dos Testes , Software
3.
J Clin Microbiol ; 55(7): 2143-2152, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28446577

RESUMO

Strict infection control practices have been implemented for health care visits by cystic fibrosis (CF) patients in an attempt to prevent transmission of important pathogens. This study used whole-genome sequencing (WGS) to determine strain relatedness and assess population dynamics of Staphylococcus aureus isolates from a cohort of CF patients as assessed by strain relatedness. A total of 311 S. aureus isolates were collected from respiratory cultures of 115 CF patients during a 22-month study period. Whole-genome sequencing was performed, and using single nucleotide polymorphism (SNP) analysis, phylogenetic trees were assembled to determine relatedness between isolates. Methicillin-resistant Staphylococcus aureus (MRSA) phenotypes were predicted using PPFS2 and compared to the observed phenotype. The accumulation of SNPs in multiple isolates obtained over time from the same patient was examined to determine if a genomic molecular clock could be calculated. Pairs of isolates with ≤71 SNP differences were considered to be the "same" strain. All of the "same" strain isolates were either from the same patient or siblings pairs. There were 47 examples of patients being superinfected with an unrelated strain. The predicted MRSA phenotype was accurate in all but three isolates. Mutation rates were unable to be determined because the branching order in the phylogenetic tree was inconsistent with the order of isolation. The observation that transmissions were identified between sibling patients shows that WGS is an effective tool for determining transmission between patients. The observation that transmission only occurred between siblings suggests that Staphylococcus aureus acquisition in our CF population occurred outside the hospital environment and indicates that current infection prevention efforts appear effective.


Assuntos
Fibrose Cística/complicações , Variação Genética , Infecções Estafilocócicas/microbiologia , Staphylococcus aureus/classificação , Staphylococcus aureus/genética , Sequenciamento Completo do Genoma , Adolescente , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Filogenia , Polimorfismo de Nucleotídeo Único , Dinâmica Populacional , Staphylococcus aureus/isolamento & purificação , Adulto Jovem
4.
Bioinformatics ; 31(17): 2877-8, 2015 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-25913206

RESUMO

UNLABELLED: We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2. AVAILABILITY AND IMPLEMENTATION: kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/ CONTACT: barryghall@gmail.com.


Assuntos
Biologia Computacional/métodos , Escherichia coli/genética , Genoma Bacteriano , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Software , Bases de Dados de Ácidos Nucleicos , Escherichia coli/classificação , Anotação de Sequência Molecular
5.
Cladistics ; 32(1): 90-99, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34732024

RESUMO

kSNP v2 is a powerful tool for single nucleotide polymorphism (SNP) identification from complete microbial genomes and for estimating phylogenetic trees from the identified SNPs. kSNP can analyse finished genomes, genome assemblies, raw reads or any combination of those and does not require either genome alignment or reference genomes. This study uses sequence evolution simulations to evaluate the topological accuracy of kSNP trees and to assess the effects of diversity and recombination on that accuracy. The accuracies of kSNP trees are strongly affected by increasing diversity, with parsimony accuracy > maximum-likelihood accuracy > neighbour-joining accuracy. Accuracy is also strongly influenced by recombination; as recombination increases accuracy decreases. Reliable trees are arbitrarily defined as those that have ≥ 90% topological accuracy. It is determined that the best predictor of topological accuracy is the ratio of r/m, a measure of the effect of recombination, to FCK (the fraction of core kmers), a measure of diversity. Tools are available to allow investigators to determine both r/m and FCK, and the relationship between topological accuracy and the ratio of r/m to FCK is determined. The practical implication of this study is that kSNP is an effective tool for estimating phylogenetic trees from microbial genome sequences provided that both recombination and sequence diversity are within acceptable ranges.

6.
Mol Biol Evol ; 31(1): 232-8, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24170494

RESUMO

In the 1960s-1980s, determination of bacterial growth rates was an important tool in microbial genetics, biochemistry, molecular biology, and microbial physiology. The exciting technical developments of the 1990s and the 2000s eclipsed that tool; as a result, many investigators today lack experience with growth rate measurements. Recently, investigators in a number of areas have started to use measurements of bacterial growth rates for a variety of purposes. Those measurements have been greatly facilitated by the availability of microwell plate readers that permit the simultaneous measurements on up to 384 different cultures. Only the exponential (logarithmic) portions of the resulting growth curves are useful for determining growth rates, and manual determination of that portion and calculation of growth rates can be tedious for high-throughput purposes. Here, we introduce the program GrowthRates that uses plate reader output files to automatically determine the exponential portion of the curve and to automatically calculate the growth rate, the maximum culture density, and the duration of the growth lag phase. GrowthRates is freely available for Macintosh, Windows, and Linux. We discuss the effects of culture volume, the classical bacterial growth curve, and the differences between determinations in rich media and minimal (mineral salts) media. This protocol covers calibration of the plate reader, growth of culture inocula for both rich and minimal media, and experimental setup. As a guide to reliability, we report typical day-to-day variation in growth rates and variation within experiments with respect to position of wells within the plates.


Assuntos
Bactérias/crescimento & desenvolvimento , Software , Algoritmos , Técnicas Bacteriológicas , Meios de Cultura/química , Fenótipo , Reprodutibilidade dos Testes
7.
Mol Biol Evol ; 30(5): 1229-35, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23486614

RESUMO

Phylogenetic analysis is sometimes regarded as being an intimidating, complex process that requires expertise and years of experience. In fact, it is a fairly straightforward process that can be learned quickly and applied effectively. This Protocol describes the several steps required to produce a phylogenetic tree from molecular data for novices. In the example illustrated here, the program MEGA is used to implement all those steps, thereby eliminating the need to learn several programs, and to deal with multiple file formats from one step to another (Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 28:2731-2739). The first step, identification of a set of homologous sequences and downloading those sequences, is implemented by MEGA's own browser built on top of the Google Chrome toolkit. For the second step, alignment of those sequences, MEGA offers two different algorithms: ClustalW and MUSCLE. For the third step, construction of a phylogenetic tree from the aligned sequences, MEGA offers many different methods. Here we illustrate the maximum likelihood method, beginning with MEGA's Models feature, which permits selecting the most suitable substitution model. Finally, MEGA provides a powerful and flexible interface for the final step, actually drawing the tree for publication. Here a step-by-step protocol is presented in sufficient detail to allow a novice to start with a sequence of interest and to build a publication-quality tree illustrating the evolution of an appropriate set of homologs of that sequence. MEGA is available for use on PCs and Macs from www.megasoftware.net.


Assuntos
Evolução Molecular , Filogenia , Software , Algoritmos , Internet
8.
J Bacteriol ; 194(15): 3922-37, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22609915

RESUMO

Gardnerella vaginalis is associated with a spectrum of clinical conditions, suggesting high degrees of genetic heterogeneity among stains. Seventeen G. vaginalis isolates were subjected to a battery of comparative genomic analyses to determine their level of relatedness. For each measure, the degree of difference among the G. vaginalis strains was the highest observed among 23 pathogenic bacterial species for which at least eight genomes are available. Genome sizes ranged from 1.491 to 1.716 Mb; GC contents ranged from 41.18% to 43.40%; and the core genome, consisting of only 746 genes, makes up only 51.6% of each strain's genome on average and accounts for only 27% of the species supragenome. Neighbor-grouping analyses, using both distributed gene possession data and core gene allelic data, each identified two major sets of strains, each of which is composed of two groups. Each of the four groups has its own characteristic genome size, GC ratio, and greatly expanded core gene content, making the genomic diversity of each group within the range for other bacterial species. To test whether these 4 groups corresponded to genetically isolated clades, we inferred the phylogeny of each distributed gene that was present in at least two strains and absent in at least two strains; this analysis identified frequent homologous recombination within groups but not between groups or sets. G. vaginalis appears to include four nonrecombining groups/clades of organisms with distinct gene pools and genomic properties, which may confer distinct ecological properties. Consequently, it may be appropriate to treat these four groups as separate species.


Assuntos
Infecções Bacterianas/microbiologia , DNA Bacteriano/genética , Gardnerella vaginalis/classificação , Gardnerella vaginalis/genética , Genoma Bacteriano , Polimorfismo Genético , Composição de Bases , Análise por Conglomerados , DNA Bacteriano/química , Gardnerella vaginalis/isolamento & purificação , Genes Bacterianos , Genótipo , Humanos , Dados de Sequência Molecular , Filogenia , Análise de Sequência de DNA
9.
PLoS One ; 17(10): e0276040, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36228033

RESUMO

The spectrophotometer has been used for decades to measure the density of bacterial populations as the turbidity expressed as optical density-OD. However, the OD alone is an unreliable metric and is only proportionately accurate to cell titers to about an OD of 0.1. The relationship between OD and cell titer depends on the configuration of the spectrophotometer, the length of the light path through the culture, the size of the bacterial cells, and the cell culture density. We demonstrate the importance of plate reader calibration to identify the exact relationship between OD and cells/mL. We use four bacterial genera and two sizes of micro-titer plates (96-well and 384-well) to show that the cell/ml per unit OD depends heavily on the bacterial cell size and plate size. We applied our calibration curve to real growth curve data and conclude the cells/mL-rather than OD-is a metric that can be used to directly compare results across experiments, labs, instruments, and species.


Assuntos
Bactérias , Espectrofotometria/métodos
10.
BMC Genomics ; 12: 187, 2011 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-21489287

RESUMO

BACKGROUND: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest. RESULTS: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains. CONCLUSIONS: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.


Assuntos
Genoma Bacteriano , Haemophilus influenzae/genética , Staphylococcus aureus/genética , Streptococcus pneumoniae/genética , Algoritmos , Animais , Bovinos , Regulação Bacteriana da Expressão Gênica , Haemophilus influenzae/isolamento & purificação , Humanos , Modelos Genéticos , Família Multigênica , Fases de Leitura Aberta , Infecções Estafilocócicas/microbiologia , Staphylococcus aureus/isolamento & purificação , Streptococcus pneumoniae/isolamento & purificação
11.
J Clin Microbiol ; 49(10): 3568-75, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21849692

RESUMO

Minimum spanning trees (MSTs) are frequently used in molecular epidemiology research to estimate relationships among individual strains or isolates. Nevertheless, there are significant caveats to MST algorithms that have been largely ignored in molecular epidemiology studies and that have the potential to confound or alter the interpretation of the results of those analyses. Specifically, (i) presenting a single, arbitrarily selected MST illustrates only one of potentially many equally optimal solutions, and (ii) statistical metrics are not used to assess the credibility of MST estimations. Here, we survey published MSTs previously used to infer microbial population structure in order to determine the effect of these factors. We propose a technique to estimate the number of alternative MSTs for a data set and find that multiple MSTs exist for each case in our survey. By implementing a bootstrapping metric to evaluate the reliability of alternative MST solutions, we discover that they encompass a wide range of credibility values. On the basis of these observations, we conclude that current approaches to studying population structure using MSTs are inadequate. We instead propose a systematic approach to MST estimation that bases analyses on the optimal computation of an input distance matrix, provides information about the number and configurations of alternative MSTs, and allows identification of the most credible MST or MSTs by using a bootstrapping metric. It is our hope this algorithm will become the new "gold standard" approach for analyzing MSTs for molecular epidemiology so that this generally useful computational approach can be used informatively and to its full potential.


Assuntos
Tipagem Molecular/métodos , Polimorfismo Genético , Análise por Conglomerados , Genótipo , Humanos , Epidemiologia Molecular/métodos
12.
Mol Biol Evol ; 26(11): 2487-97, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19625389

RESUMO

Glycosyl hydrolase Family 4 (GH4) is exceptional among the 114 families in this enzyme superfamily. Members of GH4 exhibit unusual cofactor requirements for activity, and an essential cysteine residue is present at the active site. Of greatest significance is the fact that members of GH4 employ a unique catalytic mechanism for cleavage of the glycosidic bond. By phylogenetic analysis, and from available substrate specificities, we have assigned a majority of the enzymes of GH4 to five subgroups. Our classification revealed an unexpected relationship between substrate specificity and the presence, in each subgroup, of a motif of four amino acids that includes the active-site Cys residue: alpha-glucosidase, CHE(I/V); alpha-galactosidase, CHSV; alpha-glucuronidase, CHGx; 6-phospho-alpha-glucosidase, CDMP; and 6-phospho-beta-glucosidase, CN(V/I)P. The question arises: Does the presence of a particular motif sufficiently predict the catalytic function of an unassigned GH4 protein? To test this hypothesis, we have purified and characterized the alpha-glucoside-specific GH4 enzyme (PalH) from the phytopathogen, Erwinia rhapontici. The CHEI motif in this protein has been changed by site-directed mutagenesis, and the effects upon substrate specificity have been determined. The change to CHSV caused the loss of all alpha-glucosidase activity, but the mutant protein exhibited none of the anticipated alpha-galactosidase activity. The Cys-containing motif may be suggestive of enzyme specificity, but phylogenetic placement is required for confidence in that specificity. The Acholeplasma laidlawii GH4 protein is phylogenetically a phospho-beta-glucosidase but has a unique SSSP motif. Lacking the initial Cys in that motif it cannot hydrolyze glycosides by the normal GH4 mechanism because the Cys is required to position the metal ion for hydrolysis, nor can it use the more common single or double-displacement mechanism of Koshland. Several considerations suggest that the protein has acquired a new function as the consequence of positive selection. This study emphasizes the importance of automatic annotation systems that by integrating phylogenetic analysis, functional motifs, and bioinformatics data, may lead to innovative experiments that further our understanding of biological systems.


Assuntos
Erwinia/enzimologia , Evolução Molecular , Glicosídeo Hidrolases/classificação , Glicosídeo Hidrolases/genética , Mutagênese Sítio-Dirigida , Filogenia , alfa-Galactosidase/classificação , alfa-Galactosidase/genética , alfa-Glucosidases/classificação , alfa-Glucosidases/genética
13.
Microbiology (Reading) ; 156(Pt 4): 1060-1068, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20019077

RESUMO

The most widely used DNA-based method for bacterial strain typing, multi-locus sequence typing (MLST), lacks sufficient resolution to distinguish among many bacterial strains within a species. Here, we show that strain typing based on the presence or absence of distributed genes is able to resolve all completely sequenced genomes of six bacterial species. This was accomplished by the development of a clustering method, neighbour grouping, which is completely consistent with the lower-resolution MLST method, but provides far greater resolving power. Because the presence/absence of distributed genes can be determined by low-cost microarray analyses, it offers a practical, high-resolution alternative to MLST that could provide valuable diagnostic and prognostic information for pathogenic bacterial species.


Assuntos
Bactérias/classificação , Bactérias/genética , Técnicas de Tipagem Bacteriana/métodos , Genoma Bacteriano , Análise de Sequência de DNA/métodos , Dados de Sequência Molecular
14.
J Clin Microbiol ; 48(6): 1997-2008, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20351204

RESUMO

It has proven challenging to investigate the molecular epidemiology of Mycobacterium leprae, the causative agent of leprosy, due to difficulties with culturing of the organism and a lack of genetic heterogeneity between strains. Recently, a cost-effective panel of variable-number tandem-repeat (VNTR) markers has been developed. Use of this panel allows some of those limitations to be overcome and has allowed the genotyping of 475 M. leprae strains from six different countries. In the present report, we provide a comprehensive analysis of the relationships among the strains in order to investigate the patterns of transmission and migration of M. leprae. We find phylogenetic analysis to be inadequate and have developed an alternative method, structure-neighbor clustering, which assigns isolates with the most similar genotypes to the same groups and, subsequently, subgroups, without inferring how the strains descended from a common ancestor. We validate the approach by using simulated data and detecting expected epidemiological relationships from experimental data. Our results suggest that most M. leprae strains from a given country cluster together and that the occasional isolates assigned to different clusters are a consequence of migration. We found three genetically distinguishable populations among isolates from the Philippines, as well as evidence for the significant influx of strains to that nation from India. We also report that reference strain TN originated from the Philippines and not from India, as was previously believed. Lastly, analysis of isolates from the same families and villages suggests that most community infections originate from a common source or person-to-person transmission but that infection from independent sources does occur with measurable frequency.


Assuntos
Técnicas de Tipagem Bacteriana/métodos , Hanseníase/epidemiologia , Hanseníase/microbiologia , Mycobacterium leprae/classificação , Mycobacterium leprae/genética , Polimorfismo Genético , Análise por Conglomerados , Impressões Digitais de DNA , Genótipo , Humanos , Índia/epidemiologia , Hanseníase/transmissão , Repetições Minissatélites , Epidemiologia Molecular , Mycobacterium leprae/isolamento & purificação , Filipinas/epidemiologia , Filogenia
15.
Mol Biol Evol ; 25(4): 688-95, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18192698

RESUMO

Phylogenetic reconstruction based upon multiple alignments of molecular sequences is important to most branches of modern biology and is central to molecular evolution. Understanding the historical relationships among macromolecules depends upon computer programs that implement a variety of analytical methods. Because it is impossible to know those historical relationships with certainty, assessment of the accuracy of methods and the programs that implement them requires the use of programs that realistically simulate the evolution of DNA sequences. EvolveAGene 3 is a realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions, including variable regions of selection intensity within the sequence and variation in intensity of selection over branches. Variation includes base substitutions, insertions, and deletions. To the best of my knowledge, it is the only program available that simulates the evolution of intact coding sequences. Output includes the true tree and true alignments of the resulting coding sequence and corresponding protein sequences. A log file reports the frequencies of each kind of base substitution, the ratio of transition to transversion substitutions, the ratio of indel to base substitution mutations, and the numbers of silent and amino acid replacement mutations. The realism of the data sets has been assessed by comparing the d(N)/d(S) ratio, the ratio of transition to transversion substitutions, and the ratio of indel to base substitution mutations of the simulated data sets with those parameters of real data sets from the "gold standard" BaliBase collection of structural alignments. Results show that the data sets produced by EvolveAGene 3 are very similar to real data sets, and EvolveAGene 3 is therefore a realistic simulation program that can be used to evaluate a variety of programs and methods in molecular evolution.


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Evolução Molecular , Modelos Genéticos , Software , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Escherichia coli/genética , Alinhamento de Sequência
16.
Mol Biol Evol ; 25(8): 1576-80, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18458029

RESUMO

Multiple sequence alignment is an essential tool in many areas of biological research, and the accuracy of an alignment can strongly affect the accuracy of a downstream application such as phylogenetic analysis, identification of functional motifs, or polymerase chain reaction primer design. The heads or tails (HoT) method (Landan G, Graur D. 2007. Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol. 24:1380-1383.) assesses the consistency of an alignment by comparing the alignment of a set of sequences with the alignment of the same set of sequences written in reverse order. This study shows that HoT scores and the alignment accuracies are positively correlated, so alignments with higher HoT scores are preferable. However, HoT scores are overestimates of alignment accuracy in general, with the extent of overestimation depending on the method used for multiple sequence alignment.


Assuntos
Alinhamento de Sequência/métodos , Software , Simulação por Computador , Alinhamento de Sequência/estatística & dados numéricos
17.
PLoS Comput Biol ; 3(3): e51, 2007 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-17367204

RESUMO

Metrics of phylogenetic tree reliability, such as parametric bootstrap percentages or Bayesian posterior probabilities, represent internal measures of the topological reproducibility of a phylogenetic tree, while the recently introduced aLRT (approximate likelihood ratio test) assesses the likelihood that a branch exists on a maximum-likelihood tree. Although those values are often equated with phylogenetic tree accuracy, they do not necessarily estimate how well a reconstructed phylogeny represents cladistic relationships that actually exist in nature. The authors have therefore attempted to quantify how well bootstrap percentages, posterior probabilities, and aLRT measures reflect the probability that a deduced phylogenetic clade is present in a known phylogeny. The authors simulated the evolution of bacterial genes of varying lengths under biologically realistic conditions, and reconstructed those known phylogenies using both maximum likelihood and Bayesian methods. Then, they measured how frequently clades in the reconstructed trees exhibiting particular bootstrap percentages, aLRT values, or posterior probabilities were found in the true trees. The authors have observed that none of these values correlate with the probability that a given clade is present in the known phylogeny. The major conclusion is that none of the measures provide any information about the likelihood that an individual clade actually exists. It is also found that the mean of all clade support values on a tree closely reflects the average proportion of all clades that have been assigned correctly, and is thus a good representation of the overall accuracy of a phylogenetic tree.


Assuntos
Algoritmos , Análise Mutacional de DNA/métodos , Escherichia coli/genética , Evolução Molecular , Modelos Genéticos , Filogenia , Simulação por Computador , Interpretação Estatística de Dados , Variação Genética/genética , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Estatística como Assunto
19.
Ann Epidemiol ; 16(3): 157-69, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16099674

RESUMO

Phylogenetics is a powerful tool for microbial epidemiology, but it is a tool that is often misused and misinterpreted by the field. Microbial epidemiologists are cautioned that in order to draw any inferences about the order of descent from a common ancestor it is necessary to correctly root a phylogenetic tree. Epidemiological samples of microbial populations typically include both ancestors and their descendants. In order to illustrate the relationships of those isolates, the phylogenetic method used must be able to detect zero-length branches. Unweighted Pair-Group Method (UPGMA) is the phylogenetic method that is most widely used in microbial epidemiology. Because UPGMA cannot detect zero length branches, and because it places the root of the tree based on a usually-false assumption, UPGMA is the worst possible choice among the several phylogenetic methods available. Because microbial epidemiology deals with relationships among strains within a species, rather than with relationships among species, recombination within those species can render phylogenetic trees meaningless and positively misleading. When there is evidence of significant recombination within the species of interest phylogenetic trees should not be used at all. Instead, alternative tools such as eBURST should be used to understand relationships among isolates.


Assuntos
Doenças Transmissíveis/epidemiologia , Doenças Transmissíveis/microbiologia , Epidemiologia Molecular/métodos , Filogenia , Sequência de Bases , Humanos , Dados de Sequência Molecular , Recombinação Genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA