Pesquisa | Portal Regional da BVS

Fast genome-based delimitation of Enterobacterales species.

Hernández-Salmerón, Julie E; Irani, Tanya; Moreno-Hagelsieb, Gabriel.

PLoS One ; 18(9): e0291492, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37708115

RESUMO

Average Nucleotide Identity (ANI) is becoming a standard measure for bacterial species delimitation. However, its calculation can take orders of magnitude longer than similarity estimates based on sampling of short nucleotides, compiled into so-called sketches. These estimates are widely used. However, their variable correlation with ANI has suggested that they might not be as accurate. For a where-the-rubber-meets-the-road assessment, we compared two sketching programs, mash and dashing, against ANI, in delimiting species among Esterobacterales genomes. Receiver Operating Characteristic (ROC) analysis found Area Under the Curve (AUC) values of 0.99, almost perfect species discrimination for all three measures. Subsampling to avoid over-represented species reduced these AUC values to 0.92, still highly accurate. Focused tests with ten genera, each represented by more than three species, also showed almost identical results for all methods. Shigella showed the lowest AUC values (0.68), followed by Citrobacter (0.80). All other genera, Dickeya, Enterobacter, Escherichia, Klebsiella, Pectobacterium, Proteus, Providencia and Yersinia, produced AUC values above 0.90. The species delimitation thresholds varied, with species distance ranges in a few genera overlapping the genus ranges of other genera. Mash was able to separate the E. coli + Shigella complex into 25 apparent phylogroups, four of them corresponding, roughly, to the four Shigella species represented in the data. Our results suggest that fast estimates of genome similarity are as good as ANI for species delimitation. Therefore, these estimates might suffice for covering the role of genomic similarity in bacterial taxonomy, and should increase confidence in their use for efficient bacterial identification and clustering, from epidemiological to genome-based detection of potential contaminants in farming and industry settings.

Assuntos

Escherichia coli , Gammaproteobacteria , Animais , Dickeya , Genômica , Agricultura

FastANI, Mash and Dashing equally differentiate between Klebsiella species.

Hernández-Salmerón, Julie E; Moreno-Hagelsieb, Gabriel.

PeerJ ; 10: e13784, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35891643

RESUMO

Bacteria of the genus Klebsiella are among the most important multi-drug resistant human pathogens, though they have been isolated from a variety of environments. The importance and ubiquity of these organisms call for quick and accurate methods for their classification. Average Nucleotide Identity (ANI) is becoming a standard for species delimitation based on whole genome sequence comparison. However, much faster genome comparison tools have been appearing in the literature. In this study we tested the quality of different approaches for genome-based species delineation against ANI. To this end, we compared 1,189 Klebsiella genomes using measures calculated with Mash, Dashing, and DNA compositional signatures, all of which run in a fraction of the time required to obtain ANI. Receiver Operating Characteristic (ROC) curve analyses showed equal quality in species discrimination for ANI, Mash and Dashing, with Area Under the Curve (AUC) values above 0.99, followed by DNA signatures (AUC: 0.96). Accordingly, groups obtained at optimized cutoffs largely agree with species designation, with ANI, Mash and Dashing producing 15 species-level groups. DNA signatures broke the dataset into more than 30 groups. Testing Mash to map species after adding draft genomes to the dataset also showed excellent results (AUC above 0.99), producing a total of 26 Klebsiella species-level groups. The ecological niches of Klebsiella strains were found to neither be related to species delimitation, nor to protein functional content, suggesting that a single Klebsiella species can have a wide repertoire of ecological functions.

Assuntos

Genoma Bacteriano , Klebsiella , Humanos , Klebsiella/genética , Genoma Bacteriano/genética , Bactérias , DNA

Progress in quickly finding orthologs as reciprocal best hits: comparing blast, last, diamond and MMseqs2.

Hernández-Salmerón, Julie E; Moreno-Hagelsieb, Gabriel.

BMC Genomics ; 21(1): 741, 2020 Oct 24.

Artigo em Inglês | MEDLINE | ID: mdl-33099302

RESUMO

BACKGROUND: Finding orthologs remains an important bottleneck in comparative genomics analyses. While the authors of software for the quick comparison of protein sequences evaluate the speed of their software and compare their results against the most usual software for the task, it is not common for them to evaluate their software for more particular uses, such as finding orthologs as reciprocal best hits (RBH). Here we compared RBH results obtained using software that runs faster than blastp. Namely, lastal, diamond, and MMseqs2. RESULTS: We found that lastal required the least time to produce results. However, it yielded fewer results than any other program when comparing the proteins encoded by evolutionarily distant genomes. The program producing the most similar number of RBH to blastp was diamond ran with the "ultra-sensitive" option. However, this option was diamond's slowest, with the "very-sensitive" option offering the best balance between speed and RBH results. The speeding up of the programs was much more evident when dealing with eukaryotic genomes, which code for more numerous proteins. For example, lastal took a median of approx. 1.5% of the blastp time to run with bacterial proteomes and 0.6% with eukaryotic ones, while diamond with the very-sensitive option took 7.4% and 5.2%, respectively. Though estimated error rates were very similar among the RBH obtained with all programs, RBH obtained with MMseqs2 had the lowest error rates among the programs tested. CONCLUSIONS: The fast algorithms for pairwise protein comparison produced results very similar to blast in a fraction of the time, with diamond offering the best compromise in speed, sensitivity and quality, as long as a sensitivity option, other than the default, was chosen.

Assuntos

Diamante , Software , Algoritmos , Sequência de Aminoácidos , Genoma

Genome Comparison of Pseudomonas fluorescens UM270 with Related Fluorescent Strains Unveils Genes Involved in Rhizosphere Competence and Colonization.

Hernández-Salmerón, Julie E; Moreno-Hagelsieb, Gabriel; Santoyo, Gustavo.

J Genomics ; 5: 91-98, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28943971

RESUMO

Pseudomonas fluorescens UM270 is a rhizosphere-colonizing bacterium that produces multiple diffusible and volatile compounds involved in plant growth-promoting activities. Strain UM270 exhibits excellent biocontrol capacities against diverse fungal pathogens . In a previous study, the general UM270 genome characteristics were published. Here, we report a deeper analysis of its gene content and compare it to other P. fluorescens strains to unveil the genetic elements that might explain UM270's great colonizing and plant growth-promoting capabilities. Our analyses found high variation in genome size and gene content among the eight Pseudomonas genomes analyzed (strains UM270, Pf0-1, A506, F113, SBW25, PICF-7, UK4 and UW4). A core genome of 3,039 coding DNA sequences (CDSs) was determined, with 599 CDSs present only in the UM270 genome. From these unique UM270 genes, a set of 192 CDSs was found to be involved in signaling, rhizosphere colonization and competence, highlighted as important traits to achieve an effective biocontrol and plant growth promotion.

Draft Genome Sequence of the Biocontrol and Plant Growth-Promoting Rhizobacterium Pseudomonas fluorescens strain UM270.

Hernández-Salmerón, Julie E; Hernández-León, Rocio; Orozco-Mosqueda, Ma Del Carmen; Valencia-Cantero, Eduardo; Moreno-Hagelsieb, Gabriel; Santoyo, Gustavo.

Stand Genomic Sci ; 11: 5, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-26767092

RESUMO

The Pseudomonas fluorescens strain UM270 was isolated form the rhizosphere of wild Medicago spp. A previous work has shown that this pseudomonad isolate was able to produce diverse diffusible and volatile compounds involved in plant protection and growth promotion. Here, we present the draft genome sequence of the rhizobacterium P. fluorescens strain UM270. The sequence covers 6,047,974 bp of a single chromosome, with 62.66 % G + C content and no plasmids. Genome annotations predicted 5,509 genes, 5,396 coding genes, 59 RNA genes and 110 pseudogenes. Genome sequence analysis revealed the presence of genes involved in biological control and plant-growth promoting activities. We anticipate that the P. fluorescens strain UM270 genome will contribute insights about bacterial plant protection and beneficial properties through genomic comparisons among fluorescent pseudomonads.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA