Your browser doesn't support javascript.
loading
Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups.
Abram, Kaleb; Udaondo, Zulema; Bleker, Carissa; Wanchai, Visanu; Wassenaar, Trudy M; Robeson, Michael S; Ussery, David W.
Afiliación
  • Abram K; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA.
  • Udaondo Z; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA.
  • Bleker C; The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee, 37996, USA.
  • Wanchai V; Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, 37996, USA.
  • Wassenaar TM; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA.
  • Robeson MS; Molecular Microbiology and Genomics Consultants, 55576, Zotzenheim, Germany.
  • Ussery DW; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, 72205, USA.
Commun Biol ; 4(1): 117, 2021 01 26.
Article en En | MEDLINE | ID: mdl-33500552
ABSTRACT
In this study, more than one hundred thousand Escherichia coli and Shigella genomes were examined and classified. This is, to our knowledge, the largest E. coli genome dataset analyzed to date. A Mash-based analysis of a cleaned set of 10,667 E. coli genomes from GenBank revealed 14 distinct phylogroups. A representative genome or medoid identified for each phylogroup was used as a proxy to classify 95,525 unassembled genomes from the Sequence Read Archive (SRA). We find that most of the sequenced E. coli genomes belong to four phylogroups (A, C, B1 and E2(O157)). Authenticity of the 14 phylogroups is supported by several different lines of evidence phylogroup-specific core genes, a phylogenetic tree constructed with 2613 single copy core genes, and differences in the rates of gene gain/loss/duplication. The methodology used in this work is able to reproduce known phylogroups, as well as to identify previously uncharacterized phylogroups in E. coli species.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genoma Bacteriano / Escherichia coli Idioma: En Revista: Commun Biol Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Genoma Bacteriano / Escherichia coli Idioma: En Revista: Commun Biol Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos