RESUMEN
MOTIVATION: Kinases of the eukaryotic protein kinase superfamily are key regulators of most aspects eukaryotic cellular behavior and have provided several drug targets including kinases dysregulated in cancers. The rapid increase in the number of genomic sequences has created an acute need to identify and classify members of this important class of enzymes efficiently and accurately. RESULTS: Kinannote produces a draft kinome and comparative analyses for a predicted proteome using a single line command, and it is currently the only tool that automatically classifies protein kinases using the controlled vocabulary of Hanks and Hunter [Hanks and Hunter (1995)]. A hidden Markov model in combination with a position-specific scoring matrix is used by Kinannote to identify kinases, which are subsequently classified using a BLAST comparison with a local version of KinBase, the curated protein kinase dataset from www.kinase.com. Kinannote was tested on the predicted proteomes from four divergent species. The average sensitivity and precision for kinome retrieval from the test species are 94.4 and 96.8%. The ability of Kinannote to classify identified kinases was also evaluated, and the average sensitivity and precision for full classification of conserved kinases are 71.5 and 82.5%, respectively. Kinannote has had a significant impact on eukaryotic genome annotation, providing protein kinase annotations for 36 genomes made public by the Broad Institute in the period spanning 2009 to the present. AVAILABILITY: Kinannote is freely available at http://sourceforge.net/projects/kinannote.
Asunto(s)
Células Eucariotas/enzimología , Proteínas Quinasas/clasificación , Algoritmos , Genoma , Internet , Posición Específica de Matrices de Puntuación , Proteínas Quinasas/genética , Proteínas Quinasas/metabolismo , Proteoma/genética , Diseño de SoftwareRESUMEN
BACKGROUND: As public health interventions drive parasite populations to elimination, genetic epidemiology models that incorporate population genomics can be powerful tools for evaluating the effectiveness of continued intervention. However, current genetic epidemiology models may not accurately simulate the population genetic profile of parasite populations, particularly with regard to polygenomic (multi-strain) infections. Current epidemiology models simulate polygenomic infections via superinfection (multiple mosquito bites), despite growing evidence that cotransmission (a single mosquito bite) may contribute to polygenomic infections. METHODS: Here, we quantified the relatedness of strains within 31 polygenomic infections collected from patients in Thiès, Senegal using a hidden Markov model to measure the proportion of the genome that is inferred to be identical by descent. RESULTS: We found that polygenomic infections can be composed of highly related parasites and that superinfection models drastically underestimate the relatedness of strains within polygenomic infections. CONCLUSIONS: Our findings suggest that cotransmission is a major contributor to polygenomic infections in Thiès, Senegal. The incorporation of cotransmission into existing genetic epidemiology models may enhance our ability to characterize and predict changes in population structure associated with reduced transmission intensities and the emergence of important phenotypes like drug resistance that threaten to undermine malaria elimination activities.
Asunto(s)
Genoma de Protozoos , Malaria Falciparum/transmisión , Modelos Genéticos , Plasmodium falciparum/genética , Variación Genética , Humanos , Malaria Falciparum/epidemiología , Malaria Falciparum/genética , Cadenas de Markov , SenegalRESUMEN
UNLABELLED: The diverse Fusobacterium genus contains species implicated in multiple clinical pathologies, including periodontal disease, preterm birth, and colorectal cancer. The lack of genetic tools for manipulating these organisms leaves us with little understanding of the genes responsible for adherence to and invasion of host cells. Actively invading Fusobacterium species can enter host cells independently, whereas passively invading species need additional factors, such as compromise of mucosal integrity or coinfection with other microbes. We applied whole-genome sequencing and comparative analysis to study the evolution of active and passive invasion strategies and to infer factors associated with active forms of host cell invasion. The evolution of active invasion appears to have followed an adaptive radiation in which two of the three fusobacterial lineages acquired new genes and underwent expansions of ancestral genes that enable active forms of host cell invasion. Compared to passive invaders, active invaders have much larger genomes, encode FadA-related adhesins, and possess twice as many genes encoding membrane-related proteins, including a large expansion of surface-associated proteins containing the MORN2 domain of unknown function. We predict a role for proteins containing MORN2 domains in adhesion and active invasion. In the largest and most comprehensive comparison of sequenced Fusobacterium species to date, we have generated a testable model for the molecular pathogenesis of Fusobacterium infection and illuminate new therapeutic or diagnostic strategies. IMPORTANCE: Fusobacterium species have recently been implicated in a broad spectrum of human pathologies, including Crohn's disease, ulcerative colitis, preterm birth, and colorectal cancer. Largely due to the genetic intractability of member species, the mechanisms by which Fusobacterium causes these pathologies are not well understood, although adherence to and active invasion of host cells appear important. We examined whole-genome sequence data from a diverse set of Fusobacterium species to identify genetic determinants of active forms of host cell invasion. Our analyses revealed that actively invading Fusobacterium species have larger genomes than passively invading species and possess a specific complement of genes-including a class of genes of unknown function that we predict evolved to enable host cell adherence and invasion. This study provides an important framework for future studies on the role of Fusobacterium in pathologies such as colorectal cancer.
Asunto(s)
Adhesión Bacteriana , Endocitosis , Fusobacterium/fisiología , Genes Bacterianos , Genoma Bacteriano , Factores de Virulencia/genética , Evolución Molecular , Fusobacterium/genética , Fusobacterium/crecimiento & desarrollo , Análisis de Secuencia de ADN , VirulenciaRESUMEN
UNLABELLED: The large outbreak of diarrhea and hemolytic uremic syndrome (HUS) caused by Shiga toxin-producing Escherichia coli O104:H4 in Europe from May to July 2011 highlighted the potential of a rarely identified E. coli serogroup to cause severe disease. Prior to the outbreak, there were very few reports of disease caused by this pathogen and thus little known of its diversity and evolution. The identification of cases of HUS caused by E. coli O104:H4 in France and Turkey after the outbreak and with no clear epidemiological links raises questions about whether these sporadic cases are derived from the outbreak. Here, we report genome sequences of five independent isolates from these cases and results of a comparative analysis with historical and 2011 outbreak isolates. These analyses revealed that the five isolates are not derived from the outbreak strain; however, they are more closely related to the outbreak strain and each other than to isolates identified prior to the 2011 outbreak. Over the short time scale represented by these closely related organisms, the majority of genome variation is found within their mobile genetic elements: none of the nine O104:H4 isolates compared here contain the same set of plasmids, and their prophages and genomic islands also differ. Moreover, the presence of closely related HUS-associated E. coli O104:H4 isolates supports the contention that fully virulent O104:H4 isolates are widespread and emphasizes the possibility of future food-borne E. coli O104:H4 outbreaks. IMPORTANCE: In the summer of 2011, a large outbreak of bloody diarrhea with a high rate of severe complications took place in Europe, caused by a previously rarely seen Escherichia coli strain of serogroup O104:H4. Identification of subsequent infections caused by E. coli O104:H4 raised questions about whether these new cases represented ongoing transmission of the outbreak strain. In this study, we sequenced the genomes of isolates from five recent cases and compared them with historical isolates. The analyses reveal that, in the very short term, evolution of the bacterial genome takes place in parts of the genome that are exchanged among bacteria, and these regions contain genes involved in adaptation to local environments. We show that these recent isolates are not derived from the outbreak strain but are very closely related and share many of the same disease-causing genes, emphasizing the concern that these bacteria may cause future severe outbreaks.
Asunto(s)
Evolución Biológica , Infecciones por Escherichia coli/microbiología , Genoma Bacteriano , Escherichia coli Shiga-Toxigénica/genética , Análisis por Conglomerados , ADN Bacteriano/química , ADN Bacteriano/genética , Diarrea/epidemiología , Diarrea/microbiología , Infecciones por Escherichia coli/epidemiología , Europa (Continente)/epidemiología , Síndrome Hemolítico-Urémico/epidemiología , Síndrome Hemolítico-Urémico/microbiología , Humanos , Secuencias Repetitivas Esparcidas , Epidemiología Molecular , Datos de Secuencia Molecular , Filogenia , Análisis de Secuencia de ADNRESUMEN
UNLABELLED: Enterococcus faecium, natively a gut commensal organism, emerged as a leading cause of multidrug-resistant hospital-acquired infection in the 1980s. As the living record of its adaptation to changes in habitat, we sequenced the genomes of 51 strains, isolated from various ecological environments, to understand how E. faecium emerged as a leading hospital pathogen. Because of the scale and diversity of the sampled strains, we were able to resolve the lineage responsible for epidemic, multidrug-resistant human infection from other strains and to measure the evolutionary distances between groups. We found that the epidemic hospital-adapted lineage is rapidly evolving and emerged approximately 75 years ago, concomitant with the introduction of antibiotics, from a population that included the majority of animal strains, and not from human commensal lines. We further found that the lineage that included most strains of animal origin diverged from the main human commensal line approximately 3,000 years ago, a time that corresponds to increasing urbanization of humans, development of hygienic practices, and domestication of animals, which we speculate contributed to their ecological separation. Each bifurcation was accompanied by the acquisition of new metabolic capabilities and colonization traits on mobile elements and the loss of function and genome remodeling associated with mobile element insertion and movement. As a result, diversity within the species, in terms of sequence divergence as well as gene content, spans a range usually associated with speciation. IMPORTANCE: Enterococci, in particular vancomycin-resistant Enterococcus faecium, recently emerged as a leading cause of hospital-acquired infection worldwide. In this study, we examined genome sequence data to understand the bacterial adaptations that accompanied this transformation from microbes that existed for eons as members of host microbiota. We observed changes in the genomes that paralleled changes in human behavior. An initial bifurcation within the species appears to have occurred at a time that corresponds to the urbanization of humans and domestication of animals, and a more recent bifurcation parallels the introduction of antibiotics in medicine and agriculture. In response to the opportunity to fill niches associated with changes in human activity, a rapidly evolving lineage emerged, a lineage responsible for the vast majority of multidrug-resistant E. faecium infections.
Asunto(s)
Infección Hospitalaria/epidemiología , ADN Bacteriano/genética , Farmacorresistencia Bacteriana Múltiple , Enterococcus faecium/efectos de los fármacos , Epidemias , Infecciones por Bacterias Grampositivas/epidemiología , Infecciones por Bacterias Grampositivas/veterinaria , Animales , Análisis por Conglomerados , Infección Hospitalaria/microbiología , ADN Bacteriano/química , Enterococcus faecium/clasificación , Enterococcus faecium/genética , Enterococcus faecium/aislamiento & purificación , Evolución Molecular , Genoma Bacteriano , Infecciones por Bacterias Grampositivas/microbiología , Humanos , Filogenia , Análisis de Secuencia de ADNRESUMEN
Listeria monocytogenes, a foodborne bacterial pathogen, is comprised of four phylogenetic lineages that vary with regard to their serotypes and distribution among sources. In order to characterize lineage-specific genomic diversity within L. monocytogenes, we sequenced the genomes of eight strains from several lineages and serotypes, and characterized the accessory genome, which was hypothesized to contribute to phenotypic differences across lineages. The eight L. monocytogenes genomes sequenced range in size from 2.85-3.14 Mb, encode 2,822-3,187 genes, and include the first publicly available sequenced representatives of serotypes 1/2c, 3a and 4c. Mapping of the distribution of accessory genes revealed two distinct regions of the L. monocytogenes chromosome: an accessory-rich region in the first 65° adjacent to the origin of replication and a more stable region in the remaining 295°. This pattern of genome organization is distinct from that of related bacteria Staphylococcus aureus and Bacillus cereus. The accessory genome of all lineages is enriched for cell surface-related genes and phosphotransferase systems, and transcriptional regulators, highlighting the selective pressures faced by contemporary strains from their hosts, other microbes, and their environment. Phylogenetic analysis of O-antigen genes and gene clusters predicts that serotype 4 was ancestral in L. monocytogenes and serotype 1/2 associated gene clusters were putatively introduced through horizontal gene transfer in the ancestral population of L. monocytogenes lineage I and II.
Asunto(s)
Evolución Molecular , Genoma Bacteriano/genética , Listeria monocytogenes/genética , Secuencia Conservada , Transferencia de Gen Horizontal , Genómica , Listeria monocytogenes/fisiología , Listeria monocytogenes/virología , Antígenos O/genética , Fosfotransferasas/genética , Filogenia , Profagos/fisiologíaRESUMEN
The enterococci are Gram-positive lactic acid bacteria that inhabit the gastrointestinal tracts of diverse hosts. However, Enterococcus faecium and E. faecalis have emerged as leading causes of multidrug-resistant hospital-acquired infections. The mechanism by which a well-adapted commensal evolved into a hospital pathogen is poorly understood. In this study, we examined high-quality draft genome data for evidence of key events in the evolution of the leading causes of enterococcal infections, including E. faecalis, E. faecium, E. casseliflavus, and E. gallinarum. We characterized two clades within what is currently classified as E. faecium and identified traits characteristic of each, including variation in operons for cell wall carbohydrate and putative capsule biosynthesis. We examined the extent of recombination between the two E. faecium clades and identified two strains with mosaic genomes. We determined the underlying genetics for the defining characteristics of the motile enterococci E. casseliflavus and E. gallinarum. Further, we identified species-specific traits that could be used to advance the detection of medically relevant enterococci and their identification to the species level.
Asunto(s)
Enterococcus/genética , Evolución Molecular , Genoma Bacteriano , Alelos , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Pared Celular/genética , Pared Celular/metabolismo , Enterococcus/clasificación , Enterococcus/patogenicidad , Sitios Genéticos , Variación Genética , Infecciones por Bacterias Grampositivas/microbiología , Interacciones Huésped-Patógeno , Filogenia , Polisacáridos Bacterianos/genética , Polisacáridos Bacterianos/metabolismo , Especificidad de la EspecieRESUMEN
UNLABELLED: Methicillin-resistant Staphylococcus aureus (MRSA) strains are leading causes of hospital-acquired infections in the United States, and clonal cluster 5 (CC5) is the predominant lineage responsible for these infections. Since 2002, there have been 12 cases of vancomycin-resistant S. aureus (VRSA) infection in the United States-all CC5 strains. To understand this genetic background and what distinguishes it from other lineages, we generated and analyzed high-quality draft genome sequences for all available VRSA strains. Sequence comparisons show unambiguously that each strain independently acquired Tn1546 and that all VRSA strains last shared a common ancestor over 50 years ago, well before the occurrence of vancomycin resistance in this species. In contrast to existing hypotheses on what predisposes this lineage to acquire Tn1546, the barrier posed by restriction systems appears to be intact in most VRSA strains. However, VRSA (and other CC5) strains were found to possess a constellation of traits that appears to be optimized for proliferation in precisely the types of polymicrobic infection where transfer could occur. They lack a bacteriocin operon that would be predicted to limit the occurrence of non-CC5 strains in mixed infection and harbor a cluster of unique superantigens and lipoproteins to confound host immunity. A frameshift in dprA, which in other microbes influences uptake of foreign DNA, may also make this lineage conducive to foreign DNA acquisition. IMPORTANCE: Invasive methicillin-resistant Staphylococcus aureus (MRSA) infection now ranks among the leading causes of death in the United States. Vancomycin is a key last-line bactericidal drug for treating these infections. However, since 2002, vancomycin resistance has entered this species. Of the now 12 cases of vancomycin-resistant S. aureus (VRSA), each was believed to represent a new acquisition of the vancomycin-resistant transposon Tn1546 from enterococcal donors. All acquisitions of Tn1546 so far have occurred in MRSA strains of the clonal cluster 5 genetic background, the most common hospital lineage causing hospital-acquired MRSA infection. To understand the nature of these strains, we determined and examined the nucleotide sequences of the genomes of all available VRSA. Genome comparison identified candidate features that position strains of this lineage well for acquiring resistance to antibiotics in mixed infection.
Asunto(s)
Infección Hospitalaria/microbiología , Staphylococcus aureus Resistente a Meticilina/clasificación , Staphylococcus aureus Resistente a Meticilina/genética , Infecciones Estafilocócicas/microbiología , Staphylococcus aureus/clasificación , Staphylococcus aureus/genética , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Infección Hospitalaria/epidemiología , Genómica , Humanos , Resistencia a la Meticilina , Staphylococcus aureus Resistente a Meticilina/efectos de los fármacos , Staphylococcus aureus Resistente a Meticilina/aislamiento & purificación , Datos de Secuencia Molecular , Filogenia , Alineación de Secuencia , Infecciones Estafilocócicas/epidemiología , Staphylococcus aureus/efectos de los fármacos , Staphylococcus aureus/aislamiento & purificación , Estados Unidos/epidemiología , Resistencia a la VancomicinaRESUMEN
Segniliparus rugosus represents one of two species in the genus Segniliparus, the sole genus in the family Segniliparaceae. A unique and interesting feature of this family is the presence of extremely long carbon-chain length mycolic acids bound in the cell wall. S. rugosus is also a medically important species because it is an opportunistic pathogen associated with mammalian lung disease. This report represents the second species in the genus to have its genome sequenced. The 3,567,567 bp long genome with 3,516 protein-coding and 49 RNA genes is part of the NIH Roadmap for Medical Research, Human Microbiome Project.