RESUMO
SUMMARY: Bacteriophages (phages) are incredibly abundant and genetically diverse. The volume of phage genomics data is rapidly increasing, driven in part by the SEA-PHAGES program, which isolates, sequences and manually annotates hundreds of phage genomes each year. With an ever-expanding genomics dataset, there are many opportunities for generating new biological insights through comparative genomic and bioinformatic analyses. As a result, there is a growing need to be able to store, update, explore and analyze phage genomics data. The package pdm_utils provides a collection of tools for MySQL phage database management designed to meet specific needs in the SEA-PHAGES program and phage genomics generally. AVAILABILITY AND IMPLEMENTATION: https://pypi.org/project/pdm-utils/.
Assuntos
Bacteriófagos , Bacteriófagos/genética , Biologia Computacional , Genoma Viral , Genômica , FilogeniaRESUMO
Engaging undergraduate students in scientific research promises substantial benefits, but it is not accessible to all students and is rarely implemented early in college education, when it will have the greatest impact. An inclusive Research Education Community (iREC) provides a centralized scientific and administrative infrastructure enabling engagement of large numbers of students at different types of institutions. The Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) is an iREC that promotes engagement and continued involvement in science among beginning undergraduate students. The SEA-PHAGES students show strong gains correlated with persistence relative to those in traditional laboratory courses regardless of academic, ethnic, gender, and socioeconomic profiles. This persistent involvement in science is reflected in key measures, including project ownership, scientific community values, science identity, and scientific networking.
Assuntos
Pesquisa Biomédica/educação , Educação de Graduação em Medicina/métodos , Avaliação de Programas e Projetos de Saúde , Ensino , Pesquisa Biomédica/normas , Educação de Graduação em Medicina/normas , Feminino , Humanos , Aprendizagem , Masculino , Universidades/normas , Adulto JovemRESUMO
UNLABELLED: Genomic analysis of a large set of phages infecting the common host Mycobacterium smegmatis mc(2)155 shows that they span considerable genetic diversity. There are more than 20 distinct types that lack nucleotide similarity with each other, and there is considerable diversity within most of the groups. Three newly isolated temperate mycobacteriophages, Bongo, PegLeg, and Rey, constitute a new group (cluster M), with the closely related phages Bongo and PegLeg forming subcluster M1 and the more distantly related Rey forming subcluster M2. The cluster M mycobacteriophages have siphoviral morphologies with unusually long tails, are homoimmune, and have larger than average genomes (80.2 to 83.7 kbp). They exhibit a variety of features not previously described in other mycobacteriophages, including noncanonical genome architectures and several unusual sets of conserved repeated sequences suggesting novel regulatory systems for both transcription and translation. In addition to containing transfer-messenger RNA and RtcB-like RNA ligase genes, their genomes encode 21 to 24 tRNA genes encompassing complete or nearly complete sets of isotypes. We predict that these tRNAs are used in late lytic growth, likely compensating for the degradation or inadequacy of host tRNAs. They may represent a complete set of tRNAs necessary for late lytic growth, especially when taken together with the apparent lack of codons in the same late genes that correspond to tRNAs that the genomes of the phages do not obviously encode. IMPORTANCE: The bacteriophage population is vast, dynamic, and old and plays a central role in bacterial pathogenicity. We know surprisingly little about the genetic diversity of the phage population, although metagenomic and phage genome sequencing indicates that it is great. Probing the depth of genetic diversity of phages of a common host, Mycobacterium smegmatis, provides a higher resolution of the phage population and how it has evolved. Three new phages constituting a new cluster M further expand the diversity of the mycobacteriophages and introduce novel features. As such, they provide insights into phage genome architecture, virion structure, and gene regulation at the transcriptional and translational levels.
Assuntos
Família Multigênica , Micobacteriófagos/classificação , Micobacteriófagos/genética , Mycobacterium smegmatis/virologia , RNA de Transferência/genética , RNA Viral , Composição de Bases , Sequência de Bases , Códon , Sequência Conservada , Ordem dos Genes , Tamanho do Genoma , Genoma Viral , Sequências Repetidas Invertidas , Lisogenia/genética , Micobacteriófagos/ultraestrutura , Fases de Leitura Aberta , Filogenia , RNA de Transferência/química , Sequências Repetitivas de Ácido Nucleico , Alinhamento de Sequência , Vírion/genética , Vírion/ultraestrutura , Montagem de Vírus/genéticaRESUMO
During infection, bacteriophages produce diverse gene products to overcome bacterial antiphage defenses, to outcompete other phages, and to take over cellular processes. Even in the best-studied model phages, the roles of most phage-encoded gene products are unknown, and the phage population represents a largely untapped reservoir of novel gene functions. Considering the sheer size of this population, experimental screening methods are needed to sort through the enormous collection of available sequences and identify gene products that can modulate bacterial behavior for downstream functional characterization. Here, we describe the construction of a plasmid-based overexpression library of 94 genes encoded by Hammy, a Cluster K mycobacteriophage closely related to those infecting clinically important mycobacteria. The arrayed library was systematically screened in a plate-based cytotoxicity assay, identifying a diverse set of 24 gene products (representing â¼25% of the Hammy genome) capable of inhibiting growth of the host bacterium Mycobacterium smegmatis. Half of these are related to growth inhibitors previously identified in related phage Waterfoul, supporting their functional conservation; the other genes represent novel additions to the list of known antimycobacterial growth inhibitors. This work, conducted as part of the HHMI-supported Science Education Alliance Gene-function Exploration by a Network of Emerging Scientists (SEA-GENES) project, highlights the value of parallel, comprehensive overexpression screens in exploring genome-wide patterns of phage gene function and novel interactions between phages and their hosts.
Assuntos
Bacteriófagos , Micobacteriófagos , Mycobacterium , Mycobacterium smegmatis/genética , Micobacteriófagos/genética , Mycobacterium/genética , Bacteriófagos/genética , PlasmídeosRESUMO
The diversity and mosaic architecture of phage genomes present challenges for whole-genome phylogenies and comparative genomics. There are no universally conserved core genes, â¼70% of phage genes are of unknown function, and phage genomes are replete with small (<500 bp) open reading frames. Assembling sequence-related genes into "phamilies" ("phams") based on amino acid sequence similarity simplifies comparative phage genomics and facilitates representations of phage genome mosaicism. With the rapid and substantial increase in the numbers of sequenced phage genomes, computationally efficient pham assembly is needed, together with strategies for including newly sequenced phage genomes. Here, we describe the Python package PhaMMseqs, which uses MMseqs2 for pham assembly, and we evaluate the key parameters for optimal pham assembly of sequence- and functionally related proteins. PhaMMseqs runs efficiently with only modest hardware requirements and integrates with the pdm_utils package for simple genome entry and export of datasets for evolutionary analyses and phage genome map construction.
Assuntos
Bacteriófagos , Genoma Viral , Bacteriófagos/genética , Filogenia , Genômica , Fases de Leitura Aberta/genéticaRESUMO
BACKGROUND: Bacteriophage genomes have mosaic architectures and are replete with small open reading frames of unknown function, presenting challenges in their annotation, comparative analysis, and representation. RESULTS: We describe here a bioinformatic tool, Phamerator, that assorts protein-coding genes into phamilies of related sequences using pairwise comparisons to generate a database of gene relationships. This database is used to generate genome maps of multiple phages that incorporate nucleotide and amino acid sequence relationships, as well as genes containing conserved domains. Phamerator also generates phamily circle representations of gene phamilies, facilitating analysis of the different evolutionary histories of individual genes that migrate through phage populations by horizontal genetic exchange. CONCLUSIONS: Phamerator represents a useful tool for comparative genomic analysis and comparative representations of bacteriophage genomes.
Assuntos
Bacteriófagos/genética , Genoma Viral , Genômica/métodos , Software , Fagos Bacilares/genética , Fases de Leitura Aberta , Filogenia , Streptomyces/virologiaRESUMO
The diversity of bacteriophages is likely unparalleled in the biome due to the immense variety of hosts and the multitude of viruses that infect them. Recent efforts have led to description at the genomic level of numerous bacteriophages that infect the Actinobacteria, but relatively little is known about those infecting other prokaryotic phyla, such as the purple non-sulfur photosynthetic α-proteobacterium Rhodobacter capsulatus. This species is a common inhabitant of freshwater ecosystems and has been an important model system for the study of photosynthesis. Additionally, it is notable for its utilization of a unique form of horizontal gene transfer via a bacteriophage-like element known as the gene transfer agent (RcGTA). Only three bacteriophages of R. capsulatus had been sequenced prior to this report. Isolation and characterization at the genomic level of 26 new bacteriophages infecting this host advances the understanding of bacteriophage diversity and the origins of RcGTA. These newly discovered isolates can be grouped along with three that were previously sequenced to form six clusters with four remaining as single representatives. These bacteriophages share genes with RcGTA that seem to be related to host recognition. One isolate was found to cause lysis of a marine bacterium when exposed to high-titer lysate. Although some clusters are more highly represented in the sequenced genomes, it is evident that many more bacteriophage types that infect R. capsulatus are likely to be found in the future.
Assuntos
Proteínas de Bactérias/genética , Bacteriófagos/genética , Regulação Bacteriana da Expressão Gênica , Variação Genética , Rhodobacter capsulatus/virologia , Técnicas de Transferência de GenesRESUMO
The bacteriophage population is vast, dynamic, old, and genetically diverse. The genomics of phages that infect bacterial hosts in the phylum Actinobacteria show them to not only be diverse but also pervasively mosaic, and replete with genes of unknown function. To further explore this broad group of bacteriophages, we describe here the isolation and genomic characterization of 116 phages that infect Microbacterium spp. Most of the phages are lytic, and can be grouped into twelve clusters according to their overall relatedness; seven of the phages are singletons with no close relatives. Genome sizes vary from 17.3 kbp to 97.7 kbp, and their G+C% content ranges from 51.4% to 71.4%, compared to ~67% for their Microbacterium hosts. The phages were isolated on five different Microbacterium species, but typically do not efficiently infect strains beyond the one on which they were isolated. These Microbacterium phages contain many novel features, including very large viral genes (13.5 kbp) and unusual fusions of structural proteins, including a fusion of VIP2 toxin and a MuF-like protein into a single gene. These phages and their genetic components such as integration systems, recombineering tools, and phage-mediated delivery systems, will be useful resources for advancing Microbacterium genetics.
Assuntos
Actinobacteria/virologia , Bacteriófagos/genética , Variação Genética , Genoma Viral , Bacteriófagos/classificação , Bacteriófagos/isolamento & purificação , Composição de Bases , DNA Viral/genética , Genes Virais , Genômica , Filogenia , Proteínas Virais de Fusão/genéticaRESUMO
The recognition of the vast numbers of bacteriophages in the biosphere has prompted a renewal of interest in understanding their morphological and genetic diversity, and elucidating the evolutionary mechanisms that give rise to them. We have approached these questions by isolating and characterizing a collection of mycobacteriophages that infect a common bacterial host, Mycobacterium smegmatis. Comparative genomic analysis of 50 mycobacteriophages shows that they are highly diverse, although not uniformly so, that they are pervasively mosaic with a multitude of single gene modules, and that this mosaicism is generated through illegitimate recombination.
Assuntos
Evolução Molecular , Genoma Viral , Micobacteriófagos/genética , Mycobacterium smegmatis/virologia , Sequência de Aminoácidos , Dados de Sequência Molecular , Mosaicismo , Mycobacterium smegmatis/ultraestrutura , Recombinação GenéticaRESUMO
We report here the complete genome sequences of 44 phages infecting Arthrobacter sp. strain ATCC 21022. These phages have double-stranded DNA genomes with sizes ranging from 15,680 to 70,707 bp and G+C contents from 45.1% to 68.5%. All three tail types (belonging to the families Siphoviridae, Myoviridae, and Podoviridae) are represented.
RESUMO
Cluster BE1 Streptomyces bacteriophages belong to the Siphoviridae, with genome sizes over 130 kbp, and they contain direct terminal repeats of approximately 11 kbp. Eight newly isolated closely related cluster BE1 phages contain 43 to 48 tRNAs, one transfer-messenger RNA (tmRNA), and 216 to 236 predicted open reading frames (ORFs), but few of their genes are shared with other phages, including those infecting Streptomyces species.
RESUMO
Four bacteriophages infecting Mycobacterium smegmatis mc2155 (three belonging to subcluster P1 and one belonging to subcluster P2) were isolated from soil and sequenced. All four phages are similar in the left arm of their genomes, but the P2 phage differs in the right arm. All four genomes contain features of temperate phages.
RESUMO
Caterpillar, Nightmare, and Teacup are cluster AU siphoviral phages isolated from enriched soil on Arthrobacter sp. strain ATCC 21022. These genomes are 58 kbp long with an average G+C content of 50%. Sequence analysis predicts 86 to 92 protein-coding genes, including a large number of small proteins with predicted transmembrane domains.
RESUMO
We report here the genome sequences of six newly isolated bacteriophages infecting Arthrobacter sp. ATCC 21022. All six have myoviral morphologies and have double-stranded DNA genomes with circularly permuted ends. The six phages are closely related with average nucleotide identities of 73.4 to 93.0% across genomes lengths of 49,797 to 51,347 bp.
RESUMO
Twelve siphoviral phages isolated using Arthrobacter sp. strain ATCC 21022 were sequenced. The phages all have relatively small genomes, ranging from 15,319 to 15,556 bp. All 12 phages are closely related to previously described cluster AN Arthrobacter phages.
RESUMO
We report the complete genome sequences of 19 cluster CA bacteriophages isolated from environmental samples using Rhodococcus erythropolis as a host. All of the phages are Siphoviridae, have similar genome lengths (46,314 to 46,985 bp) and G+C contents (58.5 to 58.8%), and share nucleotide sequence similarity.
RESUMO
We report here the genome sequences of three newly isolated phages that infect Mycobacterium smegmatis mc2155. Phages Findley, Hurricane, and TBond007 were discovered in geographically distinct locations and are related to cluster K mycobacteriophages, with Findley being similar to subcluster K2 phages and Hurricane and TBond007 being similar to subcluster K3 phages.
RESUMO
We report the genome sequences of 14 cluster K mycobacteriophages isolated using Mycobacterium smegmatis mc²155 as host. Four are closely related to subcluster K1 phages, and 10 are members of subcluster K6. The phage genomes span considerable sequence diversity, including multiple types of integrases and integration sites.
RESUMO
Bacteriophages AlleyCat, Edugator, and Guillsminger were isolated on Mycobacterium smegmatis mc2155 from enriched soil samples. All are members of mycobacteriophage subcluster K5, with genomes of 62,112 to 63,344 bp. Each genome contains 92 to 99 predicted protein-coding genes and one tRNA. Guillsminger is the first mycobacteriophage to carry an IS1380 family transposon.
RESUMO
We report here the complete genome sequences of four subcluster L3 mycobacteriophages newly isolated from soil samples, using Mycobacterium smegmatis mc2155 as the host. Comparative genomic analyses with four previously described subcluster L3 phages reveal strong nucleotide similarity and gene conservation, with several large insertions/deletions near their right genome ends.