Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters











Publication year range
1.
bioRxiv ; 2024 Aug 14.
Article in English | MEDLINE | ID: mdl-39185240

ABSTRACT

Nucleocytoplasmic Large DNA Viruses (NCLDVs, also called giant viruses) are widespread in marine systems and infect a broad range of microbial eukaryotes (protists). Recent biogeographic work has provided global snapshots of NCLDV diversity and community composition across the world's oceans, yet little information exists about the guiding 'rules' underpinning their community dynamics over time. We leveraged a five-year monthly metagenomic time-series to quantify the community composition of NCLDVs off the coast of Southern California and characterize these populations' temporal dynamics. NCLDVs were dominated by Algavirales (Phycodnaviruses, 59%) and Imitervirales (Mimiviruses, 36%). We identified clusters of NCLDVs with distinct classes of seasonal and non-seasonal temporal dynamics. Overall, NCLDV population abundances were often highly dynamic with a strong seasonal signal. The Imitervirales group had highest relative abundance in the more oligotrophic late summer and fall, while Algavirales did so in winter. Generally, closely related strains had similar temporal dynamics, suggesting that evolutionary history is a key driver of the temporal niche of marine NCLDVs. However, a few closely-related strains had drastically different seasonal dynamics, suggesting that while phylogenetic proximity often indicates ecological similarity, occasionally phenology can shift rapidly, possibly due to host-switching. Finally, we identified distinct functional content and possible host interactions of two major NCLDV orders-including connections of Imitervirales with primary producers like the diatom Chaetoceros and widespread marine grazers like Paraphysomonas and Spirotrichea ciliates. Together, our results reveal key insights on season-specific effect of phylogenetically distinct giant virus communities on marine protist metabolism, biogeochemical fluxes and carbon cycling.

2.
Trends Genet ; 39(6): 433-435, 2023 06.
Article in English | MEDLINE | ID: mdl-37019751

ABSTRACT

Genomic islands are hotspots for horizontal gene transfer (HGT) in bacteria, but, for Prochlorococcus, an abundant marine cyanobacterium, how these islands form has puzzled scientists. With the discovery of tycheposons, a new family of transposons, Hackl et al. provide evidence for elegant new mechanisms of gene rearrangement and transfer among Prochlorococcus and bacteria more broadly.


Subject(s)
Bacteriophages , Cyanobacteria , Bacteriophages/genetics , Gene Transfer, Horizontal/genetics , Cyanobacteria/genetics , RNA, Transfer/genetics , Genomic Islands
3.
Viruses ; 15(2)2023 02 20.
Article in English | MEDLINE | ID: mdl-36851794

ABSTRACT

Cyanophages exert important top-down controls on their cyanobacteria hosts; however, concurrent analysis of both phage and host populations is needed to better assess phage-host interaction models. We analyzed picocyanobacteria Prochlorococcus and Synechococcus and T4-like cyanophage communities in Pacific Ocean surface waters using five years of monthly viral and cellular fraction metagenomes. Cyanophage communities contained thousands of mostly low-abundance (<2% relative abundance) species with varying temporal dynamics, categorized as seasonally recurring or non-seasonal and occurring persistently, occasionally, or sporadically (detected in ≥85%, 15-85%, or <15% of samples, respectively). Viromes contained mostly seasonal and persistent phages (~40% each), while cellular fraction metagenomes had mostly sporadic species (~50%), reflecting that these sample sets capture different steps of the infection cycle-virions from prior infections or within currently infected cells, respectively. Two groups of seasonal phages correlated to Synechococcus or Prochlorococcus were abundant in spring/summer or fall/winter, respectively. Cyanophages likely have a strong influence on the host community structure, as their communities explained up to 32% of host community variation. These results support how both seasonally recurrent and apparent stochastic processes, likely determined by host availability and different host-range strategies among phages, are critical to phage-host interactions and dynamics, consistent with both the Kill-the-Winner and the Bank models.


Subject(s)
Bacteriophages , Synechococcus , Bacteriophages/genetics , Host Specificity , Metagenome , Pacific Ocean , Seasons
4.
Microbiol Resour Announc ; 11(8): e0015122, 2022 Aug 18.
Article in English | MEDLINE | ID: mdl-35862922

ABSTRACT

Marine Synechococcus spp. are unicellular cyanobacteria widely distributed in the world's oceans. We report the complete genome sequence of Synechococcus sp. strain NB0720_010, isolated from Narragansett Bay, Rhode Island. NB0702_10 has several large (>3,000-amino acid) protein-coding genes that may be important in its interactions with other cells, including grazers in estuarine habitats.

5.
PLOS Glob Public Health ; 2(11): e0001282, 2022.
Article in English | MEDLINE | ID: mdl-36962644

ABSTRACT

People of different racial/ethnic backgrounds, demographics, health, and socioeconomic characteristics have experienced disproportionate rates of infection and death due to COVID-19. This study tests if and how county-level rates of infection and death have changed in relation to societal county characteristics through time as the pandemic progressed. This longitudinal study sampled monthly county-level COVID-19 case and death data per 100,000 residents from April 2020 to March 2022, and studied the relationships of these variables with racial/ethnic, demographic, health, and socioeconomic characteristics for 3125 or 97.0% of U.S. counties, accounting for 96.4% of the U.S. population. The association of all county-level characteristics with COVID-19 case and death rates changed significantly through time, and showed different patterns. For example, counties with higher population proportions of Black, Native American, foreign-born non-citizen, elderly residents, households in poverty, or higher income inequality suffered disproportionately higher COVID-19 case and death rates at the beginning of the pandemic, followed by reversed, attenuated or fluctuating patterns, depending on the variable. Patterns for counties with higher White versus Black population proportions showed somewhat inverse patterns. Counties with higher female population proportions initially had lower case rates but higher death rates, and case and death rates become more coupled and fluctuated later in the pandemic. Counties with higher population densities had fluctuating case and death rates, with peaks coinciding with new variants of COVID-19. Counties with a greater proportion of university-educated residents had lower case and death rates throughout the pandemic, although the strength of this relationship fluctuated through time. This research clearly shows that how different segments of society are affected by a pandemic changes through time. Therefore, targeted policies and interventions that change as a pandemic unfolds are necessary to mitigate its disproportionate effects on vulnerable populations, particularly during the first six months of a pandemic.

6.
NAR Genom Bioinform ; 2(2): lqaa044, 2020 Jun.
Article in English | MEDLINE | ID: mdl-32626849

ABSTRACT

Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences; however, it remains challenging to identify the host(s) of these new viruses. We developed VirHostMatcher-Net, a flexible, network-based, Markov random field framework for predicting virus-prokaryote interactions using multiple, integrated features: CRISPR sequences and alignment-free similarity measures ([Formula: see text] and WIsH). Evaluation of this method on a benchmark set of 1462 known virus-prokaryote pairs yielded host prediction accuracy of 59% and 86% at the genus and phylum levels, representing 16-27% and 6-10% improvement, respectively, over previous single-feature prediction approaches. We applied our host prediction tool to crAssphage, a human gut phage, and two metagenomic virus datasets: marine viruses and viral contigs recovered from globally distributed, diverse habitats. Host predictions were frequently consistent with those of previous studies, but more importantly, this new tool made many more confident predictions than previous tools, up to nearly 3-fold more (n > 27 000), greatly expanding the diversity of known virus-host interactions.

7.
Microbiol Resour Announc ; 9(30)2020 Jul 23.
Article in English | MEDLINE | ID: mdl-32703830

ABSTRACT

Synechococcus bacteria are unicellular cyanobacteria that contribute significantly to global marine primary production. We report the nearly complete genome sequence of Synechococcus sp. strain MIT S9220, which lacks the nitrate utilization genes present in most marine Synechococcus genomes. Assembly also produced the complete genome sequence of a cyanophage present in the MIT S9220 culture.

8.
Nat Microbiol ; 5(2): 265-271, 2020 02.
Article in English | MEDLINE | ID: mdl-31819214

ABSTRACT

Viruses that infect microorganisms dominate marine microbial communities numerically, with impacts ranging from host evolution to global biogeochemical cycles1,2. However, virus community dynamics, necessary for conceptual and mechanistic model development, remains difficult to assess. Here, we describe the long-term stability of a viral community by analysing the metagenomes of near-surface 0.02-0.2 µm samples from the San Pedro Ocean Time-series3 that were sampled monthly over 5 years. Of 19,907 assembled viral contigs (>5 kb, mean 15 kb), 97% were found in each sample (by >98% ID metagenomic read recruitment) to have relative abundances that ranged over seven orders of magnitude, with limited temporal reordering of rank abundances along with little change in richness. Seasonal variations in viral community composition were superimposed on the overall stability; maximum community similarity occurred at 12-month intervals. Despite the stability of viral genotypic clusters that had 98% sequence identity, viral sequences showed transient variations in single-nucleotide polymorphisms (SNPs) and constant turnover of minor population variants, each rising and falling over a few months, reminiscent of Red Queen dynamics4. The rise and fall of variants within populations, interpreted through the perspective of known virus-host interactions5, is consistent with the hypothesis that fluctuating selection acts on a microdiverse cloud of strains, and this succession is associated with ever-shifting virus-host defences and counterdefences. This results in long-term virus-host coexistence that is facilitated by perpetually changing minor variants.


Subject(s)
Aquatic Organisms/virology , Seawater/virology , Viruses/genetics , Water Microbiology , Aquatic Organisms/classification , Aquatic Organisms/genetics , DNA, Viral/genetics , Ecosystem , Genome, Viral , Host Microbial Interactions/genetics , Metagenome , Microbiota , Pacific Ocean , Polymorphism, Single Nucleotide , Species Specificity
9.
Environ Microbiol ; 22(5): 1801-1815, 2020 05.
Article in English | MEDLINE | ID: mdl-31840403

ABSTRACT

Phytoplankton are limited by iron (Fe) in ~40% of the world's oceans including high-nutrient low-chlorophyll (HNLC) regions. While low-Fe adaptation has been well-studied in large eukaryotic diatoms, less is known for small, prokaryotic marine picocyanobacteria. This study reveals key physiological and genomic differences underlying Fe adaptation in marine picocyanobacteria. HNLC ecotype CRD1 strains have greater physiological tolerance to low Fe congruent with their expanded repertoire of Fe transporter, storage and regulatory genes compared to other ecotypes. From metagenomic analysis, genes encoding ferritin, flavodoxin, Fe transporters and siderophore uptake genes were more abundant in low-Fe waters, mirroring paradigms of low-Fe adaptation in diatoms. Distinct Fe-related gene repertories of HNLC ecotypes CRD1 and CRD2 also highlight how coexisting ecotypes have evolved independent approaches to life in low-Fe habitats. Synechococcus and Prochlorococcus HNLC ecotypes likewise exhibit independent, genome-wide reductions of predicted Fe-requiring genes. HNLC ecotype CRD1 interestingly was most similar to coastal ecotype I in Fe physiology and Fe-related gene content, suggesting populations from these different biomes experience similar Fe-selective conditions. This work supports an improved perspective that phytoplankton are shaped by more nuanced Fe niches in the oceans than previously implied from mostly binary comparisons of low- versus high-Fe habitats and populations.


Subject(s)
Genome, Bacterial/genetics , Mosaicism , Prochlorococcus/genetics , Prochlorococcus/physiology , Synechococcus/genetics , Synechococcus/physiology , Acclimatization/genetics , Adaptation, Physiological/genetics , Diatoms/genetics , Ecosystem , Ecotype , Iron/metabolism , Metagenomics , Oceans and Seas , Phytoplankton , Seawater/microbiology
10.
Quant Biol ; 8(1): 64-77, 2020 Mar.
Article in English | MEDLINE | ID: mdl-34084563

ABSTRACT

BACKGROUND: The recent development of metagenomic sequencing makes it possible to massively sequence microbial genomes including viral genomes without the need for laboratory culture. Existing reference-based and gene homology-based methods are not efficient in identifying unknown viruses or short viral sequences from metagenomic data. METHODS: Here we developed a reference-free and alignment-free machine learning method, DeepVirFinder, for identifying viral sequences in metagenomic data using deep learning. RESULTS: Trained based on sequences from viral RefSeq discovered before May 2015, and evaluated on those discovered after that date, DeepVirFinder outperformed the state-of-the-art method VirFinder at all contig lengths, achieving AUROC 0.93, 0.95, 0.97, and 0.98 for 300, 500, 1000, and 3000 bp sequences respectively. Enlarging the training data with additional millions of purified viral sequences from metavirome samples further improved the accuracy for identifying virus groups that are under-represented. Applying DeepVirFinder to real human gut metagenomic samples, we identified 51,138 viral sequences belonging to 175 bins in patients with colorectal carcinoma (CRC). Ten bins were found associated with the cancer status, suggesting viruses may play important roles in CRC. CONCLUSIONS: Powered by deep learning and high throughput sequencing metagenomic data, DeepVirFinder significantly improved the accuracy of viral identification and will assist the study of viruses in the era of metagenomics.

11.
PeerJ ; 7: e6902, 2019.
Article in English | MEDLINE | ID: mdl-31119088

ABSTRACT

BACKGROUND: Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. METHODS: Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. RESULTS: Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. CONCLUSIONS: PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.

12.
Environ Microbiol ; 21(8): 2948-2963, 2019 08.
Article in English | MEDLINE | ID: mdl-31106939

ABSTRACT

Currently defined ecotypes in marine cyanobacteria Prochlorococcus and Synechococcus likely contain subpopulations that themselves are ecologically distinct. We developed and applied high-throughput sequencing for the 16S-23S rRNA internally transcribed spacer (ITS) to examine ecotype and fine-scale genotypic community dynamics for monthly surface water samples spanning 5 years at the San Pedro Ocean Time-series site. Ecotype-level structure displayed regular seasonal patterns including succession, consistent with strong forcing by seasonally varying abiotic parameters (e.g. temperature, nutrients, light). We identified tens to thousands of amplicon sequence variants (ASVs) within ecotypes, many of which exhibited distinct patterns over time, suggesting ecologically distinct populations within ecotypes. Community structure within some ecotypes exhibited regular, seasonal patterns, but not for others, indicating other more irregular processes such as phage interactions are important. Network analysis including T4-like phage genotypic data revealed distinct viral variants correlated with different groups of cyanobacterial ASVs including time-lagged predator-prey relationships. Variation partitioning analysis indicated that phage community structure more strongly explains cyanobacterial community structure at the ASV level than the abiotic environmental factors. These results support a hierarchical model whereby abiotic environmental factors more strongly shape niche partitioning at the broader ecotype level while phage interactions are more important in shaping community structure of fine-scale variants within ecotypes.


Subject(s)
Bacteriophages/physiology , Prochlorococcus/virology , Seawater/microbiology , Synechococcus/virology , Bacteriophages/genetics , Ecosystem , Ecotype , Phylogeny , Prochlorococcus/genetics , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 23S/genetics , Synechococcus/genetics , Water Microbiology
13.
Environ Microbiol ; 21(5): 1677-1686, 2019 05.
Article in English | MEDLINE | ID: mdl-30724442

ABSTRACT

Synechococcus, a genus of unicellular cyanobacteria found throughout the global surface ocean, is a large driver of Earth's carbon cycle. Developing a better understanding of its diversity and distributions is an ongoing effort in biological oceanography. Here, we introduce 12 new draft genomes of marine Synechococcus isolates spanning five clades and utilize ~100 environmental metagenomes largely sourced from the TARA Oceans project to assess the global distributions of the genomic lineages they and other reference genomes represent. We show that five newly provided clade-II isolates are by far the most representative of the recovered in situ populations (most 'abundant') and have biogeographic distributions distinct from previously available clade-II references. Additionally, these isolates form a subclade possessing the smallest genomes yet identified of the genus (2.14 ± 0.05Mbps; mean ± 1SD) while concurrently hosting some of the highest GC contents (60.67 ± 0.16%). This is in direct opposition to the pattern in Synechococcus's nearest relative, Prochlorococcus - wherein decreasing genome size has coincided with a strong decrease in GC content - suggesting this new subclade of Synechococcus appears to have convergently undergone genomic reduction relative to the rest of the genus, but along a fundamentally different evolutionary trajectory.


Subject(s)
Evolution, Molecular , Genome, Bacterial , Seawater/microbiology , Synechococcus/genetics , Base Composition , Genomics , Metagenome , Oceans and Seas , Phylogeny , Prochlorococcus/genetics , Synechococcus/classification , Synechococcus/isolation & purification , Synechococcus/metabolism
14.
ISME J ; 13(3): 618-631, 2019 03.
Article in English | MEDLINE | ID: mdl-30315316

ABSTRACT

Much of the diversity of prokaryotic viruses has yet to be described. In particular, there are no viral isolates that infect abundant, globally significant marine archaea including the phylum Thaumarchaeota. This phylum oxidizes ammonia, fixes inorganic carbon, and thus contributes to globally significant nitrogen and carbon cycles in the oceans. Metagenomics provides an alternative to culture-dependent means for identifying and characterizing viral diversity. Some viruses carry auxiliary metabolic genes (AMGs) that are acquired via horizontal gene transfer from their host(s), allowing inference of what host a virus infects. Here we present the discovery of 15 new genomically and ecologically distinct Thaumarchaeota virus populations, identified as contigs that encode viral capsid and thaumarchaeal ammonia monooxygenase genes (amoC). These viruses exhibit depth and latitude partitioning and are distributed globally in various marine habitats including pelagic waters, estuarine habitats, and hydrothermal plume water and sediments. We found evidence of viral amoC expression and that viral amoC AMGs sometimes comprise up to half of total amoC DNA copies in cellular fraction metagenomes, highlighting the potential impact of these viruses on N cycling in the oceans. Phylogenetics suggest they are potentially tailed viruses and share a common ancestor with related marine Euryarchaeota viruses. This work significantly expands our view of viruses of globally important marine Thaumarchaeota.


Subject(s)
Archaea/virology , Metagenome , Oxidoreductases/genetics , Viruses/genetics , Ammonia/metabolism , Carbon Cycle , Gene Transfer, Horizontal , Marine Biology , Metagenomics , Nitrification , Nitrogen Cycle , Oceans and Seas , Phylogeny , Viral Proteins/genetics , Viruses/enzymology , Viruses/isolation & purification
15.
Microbiome ; 5(1): 69, 2017 07 06.
Article in English | MEDLINE | ID: mdl-28683828

ABSTRACT

BACKGROUND: Identifying viral sequences in mixed metagenomes containing both viral and host contigs is a critical first step in analyzing the viral component of samples. Current tools for distinguishing prokaryotic virus and host contigs primarily use gene-based similarity approaches. Such approaches can significantly limit results especially for short contigs that have few predicted proteins or lack proteins with similarity to previously known viruses. METHODS: We have developed VirFinder, the first k-mer frequency based, machine learning method for virus contig identification that entirely avoids gene-based similarity searches. VirFinder instead identifies viral sequences based on our empirical observation that viruses and hosts have discernibly different k-mer signatures. VirFinder's performance in correctly identifying viral sequences was tested by training its machine learning model on sequences from host and viral genomes sequenced before 1 January 2014 and evaluating on sequences obtained after 1 January 2014. RESULTS: VirFinder had significantly better rates of identifying true viral contigs (true positive rates (TPRs)) than VirSorter, the current state-of-the-art gene-based virus classification tool, when evaluated with either contigs subsampled from complete genomes or assembled from a simulated human gut metagenome. For example, for contigs subsampled from complete genomes, VirFinder had 78-, 2.4-, and 1.8-fold higher TPRs than VirSorter for 1, 3, and 5 kb contigs, respectively, at the same false positive rates as VirSorter (0, 0.003, and 0.006, respectively), thus VirFinder works considerably better for small contigs than VirSorter. VirFinder furthermore identified several recently sequenced virus genomes (after 1 January 2014) that VirSorter did not and that have no nucleotide similarity to previously sequenced viruses, demonstrating VirFinder's potential advantage in identifying novel viral sequences. Application of VirFinder to a set of human gut metagenomes from healthy and liver cirrhosis patients reveals higher viral diversity in healthy individuals than cirrhosis patients. We also identified contig bins containing crAssphage-like contigs with higher abundance in healthy patients and a putative Veillonella genus prophage associated with cirrhosis patients. CONCLUSIONS: This innovative k-mer based tool complements gene-based approaches and will significantly improve prokaryotic viral sequence identification, especially for metagenomic-based studies of viral ecology.


Subject(s)
DNA, Viral/genetics , Genome, Viral , Metagenomics/methods , Software , Computational Biology/methods , Gastrointestinal Microbiome , Humans , Liver Cirrhosis/virology , Machine Learning , Metagenome , Phylogeny , Sequence Analysis, DNA
16.
BMC Bioinformatics ; 18(Suppl 3): 60, 2017 Mar 14.
Article in English | MEDLINE | ID: mdl-28361670

ABSTRACT

BACKGROUND: The study of virus-host infectious association is important for understanding the functions and dynamics of microbial communities. Both cellular and fractionated viral metagenomic data generate a large number of viral contigs with missing host information. Although relative simple methods based on the similarity between the word frequency vectors of viruses and bacterial hosts have been developed to study virus-host associations, the problem is significantly understudied. We hypothesize that machine learning methods based on word frequencies can be efficiently used to study virus-host infectious associations. METHODS: We investigate four different representations of word frequencies of viral sequences including the relative word frequency and three normalized word frequencies by subtracting the number of expected from the observed word counts. We also study five machine learning methods including logistic regression, support vector machine, random forest, Gaussian naive Bayes and Bernoulli naive Bayes for separating infectious from non-infectious viruses for nine bacterial host genera with at least 45 infecting viruses. Area under the receiver operating characteristic curve (AUC) is used to compare the performance of different machine learning method and feature combinations. We then evaluate the performance of the best method for the identification of the hosts of contigs in metagenomic studies. We also develop a maximum likelihood method to estimate the fraction of true infectious viruses for a given host in viral tagging experiments. RESULTS: Based on nine bacterial host genera with at least 45 infectious viruses, we show that random forest together with the relative word frequency vector performs the best in identifying viruses infecting particular hosts. For all the nine host genera, the AUC is over 0.85 and for five of them, the AUC is higher than 0.98 when the word size is 6 indicating the high accuracy of using machine learning approaches for identifying viruses infecting particular hosts. We also show that our method can predict the hosts of viral contigs of length at least 1kbps in metagenomic studies with high accuracy. The random forest together with word frequency vector outperforms current available methods based on Manhattan and [Formula: see text] dissimilarity measures. Based on word frequencies, we estimate that about 95% of the identified T4-like viruses in viral tagging experiment infect Synechococcus, while only about 29% of the identified non-T4-like viruses and 30% of the contigs in the study potentially infect Synechococcus. CONCLUSIONS: The random forest machine learning method together with the relative word frequencies as features of viruses can be used to predict viruses and viral contigs for specific bacterial hosts. The maximum likelihood approach can be used to estimate the fraction of true infectious associated viruses in viral tagging experiments.


Subject(s)
Bacteria/virology , DNA, Viral/isolation & purification , Genome, Viral , Host-Pathogen Interactions , Support Vector Machine , Viruses/genetics , Bayes Theorem , DNA, Viral/genetics , Likelihood Functions , Logistic Models , Metagenomics , Models, Theoretical , ROC Curve , Reproducibility of Results , Sequence Analysis, DNA , Viruses/metabolism
17.
Environ Microbiol ; 19(6): 2434-2452, 2017 06.
Article in English | MEDLINE | ID: mdl-28418097

ABSTRACT

Marine Thaumarchaeota are abundant ammonia-oxidizers but have few representative laboratory-cultured strains. We report the cultivation of Candidatus Nitrosomarinus catalina SPOT01, a novel strain that is less warm-temperature tolerant than other cultivated Thaumarchaeota. Using metagenomic recruitment, strain SPOT01 comprises a major portion of Thaumarchaeota (4-54%) in temperate Pacific waters. Its complete 1.36 Mbp genome possesses several distinguishing features: putative phosphorothioation (PT) DNA modification genes; a region containing probable viral genes; and putative urea utilization genes. The PT modification genes and an adjacent putative restriction enzyme (RE) operon likely form a restriction modification (RM) system for defence from foreign DNA. PacBio sequencing showed >98% methylation at two motifs, and inferred PT guanine modification of 19% of possible TGCA sites. Metagenomic recruitment also reveals the putative virus region and PT modification and RE genes are present in 18-26%, 9-14% and <1.5% of natural populations at 150 m with ≥85% identity to strain SPOT01. The presence of multiple probable RM systems in a highly streamlined genome suggests a surprising importance for defence from foreign DNA for dilute populations that infrequently encounter viruses or other cells. This new strain provides new insights into the ecology, including viral interactions, of this important group of marine microbes.


Subject(s)
Archaea , DNA, Archaeal/genetics , Genome, Archaeal/genetics , Viruses/genetics , Aquatic Organisms/genetics , Archaea/classification , Archaea/genetics , Archaea/virology , Base Sequence , Metagenomics , RNA, Ribosomal, 16S/genetics , Sequence Analysis, DNA
18.
Nucleic Acids Res ; 45(1): 39-53, 2017 01 09.
Article in English | MEDLINE | ID: mdl-27899557

ABSTRACT

Viruses and their host genomes often share similar oligonucleotide frequency (ONF) patterns, which can be used to predict the host of a given virus by finding the host with the greatest ONF similarity. We comprehensively compared 11 ONF metrics using several k-mer lengths for predicting host taxonomy from among ∼32 000 prokaryotic genomes for 1427 virus isolate genomes whose true hosts are known. The background-subtracting measure [Formula: see text] at k = 6 gave the highest host prediction accuracy (33%, genus level) with reasonable computational times. Requiring a maximum dissimilarity score for making predictions (thresholding) and taking the consensus of the 30 most similar hosts further improved accuracy. Using a previous dataset of 820 bacteriophage and 2699 bacterial genomes, [Formula: see text] host prediction accuracies with thresholding and consensus methods (genus-level: 64%) exceeded previous Euclidian distance ONF (32%) or homology-based (22-62%) methods. When applied to metagenomically-assembled marine SUP05 viruses and the human gut virus crAssphage, [Formula: see text]-based predictions overlapped (i.e. some same, some different) with the previously inferred hosts of these viruses. The extent of overlap improved when only using host genomes or metagenomic contigs from the same habitat or samples as the query viruses. The [Formula: see text] ONF method will greatly improve the characterization of novel, metagenomic viruses.


Subject(s)
Bacteria/genetics , Bacteriophages/genetics , Metagenomics , Oligonucleotides/chemistry , Phylogeny , Bacteria/classification , Bacteria/virology , Bacteriophages/classification , Base Sequence , Gastrointestinal Tract/metabolism , Gastrointestinal Tract/virology , Genome, Bacterial , Genome, Human , Genome, Viral , Humans , Oligonucleotides/genetics , Sequence Homology, Nucleic Acid
19.
ISME J ; 10(2): 333-45, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26208139

ABSTRACT

Marine picocyanobacteria, comprised of the genera Synechococcus and Prochlorococcus, are the most abundant and widespread primary producers in the ocean. More than 20 genetically distinct clades of marine Synechococcus have been identified, but their physiology and biogeography are not as thoroughly characterized as those of Prochlorococcus. Using clade-specific qPCR primers, we measured the abundance of 10 Synechococcus clades at 92 locations in surface waters of the Atlantic and Pacific Oceans. We found that Synechococcus partition the ocean into four distinct regimes distinguished by temperature, macronutrients and iron availability. Clades I and IV were prevalent in colder, mesotrophic waters; clades II, III and X dominated in the warm, oligotrophic open ocean; clades CRD1 and CRD2 were restricted to sites with low iron availability; and clades XV and XVI were only found in transitional waters at the edges of the other biomes. Overall, clade II was the most ubiquitous clade investigated and was the dominant clade in the largest biome, the oligotrophic open ocean. Co-occurring clades that occupy the same regime belong to distinct evolutionary lineages within Synechococcus, indicating that multiple ecotypes have evolved independently to occupy similar niches and represent examples of parallel evolution. We speculate that parallel evolution of ecotypes may be a common feature of diverse marine microbial communities that contributes to functional redundancy and the potential for resiliency.


Subject(s)
Iron/metabolism , Seawater/chemistry , Seawater/microbiology , Synechococcus/isolation & purification , Synechococcus/metabolism , DNA Primers/genetics , Ecotype , Iron/analysis , Oceans and Seas , Pacific Ocean , Phylogeny , Prochlorococcus/genetics , Prochlorococcus/metabolism , Synechococcus/classification , Synechococcus/genetics , Temperature
20.
Front Microbiol ; 3: 213, 2012.
Article in English | MEDLINE | ID: mdl-22723796

ABSTRACT

Marine Synechococcus is a globally significant genus of cyanobacteria that is comprised of multiple genetic lineages or clades. These clades are thought to represent ecologically distinct units, or ecotypes. Because multiple clades often co-occur together in the oceans, Synechococcus are ideal microbes to explore how closely related bacterial taxa within the same functional guild of organisms co-exist and partition marine habitats. Here we sequenced multiple gene loci from cultured strains to confirm the congruency of clade classifications between the 16S-23S rDNA internally transcribed spacer (ITS), 16S rDNA, narB, ntcA, and rpoC1 loci commonly used in Synechococcus diversity studies. We designed quantitative PCR (qPCR) assays that target the ITS for 10 Synechococcus clades, including four clades, XV, XVI, CRD1, and CRD2, not covered by previous assays employing other loci. Our new qPCR assays are very sensitive and specific, detecting down to tens of cells per ml. Application of these qPCR assays to field samples from the northwest Atlantic showed clear shifts in Synechococcus community composition across a coastal to open-ocean transect. Consistent with previous studies, clades I and IV dominated cold, coastal Synechococcus communities. Clades II and X were abundant at the two warmer, off-shore stations, and at all stations multiple Synechococcus clades co-occurred. qPCR assays developed here provide valuable tools to further explore the dynamics of microbial community structure and the mechanisms of co-existence.

SELECTION OF CITATIONS
SEARCH DETAIL