Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 109
Filtrar
1.
NAR Genom Bioinform ; 6(2): lqae044, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38711860

RESUMO

Sequence classification facilitates a fundamental understanding of the structure of microbial communities. Binary metagenomic sequence classifiers are insufficient because environmental metagenomes are typically derived from multiple sequence sources. Here we introduce a deep-learning based sequence classifier, DeepMicroClass, that classifies metagenomic contigs into five sequence classes, i.e. viruses infecting prokaryotic or eukaryotic hosts, eukaryotic or prokaryotic chromosomes, and prokaryotic plasmids. DeepMicroClass achieved high performance for all sequence classes at various tested sequence lengths ranging from 500 bp to 100 kbps. By benchmarking on a synthetic dataset with variable sequence class composition, we showed that DeepMicroClass obtained better performance for eukaryotic, plasmid and viral contig classification than other state-of-the-art predictors. DeepMicroClass achieved comparable performance on viral sequence classification with geNomad and VirSorter2 when benchmarked on the CAMI II marine dataset. Using a coastal daily time-series metagenomic dataset as a case study, we showed that microbial eukaryotes and prokaryotic viruses are integral to microbial communities. By analyzing monthly metagenomes collected at HOT and BATS, we found relatively higher viral read proportions in the subsurface layer in late summer, consistent with the seasonal viral infection patterns prevalent in these areas. We expect DeepMicroClass will promote metagenomic studies of under-appreciated sequence types.

2.
ISME Commun ; 3(1): 84, 2023 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-37598259

RESUMO

Research on marine microbial communities is growing, but studies are hard to compare because of variation in seawater sampling protocols. To help researchers in the inter-comparison of studies that use different seawater sampling methodologies, as well as to help them design future sampling campaigns, we developed the EuroMarine Open Science Exploration initiative (EMOSE). Within the EMOSE framework, we sampled thousands of liters of seawater from a single station in the NW Mediterranean Sea (Service d'Observation du Laboratoire Arago [SOLA], Banyuls-sur-Mer), during one single day. The resulting dataset includes multiple seawater processing approaches, encompassing different material-type kinds of filters (cartridge membrane and flat membrane), three different size fractionations (>0.22 µm, 0.22-3 µm, 3-20 µm and >20 µm), and a number of different seawater volumes ranging from 1 L up to 1000 L. We show that the volume of seawater that is filtered does not have a significant effect on prokaryotic and protist diversity, independently of the sequencing strategy. However, there was a clear difference in alpha and beta diversity between size fractions and between these and "whole water" (with no pre-fractionation). Overall, we recommend care when merging data from datasets that use filters of different pore size, but we consider that the type of filter and volume should not act as confounding variables for the tested sequencing strategies. To the best of our knowledge, this is the first time a publicly available dataset effectively allows for the clarification of the impact of marine microbiome methodological options across a wide range of protocols, including large-scale variations in sampled volume.

3.
ISME Commun ; 3(1): 63, 2023 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-37355737

RESUMO

Biological nitrogen fixation, the conversion of N2 gas into a bioavailable form, is vital to sustaining marine primary production. Studies have shifted beyond traditionally studied tropical diazotrophs. Candidatus Atelocyanobacterium thalassa (or UCYN-A) has emerged as a focal point due to its streamlined metabolism, intimate partnership with a haptophyte host, and broad distribution. Here, we explore the environmental parameters that govern UCYN-A's presence at the San Pedro Ocean Time-series (SPOT), its host specificity, and statistically significant interactions with non-host eukaryotes from 2008-2018. 16S and 18S rRNA gene sequences were amplified by "universal primers" from monthly samples and resolved into Amplicon Sequence Variants, allowing us to observe multiple UCYN-A symbioses. UCYN-A1 relative abundances increased following the 2015-2016 El Niño event. This "open ocean ecotype" was present when coastal upwelling declined, and Ekman transport brought tropical waters into the region. Network analyses reveal all strains of UCYN-A co-occur with dinoflagellates including Lepidodinium, a potential predator, and parasitic Syndiniales. UCYN-A2 appeared to pair with multiple hosts and was not tightly coupled to its predominant host, while UCYN-A1 maintained a strong host-symbiont relationship. These biological relationships are particularly important to study in the context of climate change, which will alter UCYN-A distribution at regional and global scales.

4.
Viruses ; 15(2)2023 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-36851794

RESUMO

Cyanophages exert important top-down controls on their cyanobacteria hosts; however, concurrent analysis of both phage and host populations is needed to better assess phage-host interaction models. We analyzed picocyanobacteria Prochlorococcus and Synechococcus and T4-like cyanophage communities in Pacific Ocean surface waters using five years of monthly viral and cellular fraction metagenomes. Cyanophage communities contained thousands of mostly low-abundance (<2% relative abundance) species with varying temporal dynamics, categorized as seasonally recurring or non-seasonal and occurring persistently, occasionally, or sporadically (detected in ≥85%, 15-85%, or <15% of samples, respectively). Viromes contained mostly seasonal and persistent phages (~40% each), while cellular fraction metagenomes had mostly sporadic species (~50%), reflecting that these sample sets capture different steps of the infection cycle-virions from prior infections or within currently infected cells, respectively. Two groups of seasonal phages correlated to Synechococcus or Prochlorococcus were abundant in spring/summer or fall/winter, respectively. Cyanophages likely have a strong influence on the host community structure, as their communities explained up to 32% of host community variation. These results support how both seasonally recurrent and apparent stochastic processes, likely determined by host availability and different host-range strategies among phages, are critical to phage-host interactions and dynamics, consistent with both the Kill-the-Winner and the Bank models.


Assuntos
Bacteriófagos , Synechococcus , Bacteriófagos/genética , Especificidade de Hospedeiro , Metagenoma , Oceano Pacífico , Estações do Ano
5.
Nat Commun ; 14(1): 502, 2023 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-36720887

RESUMO

The introduction of high-throughput chromosome conformation capture (Hi-C) into metagenomics enables reconstructing high-quality metagenome-assembled genomes (MAGs) from microbial communities. Despite recent advances in recovering eukaryotic, bacterial, and archaeal genomes using Hi-C contact maps, few of Hi-C-based methods are designed to retrieve viral genomes. Here we introduce ViralCC, a publicly available tool to recover complete viral genomes and detect virus-host pairs using Hi-C data. Compared to other Hi-C-based methods, ViralCC leverages the virus-host proximity structure as a complementary information source for the Hi-C interactions. Using mock and real metagenomic Hi-C datasets from several different microbial ecosystems, including the human gut, cow fecal, and wastewater, we demonstrate that ViralCC outperforms existing Hi-C-based binning methods as well as state-of-the-art tools specifically dedicated to metagenomic viral binning. ViralCC can also reveal the taxonomic structure of viruses and virus-host pairs in microbial communities. When applied to a real wastewater metagenomic Hi-C dataset, ViralCC constructs a phage-host network, which is further validated using CRISPR spacer analyses. ViralCC is an open-source pipeline available at https://github.com/dyxstat/ViralCC .


Assuntos
Bacteriófagos , Microbiota , Animais , Bovinos , Feminino , Humanos , Metagenoma/genética , Metagenômica , Águas Residuárias , Genoma Viral/genética , Bacteriófagos/genética
6.
Environ Microbiol ; 25(3): 689-704, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36478085

RESUMO

Marine Group I (MGI) Thaumarchaeota were originally described as chemoautotrophic nitrifiers, but molecular and isotopic evidence suggests heterotrophic and/or mixotrophic capabilities. Here, we investigated the quantity and composition of organic matter assimilated by individual, uncultured MGI cells from the Pacific Ocean to constrain their potential for mixotrophy and heterotrophy. We observed that most MGI cells did not assimilate carbon from any organic substrate provided (glucose, pyruvate, oxaloacetate, protein, urea, and amino acids). The minority of MGI cells that did assimilate it did so exclusively from nitrogenous substrates (urea, 15% of MGI and amino acids, 36% of MGI), and only as an auxiliary carbon source (<20% of that subset's total cellular carbon was derived from those substrates). At the population level, MGI assimilation of organic carbon comprised just 0.5%-11% of total biomass carbon. We observed extensive assimilation of inorganic carbon and urea- and amino acid-derived nitrogen (equal to that from ammonium), consistent with metagenomic and metatranscriptomic analyses performed here and previously showing a widespread potential for MGI to perform autotrophy and transport and degrade organic nitrogen. Our results constrain the quantity and composition of organic matter used by MGI and suggest they use it primarily to meet nitrogen demands for anabolism and nitrification.


Assuntos
Archaea , Carbono , Archaea/metabolismo , Carbono/metabolismo , Aminoácidos/metabolismo , Ureia/metabolismo , Nitrogênio/metabolismo
7.
Nat Commun ; 13(1): 7905, 2022 12 23.
Artigo em Inglês | MEDLINE | ID: mdl-36550140

RESUMO

Free-living and particle-associated marine prokaryotes have physiological, genomic, and phylogenetic differences, yet factors influencing their temporal dynamics remain poorly constrained. In this study, we quantify the entire microbial community composition monthly over several years, including viruses, prokaryotes, phytoplankton, and total protists, from the San-Pedro Ocean Time-series using ribosomal RNA sequencing and viral metagenomics. Canonical analyses show that in addition to physicochemical factors, the double-stranded DNA viral community is the strongest factor predicting free-living prokaryotes, explaining 28% of variability, whereas the phytoplankton (via chloroplast 16S rRNA) community is strongest with particle-associated prokaryotes, explaining 31% of variability. Unexpectedly, protist community explains little variability. Our findings suggest that biotic interactions are significant determinants of the temporal dynamics of prokaryotes, and the relative importance of specific interactions varies depending on lifestyles. Also, warming influenced the prokaryotic community, which largely remained oligotrophic summer-like throughout 2014-15, with cyanobacterial populations shifting from cold-water ecotypes to warm-water ecotypes.


Assuntos
Cianobactérias , Fitoplâncton , Fitoplâncton/genética , RNA Ribossômico 16S/genética , Filogenia , Cianobactérias/genética , Eucariotos/genética , Água , Água do Mar/microbiologia
8.
mSystems ; 7(5): e0074522, 2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-36190138

RESUMO

Trait inference from mixed-species assemblages is a central problem in microbial ecology. Frequently, sequencing information from an environment is available, but phenotypic measurements from individual community members are not. With the increasing availability of molecular data for microbial communities, bioinformatic approaches that map metagenome to (meta)phenotype are needed. Recently, we developed a tool, gRodon, that enables the prediction of the maximum growth rate of an organism from genomic data on the basis of codon usage patterns. Our work and that of other groups suggest that such predictors can be applied to mixed-species communities in order to derive estimates of the average community-wide maximum growth rate. Here, we present an improved maximum growth rate predictor designed for metagenomes that corrects a persistent GC bias in the original gRodon model for metagenomic prediction. We benchmark this predictor with simulated metagenomic data sets to show that it has superior performance on mixed-species communities relative to earlier models. We go on to provide guidance on data preprocessing and show that calling genes from assembled contigs rather than directly from reads dramatically improves performance. Finally, we apply our predictor to large-scale metagenomic data sets from marine and human microbiomes to illustrate how community-wide growth prediction can be a powerful approach for hypothesis generation. Altogether, we provide an updated tool with clear guidelines for users about the uses and pitfalls of metagenomic prediction of the average community-wide maximal growth rate. IMPORTANCE Microbes dominate nearly every known habitat, and therefore tools to survey the structure and function of natural microbial communities are much needed. Metagenomics, in which the DNA content of an entire community of organisms is sequenced all at once, allows us to probe the genetic diversity contained in a habitat. Yet, mapping metagenomic information to the actual traits of community members is a difficult and largely unsolved problem. Here, we present and validate a tool that allows users to predict the average maximum growth rate of a microbial community directly from metagenomic data. Maximum growth rate is a fundamental characteristic of microbial species that can give us a great deal of insight into their ecological role, and by applying our community-level predictor to large-scale metagenomic data sets from marine and human-associated microbiomes, we show how community-wide growth prediction can be a powerful approach for hypothesis generation.


Assuntos
Metagenoma , Microbiota , Humanos , Metagenoma/genética , Benchmarking , Uso do Códon , Microbiota/genética
9.
Bioinformatics ; 38(Suppl 1): i45-i52, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758806

RESUMO

MOTIVATION: Phage-host associations play important roles in microbial communities. But in natural communities, as opposed to culture-based lab studies where phages are discovered and characterized metagenomically, their hosts are generally not known. Several programs have been developed for predicting which phage infects which host based on various sequence similarity measures or machine learning approaches. These are often based on whole viral and host genomes, but in metagenomics-based studies, we rarely have whole genomes but rather must rely on contigs that are sometimes as short as hundreds of bp long. Therefore, we need programs that predict hosts of phage contigs on the basis of these short contigs. Although most existing programs can be applied to metagenomic datasets for these predictions, their accuracies are generally low. Here, we develop ContigNet, a convolutional neural network-based model capable of predicting phage-host matches based on relatively short contigs, and compare it to previously published VirHostMatcher (VHM) and WIsH. RESULTS: On the validation set, ContigNet achieves 72-85% area under the receiver operating characteristic curve (AUROC) scores, compared to the maximum of 68% by VHM or WIsH for contigs of lengths between 200 bps to 50 kbps. We also apply the model to the Metagenomic Gut Virus (MGV) catalogue, a dataset containing a wide range of draft genomes from metagenomic samples and achieve 60-70% AUROC scores compared to that of VHM and WIsH of 52%. Surprisingly, ContigNet can also be used to predict plasmid-host contig associations with high accuracy, indicating a similar genetic exchange between mobile genetic elements and their hosts. AVAILABILITY AND IMPLEMENTATION: The source code of ContigNet and related datasets can be downloaded from https://github.com/tianqitang1/ContigNet.


Assuntos
Bacteriófagos , Bactérias/genética , Bacteriófagos/genética , Metagenoma , Metagenômica , Redes Neurais de Computação
10.
ISME Commun ; 2(1): 36, 2022 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-37938286

RESUMO

Community dynamics are central in microbial ecology, yet we lack studies comparing diversity patterns among marine protists and prokaryotes over depth and multiple years. Here, we characterized microbes at the San-Pedro Ocean Time series (2005-2018), using SSU rRNA gene sequencing from two size fractions (0.2-1 and 1-80 µm), with a universal primer set that amplifies from both prokaryotes and eukaryotes, allowing direct comparisons of diversity patterns in a single set of analyses. The 16S + 18S rRNA gene composition in the small size fraction was mostly prokaryotic (>92%) as expected, but the large size fraction unexpectedly contained 46-93% prokaryotic 16S rRNA genes. Prokaryotes and protists showed opposite vertical diversity patterns; prokaryotic diversity peaked at mid-depth, protistan diversity at the surface. Temporal beta-diversity patterns indicated prokaryote communities were much more stable than protists. Although the prokaryotic communities changed monthly, the average community stayed remarkably steady over 14 years, showing high resilience. Additionally, particle-associated prokaryotes were more diverse than smaller free-living ones, especially at deeper depths, contributed unexpectedly by abundant and diverse SAR11 clade II. Eukaryotic diversity was strongly correlated with the diversity of particle-associated prokaryotes but not free-living ones, reflecting that physical associations result in the strongest interactions, including symbioses, parasitism, and decomposer relationships.

11.
Proc Biol Sci ; 288(1961): 20211555, 2021 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-34666523

RESUMO

Clustered regularly interspaced short palindromic repeat (CRISPR)-Cas adaptive immune systems enable bacteria and archaea to efficiently respond to viral pathogens by creating a genomic record of previous encounters. These systems are broadly distributed across prokaryotic taxa, yet are surprisingly absent in a majority of organisms, suggesting that the benefits of adaptive immunity frequently do not outweigh the costs. Here, combining experiments and models, we show that a delayed immune response which allows viruses to transiently redirect cellular resources to reproduction, which we call 'immune lag', is extremely costly during viral outbreaks, even to completely immune hosts. Critically, the costs of lag are only revealed by examining the early, transient dynamics of a host-virus system occurring immediately after viral challenge. Lag is a basic parameter of microbial defence, relevant to all intracellular, post-infection antiviral defence systems, that has to-date been largely ignored by theoretical and experimental treatments of host-phage systems.


Assuntos
Bacteriófagos , Vírus , Archaea , Bactérias/genética , Sistemas CRISPR-Cas , Surtos de Doenças
12.
mSystems ; 6(3): e0056521, 2021 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-34060911

RESUMO

Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)-demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R2 = 0.98, n = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could be improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta'omic data.

13.
Environ Microbiol ; 23(6): 3240-3250, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33938123

RESUMO

Universal primers for SSU rRNA genes allow profiling of natural communities by simultaneously amplifying templates from Bacteria, Archaea, and Eukaryota in a single PCR reaction. Despite the potential to show relative abundance for all rRNA genes, universal primers are rarely used, due to various concerns including amplicon length variation and its effect on bioinformatic pipelines. We thus developed 16S and 18S rRNA mock communities and a bioinformatic pipeline to validate this approach. Using these mocks, we show that universal primers (515Y/926R) outperformed eukaryote-specific V4 primers in observed versus expected abundance correlations (slope = 0.88 vs. 0.67-0.79), and mock community members with single mismatches to the primer were strongly underestimated (threefold to eightfold). Using field samples, both primers yielded similar 18S beta-diversity patterns (Mantel test, p < 0.001) but differences in relative proportions of many rarer taxa. To test for length biases, we mixed mock communities (16S + 18S) before PCR and found a twofold underestimation of 18S sequences due to sequencing bias. Correcting for the twofold underestimation, we estimate that, in Southern California field samples (1.2-80 µm), there were averages of 35% 18S, 28% chloroplast 16S, and 37% prokaryote 16S rRNA genes. These data demonstrate the potential for universal primers to generate comprehensive microbiome profiles.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Viés , Reação em Cadeia da Polimerase , RNA Ribossômico 16S/genética , RNA Ribossômico 18S/genética , Análise de Sequência de DNA
14.
mSystems ; 6(3)2021 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-33975968

RESUMO

Bacterial biodegradation is a significant contributor to remineralization of polycyclic aromatic hydrocarbons (PAHs)-toxic and recalcitrant components of crude oil as well as by-products of partial combustion chronically introduced into seawater via atmospheric deposition. The Deepwater Horizon oil spill demonstrated the speed at which a seed PAH-degrading community maintained by chronic inputs responds to acute pollution. We investigated the diversity and functional potential of a similar seed community in the chronically polluted Port of Los Angeles (POLA), using stable isotope probing with naphthalene, deep-sequenced metagenomes, and carbon incorporation rate measurements at the port and in two sites in the San Pedro Channel. We demonstrate the ability of the community of degraders at the POLA to incorporate carbon from naphthalene, leading to a quick shift in microbial community composition to be dominated by the normally rare Colwellia and Cycloclasticus We show that metagenome-assembled genomes (MAGs) belonged to these naphthalene degraders by matching their 16S-rRNA gene with experimental stable isotope probing data. Surprisingly, we did not find a full PAH degradation pathway in those genomes, even when combining genes from the entire microbial community, leading us to hypothesize that promiscuous dehydrogenases replace canonical naphthalene degradation enzymes in this site. We compared metabolic pathways identified in 29 genomes whose abundance increased in the presence of naphthalene to generate genomic-based recommendations for future optimization of PAH bioremediation at the POLA, e.g., ammonium as opposed to urea, heme or hemoproteins as an iron source, and polar amino acids.IMPORTANCE Oil spills in the marine environment have a devastating effect on marine life and biogeochemical cycles through bioaccumulation of toxic hydrocarbons and oxygen depletion by hydrocarbon-degrading bacteria. Oil-degrading bacteria occur naturally in the ocean, especially where they are supported by chronic inputs of oil or other organic carbon sources, and have a significant role in degradation of oil spills. Polycyclic aromatic hydrocarbons are the most persistent and toxic component of crude oil. Therefore, the bacteria that can break those molecules down are of particular importance. We identified such bacteria at the Port of Los Angeles (POLA), one of the busiest ports worldwide, and characterized their metabolic capabilities. We propose chemical targets based on those analyses to stimulate the activity of these bacteria in case of an oil spill in the Port POLA.

15.
Proc Natl Acad Sci U S A ; 118(12)2021 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-33723043

RESUMO

Maximal growth rate is a basic parameter of microbial lifestyle that varies over several orders of magnitude, with doubling times ranging from a matter of minutes to multiple days. Growth rates are typically measured using laboratory culture experiments. Yet, we lack sufficient understanding of the physiology of most microbes to design appropriate culture conditions for them, severely limiting our ability to assess the global diversity of microbial growth rates. Genomic estimators of maximal growth rate provide a practical solution to survey the distribution of microbial growth potential, regardless of cultivation status. We developed an improved maximal growth rate estimator and predicted maximal growth rates from over 200,000 genomes, metagenome-assembled genomes, and single-cell amplified genomes to survey growth potential across the range of prokaryotic diversity; extensions allow estimates from 16S rRNA sequences alone as well as weighted community estimates from metagenomes. We compared the growth rates of cultivated and uncultivated organisms to illustrate how culture collections are strongly biased toward organisms capable of rapid growth. Finally, we found that organisms naturally group into two growth classes and observed a bias in growth predictions for extremely slow-growing organisms. These observations ultimately led us to suggest evolutionary definitions of oligotrophy and copiotrophy based on the selective regime an organism occupies. We found that these growth classes are associated with distinct selective regimes and genomic functional potentials.


Assuntos
Uso do Códon , Metagenoma , Metagenômica , Fenômenos Microbiológicos/genética , Análise de Célula Única , Bases de Dados Genéticas , Evolução Molecular , Metagenômica/métodos , Células Procarióticas/fisiologia , Análise de Célula Única/métodos
16.
ISME J ; 15(1): 183-195, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32939027

RESUMO

Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias (CUB) for maximum growth rates and "peak-to-trough ratio" (PTR) for in situ rates. Both were originally tested with pure cultures, but natural populations are more heterogeneous, especially in individual cell histories pertinent to PTR. To test these methods, we compared predictors with observed growth rates of freshly collected marine prokaryotes in unamended seawater. We prefiltered and diluted samples to remove grazers and greatly reduce virus infection, so net growth approximated gross growth. We sampled over 44 h for abundances and metagenomes, generating 101 metagenome-assembled genomes (MAGs), including Actinobacteria, Verrucomicrobia, SAR406, MGII archaea, etc. We tracked each MAG population by cell-abundance-normalized read recruitment, finding growth rates of 0 to 5.99 per day, the first reported rates for several groups, and used these rates as benchmarks. PTR, calculated by three methods, rarely correlated to growth (r ~-0.26-0.08), except for rapidly growing γ-Proteobacteria (r ~0.63-0.92), while CUB correlated moderately well to observed maximum growth rates (r = 0.57). This suggests that current PTR approaches poorly predict actual growth of most marine bacterial populations, but maximum growth rates can be approximated from genomic characteristics.


Assuntos
Benchmarking , Metagenoma , Archaea/genética , Bactérias/genética , Metagenômica
18.
NAR Genom Bioinform ; 2(2): lqaa044, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32626849

RESUMO

Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences; however, it remains challenging to identify the host(s) of these new viruses. We developed VirHostMatcher-Net, a flexible, network-based, Markov random field framework for predicting virus-prokaryote interactions using multiple, integrated features: CRISPR sequences and alignment-free similarity measures ([Formula: see text] and WIsH). Evaluation of this method on a benchmark set of 1462 known virus-prokaryote pairs yielded host prediction accuracy of 59% and 86% at the genus and phylum levels, representing 16-27% and 6-10% improvement, respectively, over previous single-feature prediction approaches. We applied our host prediction tool to crAssphage, a human gut phage, and two metagenomic virus datasets: marine viruses and viral contigs recovered from globally distributed, diverse habitats. Host predictions were frequently consistent with those of previous studies, but more importantly, this new tool made many more confident predictions than previous tools, up to nearly 3-fold more (n > 27 000), greatly expanding the diversity of known virus-host interactions.

19.
Glob Chang Biol ; 26(10): 5613-5629, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32715608

RESUMO

Western boundary currents (WBCs) redistribute heat and oligotrophic seawater from the tropics to temperate latitudes, with several displaying substantial climate change-driven intensification over the last century. Strengthening WBCs have been implicated in the poleward range expansion of marine macroflora and fauna, however, the impacts on the structure and function of temperate microbial communities are largely unknown. Here we show that the major subtropical WBC of the South Pacific Ocean, the East Australian Current (EAC), transports microbial assemblages that maintain tropical and oligotrophic (k-strategist) signatures, to seasonally displace more copiotrophic (r-strategist) temperate microbial populations within temperate latitudes of the Tasman Sea. We identified specific characteristics of EAC microbial assemblages compared with non-EAC assemblages, including strain transitions within the SAR11 clade, enrichment of Prochlorococcus, predicted smaller genome sizes and shifts in the importance of several functional genes, including those associated with cyanobacterial photosynthesis, secondary metabolism and fatty acid and lipid transport. At a temperate time-series site in the Tasman Sea, we observed significant reductions in standing stocks of total carbon and chlorophyll a, and a shift towards smaller phytoplankton and carnivorous copepods, associated with the seasonal impact of the EAC microbial assemblage. In light of the substantial shifts in microbial assemblage structure and function associated with the EAC, we conclude that climate-driven expansions of WBCs will expand the range of tropical oligotrophic microbes, and potentially profoundly impact the trophic status of temperate waters.


Assuntos
Prochlorococcus , Água do Mar , Austrália , Clorofila A , Oceano Pacífico
20.
Quant Biol ; 8(1): 64-77, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34084563

RESUMO

BACKGROUND: The recent development of metagenomic sequencing makes it possible to massively sequence microbial genomes including viral genomes without the need for laboratory culture. Existing reference-based and gene homology-based methods are not efficient in identifying unknown viruses or short viral sequences from metagenomic data. METHODS: Here we developed a reference-free and alignment-free machine learning method, DeepVirFinder, for identifying viral sequences in metagenomic data using deep learning. RESULTS: Trained based on sequences from viral RefSeq discovered before May 2015, and evaluated on those discovered after that date, DeepVirFinder outperformed the state-of-the-art method VirFinder at all contig lengths, achieving AUROC 0.93, 0.95, 0.97, and 0.98 for 300, 500, 1000, and 3000 bp sequences respectively. Enlarging the training data with additional millions of purified viral sequences from metavirome samples further improved the accuracy for identifying virus groups that are under-represented. Applying DeepVirFinder to real human gut metagenomic samples, we identified 51,138 viral sequences belonging to 175 bins in patients with colorectal carcinoma (CRC). Ten bins were found associated with the cancer status, suggesting viruses may play important roles in CRC. CONCLUSIONS: Powered by deep learning and high throughput sequencing metagenomic data, DeepVirFinder significantly improved the accuracy of viral identification and will assist the study of viruses in the era of metagenomics.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA