Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
Add more filters










Publication year range
1.
Nat Commun ; 14(1): 6233, 2023 10 12.
Article in English | MEDLINE | ID: mdl-37828003

ABSTRACT

Despite being perennially frigid, polar oceans form an ecosystem hosting high and unique biodiversity. Various organisms show different adaptive strategies in this habitat, but how viruses adapt to this environment is largely unknown. Viruses of phyla Nucleocytoviricota and Mirusviricota are groups of eukaryote-infecting large and giant DNA viruses with genomes encoding a variety of functions. Here, by leveraging the Global Ocean Eukaryotic Viral database, we investigate the biogeography and functional repertoire of these viruses at a global scale. We first confirm the existence of an ecological barrier that clearly separates polar and nonpolar viral communities, and then demonstrate that temperature drives dramatic changes in the virus-host network at the polar-nonpolar boundary. Ancestral niche reconstruction suggests that adaptation of these viruses to polar conditions has occurred repeatedly over the course of evolution, with polar-adapted viruses in the modern ocean being scattered across their phylogeny. Numerous viral genes are specifically associated with polar adaptation, although most of their homologues are not identified as polar-adaptive genes in eukaryotes. These results suggest that giant viruses adapt to cold environments by changing their functional repertoire, and this viral evolutionary strategy is distinct from the polar adaptation strategy of their hosts.


Subject(s)
Giant Viruses , Viruses , Giant Viruses/genetics , Genome, Viral/genetics , Ecosystem , Oceans and Seas , Phylogeny , DNA Viruses/genetics , Genomics , Viruses/genetics , Eukaryota/genetics
2.
ISME Commun ; 3(1): 84, 2023 Aug 19.
Article in English | MEDLINE | ID: mdl-37598259

ABSTRACT

Research on marine microbial communities is growing, but studies are hard to compare because of variation in seawater sampling protocols. To help researchers in the inter-comparison of studies that use different seawater sampling methodologies, as well as to help them design future sampling campaigns, we developed the EuroMarine Open Science Exploration initiative (EMOSE). Within the EMOSE framework, we sampled thousands of liters of seawater from a single station in the NW Mediterranean Sea (Service d'Observation du Laboratoire Arago [SOLA], Banyuls-sur-Mer), during one single day. The resulting dataset includes multiple seawater processing approaches, encompassing different material-type kinds of filters (cartridge membrane and flat membrane), three different size fractionations (>0.22 µm, 0.22-3 µm, 3-20 µm and >20 µm), and a number of different seawater volumes ranging from 1 L up to 1000 L. We show that the volume of seawater that is filtered does not have a significant effect on prokaryotic and protist diversity, independently of the sequencing strategy. However, there was a clear difference in alpha and beta diversity between size fractions and between these and "whole water" (with no pre-fractionation). Overall, we recommend care when merging data from datasets that use filters of different pore size, but we consider that the type of filter and volume should not act as confounding variables for the tested sequencing strategies. To the best of our knowledge, this is the first time a publicly available dataset effectively allows for the clarification of the impact of marine microbiome methodological options across a wide range of protocols, including large-scale variations in sampled volume.

3.
bioRxiv ; 2023 May 26.
Article in English | MEDLINE | ID: mdl-37293035

ABSTRACT

A wide variety of human diseases are associated with loss of microbial diversity in the human gut, inspiring a great interest in the diagnostic or therapeutic potential of the microbiota. However, the ecological forces that drive diversity reduction in disease states remain unclear, rendering it difficult to ascertain the role of the microbiota in disease emergence or severity. One hypothesis to explain this phenomenon is that microbial diversity is diminished as disease states select for microbial populations that are more fit to survive environmental stress caused by inflammation or other host factors. Here, we tested this hypothesis on a large scale, by developing a software framework to quantify the enrichment of microbial metabolisms in complex metagenomes as a function of microbial diversity. We applied this framework to over 400 gut metagenomes from individuals who are healthy or diagnosed with inflammatory bowel disease (IBD). We found that high metabolic independence (HMI) is a distinguishing characteristic of microbial communities associated with individuals diagnosed with IBD. A classifier we trained using the normalized copy numbers of 33 HMI-associated metabolic modules not only distinguished states of health versus IBD, but also tracked the recovery of the gut microbiome following antibiotic treatment, suggesting that HMI is a hallmark of microbial communities in stressed gut environments.

4.
Nature ; 616(7958): 783-789, 2023 04.
Article in English | MEDLINE | ID: mdl-37076623

ABSTRACT

DNA viruses have a major influence on the ecology and evolution of cellular organisms1-4, but their overall diversity and evolutionary trajectories remain elusive5. Here we carried out a phylogeny-guided genome-resolved metagenomic survey of the sunlit oceans and discovered plankton-infecting relatives of herpesviruses that form a putative new phylum dubbed Mirusviricota. The virion morphogenesis module of this large monophyletic clade is typical of viruses from the realm Duplodnaviria6, with multiple components strongly indicating a common ancestry with animal-infecting Herpesvirales. Yet, a substantial fraction of mirusvirus genes, including hallmark transcription machinery genes missing in herpesviruses, are closely related homologues of giant eukaryotic DNA viruses from another viral realm, Varidnaviria. These remarkable chimaeric attributes connecting Mirusviricota to herpesviruses and giant eukaryotic viruses are supported by more than 100 environmental mirusvirus genomes, including a near-complete contiguous genome of 432 kilobases. Moreover, mirusviruses are among the most abundant and active eukaryotic viruses characterized in the sunlit oceans, encoding a diverse array of functions used during the infection of microbial eukaryotes from pole to pole. The prevalence, functional activity, diversification and atypical chimaeric attributes of mirusviruses point to a lasting role of Mirusviricota in the ecology of marine ecosystems and in the evolution of eukaryotic DNA viruses.


Subject(s)
Aquatic Organisms , Giant Viruses , Herpesviridae , Oceans and Seas , Phylogeny , Plankton , Animals , Ecosystem , Eukaryota/virology , Genome, Viral/genetics , Giant Viruses/classification , Giant Viruses/genetics , Herpesviridae/classification , Herpesviridae/genetics , Plankton/virology , Metagenomics , Metagenome , Sunlight , Transcription, Genetic/genetics , Aquatic Organisms/virology
5.
Nature ; 612(7939): 283-291, 2022 12.
Article in English | MEDLINE | ID: mdl-36477129

ABSTRACT

Late Pliocene and Early Pleistocene epochs 3.6 to 0.8 million years ago1 had climates resembling those forecasted under future warming2. Palaeoclimatic records show strong polar amplification with mean annual temperatures of 11-19 °C above contemporary values3,4. The biological communities inhabiting the Arctic during this time remain poorly known because fossils are rare5. Here we report an ancient environmental DNA6 (eDNA) record describing the rich plant and animal assemblages of the Kap København Formation in North Greenland, dated to around two million years ago. The record shows an open boreal forest ecosystem with mixed vegetation of poplar, birch and thuja trees, as well as a variety of Arctic and boreal shrubs and herbs, many of which had not previously been detected at the site from macrofossil and pollen records. The DNA record confirms the presence of hare and mitochondrial DNA from animals including mastodons, reindeer, rodents and geese, all ancestral to their present-day and late Pleistocene relatives. The presence of marine species including horseshoe crab and green algae support a warmer climate than today. The reconstructed ecosystem has no modern analogue. The survival of such ancient eDNA probably relates to its binding to mineral surfaces. Our findings open new areas of genetic research, demonstrating that it is possible to track the ecology and evolution of biological communities from two million years ago using ancient eDNA.


Subject(s)
DNA, Environmental , Ecosystem , Ecology , Fossils , Greenland
7.
Nat Commun ; 13(1): 7135, 2022 11 21.
Article in English | MEDLINE | ID: mdl-36414628

ABSTRACT

The biotic crisis following the end-Cretaceous asteroid impact resulted in a dramatic renewal of pelagic biodiversity. Considering the severe and immediate effect of the asteroid impact on the pelagic environment, it is remarkable that some of the most affected pelagic groups, like the planktonic foraminifera, survived at all. Here we queried a surface ocean metabarcoding dataset to show that calcareous benthic foraminifera of the clade Globothalamea are able to disperse actively in the plankton, and we show using molecular clock phylogeny that the modern planktonic clades originated from different benthic ancestors that colonized the plankton after the end-Cretaceous crisis. We conclude that the diversity of planktonic foraminifera has been the result of a constant leakage of benthic foraminifera diversity into the plankton, continuously refueling the planktonic niche, and challenge the classical interpretation of the fossil record that suggests that Mesozoic planktonic foraminifera gave rise to the modern communities.


Subject(s)
Foraminifera , Foraminifera/genetics , Plankton/genetics , Extinction, Biological , Minor Planets , Fossils
9.
Elife ; 112022 08 03.
Article in English | MEDLINE | ID: mdl-35920817

ABSTRACT

Biogeographical studies have traditionally focused on readily visible organisms, but recent technological advances are enabling analyses of the large-scale distribution of microscopic organisms, whose biogeographical patterns have long been debated. Here we assessed the global structure of plankton geography and its relation to the biological, chemical, and physical context of the ocean (the 'seascape') by analyzing metagenomes of plankton communities sampled across oceans during the Tara Oceans expedition, in light of environmental data and ocean current transport. Using a consistent approach across organismal sizes that provides unprecedented resolution to measure changes in genomic composition between communities, we report a pan-ocean, size-dependent plankton biogeography overlying regional heterogeneity. We found robust evidence for a basin-scale impact of transport by ocean currents on plankton biogeography, and on a characteristic timescale of community dynamics going beyond simple seasonality or life history transitions of plankton.


Oceans are brimming with life invisible to our eyes, a myriad of species of bacteria, viruses and other microscopic organisms essential for the health of the planet. These 'marine plankton' are unable to swim against currents and should therefore be constantly on the move, yet previous studies have suggested that distinct species of plankton may in fact inhabit different oceanic regions. However, proving this theory has been challenging; collecting plankton is logistically difficult, and it is often impossible to distinguish between species simply by examining them under a microscope. However, within the last decade, a research schooner called Tara has travelled the globe to gather thousands of plankton samples. At the same time, advances in genomics have made it possible to identify species based only on fragments of their DNA sequence. To understand the hidden geography of plankton communities in Earth's oceans, Richter et al. pored over DNA from the Tara Oceans expedition. This revealed that, despite being unable to resist the flow of water, various planktonic species which live close to the surface manage to occupy distinct, stable provinces shaped by currents. Different sizes of plankton are distributed in different sized provinces, with the smallest organisms tending to inhabit the smallest areas. Comparing DNA similarities and speeds of currents at the ocean surface revealed how these might stretch and mix plankton communities. Plankton play a critical role in the health of the ocean and the chemical cycles of planet Earth. These results could allow deeper investigation by marine modellers, ecologists, and evolutionary biologists. Meanwhile, work is already underway to investigate how climate change might impact this hidden geography.


Subject(s)
Ecosystem , Plankton , Genomics , Geography , Oceans and Seas , Plankton/genetics
11.
Elife ; 112022 03 31.
Article in English | MEDLINE | ID: mdl-35356891

ABSTRACT

Genes of unknown function are among the biggest challenges in molecular biology, especially in microbial systems, where 40-60% of the predicted genes are unknown. Despite previous attempts, systematic approaches to include the unknown fraction into analytical workflows are still lacking. Here, we present a conceptual framework, its translation into the computational workflow AGNOSTOS and a demonstration on how we can bridge the known-unknown gap in genomes and metagenomes. By analyzing 415,971,742 genes predicted from 1749 metagenomes and 28,941 bacterial and archaeal genomes, we quantify the extent of the unknown fraction, its diversity, and its relevance across multiple organisms and environments. The unknown sequence space is exceptionally diverse, phylogenetically more conserved than the known fraction and predominantly taxonomically restricted at the species level. From the 71 M genes identified to be of unknown function, we compiled a collection of 283,874 lineage-specific genes of unknown function for Cand. Patescibacteria (also known as Candidate Phyla Radiation, CPR), which provides a significant resource to expand our understanding of their unusual biology. Finally, by identifying a target gene of unknown function for antibiotic resistance, we demonstrate how we can enable the generation of hypotheses that can be used to augment experimental data.


It is estimated that scientists do not know what half of microbial genes actually do. When these genes are discovered in microorganisms grown in the lab or found in environmental samples, it is not possible to identify what their roles are. Many of these genes are excluded from further analyses for these reasons, meaning that the study of microbial genes tends to be limited to genes that have already been described. These limitations hinder research into microbiology, because information from newly discovered genes cannot be integrated to better understand how these organisms work. Experiments to understand what role these genes have in the microorganisms are labor-intensive, so new analytical strategies are needed. To do this, Vanni et al. developed a new framework to categorize genes with unknown roles, and a computational workflow to integrate them into traditional analyses. When this approach was applied to over 400 million microbial genes (both with known and unknown roles), it showed that the share of genes with unknown functions is only about 30 per cent, smaller than previously thought. The analysis also showed that these genes are very diverse, revealing a huge space for future research and potential applications. Combining their approach with experimental data, Vanni et al. were able to identify a gene with a previously unknown purpose that could be involved in antibiotic resistance. This system could be useful for other scientists studying microorganisms to get a more complete view of microbial systems. In future, it may also be used to analyze the genetics of other organisms, such as plants and animals.


Subject(s)
Bacteria , Genome, Archaeal , Bacteria/genetics , Metagenome , Open Reading Frames
12.
Cell Genom ; 2(5): 100123, 2022 May 11.
Article in English | MEDLINE | ID: mdl-36778897

ABSTRACT

Marine planktonic eukaryotes play critical roles in global biogeochemical cycles and climate. However, their poor representation in culture collections limits our understanding of the evolutionary history and genomic underpinnings of planktonic ecosystems. Here, we used 280 billion Tara Oceans metagenomic reads from polar, temperate, and tropical sunlit oceans to reconstruct and manually curate more than 700 abundant and widespread eukaryotic environmental genomes ranging from 10 Mbp to 1.3 Gbp. This genomic resource covers a wide range of poorly characterized eukaryotic lineages that complement long-standing contributions from culture collections while better representing plankton in the upper layer of the oceans. We performed the first, to our knowledge, comprehensive genome-wide functional classification of abundant unicellular eukaryotic plankton, revealing four major groups connecting distantly related lineages. Neither trophic modes of plankton nor its vertical evolutionary history could completely explain the functional repertoire convergence of major eukaryotic lineages that coexisted within oceanic currents for millions of years.

13.
Microb Genom ; 7(12)2021 12.
Article in English | MEDLINE | ID: mdl-34904945

ABSTRACT

Polyketide synthases (PKSs) and non-ribosomal peptide synthetases (NRPSs) are mega enzymes responsible for the biosynthesis of a large fraction of natural products (NPs). Molecular markers for biosynthetic genes, such as the ketosynthase (KS) domain of PKSs, have been used to assess the diversity and distribution of biosynthetic genes in complex microbial communities. More recently, metagenomic studies have complemented and enhanced this approach by allowing the recovery of complete biosynthetic gene clusters (BGCs) from environmental DNA. In this study, the distribution and diversity of biosynthetic genes and clusters from Arctic Ocean samples (NICE-2015 expedition), was assessed using PCR-based strategies coupled with high-throughput sequencing and metagenomic analysis. In total, 149 KS domain OTU sequences were recovered, 36 % of which could not be assigned to any known BGC. In addition, 74 bacterial metagenome-assembled genomes were recovered, from which 179 BGCs were extracted. A network analysis identified potential new NP families, including non-ribosomal peptides and polyketides. Complete or near-complete BGCs were recovered, which will enable future heterologous expression efforts to uncover the respective NPs. Our study represents the first report of biosynthetic diversity assessed for Arctic Ocean metagenomes and highlights the potential of Arctic Ocean planktonic microbiomes for the discovery of novel secondary metabolites. The strategy employed in this study will enable future bioprospection, by identifying promising samples for bacterial isolation efforts, while providing also full-length BGCs for heterologous expression.


Subject(s)
Bacteria/classification , Biosynthetic Pathways , Sequence Analysis, DNA/methods , Arctic Regions , Bacteria/genetics , Bacteria/isolation & purification , Bacteria/metabolism , Bacterial Proteins/genetics , High-Throughput Nucleotide Sequencing , Humans , Microbiota , Multigene Family , Oceans and Seas , Phylogeny , Secondary Metabolism , Water Microbiology
14.
Nature ; 600(7887): 86-92, 2021 12.
Article in English | MEDLINE | ID: mdl-34671161

ABSTRACT

During the last glacial-interglacial cycle, Arctic biotas experienced substantial climatic changes, yet the nature, extent and rate of their responses are not fully understood1-8. Here we report a large-scale environmental DNA metagenomic study of ancient plant and mammal communities, analysing 535 permafrost and lake sediment samples from across the Arctic spanning the past 50,000 years. Furthermore, we present 1,541 contemporary plant genome assemblies that were generated as reference sequences. Our study provides several insights into the long-term dynamics of the Arctic biota at the circumpolar and regional scales. Our key findings include: (1) a relatively homogeneous steppe-tundra flora dominated the Arctic during the Last Glacial Maximum, followed by regional divergence of vegetation during the Holocene epoch; (2) certain grazing animals consistently co-occurred in space and time; (3) humans appear to have been a minor factor in driving animal distributions; (4) higher effective precipitation, as well as an increase in the proportion of wetland plants, show negative effects on animal diversity; (5) the persistence of the steppe-tundra vegetation in northern Siberia enabled the late survival of several now-extinct megafauna species, including the woolly mammoth until 3.9 ± 0.2 thousand years ago (ka) and the woolly rhinoceros until 9.8 ± 0.2 ka; and (6) phylogenetic analysis of mammoth environmental DNA reveals a previously unsampled mitochondrial lineage. Our findings highlight the power of ancient environmental metagenomics analyses to advance understanding of population histories and long-term ecological dynamics.


Subject(s)
Biota , DNA, Ancient/analysis , DNA, Environmental/analysis , Metagenomics , Animals , Arctic Regions , Climate Change/history , Databases, Genetic , Datasets as Topic , Extinction, Biological , Geologic Sediments , Grassland , Greenland , Haplotypes/genetics , Herbivory/genetics , History, Ancient , Humans , Lakes , Mammoths , Mitochondria/genetics , Perissodactyla , Permafrost , Phylogeny , Plants/genetics , Population Dynamics , Rain , Siberia , Spatio-Temporal Analysis , Wetlands
15.
Sci Data ; 8(1): 31, 2021 01 26.
Article in English | MEDLINE | ID: mdl-33500403

ABSTRACT

Ancient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has risen dramatically in recent years, and tracking this data for reuse is particularly important for large-scale ecological and evolutionary studies of individual taxa and communities of both microbes and eukaryotes. AncientMetagenomeDir (archived at https://doi.org/10.5281/zenodo.3980833 ) is a collection of annotated metagenomic sample lists derived from published studies that provide basic, standardised metadata and accession numbers to allow rapid data retrieval from online repositories. These tables are community-curated and span multiple sub-disciplines to ensure adequate breadth and consensus in metadata definitions, as well as longevity of the database. Internal guidelines and automated checks facilitate compatibility with established sequence-read archives and term-ontologies, and ensure consistency and interoperability for future meta-analyses. This collection will also assist in standardising metadata reporting for future ancient metagenomic studies.


Subject(s)
Databases, Genetic , Metagenome , Metagenomics , Humans , Metadata , Publications
17.
Nat Microbiol ; 5(8): 1026-1039, 2020 08.
Article in English | MEDLINE | ID: mdl-32451471

ABSTRACT

Brown algae are important players in the global carbon cycle by fixing carbon dioxide into 1 Gt of biomass annually, yet the fate of fucoidan-their major cell wall polysaccharide-remains poorly understood. Microbial degradation of fucoidans is slower than that of other polysaccharides, suggesting that fucoidans are more recalcitrant and may sequester carbon in the ocean. This may be due to the complex, branched and highly sulfated structure of fucoidans, which also varies among species of brown algae. Here, we show that 'Lentimonas' sp. CC4, belonging to the Verrucomicrobia, acquired a remarkably complex machinery for the degradation of six different fucoidans. The strain accumulated 284 putative fucoidanases, including glycoside hydrolases, sulfatases and carbohydrate esterases, which are primarily located on a 0.89-megabase pair plasmid. Proteomics reveals that these enzymes assemble into substrate-specific pathways requiring about 100 enzymes per fucoidan from different species of brown algae. These enzymes depolymerize fucoidan into fucose, which is metabolized in a proteome-costly bacterial microcompartment that spatially constrains the metabolism of the toxic intermediate lactaldehyde. Marine metagenomes and microbial genomes show that Verrucomicrobia including 'Lentimonas' are abundant and highly specialized degraders of fucoidans and other complex polysaccharides. Overall, the complexity of the pathways underscores why fucoidans are probably recalcitrant and more slowly degraded, since only highly specialized organisms can effectively degrade them in the ocean.


Subject(s)
Phaeophyceae/metabolism , Polysaccharides/metabolism , Verrucomicrobia/enzymology , Verrucomicrobia/metabolism , Bacterial Proteins/metabolism , Cell Wall/metabolism , Esterases , Genes, Bacterial/genetics , Glycoside Hydrolases , Metabolic Networks and Pathways , Metagenome , Phylogeny , Proteome , Substrate Specificity , Sulfatases , Sulfates/metabolism , Transcriptome , United States , Verrucomicrobia/genetics , Verrucomicrobia/isolation & purification
18.
PeerJ ; 8: e8783, 2020.
Article in English | MEDLINE | ID: mdl-32231882

ABSTRACT

BACKGROUND: Microbial source tracking methods are used to determine the origin of contaminating bacteria and other microorganisms, particularly in contaminated water systems. The Bayesian SourceTracker approach uses deep-sequencing marker gene libraries (16S ribosomal RNA) to determine the proportional contributions of bacteria from many potential source environments to a given sink environment simultaneously. Since its development, SourceTracker has been applied to an extensive diversity of studies, from beach contamination to human behavior. METHODS: Here, we demonstrate a novel application of SourceTracker to work with metagenomic datasets and tested this approach using sink samples from a study of coastal marine environments. Source environment metagenomes were obtained from metagenomics studies of gut, freshwater, marine, sand and soil environments. As part of this effort, we implemented features for determining the stability of source proportion estimates, including precision visualizations for performance optimization, and performed domain-specific source-tracking analyses (i.e., Bacteria, Archaea, Eukaryota and viruses). We also applied SourceTracker to metagenomic libraries generated from samples collected from the International Space Station (ISS). RESULTS: SourceTracker proved highly effective at predicting the composition of known sources using shotgun metagenomic libraries. In addition, we showed that different taxonomic domains sometimes presented highly divergent pictures of environmental source origins for both the coastal marine and ISS samples. These findings indicated that applying SourceTracker to separate domains may provide a deeper understanding of the microbial origins of complex, mixed-source environments, and further suggested that certain domains may be preferable for tracking specific sources of contamination.

19.
Nat Chem Biol ; 16(1): 60-68, 2020 01.
Article in English | MEDLINE | ID: mdl-31768033

ABSTRACT

Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the 'biosynthetic gene similarity clustering and prospecting engine' (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the 'core analysis of syntenic orthologues to prioritize natural product gene clusters' (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.


Subject(s)
Actinobacteria/genetics , Biosynthetic Pathways/genetics , Computational Biology/methods , Genome, Bacterial , Algorithms , Biological Products , Cluster Analysis , Data Mining/methods , Genomics , Metabolomics , Microbiota , Multigene Family , Phylogeny , Reproducibility of Results , Software
20.
BMC Bioinformatics ; 20(1): 453, 2019 Sep 05.
Article in English | MEDLINE | ID: mdl-31488068

ABSTRACT

BACKGROUND: Metagenomics caused a quantum leap in microbial ecology. However, the inherent size and complexity of metagenomic data limit its interpretation. The quantification of metagenomic traits in metagenomic analysis workflows has the potential to improve the exploitation of metagenomic data. Metagenomic traits are organisms' characteristics linked to their performance. They are measured at the genomic level taking a random sample of individuals in a community. As such, these traits provide valuable information to uncover microorganisms' ecological patterns. The Average Genome Size (AGS) and the 16S rRNA gene Average Copy Number (ACN) are two highly informative metagenomic traits that reflect microorganisms' ecological strategies as well as the environmental conditions they inhabit. RESULTS: Here, we present the ags.sh and acn.sh tools, which analytically derive the AGS and ACN metagenomic traits. These tools represent an advance on previous approaches to compute the AGS and ACN traits. Benchmarking shows that ags.sh is up to 11 times faster than state-of-the-art tools dedicated to the estimation AGS. Both ags.sh and acn.sh show comparable or higher accuracy than existing tools used to estimate these traits. To exemplify the applicability of both tools, we analyzed the 139 prokaryotic metagenomes of TARA Oceans and revealed the ecological strategies associated with different water layers. CONCLUSION: We took advantage of recent advances in gene annotation to develop the ags.sh and acn.sh tools to combine easy tool usage with fast and accurate performance. Our tools compute the AGS and ACN metagenomic traits on unassembled metagenomes and allow researchers to improve their metagenomic data analysis to gain deeper insights into microorganisms' ecology. The ags.sh and acn.sh tools are publicly available using Docker container technology at https://github.com/pereiramemo/AGS-and-ACN-tools .


Subject(s)
Gene Dosage , Genome Size , Metagenome/genetics , Metagenomics/methods , RNA, Ribosomal, 16S/genetics , Benchmarking , DNA Copy Number Variations , Databases, Genetic , Oceans and Seas , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...