Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39011878

RESUMO

Genome search and/or classification typically involves finding the best-match database (reference) genomes and has become increasingly challenging due to the growing number of available database genomes and the fact that traditional methods do not scale well with large databases. By combining k-mer hashing-based probabilistic data structures (i.e. ProbMinHash, SuperMinHash, Densified MinHash and SetSketch) to estimate genomic distance, with a graph based nearest neighbor search algorithm (Hierarchical Navigable Small World Graphs, or HNSW), we created a new data structure and developed an associated computer program, GSearch, that is orders of magnitude faster than alternative tools while maintaining high accuracy and low memory usage. For example, GSearch can search 8000 query genomes against all available microbial or viral genomes for their best matches (n = ∼318 000 or ∼3 000 000, respectively) within a few minutes on a personal laptop, using ∼6 GB of memory (2.5 GB via SetSketch). Notably, GSearch has an O(log(N)) time complexity and will scale well with billions of genomes based on a database splitting strategy. Further, GSearch implements a three-step search strategy depending on the degree of novelty of the query genomes to maximize specificity and sensitivity. Therefore, GSearch solves a major bottleneck of microbiome studies that require genome search and/or classification.

2.
Phytopathology ; 113(8): 1387-1393, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37081724

RESUMO

Strains of Xanthomonas citri pv. malvacearum cause bacterial blight of cotton, a potentially serious threat to cotton production worldwide, including in sub-Saharan countries. Development of disease symptoms, such as water soaking, has been linked to the activity of a class of type 3 effectors, called transcription activator-like (TAL) effectors, which induce susceptibility genes in the host's cells. To gain further insight into the global diversity of the pathogen, to elucidate their repertoires of TAL effector genes, and to better understand the evolution of these genes in the cotton-pathogenic xanthomonads, we sequenced the genomes of three African strains of X. citri pv. malvacearum using nanopore technology. We show that the cotton-pathogenic pathovar of X. citri is a monophyletic lineage containing at least three distinct genetic subclades, which appear to be mirrored by their repertoires of TAL effectors. We observed an atypical level of TAL effector gene pseudogenization, which might be related to resistance genes that are deployed to control the disease. Our work thus contributes to a better understanding of the conservation and importance of TAL effectors in the interaction with the host plant, which can inform strategies for improving resistance against bacterial blight in cotton.

3.
Artigo em Inglês | MEDLINE | ID: mdl-36125864

RESUMO

Thousands of new bacterial and archaeal species and higher-level taxa are discovered each year through the analysis of genomes and metagenomes. The Genome Taxonomy Database (GTDB) provides hierarchical sequence-based descriptions and classifications for new and as-yet-unnamed taxa. However, bacterial nomenclature, as currently configured, cannot keep up with the need for new well-formed names. Instead, microbiologists have been forced to use hard-to-remember alphanumeric placeholder labels. Here, we exploit an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB. These newly created names represent an important resource for the microbiology community, facilitating communication between bioinformaticians, microbiologists and taxonomists, while populating the emerging landscape of microbial taxonomic and functional discovery with accessible and memorable linguistic labels.


Assuntos
Archaea , Ácidos Graxos , Archaea/genética , Bactérias/genética , Técnicas de Tipagem Bacteriana , Composição de Bases , DNA Bacteriano/genética , Ácidos Graxos/química , Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA
4.
Nature ; 536(7615): 179-83, 2016 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-27487207

RESUMO

Bacteria of the SAR11 clade constitute up to one half of all microbial cells in the oxygen-rich surface ocean. SAR11 bacteria are also abundant in oxygen minimum zones (OMZs), where oxygen falls below detection and anaerobic microbes have vital roles in converting bioavailable nitrogen to N2 gas. Anaerobic metabolism has not yet been observed in SAR11, and it remains unknown how these bacteria contribute to OMZ biogeochemical cycling. Here, genomic analysis of single cells from the world's largest OMZ revealed previously uncharacterized SAR11 lineages with adaptations for life without oxygen, including genes for respiratory nitrate reductases (Nar). SAR11 nar genes were experimentally verified to encode proteins catalysing the nitrite-producing first step of denitrification and constituted ~40% of OMZ nar transcripts, with transcription peaking in the anoxic zone of maximum nitrate reduction activity. These results link SAR11 to pathways of ocean nitrogen loss, redefining the ecological niche of Earth's most abundant organismal group.


Assuntos
Alphaproteobacteria/classificação , Alphaproteobacteria/metabolismo , Organismos Aquáticos/metabolismo , Nitrogênio/análise , Oceanos e Mares , Oxigênio/análise , Água do Mar/química , Adaptação Fisiológica/genética , Alphaproteobacteria/genética , Alphaproteobacteria/isolamento & purificação , Anaerobiose/genética , Organismos Aquáticos/enzimologia , Organismos Aquáticos/genética , Organismos Aquáticos/isolamento & purificação , Desnitrificação , Perfilação da Expressão Gênica , Genes Bacterianos , Genoma Bacteriano/genética , Nitrato Redutases/genética , Nitrato Redutases/metabolismo , Nitratos/metabolismo , Nitritos/metabolismo , Nitrogênio/metabolismo , Oxirredução , Oxigênio/metabolismo , Filogenia , Análise de Célula Única , Transcrição Gênica
5.
Appl Environ Microbiol ; 87(6)2021 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-33452027

RESUMO

The recovery of metagenome-assembled genomes (MAGs) from metagenomic data has recently become a common task for microbial studies. The strengths and limitations of the underlying bioinformatics algorithms are well appreciated by now based on performance tests with mock data sets of known composition. However, these mock data sets do not capture the complexity and diversity often observed within natural populations, since their construction typically relies on only a single genome of a given organism. Further, it remains unclear if MAGs can recover population-variable genes (those shared by >10% but <90% of the members of the population) as efficiently as core genes (those shared by >90% of the members). To address these issues, we compared the gene variabilities of pathogenic Escherichia coli isolates from eight diarrheal samples, for which the isolate was the causative agent, against their corresponding MAGs recovered from the companion metagenomic data set. Our analysis revealed that MAGs with completeness estimates near 95% captured only 77% of the population core genes and 50% of the variable genes, on average. Further, about 5% of the genes of these MAGs were conservatively identified as missing in the isolate and were of different (non-Enterobacteriaceae) taxonomic origin, suggesting errors at the genome-binning step, even though contamination estimates based on commonly used pipelines were only 1.5%. Therefore, the quality of MAGs may often be worse than estimated, and we offer examples of how to recognize and improve such MAGs to sufficient quality by (for instance) employing only contigs longer than 1,000 bp for binning.IMPORTANCE Metagenome assembly and the recovery of metagenome-assembled genomes (MAGs) have recently become common tasks for microbiome studies across environmental and clinical settings. However, the extent to which MAGs can capture the genes of the population they represent remains speculative. Current approaches to evaluating MAG quality are limited to the recovery and copy number of universal housekeeping genes, which represent a small fraction of the total genome, leaving the majority of the genome essentially inaccessible. If MAG quality in reality is lower than these approaches would estimate, this could have dramatic consequences for all downstream analyses and interpretations. In this study, we evaluated this issue using an approach that employed comparisons of the gene contents of MAGs to the gene contents of isolate genomes derived from the same sample. Further, our samples originated from a diarrhea case-control study, and thus, our results are relevant for recovering the virulence factors of pathogens from metagenomic data sets.


Assuntos
Escherichia coli/genética , Fezes/microbiologia , Genoma Bacteriano , Escherichia coli/isolamento & purificação , Humanos , Metagenoma
6.
Environ Microbiol ; 22(8): 3394-3412, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32495495

RESUMO

Recent advances in sequencing technology and bioinformatic pipelines have allowed unprecedented access to the genomes of yet-uncultivated microorganisms from diverse environments. However, the catalogue of freshwater genomes remains limited, and most genome recovery attempts in freshwater ecosystems have only targeted specific taxa. Here, we present a genome recovery pipeline incorporating iterative subtractive binning, and apply it to a time series of 100 metagenomic datasets from seven connected lakes and estuaries along the Chattahoochee River (Southeastern USA). Our set of metagenome-assembled genomes (MAGs) represents >400 yet-unnamed genomospecies, substantially increasing the number of high-quality MAGs from freshwater lakes. We propose names for two novel species: 'Candidatus Elulimicrobium humile' ('Ca. Elulimicrobiota', 'Patescibacteria') and 'Candidatus Aquidulcis frankliniae' ('Chloroflexi'). Collectively, our MAGs represented about half of the total microbial community at any sampling point. To evaluate the prevalence of these genomospecies in the chronoseries, we introduce methodologies to estimate relative abundance and habitat preference that control for uneven genome quality and sample representation. We demonstrate high degrees of habitat-specialization and endemicity for most genomospecies in the Chattahoochee lakes. Wider ecological ranges characterized smaller genomes with higher coding densities, indicating an overall advantage of smaller, more compact genomes for cosmopolitan distributions.


Assuntos
Chloroflexi/classificação , Chloroflexi/isolamento & purificação , Genoma Bacteriano/genética , Lagos/microbiologia , Chloroflexi/genética , Bases de Dados Genéticas , Metagenoma/genética , Metagenômica , Microbiota/genética
7.
Environ Microbiol ; 22(6): 2094-2106, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32114693

RESUMO

Microbial communities ultimately control the fate of petroleum hydrocarbons (PHCs) that enter the natural environment, but the interactions of microbes with PHCs and the environment are highly complex and poorly understood. Genome-resolved metagenomics can help unravel these complex interactions. However, the lack of a comprehensive database that integrates existing genomic/metagenomic data from oil environments with physicochemical parameters known to regulate the fate of PHCs currently limits data analysis and interpretations. Here, we curated a comprehensive, searchable database that documents microbial populations in natural oil ecosystems and oil spills, along with available underlying physicochemical data, geocoded via geographic information system to reveal their geographic distribution patterns. Analysis of the ~2000 metagenome-assembled genomes (MAGs) available in the database revealed strong ecological niche specialization within habitats. Over 95% of the recovered MAGs represented novel taxa underscoring the limited representation of cultured organisms from oil-contaminated and oil reservoir ecosystems. The majority of MAGs linked to oil-contaminated ecosystems were detectable in non-oiled samples from the Gulf of Mexico but not in comparable samples from elsewhere, indicating that the Gulf is primed for oil biodegradation. The repository should facilitate future work toward a predictive understanding of the microbial taxa and their activities that control the fate of oil spills.


Assuntos
Biodegradação Ambiental , Bases de Dados Genéticas , Campos de Petróleo e Gás/microbiologia , Poluição por Petróleo/análise , Petróleo/microbiologia , Golfo do México , Hidrocarbonetos/metabolismo , Metagenoma/genética , Metagenômica , Microbiota/genética , Petróleo/metabolismo
8.
Appl Environ Microbiol ; 86(6)2020 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-31924621

RESUMO

Little is known about the public health risks associated with natural creek sediments that are affected by runoff and fecal pollution from agricultural and livestock practices. For instance, the persistence of foodborne pathogens such as Shiga toxin-producing Escherichia coli (STEC) originating from these practices remains poorly quantified. Towards closing these knowledge gaps, the water-sediment interface of two creeks in the Salinas River Valley of California was sampled over a 9-month period using metagenomics and traditional culture-based tests for STEC. Our results revealed that these sediment communities are extremely diverse and have functional and taxonomic diversity comparable to that observed in soils. With our sequencing effort (∼4 Gbp per library), we were unable to detect any pathogenic E. coli in the metagenomes of 11 samples that had tested positive using culture-based methods, apparently due to relatively low abundance. Furthermore, there were no significant differences in the abundance of human- or cow-specific gut microbiome sequences in the downstream impacted sites compared to that in upstream more pristine (control) sites, indicating natural dilution of anthropogenic inputs. Notably, the high number of metagenomic reads carrying antibiotic resistance genes (ARGs) found in all samples was significantly higher than ARG reads in other available freshwater and soil metagenomes, suggesting that these communities may be natural reservoirs of ARGs. The work presented here should serve as a guide for sampling volumes, amount of sequencing to apply, and what bioinformatics analyses to perform when using metagenomics for public health risk studies of environmental samples such as sediments.IMPORTANCE Current agricultural and livestock practices contribute to fecal contamination in the environment and the spread of food- and waterborne disease and antibiotic resistance genes (ARGs). Traditionally, the level of pollution and risk to public health are assessed by culture-based tests for the intestinal bacterium Escherichia coli However, the accuracy of these traditional methods (e.g., low accuracy in quantification, and false-positive signal when PCR based) and their suitability for sediments remain unclear. We collected sediments for a time series metagenomics study from one of the most highly productive agricultural regions in the United States in order to assess how agricultural runoff affects the native microbial communities and if the presence of Shiga toxin-producing Escherichia coli (STEC) in sediment samples can be detected directly by sequencing. Our study provided important information on the potential for using metagenomics as a tool for assessment of public health risk in natural environments.


Assuntos
Sedimentos Geológicos/microbiologia , Metagenômica , Saúde Pública/métodos , Medição de Risco/métodos , Escherichia coli Shiga Toxigênica/isolamento & purificação , Agricultura , Criação de Animais Domésticos , Animais , California , Gado , Rios/microbiologia , Poluição da Água
9.
Nucleic Acids Res ; 46(W1): W282-W288, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29905870

RESUMO

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.


Assuntos
Genômica , Internet , RNA Ribossômico 16S/genética , Software , Classificação , Variação Genética/genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , Filogenia
10.
Nucleic Acids Res ; 45(3): e14, 2017 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-28180325

RESUMO

Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted 'atypical' nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.


Assuntos
Metagenômica/estatística & dados numéricos , Organismos Aquáticos/genética , Biologia Computacional/métodos , Bases de Dados Genéticas/estatística & dados numéricos , Ecossistema , Consórcios Microbianos/genética , Filogenia , Curva ROC , Microbiologia do Solo
11.
Appl Environ Microbiol ; 84(6)2018 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-29305502

RESUMO

The most common practice in studying and cataloguing prokaryotic diversity involves the grouping of sequences into operational taxonomic units (OTUs) at the 97% 16S rRNA gene sequence identity level, often using partial gene sequences, such as PCR-generated amplicons. Due to the high sequence conservation of rRNA genes, organisms belonging to closely related yet distinct species may be grouped under the same OTU. However, it remains unclear how much diversity has been underestimated by this practice. To address this question, we compared the OTUs of genomes defined at the 97% or 98.5% 16S rRNA gene identity level against OTUs of the same genomes defined at the 95% whole-genome average nucleotide identity (ANI), which is a much more accurate proxy for species. Our results show that OTUs resulting from a 98.5% 16S rRNA gene identity cutoff are more accurate than 97% compared to 95% ANI (90.5% versus 89.9% accuracy) but indistinguishable from any other threshold in the 98.29 to 98.78% range. Even with the more stringent thresholds, however, the 16S rRNA gene-based approach commonly underestimates the number of OTUs by ∼12%, on average, compared to the ANI-based approach (∼14% underestimation when using the 97% identity threshold). More importantly, the degree of underestimation can become 50% or more for certain taxa, such as the genera Pseudomonas, Burkholderia, Escherichia, Campylobacter, and Citrobacter These results provide a quantitative view of the degree of underestimation of extant prokaryotic diversity by 16S rRNA gene-defined OTUs and suggest that genomic resolution is often necessary.IMPORTANCE Species diversity is one of the most fundamental pieces of information for community ecology and conservational biology. Therefore, employing accurate proxies for what a species or the unit of diversity is are cornerstones for a large set of microbial ecology and diversity studies. The most common proxies currently used rely on the clustering of 16S rRNA gene sequences at some threshold of nucleotide identity, typically 97% or 98.5%. Here, we explore how well this strategy reflects the more accurate whole-genome-based proxies and determine the frequency with which the high conservation of 16S rRNA sequences masks substantial species-level diversity.


Assuntos
Bactérias/classificação , Genoma Bacteriano , Microbiota , Análise de Sequência de RNA/métodos , Bactérias/genética , RNA Ribossômico 16S/análise
13.
Appl Environ Microbiol ; 83(8)2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28258138

RESUMO

A single liter of water contains hundreds, if not thousands, of bacterial and archaeal species, each of which typically makes up a very small fraction of the total microbial community (<0.1%), the so-called "rare biosphere." How often, and via what mechanisms, e.g., clonal amplification versus horizontal gene transfer, the rare taxa and genes contribute to microbial community response to environmental perturbations represent important unanswered questions toward better understanding the value and modeling of microbial diversity. We tested whether rare species frequently responded to changing environmental conditions by establishing 20-liter planktonic mesocosms with water from Lake Lanier (Georgia, USA) and perturbing them with organic compounds that are rarely detected in the lake, including 2,4-dichlorophenoxyacetic acid (2,4-D), 4-nitrophenol (4-NP), and caffeine. The populations of the degraders of these compounds were initially below the detection limit of quantitative PCR (qPCR) or metagenomic sequencing methods, but they increased substantially in abundance after perturbation. Sequencing of several degraders (isolates) and time-series metagenomic data sets revealed distinct cooccurring alleles of degradation genes, frequently carried on transmissible plasmids, especially for the 2,4-D mesocosms, and distinct species dominating the post-enrichment microbial communities from each replicated mesocosm. This diversity of species and genes also underlies distinct degradation profiles among replicated mesocosms. Collectively, these results supported the hypothesis that the rare biosphere can serve as a genetic reservoir, which can be frequently missed by metagenomics but enables community response to changing environmental conditions caused by organic pollutants, and they provided insights into the size of the pool of rare genes and species.IMPORTANCE A single liter of water or gram of soil contains hundreds of low-abundance bacterial and archaeal species, the so called rare biosphere. The value of this astonishing biodiversity for ecosystem functioning remains poorly understood, primarily due to the fact that microbial community analysis frequently focuses on abundant organisms. Using a combination of culture-dependent and culture-independent (metagenomics) techniques, we showed that rare taxa and genes commonly contribute to the microbial community response to organic pollutants. Our findings should have implications for future studies that aim to study the role of rare species in environmental processes, including environmental bioremediation efforts of oil spills or other contaminants.


Assuntos
Biodiversidade , Ecossistema , Água Doce/microbiologia , Consórcios Microbianos/fisiologia , Poluentes Químicos da Água/metabolismo , Poluentes Químicos da Água/farmacologia , Ácido 2,4-Diclorofenoxiacético/metabolismo , Ácido 2,4-Diclorofenoxiacético/farmacologia , Archaea/classificação , Archaea/genética , Archaea/metabolismo , Bactérias/classificação , Bactérias/genética , Bactérias/metabolismo , Biodegradação Ambiental , Cafeína/metabolismo , Cafeína/farmacologia , Georgia , Lagos/microbiologia , Metagenômica , Consórcios Microbianos/efeitos dos fármacos , Consórcios Microbianos/genética , Nitrofenóis/metabolismo , Nitrofenóis/farmacologia , Filogenia , RNA Ribossômico 16S , Reação em Cadeia da Polimerase em Tempo Real , Poluentes Químicos da Água/química
14.
Appl Environ Microbiol ; 82(9): 2872-2883, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26969701

RESUMO

Although the source of drinking water (DW) used in hospitals is commonly disinfected, biofilms forming on water pipelines are a refuge for bacteria, including possible pathogens that survive different disinfection strategies. These biofilm communities are only beginning to be explored by culture-independent techniques that circumvent the limitations of conventional monitoring efforts. Hence, theories regarding the frequency of opportunistic pathogens in DW biofilms and how biofilm members withstand high doses of disinfectants and/or chlorine residuals in the water supply remain speculative. The aim of this study was to characterize the composition of microbial communities growing on five hospital shower hoses using both 16S rRNA gene sequencing of bacterial isolates and whole-genome shotgun metagenome sequencing. The resulting data revealed a Mycobacterium-like population, closely related to Mycobacterium rhodesiae and Mycobacterium tusciae, to be the predominant taxon in all five samples, and its nearly complete draft genome sequence was recovered. In contrast, the fraction recovered by culture was mostly affiliated with Proteobacteria, including members of the genera Sphingomonas, Blastomonas, and Porphyrobacter.The biofilm community harbored genes related to disinfectant tolerance (2.34% of the total annotated proteins) and a lower abundance of virulence determinants related to colonization and evasion of the host immune system. Additionally, genes potentially conferring resistance to ß-lactam, aminoglycoside, amphenicol, and quinolone antibiotics were detected. Collectively, our results underscore the need to understand the microbiome of DW biofilms using metagenomic approaches. This information might lead to more robust management practices that minimize the risks associated with exposure to opportunistic pathogens in hospitals.


Assuntos
Fenômenos Fisiológicos Bacterianos , Biofilmes/crescimento & desenvolvimento , Infecção Hospitalar/genética , Infecção Hospitalar/microbiologia , Hospitais , Microbiologia da Água , Bactérias/classificação , Bactérias/genética , Bactérias/isolamento & purificação , Bactérias/patogenicidade , Cloro , Técnicas de Cultura , DNA Bacteriano/análise , Desinfetantes/farmacologia , Desinfecção , Farmacorresistência Bacteriana , Genoma Bacteriano , Metagenoma , Microbiota/genética , Mycobacterium/fisiologia , Ohio , Filogenia , Proteobactérias/fisiologia , RNA Ribossômico 16S/genética , Sphingomonadaceae/fisiologia , Abastecimento de Água
15.
Nucleic Acids Res ; 42(8): e73, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24589583

RESUMO

Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100-2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ∼10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies.


Assuntos
Genômica/métodos , Metagenômica/métodos , Filogenia , Algoritmos , Classificação/métodos , Genes , Humanos , Microbiota , Software
16.
Proc Natl Acad Sci U S A ; 110(7): 2575-80, 2013 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-23359712

RESUMO

The composition and prevalence of microorganisms in the middle-to-upper troposphere (8-15 km altitude) and their role in aerosol-cloud-precipitation interactions represent important, unresolved questions for biological and atmospheric science. In particular, airborne microorganisms above the oceans remain essentially uncharacterized, as most work to date is restricted to samples taken near the Earth's surface. Here we report on the microbiome of low- and high-altitude air masses sampled onboard the National Aeronautics and Space Administration DC-8 platform during the 2010 Genesis and Rapid Intensification Processes campaign in the Caribbean Sea. The samples were collected in cloudy and cloud-free air masses before, during, and after two major tropical hurricanes, Earl and Karl. Quantitative PCR and microscopy revealed that viable bacterial cells represented on average around 20% of the total particles in the 0.25- to 1-µm diameter range and were at least an order of magnitude more abundant than fungal cells, suggesting that bacteria represent an important and underestimated fraction of micrometer-sized atmospheric aerosols. The samples from the two hurricanes were characterized by significantly different bacterial communities, revealing that hurricanes aerosolize a large amount of new cells. Nonetheless, 17 bacterial taxa, including taxa that are known to use C1-C4 carbon compounds present in the atmosphere, were found in all samples, indicating that these organisms possess traits that allow survival in the troposphere. The findings presented here suggest that the microbiome is a dynamic and underappreciated aspect of the upper troposphere with potentially important impacts on the hydrological cycle, clouds, and climate.


Assuntos
Microbiologia do Ar , Atmosfera , Biodiversidade , Tempestades Ciclônicas , Metagenoma/genética , Altitude , Análise de Variância , Região do Caribe , Filogeografia , Análise de Sequência de DNA , Especificidade da Espécie
17.
Bioinformatics ; 30(5): 629-35, 2014 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-24123672

RESUMO

MOTIVATION: Determining the fraction of the diversity within a microbial community sampled and the amount of sequencing required to cover the total diversity represent challenging issues for metagenomics studies. Owing to these limitations, central ecological questions with respect to the global distribution of microbes and the functional diversity of their communities cannot be robustly assessed. RESULTS: We introduce Nonpareil, a method to estimate and project coverage in metagenomes. Nonpareil does not rely on high-quality assemblies, operational taxonomic unit calling or comprehensive reference databases; thus, it is broadly applicable to metagenomic studies. Application of Nonpareil on available metagenomic datasets provided estimates on the relative complexity of soil, freshwater and human microbiome communities, and suggested that ∼200 Gb of sequencing data are required for 95% abundance-weighted average coverage of the soil communities analyzed. AVAILABILITY AND IMPLEMENTATION: Nonpareil is available at https://github.com/lmrodriguezr/nonpareil/ under the Artistic License 2.0.


Assuntos
Metagenômica/métodos , Algoritmos , Metagenoma , Microbiota , Microbiologia do Solo
18.
Appl Environ Microbiol ; 80(5): 1777-86, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24375144

RESUMO

Soil microbial communities are extremely complex, being composed of thousands of low-abundance species (<0.1% of total). How such complex communities respond to natural or human-induced fluctuations, including major perturbations such as global climate change, remains poorly understood, severely limiting our predictive ability for soil ecosystem functioning and resilience. In this study, we compared 12 whole-community shotgun metagenomic data sets from a grassland soil in the Midwestern United States, half representing soil that had undergone infrared warming by 2°C for 10 years, which simulated the effects of climate change, and the other half representing the adjacent soil that received no warming and thus, served as controls. Our analyses revealed that the heated communities showed significant shifts in composition and predicted metabolism, and these shifts were community wide as opposed to being attributable to a few taxa. Key metabolic pathways related to carbon turnover, such as cellulose degradation (∼13%) and CO2 production (∼10%), and to nitrogen cycling, including denitrification (∼12%), were enriched under warming, which was consistent with independent physicochemical measurements. These community shifts were interlinked, in part, with higher primary productivity of the aboveground plant communities stimulated by warming, revealing that most of the additional, plant-derived soil carbon was likely respired by microbial activity. Warming also enriched for a higher abundance of sporulation genes and genomes with higher G+C content. Collectively, our results indicate that microbial communities of temperate grassland soils play important roles in mediating feedback responses to climate change and advance the understanding of the molecular mechanisms of community adaptation to environmental perturbations.


Assuntos
Biota/efeitos da radiação , Aquecimento Global , Metagenômica , Microbiologia do Solo , Carbono/metabolismo , Humanos , Redes e Vias Metabólicas , Meio-Oeste dos Estados Unidos , Nitrogênio/metabolismo
19.
Syst Appl Microbiol ; 47(2-3): 126498, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38442686

RESUMO

Codes of nomenclature that provide well-regulated and stable frameworks for the naming of taxa are a fundamental underpinning of biological research. These Codes themselves require systems that govern their administration, interpretation and emendment. Here we review the provisions that have been made for the governance of the recently introduced Code of Nomenclature of Prokaryotes Described from Sequence Data (SeqCode), which provides a nomenclatural framework for the valid publication of names of Archaea and Bacteria using isolate genome, metagenome-assembled genome or single-amplified genome sequences as type material. The administrative structures supporting the SeqCode are designed to be open and inclusive. Direction is provided by the SeqCode Community, which we encourage those with an interest in prokaryotic systematics to join.


Assuntos
Archaea , Bactérias , Participação da Comunidade , Terminologia como Assunto , Archaea/classificação , Archaea/genética , Bactérias/genética , Bactérias/classificação , Classificação/métodos
20.
Int J Food Microbiol ; 410: 110488, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38035404

RESUMO

Metagenomics, i.e., shotgun sequencing of the total microbial community DNA from a sample, has become a mature technique but its application to pathogen detection in clinical, environmental, and food samples is far from common or standardized. In this review, we summarize ongoing developments in metagenomic sequence analysis that facilitate its wider application to pathogen detection. We examine theoretical frameworks for estimating the limit of detection for a particular level of sequencing effort, current approaches for achieving species and strain analytical resolution, and discuss some relevant modern tools for these tasks. While these recent advances are significant and establish metagenomics as a powerful tool to provide insights not easily attained by culture-based approaches, metagenomics is unlikely to emerge as a widespread, routine monitoring tool in the near future due to its inherently high detection limits, cost, and inability to easily distinguish between viable and non-viable cells. Instead, metagenomics seems best poised for applications involving special circumstances otherwise challenging for culture-based and molecular (e.g., PCR-based) approaches such as the de novo detection of novel pathogens, cases of co-infection by more than one pathogen, and situations where it is important to assess the genomic composition of the pathogenic population(s) and/or its impact on the indigenous microbiome.


Assuntos
Metagenoma , Microbiota , Microbiota/genética , Metagenômica/métodos , Biologia Computacional , Bactérias/genética , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA