RESUMEN
The intestinal lining is protected by a mucous barrier composed predominantly of complex carbohydrates. Gut microbes employ diverse glycoside hydrolases (GHs) to liberate mucosal sugars as a nutrient source to facilitate host colonization. Intensive catabolism of mucosal glycans, however, may contribute to barrier erosion, pathogen encroachment, and inflammation. Sialic acid is an acidic sugar featured at terminal positions of host glycans. Characterized sialidases from the microbiome belong to the GH33 family, according to CAZy (Carbohydrate-Active enZYmes Database). In 2018 a functional metagenomics screen using thermal spring DNA uncovered the founding member of the GH156 sialidase family, the presence of which has yet to be reported in the context of the human microbiome. A subset of GH156 sequences from the CAZy database containing key sialidase residues was used to build a hidden Markov model. HMMsearch against public databases revealed ~10× more putative GH156 sialidases than currently cataloged by CAZy. Represented phyla include Bacteroidota, Verrucomicrobiota, and Firmicutes_A from human microbiomes, all of which play notable roles in carbohydrate fermentation. Analyses of metagenomic data sets revealed that GH156s are frequently encoded in metagenomes, with a greater variety and abundance of GH156 genes observed in traditional hunter-gatherer or agriculturalist societies than in industrialized societies, particularly relative to individuals with inflammatory bowel disease (IBD). Nineteen GH156s were recombinantly expressed and assayed for sialidase activity. The five GH156 sialidases identified here share limited sequence identity to each other or the founding GH156 family member and are representative of a large subset of the family. IMPORTANCE Sialic acids occupy terminal positions of human glycans where they act as receptors for microbes, toxins, and immune signaling molecules. Microbial enzymes that remove sialic acids, sialidases, are abundant in the human microbiome where they may contribute to shaping the microbiota community structure or contribute to pathology. Furthermore, sialidases have proven to hold therapeutic potential for cancer therapy. Here, we examined the sequence space of a sialidase family of enzymes, GH156, previously unknown in the human gut environment. Our analyses suggest that human populations with disparate dietary practices harbor distinct varieties and abundances of GH156-encoding genes. Furthermore, we demonstrate the sialidase activity of 5 gut-derived GH156s. These results expand the diversity of sialidases that may contribute to host glycan degradation, and these sequences may have biotechnological or clinical utility.
Asunto(s)
Metagenoma , Neuraminidasa , Humanos , Glicósido Hidrolasas/genética , Neuraminidasa/genética , Neuraminidasa/metabolismo , Polisacáridos , Ácidos Siálicos/metabolismo , Microbioma GastrointestinalRESUMEN
Amplicon sequencing (for example, of the 16S rRNA gene) identifies the presence and relative abundance of microbial community members. However, metagenomic sequencing is needed to identify the genetic content and functional potential of a community. Metagenomics is challenging in samples dominated by host DNA, such as those from the skin, tissue and respiratory tract. Here, we combine advances in amplicon and metagenomic sequencing with culture-enriched molecular profiling to study the human microbiota. Using the cystic fibrosis lung as an example, we cultured an average of 82.13% of the operational taxonomic units representing 99.3% of the relative abundance identified in direct sequencing of sputum samples; importantly, culture enrichment identified 63.3% more operational taxonomic units than direct sequencing. We developed the PLate Coverage Algorithm (PLCA) to determine a representative subset of culture plates on which to conduct culture-enriched metagenomics, resulting in the recovery of greater taxonomic diversity-including of low-abundance taxa-with better metagenome-assembled genomes, longer contigs and better functional annotations when compared to culture-independent methods. The PLCA is also applied as a proof of principle to a previously published gut microbiota dataset. Culture-enriched molecular profiling can be used to better understand the role of the human microbiota in health and disease.
Asunto(s)
Fibrosis Quística/microbiología , Pulmón/microbiología , Microbiota/genética , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Metagenoma , Metagenómica/métodos , Técnicas Microbiológicas , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: In studies evaluating the microbiome, numerous factors can contribute to technical variability. These factors include DNA extraction methodology, sequencing protocols, and data analysis strategies. We sought to evaluate the impact these factors have on the results obtained when the sequence data are independently generated and analyzed by different laboratories. METHODS: To evaluate the effect of technical variability, we used human intestinal biopsy samples resected from individuals diagnosed with an inflammatory bowel disease (IBD), including Crohn's disease (n = 12) and ulcerative colitis (n = 10), and those without IBD (n = 10). Matched samples from each participant were sent to three laboratories and studied using independent protocols for DNA extraction, library preparation, targeted-amplicon sequencing of a 16S rRNA gene hypervariable region, and processing of sequence data. We looked at two measures of interest - Bray-Curtis PERMANOVA R 2 values and log2 fold-change estimates of the 25 most-abundant taxa - to assess variation in the results produced by each laboratory, as well the relative contribution to variation from the different extraction, sequencing, and analysis steps used to generate these measures. RESULTS: The R 2 values and estimated differential abundance associated with diagnosis were consistent across datasets that used different DNA extraction and sequencing protocols, and within datasets that pooled samples from multiple protocols; however, variability in bioinformatic processing of sequence data led to changes in R 2 values and inconsistencies in taxonomic assignment and abundance estimates. CONCLUSION: Although the contribution of DNA extraction and sequencing methods to variability were observable, we find that results can be robust to the various extraction and sequencing approaches used in our study. Differences in data processing methods have a larger impact on results, making comparison among studies less reliable and the combined analysis of bioinformatically processed samples nearly impossible. Our results highlight the importance of making raw sequence data available to facilitate combined and comparative analyses of published studies using common data processing protocols. Study methodologies should provide detailed data processing methods for validation, interpretability, reproducibility, and comparability.
RESUMEN
We provide cytochrome c oxidase subunit 1 (COI) barcode sequences of fishes of the Nayband National Park, Persian Gulf, Iran. Industrial activities, ecological considerations and goals of The Fish Barcode of Life campaign make it crucial that fish species residing in the park be identified. To the best of our knowledge, this is the first report of barcoding data on fishes of the Persian Gulf. We examined 187 individuals representing 76 species, 56 genera and 32 families. The data flagged potentially cryptic species of Gerres filamentosus and Plectorhinchus schotaf. 16S rDNA data on these species are provided. Exclusion of these two potential cryptic species resulted in a mean COI intraspecific distance of 0.18%, and a mean inter- to intraspecific divergence ratio of 66.7. There was no overlap between maximum Kimura 2-parameter distances among conspecifics (1.66%) and minimum distance among congeneric species (6.19%). Barcodes shared among species were not observed. Neighbour-joining analysis showed that most species formed cohesive sequence units with little variation. Finally, the comparison of 16 selected species from this study with meta-data of conspecifics from Australia, India, China and South Africa revealed high interregion divergences and potential existence of six cryptic species. Pairwise interregional comparisons were more informative than global divergence assessments with regard to detection of cryptic variation. Our analysis exemplifies optimal use of the expanding barcode data now becoming available.