RESUMEN
While healthy gut microbiomes are critical to human health, pertinent microbial processes remain largely undefined, partially due to differential bias among profiling techniques. By simultaneously integrating multiple profiling methods, multi-omic analysis can define generalizable microbial processes, and is especially useful in understanding complex conditions such as Autism. Challenges with integrating heterogeneous data produced by multiple profiling methods can be overcome using Latent Dirichlet Allocation (LDA), a promising natural language processing technique that identifies topics in heterogeneous documents. In this study, we apply LDA to multi-omic microbial data (16S rRNA amplicon, shotgun metagenomic, shotgun metatranscriptomic, and untargeted metabolomic profiling) from the stool of 81 children with and without Autism. We identify topics, or microbial processes, that summarize complex phenomena occurring within gut microbial communities. We then subset stool samples by topic distribution, and identify metabolites, specifically neurotransmitter precursors and fatty acid derivatives, that differ significantly between children with and without Autism. We identify clusters of topics, deemed "cross-omic topics", which we hypothesize are representative of generalizable microbial processes observable regardless of profiling method. Interpreting topics, we find each represents a particular diet, and we heuristically label each cross-omic topic as: healthy/general function, age-associated function, transcriptional regulation, and opportunistic pathogenesis.
Asunto(s)
Trastorno Autístico , Microbioma Gastrointestinal , Microbiota , Niño , Humanos , Microbioma Gastrointestinal/genética , Multiómica , ARN Ribosómico 16S/genética , Microbiota/genéticaRESUMEN
Trait inference from mixed-species assemblages is a central problem in microbial ecology. Frequently, sequencing information from an environment is available, but phenotypic measurements from individual community members are not. With the increasing availability of molecular data for microbial communities, bioinformatic approaches that map metagenome to (meta)phenotype are needed. Recently, we developed a tool, gRodon, that enables the prediction of the maximum growth rate of an organism from genomic data on the basis of codon usage patterns. Our work and that of other groups suggest that such predictors can be applied to mixed-species communities in order to derive estimates of the average community-wide maximum growth rate. Here, we present an improved maximum growth rate predictor designed for metagenomes that corrects a persistent GC bias in the original gRodon model for metagenomic prediction. We benchmark this predictor with simulated metagenomic data sets to show that it has superior performance on mixed-species communities relative to earlier models. We go on to provide guidance on data preprocessing and show that calling genes from assembled contigs rather than directly from reads dramatically improves performance. Finally, we apply our predictor to large-scale metagenomic data sets from marine and human microbiomes to illustrate how community-wide growth prediction can be a powerful approach for hypothesis generation. Altogether, we provide an updated tool with clear guidelines for users about the uses and pitfalls of metagenomic prediction of the average community-wide maximal growth rate. IMPORTANCE Microbes dominate nearly every known habitat, and therefore tools to survey the structure and function of natural microbial communities are much needed. Metagenomics, in which the DNA content of an entire community of organisms is sequenced all at once, allows us to probe the genetic diversity contained in a habitat. Yet, mapping metagenomic information to the actual traits of community members is a difficult and largely unsolved problem. Here, we present and validate a tool that allows users to predict the average maximum growth rate of a microbial community directly from metagenomic data. Maximum growth rate is a fundamental characteristic of microbial species that can give us a great deal of insight into their ecological role, and by applying our community-level predictor to large-scale metagenomic data sets from marine and human-associated microbiomes, we show how community-wide growth prediction can be a powerful approach for hypothesis generation.
Asunto(s)
Metagenoma , Microbiota , Humanos , Metagenoma/genética , Benchmarking , Uso de Codones , Microbiota/genéticaRESUMEN
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder influenced by both genetic and environmental factors. Recently, gut dysbiosis has emerged as a powerful contributor to ASD symptoms. In this study, we recruited over 100 age-matched sibling pairs (between 2 and 8 years old) where one had an Autism ASD diagnosis and the other was developing typically (TD) (432 samples total). We collected stool samples over four weeks, tracked over 100 lifestyle and dietary variables, and surveyed behavior measures related to ASD symptoms. We identified 117 amplicon sequencing variants (ASVs) that were significantly different in abundance between sibling pairs across all three timepoints, 11 of which were supported by at least two contrast methods. We additionally identified dietary and lifestyle variables that differ significantly between cohorts, and further linked those variables to the ASVs they statistically relate to. Overall, dietary and lifestyle features were explanatory of ASD phenotype using logistic regression, however, global compositional microbiome features were not. Leveraging our longitudinal behavior questionnaires, we additionally identified 11 ASVs associated with changes in reported anxiety over time within and across all individuals. Lastly, we find that overall microbiome composition (beta-diversity) is associated with specific ASD-related behavioral characteristics.