Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
PLoS Comput Biol ; 20(6): e1012196, 2024 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-38875277

RESUMEN

Time series studies of microbiome interventions provide valuable data about microbial ecosystem structure. Unfortunately, existing models of microbial community dynamics have limited temporal memory and expressivity, relying on Markov or linearity assumptions. To address this, we introduce a new class of models based on transfer functions. These models learn impulse responses, capturing the potentially delayed effects of environmental changes on the microbial community. This allows us to simulate trajectories under hypothetical interventions and select significantly perturbed taxa with False Discovery Rate guarantees. Through simulations, we show that our approach effectively reduces forecasting errors compared to strong baselines and accurately pinpoints taxa of interest. Our case studies highlight the interpretability of the resulting differential response trajectories. An R package, mbtransfer, and notebooks to replicate the simulation and case studies are provided.

2.
bioRxiv ; 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38585817

RESUMEN

Mediation analysis has emerged as a versatile tool for answering mechanistic questions in microbiome research because it provides a statistical framework for attributing treatment effects to alternative causal pathways. Using a series of linked regression models, this analysis quantifies how complementary data modalities relate to one another and respond to treatments. Despite these advances, the rigid modeling assumptions of existing software often results in users viewing mediation analysis as a black box, not something that can be inspected, critiqued, and refined. We designed the multimedia R package to make advanced mediation analysis techniques accessible to a wide audience, ensuring that all statistical components are easily interpretable and adaptable to specific problem contexts. The package provides a uniform interface to direct and indirect effect estimation, synthetic null hypothesis testing, and bootstrap confidence interval construction. We illustrate the package through two case studies. The first re-analyzes a study of the microbiome and metabolome of Inflammatory Bowel Disease patients, uncovering potential mechanistic interactions between the microbiome and disease-associated metabolites, not found in the original study. The second analyzes new data about the influence of mindfulness practice on the microbiome. The mediation analysis identifies a direct effect between a randomized mindfulness intervention and microbiome composition, highlighting shifts in taxa previously associated with depression that cannot be explained by diet or sleep behaviors alone. A gallery of examples and further documentation can be found at https://go.wisc.edu/830110.

3.
bioRxiv ; 2023 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-38077024

RESUMEN

The R-Shiny package MolPad provides an interactive dashboard for understanding the dynamics of longitudinal molecular co-expression in microbiomics. The main idea for addressing the issue is first to use a network to overview major patterns among their predictive relationships and then zoom into specific clusters of interest. It is designed with a focus-plus-context analysis strategy and automatically generates links to online curated annotations. The dashboard consists of a cluster-level network, a bar plot of taxonomic composition, a line plot of data modalities, and a table for each pathway. Further, the package includes functions that handle the data processing for creating the dashboard. This makes it beginner-friendly for users with less R programming experience. We illustrate these methods with a case study on a longitudinal metagenomics analysis of the cheese microbiome.

4.
Biostatistics ; 24(4): 1045-1065, 2023 10 18.
Artículo en Inglés | MEDLINE | ID: mdl-35657012

RESUMEN

Topic modeling is a popular method used to describe biological count data. With topic models, the user must specify the number of topics $K$. Since there is no definitive way to choose $K$ and since a true value might not exist, we develop a method, which we call topic alignment, to study the relationships across models with different $K$. In addition, we present three diagnostics based on the alignment. These techniques can show how many topics are consistently present across different models, if a topic is only transiently present, or if a topic splits into more topics when $K$ increases. This strategy gives more insight into the process of generating the data than choosing a single value of $K$ would. We design a visual representation of these cross-model relationships, show the effectiveness of these tools for interpreting the topics on simulated and real data, and release an accompanying R package, alto.

5.
Proc Natl Acad Sci U S A ; 119(42): e2212930119, 2022 10 18.
Artículo en Inglés | MEDLINE | ID: mdl-36215464

RESUMEN

Bacterial secondary metabolites are a major source of antibiotics and other bioactive compounds. In microbial communities, these molecules can mediate interspecies interactions and responses to environmental change. Despite the importance of secondary metabolites in human health and microbial ecology, little is known about their roles and regulation in the context of multispecies communities. In a simplified model of the rhizosphere composed of Bacillus cereus, Flavobacterium johnsoniae, and Pseudomonas koreensis, we show that the dynamics of secondary metabolism depend on community species composition and interspecies interactions. Comparative metatranscriptomics and metametabolomics reveal that the abundance of transcripts of biosynthetic gene clusters (BGCs) and metabolomic molecular features differ between monocultures or dual cultures and a tripartite community. In both two- and three-member cocultures, P. koreensis modified expression of BGCs for zwittermicin, petrobactin, and other secondary metabolites in B. cereus and F. johnsoniae, whereas the BGC transcriptional response to the community in P. koreensis itself was minimal. Pairwise and tripartite cocultures with P. koreensis displayed unique molecular features that appear to be derivatives of lokisin, suggesting metabolic handoffs between species. Deleting the BGC for koreenceine, another P. koreensis metabolite, altered transcript and metabolite profiles across the community, including substantial up-regulation of the petrobactin and bacillibactin BGCs in B. cereus, suggesting that koreenceine represses siderophore production. Results from this model community show that bacterial BGC expression and chemical output depend on the identity and biosynthetic capacity of coculture partners, suggesting community composition and microbiome interactions may shape the regulation of secondary metabolism in nature.


Asunto(s)
Microbiota , Sideróforos , Antibacterianos , Benzamidas , Humanos , Metabolismo Secundario , Sideróforos/genética , Sideróforos/metabolismo
8.
Front Genet ; 10: 627, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31555316

RESUMEN

The simultaneous study of multiple measurement types is a frequently encountered problem in practical data analysis. It is especially common in microbiome research, where several sources of data-for example, 16s-rRNA, metagenomic, metabolomic, or transcriptomic data-can be collected on the same physical samples. There has been a proliferation of proposals for analyzing such multitable microbiome data, as is often the case when new data sources become more readily available, facilitating inquiry into new types of scientific questions. However, stepping back from the rush for new methods for multitable analysis in the microbiome literature, it is worthwhile to recognize the broader landscape of multitable methods, as they have been relevant in problem domains ranging across economics, robotics, genomics, chemometrics, and neuroscience. In different contexts, these techniques are called data integration, multi-omic, and multitask methods, for example. Of course, there is no unique optimal algorithm to use across domains-different instances of the multitable problem possess specific structure or variation that are worth incorporating in methodology. Our purpose here is not to develop new algorithms, but rather to 1) distill relevant themes across different analysis approaches and 2) provide concrete workflows for approaching analysis, as a function of ultimate analysis goals and data characteristics (heterogeneity, dimensionality, sparsity). Towards the second goal, we have made code for all analysis and figures available online at https://github.com/krisrs1128/multitable_review.

9.
Nat Commun ; 10(1): 2408, 2019 06 03.
Artículo en Inglés | MEDLINE | ID: mdl-31160598

RESUMEN

The gut microbiome has been linked to host obesity; however, sex-specific associations between microbiome and fat distribution are not well understood. Here we show sex-specific microbiome signatures contributing to obesity despite both sexes having similar gut microbiome characteristics, including overall abundance and diversity. Our comparisons of the taxa associated with the android fat ratio in men and women found that there is no widespread species-level overlap. We did observe overlap between the sexes at the genus and family levels in the gut microbiome, such as Holdemanella and Gemmiger; however, they had opposite correlations with fat distribution in men and women. Our findings support a role for fat distribution in sex-specific relationships with the composition of the microbiome. Our results suggest that studies of the gut microbiome and abdominal obesity-related disease outcomes should account for sex-specific differences.


Asunto(s)
Distribución de la Grasa Corporal , Microbioma Gastrointestinal/fisiología , Absorciometría de Fotón , Adulto , Anciano , Femenino , Microbioma Gastrointestinal/genética , Humanos , Masculino , Persona de Mediana Edad , ARN Ribosómico 16S/genética , Factores Sexuales
10.
Biostatistics ; 20(4): 599-614, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29868846

RESUMEN

The human microbiome is a complex ecological system, and describing its structure and function under different environmental conditions is important from both basic scientific and medical perspectives. Viewed through a biostatistical lens, many microbiome analysis goals can be formulated as latent variable modeling problems. However, although probabilistic latent variable models are a cornerstone of modern unsupervised learning, they are rarely applied in the context of microbiome data analysis, in spite of the evolutionary, temporal, and count structure that could be directly incorporated through such models. We explore the application of probabilistic latent variable models to microbiome data, with a focus on Latent Dirichlet allocation, Non-negative matrix factorization, and Dynamic Unigram models. To develop guidelines for when different methods are appropriate, we perform a simulation study. We further illustrate and compare these techniques using the data of Dethlefsen and Relman (2011, Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proceedings of the National Academy of Sciences108, 4554-4561), a study on the effects of antibiotics on bacterial community composition. Code and data for all simulations and case studies are available publicly.


Asunto(s)
Bioestadística/métodos , Microbiota , Modelos Estadísticos , Humanos
11.
J Comput Graph Stat ; 27(3): 553-563, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30416327

RESUMEN

We introduce methods for visualization of data structured along trees, especially hierarchically structured collections of time series. To this end, we identify questions that often emerge when working with hierarchical data and provide an R package to simplify their investigation. Our key contribution is the adaptation of the visualization principles of focus-plus-context and linking to the study of tree-structured data. Our motivating application is to the analysis of bacterial time series, where an evolutionary tree relating bacteria is available a priori. However, we have identified common problem types where, if a tree is not directly available, it can be constructed from data and then studied using our techniques. We perform detailed case studies to describe the alternative use cases, interpretations, and utility of the proposed visualization methods.

12.
PLoS Comput Biol ; 13(8): e1005706, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28821012

RESUMEN

Our work focuses on the stability, resilience, and response to perturbation of the bacterial communities in the human gut. Informative flash flood-like disturbances that eliminate most gastrointestinal biomass can be induced using a clinically-relevant iso-osmotic agent. We designed and executed such a disturbance in human volunteers using a dense longitudinal sampling scheme extending before and after induced diarrhea. This experiment has enabled a careful multidomain analysis of a controlled perturbation of the human gut microbiota with a new level of resolution. These new longitudinal multidomain data were analyzed using recently developed statistical methods that demonstrate improvements over current practices. By imposing sparsity constraints we have enhanced the interpretability of the analyses and by employing a new adaptive generalized principal components analysis, incorporated modulated phylogenetic information and enhanced interpretation through scoring of the portions of the tree most influenced by the perturbation. Our analyses leverage the taxa-sample duality in the data to show how the gut microbiota recovers following this perturbation. Through a holistic approach that integrates phylogenetic, metagenomic and abundance information, we elucidate patterns of taxonomic and functional change that characterize the community recovery process across individuals. We provide complete code and illustrations of new sparse statistical methods for high-dimensional, longitudinal multidomain data that provide greater interpretability than existing methods.


Asunto(s)
Microbioma Gastrointestinal/genética , Metagenoma/genética , Metagenómica/métodos , Modelos Biológicos , Adulto , ADN Bacteriano/análisis , ADN Bacteriano/genética , Diarrea , Femenino , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Análisis de Componente Principal , ARN Ribosómico 16S/genética , Adulto Joven
13.
F1000Res ; 5: 1492, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27508062

RESUMEN

High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or OTU composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, whether parametric or nonparametric. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests and nonparametric testing using community networks and the ggnetwork package.

14.
J Virol ; 90(13): 6058-6070, 2016 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-27099321

RESUMEN

UNLABELLED: HIV-1 protease (PR), reverse transcriptase (RT), and integrase (IN) variability presents a challenge to laboratories performing genotypic resistance testing. This challenge will grow with increased sequencing of samples enriched for proviral DNA such as dried blood spots and increased use of next-generation sequencing (NGS) to detect low-abundance HIV-1 variants. We analyzed PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to characterize variation at each amino acid position, identify mutations indicating APOBEC-mediated G-to-A editing, and identify mutations resulting from selective drug pressure. Forty-seven percent of PR, 37% of RT, and 34% of IN positions had one or more amino acid variants with a prevalence of ≥1%. Seventy percent of PR, 60% of RT, and 60% of IN positions had one or more variants with a prevalence of ≥0.1%. Overall 201 PR, 636 RT, and 346 IN variants had a prevalence of ≥0.1%. The median intersubtype prevalence ratios were 2.9-, 2.1-, and 1.9-fold for these PR, RT, and IN variants, respectively. Only 5.0% of PR, 3.7% of RT, and 2.0% of IN variants had a median intersubtype prevalence ratio of ≥10-fold. Variants at lower prevalences were more likely to differ biochemically and to be part of an electrophoretic mixture compared to high-prevalence variants. There were 209 mutations indicative of APOBEC-mediated G-to-A editing and 326 mutations nonpolymorphic treatment selected. Identification of viruses with a high number of APOBEC-associated mutations will facilitate the quality control of dried blood spot sequencing. Identifying sequences with a high proportion of rare mutations will facilitate the quality control of NGS. IMPORTANCE: Most antiretroviral drugs target three HIV-1 proteins: PR, RT, and IN. These proteins are highly variable: many different amino acids can be present at the same position in viruses from different individuals. Some of the amino acid variants cause drug resistance and occur mainly in individuals receiving antiretroviral drugs. Some variants result from a human cellular defense mechanism called APOBEC-mediated hypermutation. Many variants result from naturally occurring mutation. Some variants may represent technical artifacts. We studied PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to quantify variation at each amino acid position in these three HIV-1 proteins. We performed analyses to determine which amino acid variants resulted from antiretroviral drug selection pressure, APOBEC-mediated editing, and naturally occurring variation. Our results provide information essential to clinical, research, and public health laboratories performing genotypic resistance testing by sequencing HIV-1 PR, RT, and IN.


Asunto(s)
Desaminasas APOBEC/metabolismo , Variación Genética , Integrasa de VIH/genética , Proteasa del VIH/genética , Transcriptasa Inversa del VIH/genética , VIH-1/genética , Desaminasas APOBEC/genética , Secuencia de Aminoácidos , Fármacos Anti-VIH/uso terapéutico , Farmacorresistencia Viral/genética , Genotipo , Infecciones por VIH/tratamiento farmacológico , Infecciones por VIH/virología , Integrasa de VIH/química , Proteasa del VIH/química , Transcriptasa Inversa del VIH/química , VIH-1/enzimología , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación , Inhibidores de la Transcriptasa Inversa/uso terapéutico
15.
J Stat Softw ; 59(13): 1-21, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-26917999

RESUMEN

The 𝖱 package structSSI provides an accessible implementation of two recently developed simultaneous and selective inference techniques: the group Benjamini-Hochberg and hierarchical false discovery rate procedures. Unlike many multiple testing schemes, these methods specifically incorporate existing information about the grouped or hierarchical dependence between hypotheses under consideration while controlling the false discovery rate. Doing so increases statistical power and interpretability. Furthermore, these procedures provide novel approaches to the central problem of encoding complex dependency between hypotheses. We briefly describe the group Benjamini-Hochberg and hierarchical false discovery rate procedures and then illustrate them using two examples, one a measure of ecological microbial abundances and the other a global temperature time series. For both procedures, we detail the steps associated with the analysis of these particular data sets, including establishing the dependence structures, performing the test, and interpreting the results. These steps are encapsulated by 𝖱 functions, and we explain their applicability to general data sets.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA