Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 143
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Comput Biol ; 20(3): e1011814, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38527092

RESUMEN

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.


Asunto(s)
Genómica , Multiómica , Genómica/métodos
2.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36548341

RESUMEN

MOTIVATION: Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module's information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. RESULTS: In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. AVAILABILITY AND IMPLEMENTATION: R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Simulación por Computador , Análisis de Componente Principal , Tamaño de la Muestra
3.
PLoS Comput Biol ; 19(1): e1010758, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36607897

RESUMEN

Inferring gene co-expression networks is a useful process for understanding gene regulation and pathway activity. The networks are usually undirected graphs where genes are represented as nodes and an edge represents a significant co-expression relationship. When expression data of multiple (p) genes in multiple (K) conditions (e.g., treatments, tissues, strains) are available, joint estimation of networks harnessing shared information across them can significantly increase the power of analysis. In addition, examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. Condition adaptive fused graphical lasso (CFGL) is an existing method that incorporates condition specificity in a fused graphical lasso (FGL) model for estimating multiple co-expression networks. However, with computational complexity of O(p2K log K), the current implementation of CFGL is prohibitively slow even for a moderate number of genes and can only be used for a maximum of three conditions. In this paper, we propose a faster alternative of CFGL named rapid condition adaptive fused graphical lasso (RCFGL). In RCFGL, we incorporate the condition specificity into another popular model for joint network estimation, known as fused multiple graphical lasso (FMGL). We use a more efficient algorithm in the iterative steps compared to CFGL, enabling faster computation with complexity of O(p2K) and making it easily generalizable for more than three conditions. We also present a novel screening rule to determine if the full network estimation problem can be broken down into estimation of smaller disjoint sub-networks, thereby reducing the complexity further. We demonstrate the computational advantage and superior performance of our method compared to two non-condition adaptive methods, FGL and FMGL, and one condition adaptive method, CFGL in both simulation study and real data analysis. We used RCFGL to jointly estimate the gene co-expression networks in different brain regions (conditions) using a cohort of heterogeneous stock rats. We also provide an accommodating C and Python based package that implements RCFGL.


Asunto(s)
Algoritmos , Encéfalo , Animales , Ratas , Simulación por Computador , Redes Reguladoras de Genes/genética
4.
BMC Bioinformatics ; 24(1): 398, 2023 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-37880571

RESUMEN

BACKGROUND: In this paper, we are interested in interactions between a high-dimensional -omics dataset and clinical covariates. The goal is to evaluate the relationship between a phenotype of interest and a high-dimensional omics pathway, where the effect of the omics data depends on subjects' clinical covariates (age, sex, smoking status, etc.). For instance, metabolic pathways can vary greatly between sexes which may also change the relationship between certain metabolic pathways and a clinical phenotype of interest. We propose partitioning the clinical covariate space and performing a kernel association test within those partitions. To illustrate this idea, we focus on hierarchical partitions of the clinical covariate space and kernel tests on metabolic pathways. RESULTS: We see that our proposed method outperforms competing methods in most simulation scenarios. It can identify different relationships among clinical groups with higher power in most scenarios while maintaining a proper Type I error rate. The simulation studies also show a robustness to the grouping structure within the clinical space. We also apply the method to the COPDGene study and find several clinically meaningful interactions between metabolic pathways, the clinical space, and lung function. CONCLUSION: TreeKernel provides a simple and interpretable process for testing for relationships between high-dimensional omics data and clinical outcomes in the presence of interactions within clinical cohorts. The method is broadly applicable to many studies.


Asunto(s)
Enfermedad Pulmonar Obstructiva Crónica , Humanos , Fenotipo , Simulación por Computador
5.
BMC Bioinformatics ; 24(1): 86, 2023 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-36882691

RESUMEN

BACKGROUND: We developed a novel approach to minimize batch effects when assigning samples to batches. Our algorithm selects a batch allocation, among all possible ways of assigning samples to batches, that minimizes differences in average propensity score between batches. This strategy was compared to randomization and stratified randomization in a case-control study (30 per group) with a covariate (case vs control, represented as ß1, set to be null) and two biologically relevant confounding variables (age, represented as ß2, and hemoglobin A1c (HbA1c), represented as ß3). Gene expression values were obtained from a publicly available dataset of expression data obtained from pancreas islet cells. Batch effects were simulated as twice the median biological variation across the gene expression dataset and were added to the publicly available dataset to simulate a batch effect condition. Bias was calculated as the absolute difference between observed betas under the batch allocation strategies and the true beta (no batch effects). Bias was also evaluated after adjustment for batch effects using ComBat as well as a linear regression model. In order to understand performance of our optimal allocation strategy under the alternative hypothesis, we also evaluated bias at a single gene associated with both age and HbA1c levels in the 'true' dataset (CAPN13 gene). RESULTS: Pre-batch correction, under the null hypothesis (ß1), maximum absolute bias and root mean square (RMS) of maximum absolute bias, were minimized using the optimal allocation strategy. Under the alternative hypothesis (ß2 and ß3 for the CAPN13 gene), maximum absolute bias and RMS of maximum absolute bias were also consistently lower using the optimal allocation strategy. ComBat and the regression batch adjustment methods performed well as the bias estimates moved towards the true values in all conditions under both the null and alternative hypotheses. Although the differences between methods were less pronounced following batch correction, estimates of bias (average and RMS) were consistently lower using the optimal allocation strategy under both the null and alternative hypotheses. CONCLUSIONS: Our algorithm provides an extremely flexible and effective method for assigning samples to batches by exploiting knowledge of covariates prior to sample allocation.


Asunto(s)
Algoritmos , Estado de Salud , Puntaje de Propensión , Estudios de Casos y Controles , Hemoglobina Glucada , Humanos
6.
J Virol ; 96(17): e0097622, 2022 09 14.
Artículo en Inglés | MEDLINE | ID: mdl-35938870

RESUMEN

Humoral immune perturbations contribute to pathogenic outcomes in persons with HIV-1 infection (PWH). Gut barrier dysfunction in PWH is associated with microbial translocation and alterations in microbial communities (dysbiosis), and IgA, the most abundant immunoglobulin (Ig) isotype in the gut, is involved in gut homeostasis by interacting with the microbiome. We determined the impact of HIV-1 infection on the antibody repertoire in the gastrointestinal tract by comparing Ig gene utilization and somatic hypermutation (SHM) in colon biopsies from PWH (n = 19) versus age and sex-matched controls (n = 13). We correlated these Ig parameters with clinical, immunological, microbiome and virological data. Gene signatures of enhanced B cell activation were accompanied by skewed frequencies of multiple Ig Variable genes in PWH. PWH showed decreased frequencies of SHM in IgA and possibly IgG, with a substantial loss of highly mutated IgA sequences. The decline in IgA SHM in PWH correlated with gut CD4+ T cell loss and inversely correlated with mucosal inflammation and microbial translocation. Diminished gut IgA SHM in PWH was driven by transversion mutations at A or T deoxynucleotides, suggesting a defect not at the AID/APOBEC3 deamination step but at later stages of IgA SHM. These results expand our understanding of humoral immune perturbations in PWH that could have important implications in understanding mucosal immune defects in individuals with chronic HIV-1 infection. IMPORTANCE The gut is a major site of early HIV-1 replication and pathogenesis. Extensive CD4+ T cell depletion in this compartment results in a compromised epithelial barrier that facilitates the translocation of microbes into the underlying lamina propria and systemic circulation, resulting in chronic immune activation. To date, the consequences of microbial translocation on the mucosal humoral immune response (or vice versa) remains poorly integrated into the panoply of mucosal immune defects in PWH. We utilized next-generation sequencing approaches to profile the Ab repertoire and ascertain frequencies of somatic hypermutation in colon biopsies from antiretroviral therapy-naive PWH versus controls. Our findings identify perturbations in the Ab repertoire of PWH that could contribute to development or maintenance of dysbiosis. Moreover, IgA mutations significantly decreased in PWH and this was associated with adverse clinical outcomes. These data may provide insight into the mechanisms underlying impaired Ab-dependent gut homeostasis during chronic HIV-1 infection.


Asunto(s)
Tracto Gastrointestinal , Infecciones por VIH , Inmunoglobulina A , Hipermutación Somática de Inmunoglobulina , Disbiosis , Tracto Gastrointestinal/inmunología , Tracto Gastrointestinal/virología , Infecciones por VIH/genética , Infecciones por VIH/inmunología , VIH-1 , Humanos , Inmunidad Humoral , Inmunoglobulina A/genética
7.
Pediatr Diabetes ; 20232023.
Artículo en Inglés | MEDLINE | ID: mdl-38765731

RESUMEN

Given the differential risk of type 1 diabetes (T1D) in offspring of affected fathers versus affected mothers and our observation that T1D cases have differential DNA methylation near the imprinted DLGAP2 gene compared to controls, we examined whether methylation near DLGAP2 mediates the association between T1D family history and T1D risk. In a nested case-control study of 87 T1D cases and 87 controls from the Diabetes Autoimmunity Study in the Young, we conducted causal mediation analyses at 12 DLGAP2 region CpGs to decompose the effect of family history on T1D risk into indirect and direct effects. These effects were estimated from two regression models adjusted for the human leukocyte antigen DR3/4 genotype: a linear regression of family history on methylation (mediator model) and a logistic regression of family history and methylation on T1D (outcome model). For 8 of the 12 CpGs, we identified a significant interaction between T1D family history and methylation on T1D risk. Accounting for this interaction, we found that the increased risk of T1D for children with affected mothers compared to those with no family history was mediated through differences in methylation at two CpGs (cg27351978, cg00565786) in the DLGAP2 region, as demonstrated by a significant pure natural indirect effect (odds ratio (OR) = 1.98, 95% confidence interval (CI): 1.06-3.71) and nonsignificant total natural direct effect (OR = 1.65, 95% CI: 0.16-16.62) (for cg00565786). In contrast, the increased risk of T1D for children with an affected father or sibling was not explained by DNA methylation changes at these CpGs. Results were similar for cg27351978 and robust in sensitivity analyses. Lastly, we found that DNA methylation in the DLGAP2 region was associated (P<0:05) with gene expression of nearby protein-coding genes DLGAP2, ARHGEF10, ZNF596, and ERICH1. Results indicate that the maternal protective effect conferred through exposure to T1D in utero may operate through changes to DNA methylation that have functional downstream consequences.


Asunto(s)
Metilación de ADN , Diabetes Mellitus Tipo 1 , Predisposición Genética a la Enfermedad , Humanos , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/epidemiología , Femenino , Masculino , Estudios de Casos y Controles , Niño , Preescolar , Adolescente , Proteínas Activadoras de GTPasa/genética , Islas de CpG , Factores de Riesgo , Proteínas del Tejido Nervioso
8.
Environ Res ; 231(Pt 2): 116215, 2023 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-37224946

RESUMEN

BACKGROUND: Per- and polyfluoroalkyl substances (PFAS) are ubiquitous, environmentally persistent chemicals, and prenatal exposures have been associated with adverse child health outcomes. Prenatal PFAS exposure may lead to epigenetic age acceleration (EAA), defined as the discrepancy between an individual's chronologic and epigenetic or biological age. OBJECTIVES: We estimated associations of maternal serum PFAS concentrations with EAA in umbilical cord blood DNA methylation using linear regression, and a multivariable exposure-response function of the PFAS mixture using Bayesian kernel machine regression. METHODS: Five PFAS were quantified in maternal serum (median: 27 weeks of gestation) among 577 mother-infant dyads from a prospective cohort. Cord blood DNA methylation data were assessed with the Illumina HumanMethylation450 array. EAA was calculated as the residuals from regressing gestational age on epigenetic age, calculated using a cord-blood specific epigenetic clock. Linear regression tested for associations between each maternal PFAS concentration with EAA. Bayesian kernel machine regression with hierarchical selection estimated an exposure-response function for the PFAS mixture. RESULTS: In single pollutant models we observed an inverse relationship between perfluorodecanoate (PFDA) and EAA (-0.148 weeks per log-unit increase, 95% CI: -0.283, -0.013). Mixture analysis with hierarchical selection between perfluoroalkyl carboxylates and sulfonates indicated the carboxylates had the highest group posterior inclusion probability (PIP), or relative importance. Within this group, PFDA had the highest conditional PIP. Univariate predictor-response functions indicated PFDA and perfluorononanoate were inversely associated with EAA, while perfluorohexane sulfonate had a positive association with EAA. CONCLUSIONS: Maternal mid-pregnancy serum concentrations of PFDA were negatively associated with EAA in cord blood, suggesting a pathway by which prenatal PFAS exposures may affect infant development. No significant associations were observed with other PFAS. Mixture models suggested opposite directions of association between perfluoroalkyl sulfonates and carboxylates. Future studies are needed to determine the importance of neonatal EAA for later child health outcomes.


Asunto(s)
Ácidos Alcanesulfónicos , Contaminantes Ambientales , Fluorocarburos , Efectos Tardíos de la Exposición Prenatal , Lactante , Recién Nacido , Embarazo , Niño , Femenino , Humanos , Sangre Fetal , Efectos Tardíos de la Exposición Prenatal/inducido químicamente , Estudios Prospectivos , Teorema de Bayes , Alcanosulfonatos , Madres , Ácidos Carboxílicos , Epigénesis Genética
9.
Int J Mol Sci ; 24(9)2023 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-37175432

RESUMEN

Intrauterine smoke (IUS) exposure during early childhood has been associated with a number of negative health consequences, including reduced lung function and asthma susceptibility. The biological mechanisms underlying these associations have not been established. MicroRNAs regulate the expression of numerous genes involved in lung development. Thus, investigation of the impact of IUS on miRNA expression during human lung development may elucidate the impact of IUS on post-natal respiratory outcomes. We sought to investigate the effect of IUS exposure on miRNA expression during early lung development. We hypothesized that miRNA-mRNA networks are dysregulated by IUS during human lung development and that these miRNAs may be associated with future risk of asthma and allergy. Human fetal lung samples from a prenatal tissue retrieval program were tested for differential miRNA expression with IUS exposure (measured using placental cotinine concentration). RNA was extracted and miRNA-sequencing was performed. We performed differential expression using IUS exposure, with covariate adjustment. We also considered the above model with an additional sex-by-IUS interaction term, allowing IUS effects to differ by male and female samples. Using paired gene expression profiles, we created sex-stratified miRNA-mRNA correlation networks predictive of IUS using DIABLO. We additionally evaluated whether miRNAs were associated with asthma and allergy outcomes in a cohort of childhood asthma. We profiled pseudoglandular lung miRNA in n = 298 samples, 139 (47%) of which had evidence of IUS exposure. Of 515 miRNAs, 25 were significantly associated with intrauterine smoke exposure (q-value < 0.10). The IUS associated miRNAs were correlated with well-known asthma genes (e.g., ORM1-Like Protein 3, ORDML3) and enriched in disease-relevant pathways (oxidative stress). Eleven IUS-miRNAs were also correlated with clinical measures (e.g., Immunoglobulin E andlungfunction) in children with asthma, further supporting their likely disease relevance. Lastly, we found substantial differences in IUS effects by sex, finding 95 significant IUS-miRNAs in male samples, but only four miRNAs in female samples. The miRNA-mRNA correlation networks were predictive of IUS (AUC = 0.78 in males and 0.86 in females) and suggested that IUS-miRNAs are involved in regulation of disease-relevant genes (e.g., A disintegrin and metalloproteinase domain 19 (ADAM19), LBH regulator of WNT signaling (LBH)) and sex hormone signaling (Coactivator associated methyltransferase 1(CARM1)). Our study demonstrated differential expression of miRNAs by IUS during early prenatal human lung development, which may be modified by sex. Based on their gene targets and correlation to clinical asthma and atopy outcomes, these IUS-miRNAs may be relevant for subsequent allergy and asthma risk. Our study provides insight into the impact of IUS in human fetal lung transcriptional networks and on the developmental origins of asthma and allergic disorders.


Asunto(s)
Asma , MicroARNs , Niño , Humanos , Masculino , Femenino , Preescolar , Embarazo , Humo , Placenta/metabolismo , Asma/genética , Pulmón/metabolismo , MicroARNs/genética , MicroARNs/metabolismo , ARN Mensajero/genética
10.
BMC Bioinformatics ; 23(1): 179, 2022 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-35578165

RESUMEN

When analyzing large datasets from high-throughput technologies, researchers often encounter missing quantitative measurements, which are particularly frequent in metabolomics datasets. Metabolomics, the comprehensive profiling of metabolite abundances, are typically measured using mass spectrometry technologies that often introduce missingness via multiple mechanisms: (1) the metabolite signal may be smaller than the instrument limit of detection; (2) the conditions under which the data are collected and processed may lead to missing values; (3) missing values can be introduced randomly. Missingness resulting from mechanism (1) would be classified as Missing Not At Random (MNAR), that from mechanism (2) would be Missing At Random (MAR), and that from mechanism (3) would be classified as Missing Completely At Random (MCAR). Two common approaches for handling missing data are the following: (1) omit missing data from the analysis; (2) impute the missing values. Both approaches may introduce bias and reduce statistical power in downstream analyses such as testing metabolite associations with clinical variables. Further, standard imputation methods in metabolomics often ignore the mechanisms causing missingness and inaccurately estimate missing values within a data set. We propose a mechanism-aware imputation algorithm that leverages a two-step approach in imputing missing values. First, we use a random forest classifier to classify the missing mechanism for each missing value in the data set. Second, we impute each missing value using imputation algorithms that are specific to the predicted missingness mechanism (i.e., MAR/MCAR or MNAR). Using complete data, we conducted simulations, where we imposed different missingness patterns within the data and tested the performance of combinations of imputation algorithms. Our proposed algorithm provided imputations closer to the original data than those using only one imputation algorithm for all the missing values. Consequently, our two-step approach was able to reduce bias for improved downstream analyses.


Asunto(s)
Algoritmos , Metabolómica , Sesgo , Espectrometría de Masas/métodos , Metabolómica/métodos
11.
Genet Epidemiol ; 45(6): 593-603, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34130352

RESUMEN

Omics studies frequently use samples collected during cohort studies. Conditioning on sample availability can cause selection bias if sample availability is nonrandom. Inverse probability weighting (IPW) is purported to reduce this bias. We evaluated IPW in an epigenome-wide analysis testing the association between DNA methylation (261,435 probes) and age in healthy adolescent subjects (n = 114). We simulated age and sex to be correlated with sample selection and then evaluated four conditions: complete population/no selection bias (all subjects), naïve selection bias (no adjustment), and IPW selection bias (selection bias with IPW adjustment). Assuming the complete population condition represented the "truth," we compared each condition to the complete population condition. Bias or difference in associations between age and methylation was reduced in the IPW condition versus the naïve condition. However, genomic inflation and type 1 error were higher in the IPW condition relative to the naïve condition. Postadjustment using bacon, type 1 error and inflation were similar across all conditions. Power was higher under the IPW condition compared with the naïve condition before and after inflation adjustment. IPW methods can reduce bias in genome-wide analyses. Genomic inflation is a potential concern that can be minimized using methods that adjust for inflation.


Asunto(s)
Estudio de Asociación del Genoma Completo , Adolescente , Sesgo , Estudios de Cohortes , Humanos , Probabilidad , Sesgo de Selección
12.
PLoS Pathog ; 16(10): e1008986, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33064743

RESUMEN

The Type I Interferons (IFN-Is) are innate antiviral cytokines that include 12 different IFNα subtypes and IFNß that signal through the IFN-I receptor (IFNAR), inducing hundreds of IFN-stimulated genes (ISGs) that comprise the 'interferome'. Quantitative differences in IFNAR binding correlate with antiviral activity, but whether IFN-Is exhibit qualitative differences remains controversial. Moreover, the IFN-I response is protective during acute HIV-1 infection, but likely pathogenic during the chronic stages. To gain a deeper understanding of the IFN-I response, we compared the interferomes of IFNα subtypes dominantly-expressed in HIV-1-exposed plasmacytoid dendritic cells (1, 2, 5, 8 and 14) and IFNß in the earliest cellular targets of HIV-1 infection. Primary gut CD4 T cells from 3 donors were treated for 18 hours ex vivo with individual IFN-Is normalized for IFNAR signaling strength. Of 1,969 IFN-regulated genes, 246 'core ISGs' were induced by all IFN-Is tested. However, many IFN-regulated genes were not shared between the IFNα subtypes despite similar induction of canonical antiviral ISGs such as ISG15, RSAD2 and MX1, formally demonstrating qualitative differences between the IFNα subtypes. Notably, IFNß induced a broader interferome than the individual IFNα subtypes. Since IFNß, and not IFNα, is upregulated during chronic HIV-1 infection in the gut, we compared core ISGs and IFNß-specific ISGs from colon pinch biopsies of HIV-1-uninfected (n = 13) versus age- and gender-matched, antiretroviral-therapy naïve persons with HIV-1 (PWH; n = 19). Core ISGs linked to inflammation, T cell activation and immune exhaustion were elevated in PWH, positively correlated with plasma lipopolysaccharide (LPS) levels and gut IFNß levels, and negatively correlated with gut CD4 T cell frequencies. In sharp contrast, IFNß-specific ISGs linked to protein translation and anti-inflammatory responses were significantly downregulated in PWH, negatively correlated with gut IFNß and LPS, and positively correlated with plasma IL6 and gut CD4 T cell frequencies. Our findings reveal qualitative differences in interferome induction by diverse IFN-Is and suggest potential mechanisms for how IFNß may drive HIV-1 pathogenesis in the gut.


Asunto(s)
Antivirales/farmacología , Células Dendríticas/patología , Tracto Gastrointestinal/patología , Infecciones por VIH/patología , VIH-1/efectos de los fármacos , Interferón-alfa/farmacología , Interferón beta/farmacología , Adulto , Estudios de Casos y Controles , Células Dendríticas/efectos de los fármacos , Femenino , Tracto Gastrointestinal/efectos de los fármacos , Perfilación de la Expresión Génica , Infecciones por VIH/tratamiento farmacológico , Infecciones por VIH/virología , Humanos , Interferón-alfa/clasificación , Masculino , Persona de Mediana Edad , Adulto Joven
13.
PLoS Comput Biol ; 17(10): e1008986, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34679079

RESUMEN

High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the "-omics" family. For this work, we focus on subsets that interact with one another and represent these "pathways" as graphs. Observed pathways often have disjoint components, i.e., nodes or sets of nodes (metabolites, etc.) not connected to any other within the pathway, which notably lessens testing power. In this paper we propose the Pathway Integrated Regression-based Kernel Association Test (PaIRKAT), a new kernel machine regression method for incorporating known pathway information into the semi-parametric kernel regression framework. This work extends previous kernel machine approaches. This paper also contributes an application of a graph kernel regularization method for overcoming disconnected pathways. By incorporating a regularized or "smoothed" graph into a score test, PaIRKAT can provide more powerful tests for associations between biological pathways and phenotypes of interest and will be helpful in identifying novel pathways for targeted clinical research. We evaluate this method through several simulation studies and an application to real metabolomics data from the COPDGene study. Our simulation studies illustrate the robustness of this method to incorrect and incomplete pathway knowledge, and the real data analysis shows meaningful improvements of testing power in pathways. PaIRKAT was developed for application to metabolomic pathway data, but the techniques are easily generalizable to other data sources with a graph-like structure.


Asunto(s)
Metaboloma/genética , Metabolómica/métodos , Enfermedad Pulmonar Obstructiva Crónica , Algoritmos , Biomarcadores/sangre , Bases de Datos Genéticas , Humanos , Fenotipo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Análisis de Regresión
14.
Environ Res ; 214(Pt 1): 113881, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35835166

RESUMEN

BACKGROUND: Prenatal exposure to ambient air pollution has been associated with adverse offspring health outcomes. Childhood health effects of prenatal exposures may be mediated through changes to DNA methylation detectable at birth. METHODS: Among 429 non-smoking women in a cohort study of mother-infant pairs in Colorado, USA, we estimated associations between prenatal exposure to ambient fine particulate matter (PM2.5) and ozone (O3), and epigenome-wide DNA methylation of umbilical cord blood cells at delivery (2010-2014). We calculated average PM2.5 and O3 in each trimester of pregnancy and the full pregnancy using inverse-distance-weighted interpolation. We fit linear regression models adjusted for potential confounders and cell proportions to estimate associations between air pollutants and methylation at each of 432,943 CpGs. Differentially methylated regions (DMRs) were identified using comb-p. Previously in this cohort, we reported positive associations between 3rd trimester O3 exposure and infant adiposity at 5 months of age. Here, we quantified the potential for mediation of that association by changes in DNA methylation in cord blood. RESULTS: We identified several DMRs for each pollutant and period of pregnancy. The greatest number of significant DMRs were associated with third trimester PM2.5 (21 DMRs). No single CpGs were associated with air pollutants at a false discovery rate <0.05. We found that up to 8% of the effect of 3rd trimester O3 on 5-month adiposity may be mediated by locus-specific methylation changes, but mediation estimates were not statistically significant. CONCLUSIONS: Differentially methylated regions in cord blood were identified in association with maternal exposure to PM2.5 and O3. Genes annotated to the significant sites played roles in cardiometabolic disease, immune function and inflammation, and neurologic disorders. We found limited evidence of mediation by DNA methylation of associations between third trimester O3 exposure and 5-month infant adiposity.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Efectos Tardíos de la Exposición Prenatal , Adiposidad , Niño , Estudios de Cohortes , Metilación de ADN , Femenino , Sangre Fetal , Humanos , Lactante , Recién Nacido , Exposición Materna , Obesidad , Material Particulado , Embarazo
15.
BMC Bioinformatics ; 22(1): 423, 2021 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-34493210

RESUMEN

BACKGROUND: Assessing the reproducibility of measurements is an important first step for improving the reliability of downstream analyses of high-throughput metabolomics experiments. We define a metabolite to be reproducible when it demonstrates consistency across replicate experiments. Similarly, metabolites which are not consistent across replicates can be labeled as irreproducible. In this work, we introduce and evaluate the use (Ma)ximum (R)ank (R)eproducibility (MaRR) to examine reproducibility in mass spectrometry-based metabolomics experiments. We examine reproducibility across technical or biological samples in three different mass spectrometry metabolomics (MS-Metabolomics) data sets. RESULTS: We apply MaRR, a nonparametric approach that detects the change from reproducible to irreproducible signals using a maximal rank statistic. The advantage of using MaRR over model-based methods that it does not make parametric assumptions on the underlying distributions or dependence structures of reproducible metabolites. Using three MS Metabolomics data sets generated in the multi-center Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPD) study, we applied the MaRR procedure after data processing to explore reproducibility across technical or biological samples. Under realistic settings of MS-Metabolomics data, the MaRR procedure effectively controls the False Discovery Rate (FDR) when there was a gradual reduction in correlation between replicate pairs for less highly ranked signals. Simulation studies also show that the MaRR procedure tends to have high power for detecting reproducible metabolites in most situations except for smaller values of proportion of reproducible metabolites. Bias (i.e., the difference between the estimated and the true value of reproducible signal proportions) values for simulations are also close to zero. The results reported from the real data show a higher level of reproducibility for technical replicates compared to biological replicates across all the three different datasets. In summary, we demonstrate that the MaRR procedure application can be adapted to various experimental designs, and that the nonparametric approach performs consistently well. CONCLUSIONS: This research was motivated by reproducibility, which has proven to be a major obstacle in the use of genomic findings to advance clinical practice. In this paper, we developed a data-driven approach to assess the reproducibility of MS-Metabolomics data sets. The methods described in this paper are implemented in the open-source R package marr, which is freely available from Bioconductor at http://bioconductor.org/packages/marr .


Asunto(s)
Metabolómica , Espectrometría de Masas , Reproducibilidad de los Resultados
16.
BMC Bioinformatics ; 22(1): 41, 2021 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-33526006

RESUMEN

BACKGROUND: The drive to understand how microbial communities interact with their environments has inspired innovations across many fields. The data generated from sequence-based analyses of microbial communities typically are of high dimensionality and can involve multiple data tables consisting of taxonomic or functional gene/pathway counts. Merging multiple high dimensional tables with study-related metadata can be challenging. Existing microbiome pipelines available in R have created their own data structures to manage this problem. However, these data structures may be unfamiliar to analysts new to microbiome data or R and do not allow for deviations from internal workflows. Existing analysis tools also focus primarily on community-level analyses and exploratory visualizations, as opposed to analyses of individual taxa. RESULTS: We developed the R package "tidyMicro" to serve as a more complete microbiome analysis pipeline. This open source software provides all of the essential tools available in other popular packages (e.g., management of sequence count tables, standard exploratory visualizations, and diversity inference tools) supplemented with multiple options for regression modelling (e.g., negative binomial, beta binomial, and/or rank based testing) and novel visualizations to improve interpretability (e.g., Rocky Mountain plots, longitudinal ordination plots). This comprehensive pipeline for microbiome analysis also maintains data structures familiar to R users to improve analysts' control over workflow. A complete vignette is provided to aid new users in analysis workflow. CONCLUSIONS: tidyMicro provides a reliable alternative to popular microbiome analysis packages in R. We provide standard tools as well as novel extensions on standard analyses to improve interpretability results while maintaining object malleability to encourage open source collaboration. The simple examples and full workflow from the package are reproducible and applicable to external data sets.


Asunto(s)
Análisis de Datos , Microbiota , Programas Informáticos , Flujo de Trabajo
17.
Diabetologia ; 64(8): 1785-1794, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-33893822

RESUMEN

AIMS/HYPOTHESIS: Oxylipins are lipid mediators derived from polyunsaturated fatty acids. Some oxylipins are proinflammatory (e.g. those derived from arachidonic acid [ARA]), others are pro-resolving of inflammation (e.g. those derived from α-linolenic acid [ALA], docosahexaenoic acid [DHA] and eicosapentaenoic acid [EPA]) and others may be both (e.g. those derived from linoleic acid [LA]). The goal of this study was to examine whether oxylipins are associated with incident type 1 diabetes. METHODS: We conducted a nested case-control analysis in the Diabetes Autoimmunity Study in the Young (DAISY), a prospective cohort study of children at risk of type 1 diabetes. Plasma levels of 14 ARA-derived oxylipins, ten LA-derived oxylipins, six ALA-derived oxylipins, four DHA-derived oxylipins and two EPA-related oxylipins were measured by ultra-HPLC-MS/MS at multiple timepoints related to autoantibody seroconversion in 72 type 1 diabetes cases and 71 control participants, which were frequency matched on age at autoantibody seroconversion (of the case), ethnicity and sample availability. Linear mixed models were used to obtain an age-adjusted mean of each oxylipin prior to type 1 diabetes. Age-adjusted mean oxylipins were tested for association with type 1 diabetes using logistic regression, adjusting for the high risk HLA genotype HLA-DR3/4,DQB1*0302. We also performed principal component analysis of the oxylipins and tested principal components (PCs) for association with type 1 diabetes. Finally, to investigate potential critical timepoints, we examined the association of oxylipins measured before and after autoantibody seroconversion (of the cases) using PCs of the oxylipins at those visits. RESULTS: The ARA-related oxylipin 5-HETE was associated with increased type 1 diabetes risk. Five LA-related oxylipins, two ALA-related oxylipins and one DHA-related oxylipin were associated with decreased type 1 diabetes risk. A profile of elevated LA- and ALA-related oxylipins (PC1) was associated with decreased type 1 diabetes risk (OR 0.61; 95% CI 0.40, 0.94). A profile of elevated ARA-related oxylipins (PC2) was associated with increased diabetes risk (OR 1.53; 95% CI 1.03, 2.29). A critical timepoint analysis showed type 1 diabetes was associated with a high ARA-related oxylipin profile at post-autoantibody-seroconversion but not pre-seroconversion. CONCLUSIONS/INTERPRETATION: The protective association of higher LA- and ALA-related oxylipins demonstrates the importance of both inflammation promotion and resolution in type 1 diabetes. Proinflammatory ARA-related oxylipins may play an important role once the autoimmune process has begun.


Asunto(s)
Autoinmunidad/inmunología , Diabetes Mellitus Tipo 1/sangre , Diabetes Mellitus Tipo 1/inmunología , Oxilipinas/sangre , Adolescente , Ácido Araquidónico/sangre , Autoanticuerpos/sangre , Estudios de Casos y Controles , Niño , Preescolar , Cromatografía Líquida de Alta Presión , Ácidos Docosahexaenoicos/sangre , Femenino , Estudios de Seguimiento , Glutamato Descarboxilasa/inmunología , Antígeno HLA-DR3/genética , Antígeno HLA-DR4/genética , Humanos , Insulina/sangre , Insulina/inmunología , Ácido Linoleico/sangre , Masculino , Estudios Prospectivos , Proteínas Tirosina Fosfatasas Clase 8 Similares a Receptores/inmunología , Espectrometría de Masas en Tándem
18.
Am J Epidemiol ; 190(7): 1243-1252, 2021 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-33438003

RESUMEN

Urbanization increases human mobility in ways that can alter the transmission of classically rural, vector-borne diseases like schistosomiasis. The impact of human mobility on individual-level Schistosoma risk is poorly characterized. Travel outside endemic areas may protect against infection by reducing exposure opportunities, whereas travel to other endemic regions may increase risk. Using detailed monthly travel- and water-contact surveys from 27 rural communities in Sichuan, China, in 2008, we aimed to describe human mobility and to identify mobility-related predictors of S. japonicum infection. Candidate predictors included timing, frequency, distance, duration, and purpose of recent travel as well as water-contact measures. Random forests machine learning was used to detect key predictors of individual infection status. Logistic regression was used to assess the strength and direction of associations. Key mobility-related predictors include frequent travel and travel during July-both associated with decreased probability of infection and less time engaged in risky water-contact behavior, suggesting travel may remove opportunities for schistosome exposure. The importance of July travel and July water contact suggests a high-risk window for cercarial exposure. The frequency and timing of human movement out of endemic areas should be considered when assessing potential drivers of rural infectious diseases.


Asunto(s)
Enfermedades Endémicas/estadística & datos numéricos , Dinámica Poblacional/estadística & datos numéricos , Población Rural/estadística & datos numéricos , Esquistosomiasis Japónica/epidemiología , Viaje/estadística & datos numéricos , Adulto , China/epidemiología , Femenino , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Esquistosomiasis Japónica/etiología , Factores de Tiempo , Recursos Hídricos
19.
Anal Chem ; 93(4): 1912-1923, 2021 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-33467846

RESUMEN

A growing number of software tools have been developed for metabolomics data processing and analysis. Many new tools are contributed by metabolomics practitioners who have limited prior experience with software development, and the tools are subsequently implemented by users with expertise that ranges from basic point-and-click data analysis to advanced coding. This Perspective is intended to introduce metabolomics software users and developers to important considerations that determine the overall impact of a publicly available tool within the scientific community. The recommendations reflect the collective experience of an NIH-sponsored Metabolomics Consortium working group that was formed with the goal of researching guidelines and best practices for metabolomics tool development. The recommendations are aimed at metabolomics researchers with little formal background in programming and are organized into three stages: (i) preparation, (ii) tool development, and (iii) distribution and maintenance.


Asunto(s)
Nube Computacional , Metabolómica/métodos , Programas Informáticos
20.
Biostatistics ; 21(3): 561-576, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-30590505

RESUMEN

In this article, we develop a graphical modeling framework for the inference of networks across multiple sample groups and data types. In medical studies, this setting arises whenever a set of subjects, which may be heterogeneous due to differing disease stage or subtype, is profiled across multiple platforms, such as metabolomics, proteomics, or transcriptomics data. Our proposed Bayesian hierarchical model first links the network structures within each platform using a Markov random field prior to relate edge selection across sample groups, and then links the network similarity parameters across platforms. This enables joint estimation in a flexible manner, as we make no assumptions on the directionality of influence across the data types or the extent of network similarity across the sample groups and platforms. In addition, our model formulation allows the number of variables and number of subjects to differ across the data types, and only requires that we have data for the same set of groups. We illustrate the proposed approach through both simulation studies and an application to gene expression levels and metabolite abundances on subjects with varying severity levels of chronic obstructive pulmonary disease. Bayesian inference; Chronic obstructive pulmonary disease (COPD); Data integration; Gaussian graphical model; Markov random field prior; Spike and slab prior.


Asunto(s)
Investigación Biomédica/métodos , Bioestadística/métodos , Interpretación Estadística de Datos , Modelos Estadísticos , Teorema de Bayes , Simulación por Computador , Conjuntos de Datos como Asunto , Expresión Génica/fisiología , Humanos , Cadenas de Markov , Metaboloma/fisiología , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Índice de Severidad de la Enfermedad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA