Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 142
Filtrar
1.
PLoS Comput Biol ; 20(3): e1011814, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38527092

RESUMO

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.


Assuntos
Genômica , Multiômica , Genômica/métodos
2.
medRxiv ; 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38464285

RESUMO

Background: Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features. Methods: Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS. Results: We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts. Conclusions: In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.

3.
bioRxiv ; 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38328226

RESUMO

Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.

4.
bioRxiv ; 2024 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-38260498

RESUMO

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. The PathIntegrate Python package is available at https://github.com/cwieder/PathIntegrate.

5.
Obesity (Silver Spring) ; 32(1): 187-199, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37869908

RESUMO

OBJECTIVE: Fetal exposures may impact offspring epigenetic signatures and adiposity. The authors hypothesized that maternal metabolic traits associate with cord blood DNA methylation, which, in turn, associates with child adiposity. METHODS: Fasting serum was obtained in 588 pregnant women (27-34 weeks' gestation), and insulin, glucose, high-density lipoprotein cholesterol, triglycerides, and free fatty acids were measured. Cord blood DNA methylation and child adiposity were measured at birth, 4-6 months, and 4-6 years. The association of maternal metabolic traits with DNA methylation (429,246 CpGs) for differentially methylated probes (DMPs) and regions (DMRs) was tested. The association of the first principal component of each DMR with child adiposity was tested, and mediation analysis was performed. RESULTS: Maternal triglycerides were associated with the most DMPs and DMRs of all traits tested (261 and 198, respectively, false discovery rate < 0.05). DMRs were near genes involved in immune function and lipid metabolism. Triglyceride-associated CpGs were associated with child adiposity at 4-6 months (32 CpGs) and 4-6 years (2 CpGs). One, near CD226, was observed at both timepoints, mediating 10% and 22% of the relationship between maternal triglycerides and child adiposity at 4-6 months and 4-6 years, respectively. CONCLUSIONS: DNA methylation may play a role in the association of maternal triglycerides and child adiposity.


Assuntos
Adiposidade , Metilação de DNA , Recém-Nascido , Criança , Humanos , Feminino , Gravidez , Triglicerídeos , Adiposidade/genética , Metabolismo dos Lipídeos/genética , Sangue Fetal/metabolismo , Obesidade/metabolismo
6.
bioRxiv ; 2024 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-38045372

RESUMO

Summary: Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. Availability: This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/.

9.
BMC Bioinformatics ; 24(1): 398, 2023 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-37880571

RESUMO

BACKGROUND: In this paper, we are interested in interactions between a high-dimensional -omics dataset and clinical covariates. The goal is to evaluate the relationship between a phenotype of interest and a high-dimensional omics pathway, where the effect of the omics data depends on subjects' clinical covariates (age, sex, smoking status, etc.). For instance, metabolic pathways can vary greatly between sexes which may also change the relationship between certain metabolic pathways and a clinical phenotype of interest. We propose partitioning the clinical covariate space and performing a kernel association test within those partitions. To illustrate this idea, we focus on hierarchical partitions of the clinical covariate space and kernel tests on metabolic pathways. RESULTS: We see that our proposed method outperforms competing methods in most simulation scenarios. It can identify different relationships among clinical groups with higher power in most scenarios while maintaining a proper Type I error rate. The simulation studies also show a robustness to the grouping structure within the clinical space. We also apply the method to the COPDGene study and find several clinically meaningful interactions between metabolic pathways, the clinical space, and lung function. CONCLUSION: TreeKernel provides a simple and interpretable process for testing for relationships between high-dimensional omics data and clinical outcomes in the presence of interactions within clinical cohorts. The method is broadly applicable to many studies.


Assuntos
Doença Pulmonar Obstrutiva Crônica , Humanos , Fenótipo , Simulação por Computador
10.
Epigenetics ; 18(1): 2254971, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37691382

RESUMO

Background: 'Epigenetic clocks' have been developed to accurately predict chronologic gestational age and have been associated with child health outcomes in prior work.Methods: We meta-analysed results from four prospective U.S cohorts investigating the association between epigenetic age acceleration estimated using blood DNA methylation collected at birth and preschool age Childhood Behavior Checklist (CBCL) scores.Results: Epigenetic ageing was not significantly associated with CBCL total problem scores (ß = 0.33, 95% CI: -0.95, 0.28) and DSM-oriented pervasive development problem scores (ß = -0.23, 95% CI: -0.61, 0.15). No associations were observed for other DSM-oriented subscales.Conclusions: The meta-analysis results suggest that epigenetic gestational age acceleration is not associated with child emotional and behavioural functioning for preschool age group. These findings may relate to our study population, which includes two cohorts enriched for ASD and one preterm birth cohort.; future work should address the role of epigenetic age in child health in other study populations.Abbreviations: DNAm: DNA methylation; CBCL: Child Behavioral Checklist; ECHO: Environmental Influences on Child Health Outcomes; EARLI: Early Autism Risk Longitudinal Investigation; MARBLES: Markers of Autism Risk in Babies - Learning Early Signs; ELGAN: Extremely Low Gestational Age Newborns; ASD: autism spectrum disorder; BMI: body mass index; DSM: Diagnostic and Statistical Manual of Mental Disorders.


Assuntos
Transtorno do Espectro Autista , Nascimento Prematuro , Pré-Escolar , Humanos , Recém-Nascido , Metilação de DNA , Epigênese Genética , Estudos Prospectivos
11.
Sci Rep ; 13(1): 13862, 2023 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-37620507

RESUMO

Quantitative assessment of emphysema in CT scans has mostly focused on calculating the percentage of lung tissue that is deemed abnormal based on a density thresholding strategy. However, this overall measure of disease burden discards virtually all the spatial information encoded in the scan that is implicitly utilized in a visual assessment. This simplification is likely grouping heterogenous disease patterns and is potentially obscuring clinical phenotypes and variable disease outcomes. To overcome this, several methods that attempt to quantify heterogeneity in emphysema distribution have been proposed. Here, we compare three of those: one based on estimating a power law for the size distribution of contiguous emphysema clusters, a second that looks at the number of emphysema-to-emphysema voxel adjacencies, and a third that applies a parametric spatial point process model to the emphysema voxel locations. This was done using data from 587 individuals from Phase 1 of COPDGene that had an inspiratory CT scan and plasma protein abundance measurements. The associations between these imaging metrics and visual assessment with clinical measures (FEV[Formula: see text], FEV[Formula: see text]-FVC ratio, etc.) and plasma protein biomarker levels were evaluated using a variety of regression models. Our results showed that a selection of spatial measures had the ability to discern heterogeneous patterns among CTs that had similar emphysema burdens. The most informative quantitative measure, average cluster size from the point process model, showed much stronger associations with nearly every clinical outcome examined than existing CT-derived emphysema metrics and visual assessment. Moreover, approximately 75% more plasma biomarkers were found to be associated with an emphysema heterogeneity phenotype when accounting for spatial clustering measures than when they were excluded.


Assuntos
Enfisema , Enfisema Pulmonar , Humanos , Enfisema Pulmonar/diagnóstico por imagem , Enfisema/diagnóstico por imagem , Benchmarking , Pulmão/diagnóstico por imagem , Análise por Conglomerados
12.
Obesity (Silver Spring) ; 31(8): 2090-2102, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37475691

RESUMO

OBJECTIVE: Fat content of adipocytes derived from infant umbilical cord mesenchymal stem cells (MSCs) predicts adiposity in children through 4 to 6 years of age. This study tested the hypothesis that MSCs from infants born to mothers with obesity (Ob-MSCs) exhibit adipocyte hypertrophy and perturbations in genes regulating adipogenesis compared with MSCs from infants of mothers with normal weight (NW-MSCs). METHODS: Adipogenesis was induced in MSCs embedded in three-dimensional hydrogel structures, and cell size and number were measured by three-dimensional imaging. Proliferation and protein markers of proliferation and adipogenesis in undifferentiated and adipocyte differentiating cells were measured. RNA sequencing was performed to determine pathways linked to adipogenesis phenotype. RESULTS: In undifferentiated MSCs, greater zinc finger protein (Zfp)423 protein content was observed in Ob- versus NW-MSCs. Adipocytes from Ob-MSCs were larger but fewer than adipocytes from NW-MSCs. RNA sequencing analysis showed that Zfp423 protein correlated with mRNA expression of genes enriched for cell cycle, MSC lineage specification, inflammation, and metabolism pathways. MSC proliferation was not different before differentiation but declined faster in Ob-MSCs upon adipogenic induction. CONCLUSIONS: Ob-MSCs have an intrinsic propensity for adipocyte hypertrophy and reduced hyperplasia during adipogenesis in vitro, perhaps linked to greater Zfp423 content and changes in cell cycle pathway gene expression.


Assuntos
Células-Tronco Mesenquimais , Mães , Feminino , Humanos , Obesidade/genética , Obesidade/metabolismo , Diferenciação Celular/genética , Adipogenia/genética , Células-Tronco Mesenquimais/metabolismo , Fatores de Transcrição/metabolismo , Adipócitos/metabolismo , Hipertrofia/metabolismo
13.
Sci Rep ; 13(1): 9254, 2023 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-37286633

RESUMO

Privacy protection is a core principle of genomic but not proteomic research. We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS), calculated continuous protein level genotype probabilities, and then applied a naïve Bayesian approach to link SomaScan 1.3K proteomes to genomes for 2812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA). We correctly linked 90-95% of proteomes to their correct genome and for 95-99% we identify the 1% most likely links. The linking accuracy in subjects with African ancestry was lower (~ 60%) unless training included diverse subjects. With larger profiling (SomaScan 5K) in the Atherosclerosis Risk Communities (ARIC) correct identification was > 99% even in mixed ancestry populations. We also linked proteomes-to-proteomes and used the proteome only to determine features such as sex, ancestry, and first-degree relatives. When serial proteomes are available, the linking algorithm can be used to identify and correct mislabeled samples. This work also demonstrates the importance of including diverse populations in omics research and that large proteomic datasets (> 1000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered unidentifiable.


Assuntos
Aterosclerose , Proteoma , Humanos , Proteoma/genética , Teorema de Bayes , Privacidade , Estudo de Associação Genômica Ampla , Aterosclerose/genética , Polimorfismo de Nucleotídeo Único
14.
Int J Mol Sci ; 24(9)2023 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-37175432

RESUMO

Intrauterine smoke (IUS) exposure during early childhood has been associated with a number of negative health consequences, including reduced lung function and asthma susceptibility. The biological mechanisms underlying these associations have not been established. MicroRNAs regulate the expression of numerous genes involved in lung development. Thus, investigation of the impact of IUS on miRNA expression during human lung development may elucidate the impact of IUS on post-natal respiratory outcomes. We sought to investigate the effect of IUS exposure on miRNA expression during early lung development. We hypothesized that miRNA-mRNA networks are dysregulated by IUS during human lung development and that these miRNAs may be associated with future risk of asthma and allergy. Human fetal lung samples from a prenatal tissue retrieval program were tested for differential miRNA expression with IUS exposure (measured using placental cotinine concentration). RNA was extracted and miRNA-sequencing was performed. We performed differential expression using IUS exposure, with covariate adjustment. We also considered the above model with an additional sex-by-IUS interaction term, allowing IUS effects to differ by male and female samples. Using paired gene expression profiles, we created sex-stratified miRNA-mRNA correlation networks predictive of IUS using DIABLO. We additionally evaluated whether miRNAs were associated with asthma and allergy outcomes in a cohort of childhood asthma. We profiled pseudoglandular lung miRNA in n = 298 samples, 139 (47%) of which had evidence of IUS exposure. Of 515 miRNAs, 25 were significantly associated with intrauterine smoke exposure (q-value < 0.10). The IUS associated miRNAs were correlated with well-known asthma genes (e.g., ORM1-Like Protein 3, ORDML3) and enriched in disease-relevant pathways (oxidative stress). Eleven IUS-miRNAs were also correlated with clinical measures (e.g., Immunoglobulin E andlungfunction) in children with asthma, further supporting their likely disease relevance. Lastly, we found substantial differences in IUS effects by sex, finding 95 significant IUS-miRNAs in male samples, but only four miRNAs in female samples. The miRNA-mRNA correlation networks were predictive of IUS (AUC = 0.78 in males and 0.86 in females) and suggested that IUS-miRNAs are involved in regulation of disease-relevant genes (e.g., A disintegrin and metalloproteinase domain 19 (ADAM19), LBH regulator of WNT signaling (LBH)) and sex hormone signaling (Coactivator associated methyltransferase 1(CARM1)). Our study demonstrated differential expression of miRNAs by IUS during early prenatal human lung development, which may be modified by sex. Based on their gene targets and correlation to clinical asthma and atopy outcomes, these IUS-miRNAs may be relevant for subsequent allergy and asthma risk. Our study provides insight into the impact of IUS in human fetal lung transcriptional networks and on the developmental origins of asthma and allergic disorders.


Assuntos
Asma , MicroRNAs , Criança , Humanos , Masculino , Feminino , Pré-Escolar , Gravidez , Fumaça , Placenta/metabolismo , Asma/genética , Pulmão/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Mensageiro/genética
15.
Environ Res ; 231(Pt 2): 116215, 2023 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-37224946

RESUMO

BACKGROUND: Per- and polyfluoroalkyl substances (PFAS) are ubiquitous, environmentally persistent chemicals, and prenatal exposures have been associated with adverse child health outcomes. Prenatal PFAS exposure may lead to epigenetic age acceleration (EAA), defined as the discrepancy between an individual's chronologic and epigenetic or biological age. OBJECTIVES: We estimated associations of maternal serum PFAS concentrations with EAA in umbilical cord blood DNA methylation using linear regression, and a multivariable exposure-response function of the PFAS mixture using Bayesian kernel machine regression. METHODS: Five PFAS were quantified in maternal serum (median: 27 weeks of gestation) among 577 mother-infant dyads from a prospective cohort. Cord blood DNA methylation data were assessed with the Illumina HumanMethylation450 array. EAA was calculated as the residuals from regressing gestational age on epigenetic age, calculated using a cord-blood specific epigenetic clock. Linear regression tested for associations between each maternal PFAS concentration with EAA. Bayesian kernel machine regression with hierarchical selection estimated an exposure-response function for the PFAS mixture. RESULTS: In single pollutant models we observed an inverse relationship between perfluorodecanoate (PFDA) and EAA (-0.148 weeks per log-unit increase, 95% CI: -0.283, -0.013). Mixture analysis with hierarchical selection between perfluoroalkyl carboxylates and sulfonates indicated the carboxylates had the highest group posterior inclusion probability (PIP), or relative importance. Within this group, PFDA had the highest conditional PIP. Univariate predictor-response functions indicated PFDA and perfluorononanoate were inversely associated with EAA, while perfluorohexane sulfonate had a positive association with EAA. CONCLUSIONS: Maternal mid-pregnancy serum concentrations of PFDA were negatively associated with EAA in cord blood, suggesting a pathway by which prenatal PFAS exposures may affect infant development. No significant associations were observed with other PFAS. Mixture models suggested opposite directions of association between perfluoroalkyl sulfonates and carboxylates. Future studies are needed to determine the importance of neonatal EAA for later child health outcomes.


Assuntos
Ácidos Alcanossulfônicos , Poluentes Ambientais , Fluorocarbonos , Efeitos Tardios da Exposição Pré-Natal , Lactente , Recém-Nascido , Gravidez , Criança , Feminino , Humanos , Sangue Fetal , Efeitos Tardios da Exposição Pré-Natal/induzido quimicamente , Estudos Prospectivos , Teorema de Bayes , Alcanossulfonatos , Mães , Ácidos Carboxílicos , Epigênese Genética
16.
Front Nutr ; 10: 1040993, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37057071

RESUMO

Background: Oxylipins are inflammatory biomarkers derived from omega-3 and-6 fatty acids implicated in inflammatory diseases but have not been studied in a genome-wide association study (GWAS). The aim of this study was to identify genetic loci associated with oxylipins and oxylipin profiles to identify biologic pathways and therapeutic targets for oxylipins. Methods: We conducted a GWAS of plasma oxylipins in 316 participants in the Diabetes Autoimmunity Study in the Young (DAISY). DNA samples were genotyped using the TEDDY-T1D Exome array, and additional variants were imputed using the Trans-Omics for Precision Medicine (TOPMed) multi-ancestry reference panel. Principal components analysis of 36 plasma oxylipins was used to capture oxylipin profiles. PC1 represented linoleic acid (LA)- and alpha-linolenic acid (ALA)-related oxylipins, and PC2 represented arachidonic acid (ARA)-related oxylipins. Oxylipin PC1, PC2, and the top five loading oxylipins from each PC were used as outcomes in the GWAS (genome-wide significance: p < 5×10-8). Results: The SNP rs143070873 was associated with (p < 5×10-8) the LA-related oxylipin 9-HODE, and rs6444933 (downstream of CLDN11) was associated with the LA-related oxylipin 13 S-HODE. A locus between MIR1302-7 and LOC100131146, rs10118380 and an intronic variant in TRPM3 were associated with the ARA-related oxylipin 11-HETE. These loci are involved in inflammatory signaling cascades and interact with PLA2, an initial step to oxylipin biosynthesis. Conclusion: Genetic loci involved in inflammation and oxylipin metabolism are associated with oxylipin levels.

17.
PLoS One ; 18(4): e0284563, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37083575

RESUMO

Network approaches have successfully been used to help reveal complex mechanisms of diseases including Chronic Obstructive Pulmonary Disease (COPD). However despite recent advances, we remain limited in our ability to incorporate protein-protein interaction (PPI) network information with omics data for disease prediction. New deep learning methods including convolution Graph Neural Network (ConvGNN) has shown great potential for disease classification using transcriptomics data and known PPI networks from existing databases. In this study, we first reconstructed the COPD-associated PPI network through the AhGlasso (Augmented High-Dimensional Graphical Lasso Method) algorithm based on one independent transcriptomics dataset including COPD cases and controls. Then we extended the existing ConvGNN methods to successfully integrate COPD-associated PPI, proteomics, and transcriptomics data and developed a prediction model for COPD classification. This approach improves accuracy over several conventional classification methods and neural networks that do not incorporate network information. We also demonstrated that the updated COPD-associated network developed using AhGlasso further improves prediction accuracy. Although deep neural networks often achieve superior statistical power in classification compared to other methods, it can be very difficult to explain how the model, especially graph neural network(s), makes decisions on the given features and identifies the features that contribute the most to prediction generally and individually. To better explain how the spectral-based Graph Neural Network model(s) works, we applied one unified explainable machine learning method, SHapley Additive exPlanations (SHAP), and identified CXCL11, IL-2, CD48, KIR3DL2, TLR2, BMP10 and several other relevant COPD genes in subnetworks of the ConvGNN model for COPD prediction. Finally, Gene Ontology (GO) enrichment analysis identified glycosaminoglycan, heparin signaling, and carbohydrate derivative signaling pathways significantly enriched in the top important gene/proteins for COPD classifications.


Assuntos
Aprendizado Profundo , Doença Pulmonar Obstrutiva Crônica , Humanos , Multiômica , Redes Neurais de Computação , Algoritmos , Doença Pulmonar Obstrutiva Crônica/genética , Proteínas Morfogenéticas Ósseas
18.
JAMA Netw Open ; 6(4): e237030, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37014638

RESUMO

Importance: The in utero metabolic milieu is associated with offspring adiposity. Standard definitions of maternal obesity (according to prepregnancy body mass index [BMI]) and gestational diabetes (GDM) may not be adequate to capture subtle yet important differences in the intrauterine environment that could be involved in programming. Objectives: To identify maternal metabolic subgroups during pregnancy and to examine associations of subgroup classification with adiposity traits in their children. Design, Setting, and Participants: This cohort study included mother-offspring pairs in the Healthy Start prebirth cohort (enrollment: 2010-2014) recruited from University of Colorado Hospital obstetrics clinics in Aurora, Colorado. Follow-up of women and children is ongoing. Data were analyzed from March to December 2022. Exposures: Metabolic subtypes of pregnant women ascertained by applying k-means clustering on 7 biomarkers and 2 biomarker indices measured at approximately 17 gestational weeks: glucose, insulin, Homeostatic Model Assessment for Insulin Resistance, total cholesterol, high-density lipoprotein cholesterol (HDL-C), triglycerides, free fatty acids (FFA), HDL-C:triglycerides ratio, and tumor necrosis factor α. Main Outcomes and Measures: Offspring birthweight z score and neonatal fat mass percentage (FM%). In childhood at approximately 5 years of age, offspring BMI percentile, FM%, BMI in the 95th percentile or higher, and FM% in the 95th percentile or higher. Results: A total of 1325 pregnant women (mean [SD] age, 27.8 [6.2 years]; 322 [24.3%] Hispanic, 207 non-Hispanic Black [15.6%], and 713 [53.8%] non-Hispanic White), and 727 offspring with anthropometric data measured in childhood (mean [SD] age 4.81 [0.72] years, 48% female) were included. We identified the following 5 maternal metabolic subgroups: reference (438 participants), high HDL-C (355 participants), dyslipidemic-high triglycerides (182 participants), dyslipidemic-high FFA (234 participants), and insulin resistant (IR)-hyperglycemic (116 participants). Compared with the reference subgroup, women in the IR-hyperglycemic and dyslipidemic-high FFA subgroups had offspring with 4.27% (95% CI, 1.94-6.59) and 1.96% (95% CI, 0.45-3.47) greater FM% during childhood, respectively. There was a higher risk of high FM% among offspring of the IR-hyperglycemic (relative risk, 8.7; 95% CI, 2.7-27.8) and dyslipidemic-high FFA (relative risk, 3.4; 95% CI, 1.0-11.3) subgroups; this risk was of greater magnitude compared with prepregnancy obesity alone, GDM alone, or both conditions. Conclusions and Relevance: In this cohort study, an unsupervised clustering approach revealed distinct metabolic subgroups of pregnant women. These subgroups exhibited differences in risk of offspring adiposity in early childhood. Such approaches have the potential to refine understanding of the in utero metabolic milieu, with utility for capturing variation in sociocultural, anthropometric, and biochemical risk factors for offspring adiposity.


Assuntos
Diabetes Gestacional , Obesidade Infantil , Recém-Nascido , Feminino , Criança , Pré-Escolar , Humanos , Gravidez , Adulto , Masculino , Obesidade Infantil/epidemiologia , Estudos de Coortes , Gestantes , Glicemia/metabolismo , Diabetes Gestacional/epidemiologia , Insulina , Triglicerídeos , Colesterol
19.
BMC Bioinformatics ; 24(1): 86, 2023 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-36882691

RESUMO

BACKGROUND: We developed a novel approach to minimize batch effects when assigning samples to batches. Our algorithm selects a batch allocation, among all possible ways of assigning samples to batches, that minimizes differences in average propensity score between batches. This strategy was compared to randomization and stratified randomization in a case-control study (30 per group) with a covariate (case vs control, represented as ß1, set to be null) and two biologically relevant confounding variables (age, represented as ß2, and hemoglobin A1c (HbA1c), represented as ß3). Gene expression values were obtained from a publicly available dataset of expression data obtained from pancreas islet cells. Batch effects were simulated as twice the median biological variation across the gene expression dataset and were added to the publicly available dataset to simulate a batch effect condition. Bias was calculated as the absolute difference between observed betas under the batch allocation strategies and the true beta (no batch effects). Bias was also evaluated after adjustment for batch effects using ComBat as well as a linear regression model. In order to understand performance of our optimal allocation strategy under the alternative hypothesis, we also evaluated bias at a single gene associated with both age and HbA1c levels in the 'true' dataset (CAPN13 gene). RESULTS: Pre-batch correction, under the null hypothesis (ß1), maximum absolute bias and root mean square (RMS) of maximum absolute bias, were minimized using the optimal allocation strategy. Under the alternative hypothesis (ß2 and ß3 for the CAPN13 gene), maximum absolute bias and RMS of maximum absolute bias were also consistently lower using the optimal allocation strategy. ComBat and the regression batch adjustment methods performed well as the bias estimates moved towards the true values in all conditions under both the null and alternative hypotheses. Although the differences between methods were less pronounced following batch correction, estimates of bias (average and RMS) were consistently lower using the optimal allocation strategy under both the null and alternative hypotheses. CONCLUSIONS: Our algorithm provides an extremely flexible and effective method for assigning samples to batches by exploiting knowledge of covariates prior to sample allocation.


Assuntos
Algoritmos , Nível de Saúde , Pontuação de Propensão , Estudos de Casos e Controles , Hemoglobinas Glicadas , Humanos
20.
Nutrients ; 15(4)2023 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-36839302

RESUMO

Oxylipins, pro-inflammatory and pro-resolving lipid mediators, are associated with the risk of type 1 diabetes (T1D) and may be influenced by diet. This study aimed to develop a nutrient pattern related to oxylipin profiles and test their associations with the risk of T1D among youth. The nutrient patterns were developed with a reduced rank regression in a nested case-control study (n = 335) within the Diabetes Autoimmunity Study in the Young (DAISY), a longitudinal cohort of children at risk of T1D. The oxylipin profiles (adjusted for genetic predictors) were the response variables. The nutrient patterns were tested in the case-control study (n = 69 T1D cases, 69 controls), then validated in the DAISY cohort using a joint Cox proportional hazards model (n = 1933, including 81 T1D cases). The first nutrient pattern (NP1) was characterized by low beta cryptoxanthin, flavanone, vitamin C, total sugars and iron, and high lycopene, anthocyanidins, linoleic acid and sodium. After adjusting for T1D family history, the HLA genotype, sex and race/ethnicity, NP1 was associated with a lower risk of T1D in the nested case-control study (OR: 0.44, p = 0.0126). NP1 was not associated with the risk of T1D (HR: 0.54, p-value = 0.1829) in the full DAISY cohort. Future studies are needed to confirm the nested case-control findings and investigate the modifiable factors for oxylipins.


Assuntos
Diabetes Mellitus Tipo 1 , Ilhotas Pancreáticas , Criança , Adolescente , Humanos , Oxilipinas , Autoimunidade , Fatores de Risco , Estudos de Casos e Controles , Nutrientes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA