Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 20(3): e1011814, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38527092

RESUMO

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.


Assuntos
Genômica , Multiômica , Genômica/métodos
2.
BMC Bioinformatics ; 25(1): 276, 2024 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-39179997

RESUMO

Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. AVAILABILITY : This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/ .


Assuntos
Aprendizado de Máquina , Software , Genômica/métodos , Redes Reguladoras de Genes , Biologia Computacional/métodos , Humanos , Multiômica
3.
BMC Genomics ; 25(1): 825, 2024 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-39223457

RESUMO

BACKGROUND: Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features. METHODS: Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed a genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS. RESULTS: We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts. CONCLUSIONS: In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.


Assuntos
Estudo de Associação Genômica Ampla , Proteômica , Doença Pulmonar Obstrutiva Crônica , Fumar , Humanos , Doença Pulmonar Obstrutiva Crônica/genética , Fumar/genética , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Locos de Características Quantitativas , Fenótipo , Polimorfismo de Nucleotídeo Único , Variação Genética
4.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36548341

RESUMO

MOTIVATION: Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module's information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. RESULTS: In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. AVAILABILITY AND IMPLEMENTATION: R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Simulação por Computador , Análise de Componente Principal , Tamanho da Amostra
5.
Mamm Genome ; 27(9-10): 469-84, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27401171

RESUMO

Gene co-expression analysis has proven to be a powerful tool for ascertaining the organization of gene products into networks that are important for organ function. An organ, such as the liver, engages in a multitude of functions important for the survival of humans, rats, and other animals; these liver functions include energy metabolism, metabolism of xenobiotics, immune system function, and hormonal homeostasis. With the availability of organ-specific transcriptomes, we can now examine the role of RNA transcripts (both protein-coding and non-coding) in these functions. A systems genetic approach for identifying and characterizing liver gene networks within a recombinant inbred panel of rats was used to identify genetically regulated transcriptional networks (modules). For these modules, biological consensus was found between functional enrichment analysis and publicly available phenotypic quantitative trait loci (QTL). In particular, the biological function of two liver modules could be linked to immune response. The eigengene QTLs for these co-expression modules were located at genomic regions coincident with highly significant phenotypic QTLs; these phenotypes were related to rheumatoid arthritis, food preference, and basal corticosterone levels in rats. Our analysis illustrates that genetically and biologically driven RNA-based networks, such as the ones identified as part of this research, provide insight into the genetic influences on organ functions. These networks can pinpoint phenotypes that manifest through the interaction of many organs/tissues and can identify unannotated or under-annotated RNA transcripts that play a role in these phenotypes.


Assuntos
Fígado/metabolismo , RNA/metabolismo , Animais , Feminino , Ontologia Genética , Sistema Imunitário/metabolismo , Desequilíbrio de Ligação , Fígado/imunologia , Escore Lod , Masculino , Locos de Características Quantitativas , RNA/genética , Ratos Endogâmicos SHR , Análise de Sequência de RNA , Transcriptoma
6.
Nucleic Acids Res ; 42(17): e133, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25063298

RESUMO

microRNAs (miRNAs) regulate expression by promoting degradation or repressing translation of target transcripts. miRNA target sites have been catalogued in databases based on experimental validation and computational prediction using various algorithms. Several online resources provide collections of multiple databases but need to be imported into other software, such as R, for processing, tabulation, graphing and computation. Currently available miRNA target site packages in R are limited in the number of databases, types of databases and flexibility. We present multiMiR, a new miRNA-target interaction R package and database, which includes several novel features not available in existing R packages: (i) compilation of nearly 50 million records in human and mouse from 14 different databases, more than any other collection; (ii) expansion of databases to those based on disease annotation and drug microRNAresponse, in addition to many experimental and computational databases; and (iii) user-defined cutoffs for predicted binding strength to provide the most confident selection. Case studies are reported on various biomedical applications including mouse models of alcohol consumption, studies of chronic obstructive pulmonary disease in human subjects, and human cell line models of bladder cancer metastasis. We also demonstrate how multiMiR was used to generate testable hypotheses that were pursued experimentally.


Assuntos
Regiões 3' não Traduzidas , Bases de Dados de Ácidos Nucleicos , MicroRNAs/metabolismo , Software , Consumo de Bebidas Alcoólicas/genética , Animais , Antineoplásicos/farmacologia , Linhagem Celular Tumoral , Humanos , Camundongos , Metástase Neoplásica , Doença Pulmonar Obstrutiva Crônica/genética , Neoplasias da Bexiga Urinária/genética , Neoplasias da Bexiga Urinária/patologia
7.
Epigenomics ; : 1-16, 2024 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-39263873

RESUMO

Aim: Assess if cord blood differentially methylated regions (DMRs) representing human metastable epialleles (MEs) associate with offspring adiposity in 588 maternal-infant dyads from the Colorado Health Start Study.Materials & methods: DNA methylation was assessed via the Illumina 450K array (~439,500 CpG sites). Offspring adiposity was obtained via air displacement plethysmography. Linear regression modeled the association of DMRs potentially representing MEs with adiposity.Results & conclusion: We identified two potential MEs, ZFP57, which associated with infant adiposity change and B4GALNT4, which associated with infancy and childhood adiposity change. Nine DMRs annotating to genes that annotated to MEs associated with change in offspring adiposity (false discovery rate <0.05). Methylation of approximately 80% of DMRs identified associated with decreased change in adiposity.


[Box: see text].

8.
bioRxiv ; 2024 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-38260498

RESUMO

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. The PathIntegrate Python package is available at https://github.com/cwieder/PathIntegrate.

9.
bioRxiv ; 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38328226

RESUMO

Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.

10.
medRxiv ; 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38464285

RESUMO

Background: Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features. Methods: Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS. Results: We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts. Conclusions: In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.

11.
Am J Respir Cell Mol Biol ; 49(2): 316-23, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23590301

RESUMO

Although most cases of chronic obstructive pulmonary disease (COPD) occur in smokers, only a fraction of smokers develop the disease. We hypothesized distinct molecular signatures for COPD and emphysema in the peripheral blood mononuclear cells (PBMCs) of current and former smokers. To test this hypothesis, we identified and validated PBMC gene expression profiles in smokers with and without COPD. We generated expression data on 136 subjects from the COPDGene study, using Affymetrix U133 2.0 microarrays (Affymetrix, Santa Clara, CA). Multiple linear regression with adjustment for covariates (gender, age, body mass index, family history, smoking status, and pack-years) was used to identify candidate genes, and ingenuity pathway analysis was used to identify candidate pathways. Candidate genes were validated in 149 subjects according to multiplex quantitative real-time polymerase chain reaction, which included 75 subjects not previously profiled. Pathways that were differentially expressed in subjects with COPD and emphysema included those that play a role in the immune system, inflammatory responses, and sphingolipid (ceramide) metabolism. Twenty-six of the 46 candidate genes (e.g., FOXP1, TCF7, and ASAH1) were validated in the independent cohort. Plasma metabolomics was used to identify a novel glycoceramide (galabiosylceramide) as a biomarker of emphysema, supporting the genomic association between acid ceramidase (ASAH1) and emphysema. COPD is a systemic disease whose gene expression signatures in PBMCs could serve as novel diagnostic or therapeutic targets.


Assuntos
Gangliosídeos/sangue , Regulação da Expressão Gênica , Leucócitos Mononucleares/metabolismo , Doença Pulmonar Obstrutiva Crônica/sangue , Idoso , Idoso de 80 Anos ou mais , Biomarcadores/sangue , Feminino , Humanos , Masculino , Metabolômica/métodos , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Enfisema Pulmonar/sangue , Enfisema Pulmonar/diagnóstico , Reação em Cadeia da Polimerase em Tempo Real
12.
Bioinformatics ; 28(22): 2986-8, 2012 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-22954632

RESUMO

SUMMARY: comb-p is a command-line tool and a python library that manipulates BED files of possibly irregularly spaced P-values and (1) calculates auto-correlation, (2) combines adjacent P-values, (3) performs false discovery adjustment, (4) finds regions of enrichment (i.e. series of adjacent low P-values) and (5) assigns significance to those regions. In addition, tools are provided for visualization and assessment. We provide validation and example uses on bisulfite-seq with P-values from Fisher's exact test, tiled methylation probes using a linear model and Dam-ID for chromatin binding using moderated t-statistics. Because the library accepts input in a simple, standardized format and is unaffected by the origin of the P-values, it can be used for a wide variety of applications. AVAILABILITY: comb-p is maintained under the BSD license. The documentation and implementation are available at https://github.com/brentp/combined-pvalues. CONTACT: bpederse@gmail.com


Assuntos
Genômica/métodos , Software , Sondas de DNA/análise , Sondas de DNA/genética , Estudo de Associação Genômica Ampla , Humanos , Linguagens de Programação , Análise de Sequência de DNA
13.
Stat Med ; 32(23): 4057-70, 2013 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-23703923

RESUMO

Although genome-wide expression data sets from multiple species are now more commonly generated, there have been few studies on how to best integrate this type of correlated data into models. Starting with a single-species, linear regression model that predicts transcription factor binding sites as a case study, we investigated how best to take into account the correlated expression data when extending this model to multiple species. Using a multivariate regression model, we accounted for the phylogenetic relationships among the species in two ways: (i) a repeated-measures model, where the error term is constrained; and (ii) a Bayesian hierarchical model, where the prior distributions of the regression coefficients are constrained. We show that both multiple-species models improve predictive performance over the single-species model. When compared with each other, the repeated-measures model outperformed the Bayesian model. We suggest a possible explanation for the better performance of the model with the constrained error term.


Assuntos
Teorema de Bayes , Interpretação Estatística de Dados , Perfilação da Expressão Gênica/métodos , Modelos Lineares , Análise Multivariada , Filogenia , Animais , Proteínas de Choque Térmico/genética , Humanos , Saccharomyces/genética
14.
Obesity (Silver Spring) ; 31(8): 2090-2102, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37475691

RESUMO

OBJECTIVE: Fat content of adipocytes derived from infant umbilical cord mesenchymal stem cells (MSCs) predicts adiposity in children through 4 to 6 years of age. This study tested the hypothesis that MSCs from infants born to mothers with obesity (Ob-MSCs) exhibit adipocyte hypertrophy and perturbations in genes regulating adipogenesis compared with MSCs from infants of mothers with normal weight (NW-MSCs). METHODS: Adipogenesis was induced in MSCs embedded in three-dimensional hydrogel structures, and cell size and number were measured by three-dimensional imaging. Proliferation and protein markers of proliferation and adipogenesis in undifferentiated and adipocyte differentiating cells were measured. RNA sequencing was performed to determine pathways linked to adipogenesis phenotype. RESULTS: In undifferentiated MSCs, greater zinc finger protein (Zfp)423 protein content was observed in Ob- versus NW-MSCs. Adipocytes from Ob-MSCs were larger but fewer than adipocytes from NW-MSCs. RNA sequencing analysis showed that Zfp423 protein correlated with mRNA expression of genes enriched for cell cycle, MSC lineage specification, inflammation, and metabolism pathways. MSC proliferation was not different before differentiation but declined faster in Ob-MSCs upon adipogenic induction. CONCLUSIONS: Ob-MSCs have an intrinsic propensity for adipocyte hypertrophy and reduced hyperplasia during adipogenesis in vitro, perhaps linked to greater Zfp423 content and changes in cell cycle pathway gene expression.


Assuntos
Células-Tronco Mesenquimais , Mães , Feminino , Humanos , Obesidade/genética , Obesidade/metabolismo , Diferenciação Celular/genética , Adipogenia/genética , Células-Tronco Mesenquimais/metabolismo , Fatores de Transcrição/metabolismo , Adipócitos/metabolismo , Hipertrofia/metabolismo
15.
Sci Rep ; 13(1): 9254, 2023 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-37286633

RESUMO

Privacy protection is a core principle of genomic but not proteomic research. We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS), calculated continuous protein level genotype probabilities, and then applied a naïve Bayesian approach to link SomaScan 1.3K proteomes to genomes for 2812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA). We correctly linked 90-95% of proteomes to their correct genome and for 95-99% we identify the 1% most likely links. The linking accuracy in subjects with African ancestry was lower (~ 60%) unless training included diverse subjects. With larger profiling (SomaScan 5K) in the Atherosclerosis Risk Communities (ARIC) correct identification was > 99% even in mixed ancestry populations. We also linked proteomes-to-proteomes and used the proteome only to determine features such as sex, ancestry, and first-degree relatives. When serial proteomes are available, the linking algorithm can be used to identify and correct mislabeled samples. This work also demonstrates the importance of including diverse populations in omics research and that large proteomic datasets (> 1000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered unidentifiable.


Assuntos
Aterosclerose , Proteoma , Humanos , Proteoma/genética , Teorema de Bayes , Privacidade , Estudo de Associação Genômica Ampla , Aterosclerose/genética , Polimorfismo de Nucleotídeo Único
16.
Alcohol Clin Exp Res ; 36(9): 1519-29, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22530671

RESUMO

BACKGROUND: Prenatal alcohol exposure can result in fetal alcohol spectrum disorders (FASD). Not all women who consume alcohol during pregnancy have children with FASD and studies have shown that genetic factors can play a role in ethanol teratogenesis. We examined gene expression in embryos and placentae from C57BL/6J (B6) and DBA/2J (D2) mice following prenatal alcohol exposure. B6 fetuses are susceptible to morphological malformations following prenatal alcohol exposure while D2 are relatively resistant. METHODS: Male and female B6 and D2 mice were mated for 2 hours in the morning, producing 4 embryonic genotypes: true-bred B6B6 and D2D2, and reciprocal B6D2 and D2B6. On gestational day 9, dams were intubated with 5.8 g/kg ethanol, an isocaloric amount of maltose dextrin, or nothing. Four hours later, dams were sacrificed and embryos and placentae were harvested. RNA was extracted, labeled and hybridized to Affymetrix Mouse Genome 430 v2 microarray chips. Data were normalized, subjected to analysis of variance and tested for enrichment of gene ontology molecular function and biological process using the Database for Annotation, Visualization and Integrated Discovery (DAVID). RESULTS: Several gene classes were differentially expressed in B6 and D2 regardless of treatment, including genes involved in polysaccharide binding and mitosis. Prenatal alcohol exposure altered expression of a subset of genes, including genes involved in methylation, chromatin remodeling, protein synthesis, and mRNA splicing. Very few genes were differentially expressed between maltose-exposed tissues and tissues that received nothing, so we combined these groups for comparisons with ethanol. While we observed many expression changes specific to B6 following prenatal alcohol exposure, none were specific for D2. Gene classes up- or down-regulated in B6 following prenatal alcohol exposure included genes involved in mRNA splicing, transcription, and translation. CONCLUSIONS: Our study identified several classes of genes with altered expression following prenatal alcohol exposure, including many specific for B6, a strain susceptible to ethanol teratogenesis. Lack of strain specific effects in D2 suggests there are few gene expression changes that confer resistance. Future studies will begin to analyze functional significance of the expression changes.


Assuntos
Expressão Gênica/efeitos dos fármacos , Efeitos Tardios da Exposição Pré-Natal/genética , Análise de Variância , Animais , Depressores do Sistema Nervoso Central/toxicidade , Etanol/toxicidade , Feminino , Transtornos do Espectro Alcoólico Fetal/genética , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos DBA , Análise em Microsséries , Placenta/metabolismo , Reação em Cadeia da Polimerase , Gravidez , RNA/biossíntese , RNA/genética , Especificidade da Espécie , Teratogênicos/toxicidade
17.
Artigo em Inglês | MEDLINE | ID: mdl-36776768

RESUMO

The study of complex behavior of biological systems has become increasingly dependent on evolutionary network modeling. In particular, multi-omics networks capture interactions between biomolecules such as proteins and metabolites, providing a basis for predicting relationships between such biomolecules and various phenotypic traits of complex diseases. In this paper, we introduce an integrative framework that given a multi-omics network representing a cohort of subjects, learns expressive representations for network nodes, and combines the learned nodes representations with the biological profiles of individual subjects for enriched representation of the subjects. With extensive empirical evaluation using real-world multi-omics networks, we show that our proposed framework significantly outperforms existing and baseline methods in terms of subject representation accuracy, particularly when the multi-omics network representing the cohort is sparse and structured and therefore, more informative.

18.
Front Big Data ; 5: 894632, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35811829

RESUMO

Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death in the United States. COPD represents one of many areas of research where identifying complex pathways and networks of interacting biomarkers is an important avenue toward studying disease progression and potentially discovering cures. Recently, sparse multiple canonical correlation network analysis (SmCCNet) was developed to identify complex relationships between omics associated with a disease phenotype, such as lung function. SmCCNet uses two sets of omics datasets and an associated output phenotypes to generate a multi-omics graph, which can then be used to explore relationships between omics in the context of a disease. Detecting significant subgraphs within this multi-omics network, i.e., subgraphs which exhibit high correlation to a disease phenotype and high inter-connectivity, can help clinicians identify complex biological relationships involved in disease progression. The current approach to identifying significant subgraphs relies on hierarchical clustering, which can be used to inform clinicians about important pathways involved in the disease or phenotype of interest. The reliance on a hierarchical clustering approach can hinder subgraph quality by biasing toward finding more compact subgraphs and removing larger significant subgraphs. This study aims to introduce new significant subgraph detection techniques. In particular, we introduce two subgraph detection methods, dubbed Correlated PageRank and Correlated Louvain, by extending the Personalized PageRank Clustering and Louvain algorithms, as well as a hybrid approach combining the two proposed methods, and compare them to the hierarchical method currently in use. The proposed methods show significant improvement in the quality of the subgraphs produced when compared to the current state of the art.

19.
Behav Genet ; 41(4): 625-8, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21184165

RESUMO

Our laboratory has developed an online interactive resource called PhenoGen ( http://phenogen.ucdenver.edu ) which provides an archive of brain and other organ gene expression data from a panel of 20 common inbred mouse strains, and three recombinant inbred (RI) panels (two mouse and one rat). DNA microarray data can also be uploaded to the site where numerous analytical tools can be implemented. An important advantage to the archived data is that each array represents data from a single animal and each strain was sampled 4-7 times, providing an estimate of genetic variance (heritability) of individual transcript levels. These panels also allow genetic mapping of expression QTLs. Overlap of eQTLs with phenotypic QTLs provides a powerful approach to candidate gene identification. These methods are briefly described here and we encourage the use of our site for both scientific discovery and as a teaching tool in quantitative genetics.


Assuntos
Genoma , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Mapeamento Cromossômico , Cruzamentos Genéticos , Perfilação da Expressão Gênica , Genética Comportamental , Internet , Camundongos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , RNA Mensageiro/metabolismo , Software
20.
Stat Appl Genet Mol Biol ; 9: Article29, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20812907

RESUMO

High density tiling arrays are an effective strategy for genome-wide identification of transcription factor binding regions. Sliding window methods that calculate moving averages of log ratios or t-statistics have been useful for the analysis of tiling array data. Here, we present a method that generalizes the moving average approach to evaluate sliding windows of p-values by using combined p-value statistics. In particular, the combined p-value framework can be useful in situations when taking averages of the corresponding test-statistic for the hypothesis may not be appropriate or when it is difficult to assess the significance of these averages. We exhibit the strengths of the combined p-values methods on Drosophila tiling array data and assess their ability to predict genomic regions enriched for transcription factor binding. The predictions are evaluated based on their proximity to target genes and their enrichment of known transcription factor binding sites. We also present an application for the generalization of the moving average based on integrating two different tiling array experiments.


Assuntos
Drosophila/genética , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Sítios de Ligação , Interpretação Estatística de Dados , Fatores de Transcrição/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA