Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 295
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 2024 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-39134413

RESUMEN

Gene regulatory networks (GRNs) are effective tools for inferring complex interactions between molecules that regulate biological processes and hence can provide insights into drivers of biological systems. Inferring coexpression networks is a critical element of GRN inference, as the correlation between expression patterns may indicate that genes are coregulated by common factors. However, methods that estimate coexpression networks generally derive an aggregate network representing the mean regulatory properties of the population and so fail to fully capture population heterogeneity. BONOBO (Bayesian Optimized Networks Obtained By assimilating Omics data) is a scalable Bayesian model for deriving individual sample-specific coexpression matrices that recognizes variations in molecular interactions across individuals. For each sample, BONOBO assumes a Gaussian distribution on the log-transformed centered gene expression and a conjugate prior distribution on the sample-specific coexpression matrix constructed from all other samples in the data. Combining the sample-specific gene coexpression with the prior distribution, BONOBO yields a closed-form solution for the posterior distribution of the sample-specific coexpression matrices, thus allowing the analysis of large datasets. We demonstrate BONOBO's utility in several contexts, including analyzing gene regulation in yeast transcription factor knockout studies, the prognostic significance of miRNA-mRNA interaction in human breast cancer subtypes, and sex differences in gene regulation within human thyroid tissue. We find that BONOBO outperforms other methods that have been used for sample-specific coexpression network inference and provides insight into individual differences in the drivers of biological processes.

2.
Nature ; 591(7851): 665-670, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33536619

RESUMEN

Strong connections exist between R-loops (three-stranded structures harbouring an RNA:DNA hybrid and a displaced single-strand DNA), genome instability and human disease1-5. Indeed, R-loops are favoured in relevant genomic regions as regulators of certain physiological processes through which homeostasis is typically maintained. For example, transcription termination pause sites regulated by R-loops can induce the synthesis of antisense transcripts that enable the formation of local, RNA interference (RNAi)-driven heterochromation6. Pause sites are also protected against endogenous single-stranded DNA breaks by BRCA17. Hypotheses about how DNA repair is enacted at pause sites include a role for RNA, which is emerging as a normal, albeit unexplained, regulator of genome integrity8. Here we report that a species of single-stranded, DNA-damage-associated small RNA (sdRNA) is generated by a BRCA1-RNAi protein complex. sdRNAs promote DNA repair driven by the PALB2-RAD52 complex at transcriptional termination pause sites that form R-loops and are rich in single-stranded DNA breaks. sdRNA repair operates in both quiescent (G0) and proliferating cells. Thus, sdRNA repair can occur in intact tissue and/or stem cells, and may contribute to tumour suppression mediated by BRCA1.


Asunto(s)
Proteína BRCA1/metabolismo , Reparación del ADN , Proteína del Grupo de Complementación N de la Anemia de Fanconi/metabolismo , Interferencia de ARN , Proteína Recombinante y Reparadora de ADN Rad52/metabolismo , Proteínas Argonautas/metabolismo , Proteínas de Ciclo Celular/metabolismo , Daño del ADN , Factores Eucarióticos de Iniciación/metabolismo , Células HeLa , Humanos , ARN Interferente Pequeño/genética , ARN Interferente Pequeño/metabolismo , Fase de Descanso del Ciclo Celular , Ribonucleasa III/metabolismo
3.
Genome Res ; 32(3): 524-533, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35193937

RESUMEN

Understanding how each person's unique genotype influences their individual patterns of gene regulation has the potential to improve our understanding of human health and development, and to refine genotype-specific disease risk assessments and treatments. However, the effects of genetic variants are not typically considered when constructing gene regulatory networks, despite the fact that many disease-associated genetic variants are thought to have regulatory effects, including the disruption of transcription factor (TF) binding. We developed EGRET (Estimating the Genetic Regulatory Effect on TFs), which infers a genotype-specific gene regulatory network for each individual in a study population. EGRET begins by constructing a genotype-informed TF-gene prior network derived using TF motif predictions, expression quantitative trait locus (eQTL) data, individual genotypes, and the predicted effects of genetic variants on TF binding. It then uses a technique known as message passing to integrate this prior network with gene expression and TF protein-protein interaction data to produce a refined, genotype-specific regulatory network. We used EGRET to infer gene regulatory networks for two blood-derived cell lines and identified genotype-associated, cell line-specific regulatory differences that we subsequently validated using allele-specific expression, chromatin accessibility QTLs, and differential ChIP-seq TF binding. We also inferred EGRET networks for three cell types from each of 119 individuals and identified cell type-specific regulatory differences associated with diseases related to those cell types. EGRET is, to our knowledge, the first method that infers networks reflective of individual genetic variation in a way that provides insight into the genetic regulatory associations driving complex phenotypes.


Asunto(s)
Redes Reguladoras de Genes , Factores de Transcripción , Cromatina , Inmunoprecipitación de Cromatina , Genotipo , Humanos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
4.
Nucleic Acids Res ; 51(3): e15, 2023 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-36533448

RESUMEN

The increasing quantity of multi-omic data, such as methylomic and transcriptomic profiles collected on the same specimen or even on the same cell, provides a unique opportunity to explore the complex interactions that define cell phenotype and govern cellular responses to perturbations. We propose a network approach based on Gaussian Graphical Models (GGMs) that facilitates the joint analysis of paired omics data. This method, called DRAGON (Determining Regulatory Associations using Graphical models on multi-Omic Networks), calibrates its parameters to achieve an optimal trade-off between the network's complexity and estimation accuracy, while explicitly accounting for the characteristics of each of the assessed omics 'layers.' In simulation studies, we show that DRAGON adapts to edge density and feature size differences between omics layers, improving model inference and edge recovery compared to state-of-the-art methods. We further demonstrate in an analysis of joint transcriptome - methylome data from TCGA breast cancer specimens that DRAGON can identify key molecular mechanisms such as gene regulation via promoter methylation. In particular, we identify Transcription Factor AP-2 Beta (TFAP2B) as a potential multi-omic biomarker for basal-type breast cancer. DRAGON is available as open-source code in Python through the Network Zoo package (netZooPy v0.8; netzoo.github.io).


Asunto(s)
Multiómica , Neoplasias , Humanos , Programas Informáticos , Simulación por Computador , Transcriptoma , Neoplasias/genética , Redes Reguladoras de Genes
5.
Artículo en Inglés | MEDLINE | ID: mdl-39102858

RESUMEN

Compared to men, women often develop COPD at an earlier age with worse respiratory symptoms despite lower smoking exposure. However, most preventive, and therapeutic strategies ignore biological sex differences in COPD. Our goal was to better understand sex-specific gene regulatory processes in lung tissue and the molecular basis for sex differences in COPD onset and severity. We analyzed lung tissue gene expression and DNA methylation data from 747 individuals in the Lung Tissue Research Consortium (LTRC), and 85 individuals in an independent dataset. We identified sex differences in COPD-associated gene regulation using gene regulatory networks. We used linear regression to test for sex-biased associations of methylation with lung function, emphysema, smoking, and age. Analyzing gene regulatory networks in the control group, we identified that genes involved in the extracellular matrix (ECM) have higher transcriptional factor targeting in females than in males. However, this pattern is reversed in COPD, with males showing stronger regulatory targeting of ECM-related genes than females. Smoking exposure, age, lung function, and emphysema were all associated with sex-specific differential methylation of ECM-related genes. We identified sex-based gene regulatory patterns of ECM-related genes associated with lung function and emphysema. Multiple factors including epigenetics, smoking, aging, and cell heterogeneity influence sex-specific gene regulation in COPD. Our findings underscore the importance of considering sex as a key factor in disease susceptibility and severity.

6.
Nucleic Acids Res ; 50(D1): D610-D621, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34508353

RESUMEN

Gene regulation plays a fundamental role in shaping tissue identity, function, and response to perturbation. Regulatory processes are controlled by complex networks of interacting elements, including transcription factors, miRNAs and their target genes. The structure of these networks helps to determine phenotypes and can ultimately influence the development of disease or response to therapy. We developed GRAND (https://grand.networkmedicine.org) as a database for computationally-inferred, context-specific gene regulatory network models that can be compared between biological states, or used to predict which drugs produce changes in regulatory network structure. The database includes 12 468 genome-scale networks covering 36 human tissues, 28 cancers, 1378 unperturbed cell lines, as well as 173 013 TF and gene targeting scores for 2858 small molecule-induced cell line perturbation paired with phenotypic information. GRAND allows the networks to be queried using phenotypic information and visualized using a variety of interactive tools. In addition, it includes a web application that matches disease states to potentially therapeutic small molecule drugs using regulatory network properties.


Asunto(s)
Bases de Datos Genéticas , Bases de Datos Farmacéuticas , Redes Reguladoras de Genes/genética , Programas Informáticos , Regulación de la Expresión Génica/genética , Genoma Humano/genética , Humanos , MicroARNs/clasificación , MicroARNs/genética , Factores de Transcripción/clasificación , Factores de Transcripción/genética
7.
Cancer Causes Control ; 33(8): 1107-1120, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35759080

RESUMEN

Cancer heterogeneities hold the key to a deeper understanding of cancer etiology and progression and the discovery of more precise cancer therapy. Modern pathological and molecular technologies offer a powerful set of tools to profile tumor heterogeneities at multiple levels in large patient populations, from DNA to RNA, protein and epigenetics, and from tumor tissues to tumor microenvironment and liquid biopsy. When coupled with well-validated epidemiologic methodology and well-characterized epidemiologic resources, the rich tumor pathological and molecular tumor information provide new research opportunities at an unprecedented breadth and depth. This is the research space where Molecular Pathological Epidemiology (MPE) emerged over a decade ago and has been thriving since then. As a truly multidisciplinary field, MPE embraces collaborations from diverse fields including epidemiology, pathology, immunology, genetics, biostatistics, bioinformatics, and data science. Since first convened in 2013, the International MPE Meeting series has grown into a dynamic and dedicated platform for experts from these disciplines to communicate novel findings, discuss new research opportunities and challenges, build professional networks, and educate the next-generation scientists. Herein, we share the proceedings of the Fifth International MPE meeting, held virtually online, on May 24 and 25, 2021. The meeting consisted of 21 presentations organized into the three main themes, which were recent integrative MPE studies, novel cancer profiling technologies, and new statistical and data science approaches. Looking forward to the near future, the meeting attendees anticipated continuous expansion and fruition of MPE research in many research fronts, particularly immune-epidemiology, mutational signatures, liquid biopsy, and health disparities.


Asunto(s)
Neoplasias , Patología Molecular , Humanos , Mutación , Neoplasias/epidemiología , Neoplasias/genética , Neoplasias/terapia , Patología Molecular/métodos , Microambiente Tumoral
8.
Respir Res ; 23(1): 157, 2022 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-35715807

RESUMEN

BACKGROUND: Interstitial lung abnormalities (ILA) are radiologic findings that may progress to idiopathic pulmonary fibrosis (IPF). Blood gene expression profiles can predict IPF mortality, but whether these same genes associate with ILA and ILA outcomes is unknown. This study evaluated if a previously described blood gene expression profile associated with IPF mortality is associated with ILA and all-cause mortality. METHODS: In COPDGene and ECLIPSE study participants with visual scoring of ILA and gene expression data, we evaluated the association of a previously described IPF mortality score with ILA and mortality. We also trained a new ILA score, derived using genes from the IPF score, in a subset of COPDGene. We tested the association with ILA and mortality on the remainder of COPDGene and ECLIPSE. RESULTS: In 1469 COPDGene (training n = 734; testing n = 735) and 571 ECLIPSE participants, the IPF score was not associated with ILA or mortality. However, an ILA score derived from IPF score genes was associated with ILA (meta-analysis of test datasets OR 1.4 [95% CI: 1.2-1.6]) and mortality (HR 1.25 [95% CI: 1.12-1.41]). Six of the 11 genes in the ILA score had discordant directions of effects compared to the IPF score. The ILA score partially mediated the effects of age on mortality (11.8% proportion mediated). CONCLUSIONS: An ILA gene expression score, derived from IPF mortality-associated genes, identified genes with concordant and discordant effects on IPF mortality and ILA. These results suggest shared, and unique biologic processes, amongst those with ILA, IPF, aging, and death.


Asunto(s)
Fibrosis Pulmonar Idiopática , Enfermedades Pulmonares Intersticiales , Estudios de Cohortes , Humanos , Fibrosis Pulmonar Idiopática/diagnóstico , Fibrosis Pulmonar Idiopática/genética , Pulmón , Enfermedades Pulmonares Intersticiales/diagnóstico , Enfermedades Pulmonares Intersticiales/genética , Tomografía Computarizada por Rayos X , Transcriptoma/genética
9.
Bioinformatics ; 36(18): 4765-4773, 2020 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-32860050

RESUMEN

MOTIVATION: Conventional methods to analyze genomic data do not make use of the interplay between multiple factors, such as between microRNAs (miRNAs) and the messenger RNA (mRNA) transcripts they regulate, and thereby often fail to identify the cellular processes that are unique to specific tissues. We developed PUMA (PANDA Using MicroRNA Associations), a computational tool that uses message passing to integrate a prior network of miRNA target predictions with target gene co-expression information to model genome-wide gene regulation by miRNAs. We applied PUMA to 38 tissues from the Genotype-Tissue Expression project, integrating RNA-Seq data with two different miRNA target predictions priors, built on predictions from TargetScan and miRanda, respectively. We found that while target predictions obtained from these two different resources are considerably different, PUMA captures similar tissue-specific miRNA-target regulatory interactions in the different network models. Furthermore, the tissue-specific functions of miRNAs we identified based on regulatory profiles (available at: https://kuijjer.shinyapps.io/puma_gtex/) are highly similar between networks modeled on the two target prediction resources. This indicates that PUMA consistently captures important tissue-specific miRNA regulatory processes. In addition, using PUMA we identified miRNAs regulating important tissue-specific processes that, when mutated, may result in disease development in the same tissue. AVAILABILITY AND IMPLEMENTATION: PUMA is available in C++, MATLAB and Python on GitHub (https://github.com/kuijjerlab and https://netzoo.github.io/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
MicroARNs , Proteínas Reguladoras de la Apoptosis/genética , Biología Computacional , Regulación de la Expresión Génica , Redes Reguladoras de Genes , MicroARNs/genética , ARN Mensajero , RNA-Seq
11.
Am J Respir Crit Care Med ; 201(9): 1099-1109, 2020 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-31995399

RESUMEN

Rationale: Smoking results in at least a decade lower life expectancy. Mortality among current smokers is two to three times as high as never smokers. DNA methylation is an epigenetic modification of the human genome that has been associated with both cigarette smoking and mortality.Objectives: We sought to identify DNA methylation marks in blood that are predictive of mortality in a subset of the COPDGene (Genetic Epidemiology of COPD) study, representing 101 deaths among 667 current and former smokers.Methods: We assayed genome-wide DNA methylation in non-Hispanic white smokers with and without chronic obstructive pulmonary disease (COPD) using blood samples from the COPDGene enrollment visit. We tested whether DNA methylation was associated with mortality in models adjusted for COPD status, age, sex, current smoking status, and pack-years of cigarette smoking. Replication was performed in a subset of 231 individuals from the ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) study.Measurements and Main Results: We identified seven CpG sites associated with mortality (false discovery rate < 20%) that replicated in the ECLIPSE cohort (P < 0.05). None of these marks were associated with longitudinal lung function decline in survivors, smoking history, or current smoking status. However, differential methylation of two replicated PIK3CD (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit delta) sites were associated with lung function at enrollment (P < 0.05). We also observed associations between DNA methylation and gene expression for the PIK3CD sites.Conclusions: This study is the first to identify variable DNA methylation associated with all-cause mortality in smokers with and without COPD. Evaluating predictive epigenomic marks of smokers in peripheral blood may allow for targeted risk stratification and aid in delivery of future tailored therapeutic interventions.


Asunto(s)
Biomarcadores de Tumor/sangre , Metilación de ADN , Valor Predictivo de las Pruebas , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/mortalidad , Fumar/genética , Fumar/mortalidad , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Epigénesis Genética , Femenino , Humanos , Masculino , Persona de Mediana Edad
12.
Br J Cancer ; 122(4): 569-577, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31806877

RESUMEN

BACKGROUND: Genome-wide association studies (GWASes) have identified many noncoding germline single-nucleotide polymorphisms (SNPs) that are associated with an increased risk of developing cancer. However, how these SNPs affect cancer risk is still largely unknown. METHODS: We used a systems biology approach to analyse the regulatory role of cancer-risk SNPs in thirteen tissues. By using data from the Genotype-Tissue Expression (GTEx) project, we performed an expression quantitative trait locus (eQTL) analysis. We represented both significant cis- and trans-eQTLs as edges in tissue-specific eQTL bipartite networks. RESULTS: Each tissue-specific eQTL network is organised into communities that group sets of SNPs and functionally related genes. When mapping cancer-risk SNPs to these networks, we find that in each tissue, these SNPs are significantly overrepresented in communities enriched for immune response processes, as well as tissue-specific functions. Moreover, cancer-risk SNPs are more likely to be 'cores' of their communities, influencing the expression of many genes within the same biological processes. Finally, cancer-risk SNPs preferentially target oncogenes and tumour-suppressor genes, suggesting that they may alter the expression of these key cancer genes. CONCLUSIONS: This approach provides a new way of understanding genetic effects on cancer risk and provides a biological context for interpreting the results of GWAS cancer studies.


Asunto(s)
Genes Supresores de Tumor , Predisposición Genética a la Enfermedad/genética , Neoplasias/genética , Neoplasias/inmunología , Oncogenes/genética , Polimorfismo de Nucleótido Simple , Humanos , Sitios de Carácter Cuantitativo
14.
Bioinformatics ; 35(22): 4568-4576, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31062858

RESUMEN

MOTIVATION: Cancer genomics studies frequently aim to identify genes that are differentially expressed between clinically distinct patient subgroups, generally by testing single genes one at a time. However, the results of any individual transcriptomic study are often not fully reproducible. A particular challenge impeding statistical analysis is the difficulty of distinguishing between differential expression comprising part of the genomic disease etiology and that induced by downstream effects. More robust analytical approaches that are well-powered to detect potentially causative genes, are less prone to discovering spurious associations, and can deliver reproducible findings across different studies are needed. RESULTS: We propose a set-based procedure for testing of differential expression and show that this set-based approach can produce more robust results by aggregating information across multiple, correlated genomic markers. Specifically, we adapt the Generalized Berk-Jones statistic to test for the transcription factors that may contribute to the progression of estrogen receptor positive breast cancer. We demonstrate the ability of our method to produce reproducible findings by applying the same analysis to 21 publicly available datasets, producing a similar list of significant transcription factors across most studies. Our Generalized Berk-Jones approach produces results that show improved consistency over three set-based testing algorithms: Generalized Higher Criticism, Gene Set Analysis and Gene Set Enrichment Analysis. AVAILABILITY AND IMPLEMENTATION: Data are in the MetaGxBreast R package. Code is available at github.com/ryanrsun/gaynor_sun_GBJ_breast_cancer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica , Algoritmos , Neoplasias de la Mama , Genoma , Humanos , Transcriptoma
15.
Anal Biochem ; 601: 113768, 2020 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-32416095

RESUMEN

Understanding reverse transcriptase (RT) activity is critical for designing fast one-step RT-PCRs. We report a stopped-flow assay that monitors SYBR Green I fluorescence to investigate RT activity in PCR conditions. We studied the influence of PCR conditions on RT activity and assessed the accuracy of cDNA synthesis predictions for one-step RT-PCR. Nucleotide incorporation increased from 26 to 89 s-1 between 1.5 and 6 mM MgCl2 but was largely unaffected by changes in KCl. Conversely, increasing KCl from 15 to 75 mM increased apparent rate constants for RT-oligonucleotide binding (0.010-0.026 nM-1 s-1) and unbinding (0.2-1.5 s-1). All rate constants increased between 22 and 42 °C. When evaluated by PCR quantification cycle, cDNA predictions differed from experiments using RNase H+ RT (average 1.7 cycles) and RNase H- (average 4.5 cycles). Decreasing H+ RT concentrations 10 to 104-fold from manufacturer recommendations improved cDNA predictions (average 0.8 cycles) and increased RT-PCR assay efficiency. RT activity assays and models can be used to aid assay design and improve the speed of RT-PCRs. RT type and concentration must be selected to promote rapid cDNA synthesis but minimize nonspecific amplification. We demonstrate 2-min one-step RT-PCR of a Zika virus target using reduced RT concentrations and extreme PCR.


Asunto(s)
ADN Polimerasa Dirigida por ARN/genética , ADN Polimerasa Dirigida por ARN/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Benzotiazoles , Diaminas , Fluorescencia , Humanos , Cinética , Compuestos Orgánicos/química , Quinolinas
16.
Proc Natl Acad Sci U S A ; 114(37): E7841-E7850, 2017 09 12.
Artículo en Inglés | MEDLINE | ID: mdl-28851834

RESUMEN

Characterizing the collective regulatory impact of genetic variants on complex phenotypes is a major challenge in developing a genotype to phenotype map. Using expression quantitative trait locus (eQTL) analyses, we constructed bipartite networks in which edges represent significant associations between genetic variants and gene expression levels and found that the network structure informs regulatory function. We show, in 13 tissues, that these eQTL networks are organized into dense, highly modular communities grouping genes often involved in coherent biological processes. We find communities representing shared processes across tissues, as well as communities associated with tissue-specific processes that coalesce around variants in tissue-specific active chromatin regions. Node centrality is also highly informative, with the global and community hubs differing in regulatory potential and likelihood of being disease associated.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Especificidad de Órganos/genética , Sitios de Carácter Cuantitativo/genética , Expresión Génica/genética , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/fisiología , Transcriptoma/genética
17.
Biostatistics ; 19(2): 185-198, 2018 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-29036413

RESUMEN

Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.


Asunto(s)
Bioestadística/métodos , Interpretación Estadística de Datos , Genómica/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Modelos Estadísticos , Humanos
18.
Cancer Causes Control ; 30(8): 799-811, 2019 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-31069578

RESUMEN

An important premise of epidemiology is that individuals with the same disease share similar underlying etiologies and clinical outcomes. In the past few decades, our knowledge of disease pathogenesis has improved, and disease classification systems have evolved to the point where no complex disease processes are considered homogenous. As a result, pathology and epidemiology have been integrated into the single, unified field of molecular pathological epidemiology (MPE). Advancing integrative molecular and population-level health sciences and addressing the unique research challenges specific to the field of MPE necessitates assembling experts in diverse fields, including epidemiology, pathology, biostatistics, computational biology, bioinformatics, genomics, immunology, and nutritional and environmental sciences. Integrating these seemingly divergent fields can lead to a greater understanding of pathogenic processes. The International MPE Meeting Series fosters discussion that addresses the specific research questions and challenges in this emerging field. The purpose of the meeting series is to: discuss novel methods to integrate pathology and epidemiology; discuss studies that provide pathogenic insights into population impact; and educate next-generation scientists. Herein, we share the proceedings of the Fourth International MPE Meeting, held in Boston, MA, USA, on 30 May-1 June, 2018. Major themes of this meeting included 'integrated genetic and molecular pathologic epidemiology', 'immunology-MPE', and 'novel disease phenotyping'. The key priority areas for future research identified by meeting attendees included integration of tumor immunology and cancer disparities into epidemiologic studies, further collaboration between computational and population-level scientists to gain new insight on exposure-disease associations, and future pooling projects of studies with comparable data.


Asunto(s)
Epidemiología , Patología Molecular , Humanos , Neoplasias/epidemiología , Neoplasias/genética , Neoplasias/inmunología , Neoplasias/patología
19.
BMC Cancer ; 19(1): 1003, 2019 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-31653243

RESUMEN

BACKGROUND: In biomedical research, network inference algorithms are typically used to infer complex association patterns between biological entities, such as between genes or proteins, using data from a population. This resulting aggregate network, in essence, averages over the networks of those individuals in the population. LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples) is a method that can be used together with a network inference algorithm to extract networks for individual samples in a population. The method's key characteristic is that, by modeling networks for individual samples in a data set, it can capture network heterogeneity in a population. LIONESS was originally made available as a function within the PANDA (Passing Attributes between Networks for Data Assimilation) regulatory network reconstruction framework. However, the LIONESS algorithm is generalizable and can be used to model single sample networks based on a wide range of network inference algorithms. RESULTS: In this software article, we describe lionessR, an R implementation of LIONESS that can be applied to any network inference method in R that outputs a complete, weighted adjacency matrix. As an example, we provide a vignette of an application of lionessR to model single sample networks based on correlated gene expression in a bone cancer dataset. We show how the tool can be used to identify differential patterns of correlation between two groups of patients. CONCLUSIONS: We developed lionessR, an open source R package to model single sample networks. We show how lionessR can be used to inform us on potential precision medicine applications in cancer. The lionessR package is a user-friendly tool to perform such analyses. The package, which includes a vignette describing the application, is freely available at: https://github.com/kuijjerlab/lionessR and at: http://bioconductor.org/packages/lionessR .


Asunto(s)
Algoritmos , Biología Computacional/métodos , Simulación por Computador , Medicina de Precisión/métodos , Programas Informáticos , Biopsia , Neoplasias Óseas/genética , Neoplasias Óseas/patología , Redes Reguladoras de Genes , Humanos , Neoplasias/terapia , Osteosarcoma/genética , Osteosarcoma/patología , Análisis de Supervivencia , Transcriptoma
20.
Hum Genomics ; 12(1): 1, 2018 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-29335020

RESUMEN

BACKGROUND: Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) significantly associated with chronic obstructive pulmonary disease (COPD). However, many genetic variants show suggestive evidence for association but do not meet the strict threshold for genome-wide significance. Integrative analysis of multiple omics datasets has the potential to identify novel genes involved in disease pathogenesis by leveraging these variants in a functional, regulatory context. RESULTS: We performed expression quantitative trait locus (eQTL) analysis using genome-wide SNP genotyping and gene expression profiling of lung tissue samples from 86 COPD cases and 31 controls, testing for SNPs associated with gene expression levels. These results were integrated with a prior COPD GWAS using an ensemble statistical and network methods approach to identify relevant genes and observe them in the context of overall genetic control of gene expression to highlight co-regulated genes and disease pathways. We identified 250,312 unique SNPs and 4997 genes in the cis(local)-eQTL analysis (5% false discovery rate). The top gene from the integrative analysis was MAPT, a gene recently identified in an independent GWAS of lung function. The genes HNRNPAB and PCBP2 with RNA binding activity and the gene ACVR1B were identified in network communities with validated disease relevance. CONCLUSIONS: The integration of lung tissue gene expression with genome-wide SNP genotyping and subsequent intersection with prior GWAS and omics studies highlighted candidate genes within COPD loci and in communities harboring known COPD genes. This integration also identified novel disease genes in sub-threshold regions that would otherwise have been missed through GWAS.


Asunto(s)
Predisposición Genética a la Enfermedad , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Enfermedad Pulmonar Obstructiva Crónica/genética , Receptores de Activinas Tipo I/genética , Adulto , Anciano , Femenino , Regulación de la Expresión Génica , Genómica , Ribonucleoproteína Heterogénea-Nuclear Grupo A-B/genética , Humanos , Pulmón/metabolismo , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Enfermedad Pulmonar Obstructiva Crónica/patología , Sitios de Carácter Cuantitativo/genética , Proteínas de Unión al ARN/genética , Proteínas tau/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA