Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genes Dev ; 34(15-16): 1039-1050, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32561546

RESUMO

The FoxA transcription factors are critical for liver development through their pioneering activity, which initiates a highly complex regulatory network thought to become progressively resistant to the loss of any individual hepatic transcription factor via mutual redundancy. To investigate the dispensability of FoxA factors for maintaining this regulatory network, we ablated all FoxA genes in the adult mouse liver. Remarkably, loss of FoxA caused rapid and massive reduction in the expression of critical liver genes. Activity of these genes was reduced back to the low levels of the fetal prehepatic endoderm stage, leading to necrosis and lethality within days. Mechanistically, we found FoxA proteins to be required for maintaining enhancer activity, chromatin accessibility, nucleosome positioning, and binding of HNF4α. Thus, the FoxA factors act continuously, guarding hepatic enhancer activity throughout adult life.


Assuntos
Fatores de Transcrição Forkhead/fisiologia , Redes Reguladoras de Genes , Fígado/metabolismo , Animais , Sítios de Ligação , Cromatina/metabolismo , Elementos Facilitadores Genéticos , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Regulação da Expressão Gênica , Técnicas de Silenciamento de Genes , Fator 3-alfa Nuclear de Hepatócito/genética , Fator 3-beta Nuclear de Hepatócito/genética , Fator 3-gama Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/metabolismo , Fígado/patologia , Falência Hepática/etiologia , Falência Hepática/patologia , Masculino , Camundongos , Nucleossomos
2.
Diabetologia ; 2024 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-39240351

RESUMO

AIMS/HYPOTHESIS: Genome-wide association studies (GWAS) have identified hundreds of type 2 diabetes loci, with the vast majority of signals located in non-coding regions; as a consequence, it remains largely unclear which 'effector' genes these variants influence. Determining these effector genes has been hampered by the relatively challenging cellular settings in which they are hypothesised to confer their effects. METHODS: To implicate such effector genes, we elected to generate and integrate high-resolution promoter-focused Capture-C, assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA-seq datasets to characterise chromatin and expression profiles in multiple cell lines relevant to type 2 diabetes for subsequent functional follow-up analyses: EndoC-BH1 (pancreatic beta cell), HepG2 (hepatocyte) and Simpson-Golabi-Behmel syndrome (SGBS; adipocyte). RESULTS: The subsequent variant-to-gene analysis implicated 810 candidate effector genes at 370 type 2 diabetes risk loci. Using partitioned linkage disequilibrium score regression, we observed enrichment for type 2 diabetes and fasting glucose GWAS loci in promoter-connected putative cis-regulatory elements in EndoC-BH1 cells as well as fasting insulin GWAS loci in SGBS cells. Moreover, as a proof of principle, when we knocked down expression of the SMCO4 gene in EndoC-BH1 cells, we observed a statistically significant increase in insulin secretion. CONCLUSIONS/INTERPRETATION: These results provide a resource for comparing tissue-specific data in tractable cellular models as opposed to relatively challenging primary cell settings. DATA AVAILABILITY: Raw and processed next-generation sequencing data for EndoC-BH1, HepG2, SGBS_undiff and SGBS_diff cells are deposited in GEO under the Superseries accession GSE262484. Promoter-focused Capture-C data are deposited under accession GSE262496. Hi-C data are deposited under accession GSE262481. Bulk ATAC-seq data are deposited under accession GSE262479. Bulk RNA-seq data are deposited under accession GSE262480.

3.
Development ; 148(6)2021 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-33653874

RESUMO

To gain a deeper understanding of pancreatic ß-cell development, we used iterative weighted gene correlation network analysis to calculate a gene co-expression network (GCN) from 11 temporally and genetically defined murine cell populations. The GCN, which contained 91 distinct modules, was then used to gain three new biological insights. First, we found that the clustered protocadherin genes are differentially expressed during pancreas development. Pcdhγ genes are preferentially expressed in pancreatic endoderm, Pcdhß genes in nascent islets, and Pcdhα genes in mature ß-cells. Second, after extracting sub-networks of transcriptional regulators for each developmental stage, we identified 81 zinc finger protein (ZFP) genes that are preferentially expressed during endocrine specification and ß-cell maturation. Third, we used the GCN to select three ZFPs for further analysis by CRISPR mutagenesis of mice. Zfp800 null mice exhibited early postnatal lethality, and at E18.5 their pancreata exhibited a reduced number of pancreatic endocrine cells, alterations in exocrine cell morphology, and marked changes in expression of genes involved in protein translation, hormone secretion and developmental pathways in the pancreas. Together, our results suggest that developmentally oriented GCNs have utility for gaining new insights into gene regulation during organogenesis.


Assuntos
Diferenciação Celular/genética , Proteínas de Homeodomínio/genética , Organogênese/genética , Pâncreas/crescimento & desenvolvimento , Animais , Caderinas/genética , Linhagem da Célula/genética , Regulação da Expressão Gênica no Desenvolvimento/genética , Insulina/metabolismo , Ilhotas Pancreáticas/citologia , Ilhotas Pancreáticas/metabolismo , Camundongos , Pâncreas/metabolismo
4.
Hum Genet ; 141(9): 1529-1544, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34713318

RESUMO

The genetic analysis of complex traits has been dominated by parametric statistical methods due to their theoretical properties, ease of use, computational efficiency, and intuitive interpretation. However, there are likely to be patterns arising from complex genetic architectures which are more easily detected and modeled using machine learning methods. Unfortunately, selecting the right machine learning algorithm and tuning its hyperparameters can be daunting for experts and non-experts alike. The goal of automated machine learning (AutoML) is to let a computer algorithm identify the right algorithms and hyperparameters thus taking the guesswork out of the optimization process. We review the promises and challenges of AutoML for the genetic analysis of complex traits and give an overview of several approaches and some example applications to omics data. It is our hope that this review will motivate studies to develop and evaluate novel AutoML methods and software in the genetics and genomics space. The promise of AutoML is to enable anyone, regardless of training or expertise, to apply machine learning as part of their genetic analysis strategy.


Assuntos
Aprendizado de Máquina , Herança Multifatorial , Algoritmos , Genômica/métodos , Humanos , Software
5.
Am J Hum Genet ; 105(1): 89-107, 2019 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-31204013

RESUMO

Deciphering the impact of genetic variation on gene regulation is fundamental to understanding common, complex human diseases. Although histone modifications are important markers of gene regulatory elements of the genome, any specific histone modification has not been assayed in more than a few individuals in the human liver. As a result, the effects of genetic variation on histone modification states in the liver are poorly understood. Here, we generate the most comprehensive genome-wide dataset of two epigenetic marks, H3K4me3 and H3K27ac, and annotate thousands of putative regulatory elements in the human liver. We integrate these findings with genome-wide gene expression data collected from the same human liver tissues and high-resolution promoter-focused chromatin interaction maps collected from human liver-derived HepG2 cells. We demonstrate widespread functional consequences of natural genetic variation on putative regulatory element activity and gene expression levels. Leveraging these extensive datasets, we fine-map a total of 74 GWAS loci that have been associated with at least one complex phenotype. Our results reveal a repertoire of genes and regulatory mechanisms governing complex disease development and further the basic understanding of genetic and epigenetic regulation of gene expression in the human liver tissue.


Assuntos
Cromatina/genética , Mapeamento Cromossômico/métodos , Epigênese Genética , Fígado/patologia , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Adolescente , Adulto , Idoso , Criança , Cromatina/metabolismo , Feminino , Estudos de Associação Genética , Células Hep G2 , Histonas/genética , Humanos , Fígado/metabolismo , Masculino , Pessoa de Meia-Idade , Fenótipo , Regiões Promotoras Genéticas , Estudos Prospectivos , Sequências Reguladoras de Ácido Nucleico , Adulto Jovem
6.
Genet Epidemiol ; 44(1): 52-66, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31583758

RESUMO

Genetic interactions have been recognized as a potentially important contributor to the heritability of complex diseases. Nevertheless, due to small effect sizes and stringent multiple-testing correction, identifying genetic interactions in complex diseases is particularly challenging. To address the above challenges, many genomic research initiatives collaborate to form large-scale consortia and develop open access to enable sharing of genome-wide association study (GWAS) data. Despite the perceived benefits of data sharing from large consortia, a number of practical issues have arisen, such as privacy concerns on individual genomic information and heterogeneous data sources from distributed GWAS databases. In the context of large consortia, we demonstrate that the heterogeneously appearing marginal effects over distributed GWAS databases can offer new insights into genetic interactions for which conventional methods have had limited success. In this paper, we develop a novel two-stage testing procedure, named phylogenY-based effect-size tests for interactions using first 2 moments (YETI2), to detect genetic interactions through both pooled marginal effects, in terms of averaging site-specific marginal effects, and heterogeneity in marginal effects across sites, using a meta-analytic framework. YETI2 can not only be applied to large consortia without shared personal information but also can be used to leverage underlying heterogeneity in marginal effects to prioritize potential genetic interactions. We investigate the performance of YETI2 through simulation studies and apply YETI2 to bladder cancer data from dbGaP.


Assuntos
Epistasia Genética/genética , Estudo de Associação Genômica Ampla/métodos , Neoplasias da Bexiga Urinária/genética , Humanos , Disseminação de Informação , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética
7.
BMC Bioinformatics ; 21(1): 430, 2020 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-32998684

RESUMO

BACKGROUND: A typical task in bioinformatics consists of identifying which features are associated with a target outcome of interest and building a predictive model. Automated machine learning (AutoML) systems such as the Tree-based Pipeline Optimization Tool (TPOT) constitute an appealing approach to this end. However, in biomedical data, there are often baseline characteristics of the subjects in a study or batch effects that need to be adjusted for in order to better isolate the effects of the features of interest on the target. Thus, the ability to perform covariate adjustments becomes particularly important for applications of AutoML to biomedical big data analysis. RESULTS: We developed an approach to adjust for covariates affecting features and/or target in TPOT. Our approach is based on regressing out the covariates in a manner that avoids 'leakage' during the cross-validation training procedure. We describe applications of this approach to toxicogenomics and schizophrenia gene expression data sets. The TPOT extensions discussed in this work are available at https://github.com/EpistasisLab/tpot/tree/v0.11.1-resAdj . CONCLUSIONS: In this work, we address an important need in the context of AutoML, which is particularly crucial for applications to bioinformatics and medical informatics, namely covariate adjustments. To this end we present a substantial extension of TPOT, a genetic programming based AutoML approach. We show the utility of this extension by applications to large toxicogenomics and differential gene expression data. The method is generally applicable in many other scenarios from the biomedical field.


Assuntos
Big Data , Análise de Dados , Aprendizado de Máquina , Algoritmos , Automação , Humanos
8.
Genet Epidemiol ; 43(6): 717-726, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31145509

RESUMO

A typical task arising from main effect analyses in a Genome Wide Association Study (GWAS) is to identify single nucleotide polymorphisms (SNPs), in linkage disequilibrium with the observed signals, that are likely causal variants and the affected genes. The affected genes may not be those closest to associating SNPs. Functional genomics data from relevant tissues are believed to be helpful in selecting likely causal SNPs and interpreting implicated biological mechanisms, ultimately facilitating prevention and treatment in the case of a disease trait. These data are typically used post GWAS analyses to fine-map the statistically significant signals identified agnostically by testing all SNPs and applying a multiple testing correction. The number of tested SNPs is typically in the millions, so the multiple testing burden is high. Motivated by this, in this study we investigated an alternative workflow, which consists in utilizing the available functional genomics data as a first step to reduce the number of SNPs tested for association. We analyzed GWAS on electrocardiographic QRS duration using these two workflows. The alternative workflow identified more SNPs, including some residing in loci not discovered with the typical workflow. Moreover, the latter are corroborated by other reports on QRS duration. This indicates the potential value of incorporating functional genomics information at the onset in GWAS analyses.


Assuntos
Cardiomiopatias/genética , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Transcriptoma , Humanos , Desequilíbrio de Ligação , Fenótipo , Regiões Promotoras Genéticas , Fluxo de Trabalho
9.
Am J Hum Genet ; 101(5): 643-663, 2017 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-29056226

RESUMO

Neurodegenerative diseases pose an extraordinary threat to the world's aging population, yet no disease-modifying therapies are available. Although genome-wide association studies (GWASs) have identified hundreds of risk loci for neurodegeneration, the mechanisms by which these loci influence disease risk are largely unknown. Here, we investigated the association between common genetic variants at the 7p21 locus and risk of the neurodegenerative disease frontotemporal lobar degeneration. We showed that variants associated with disease risk correlate with increased expression of the 7p21 gene TMEM106B and no other genes; co-localization analyses implicated a common causal variant underlying both association with disease and association with TMEM106B expression in lymphoblastoid cell lines and human brain. Furthermore, increases in the amount of TMEM106B resulted in increases in abnormal lysosomal phenotypes and cell toxicity in both immortalized cell lines and neurons. We then combined fine-mapping, bioinformatics, and bench-based approaches to functionally characterize all candidate causal variants at this locus. This approach identified a noncoding variant, rs1990620, that differentially recruits CTCF in lymphoblastoid cell lines and human brain to influence CTCF-mediated long-range chromatin-looping interactions between multiple cis-regulatory elements, including the TMEM106B promoter. Our findings thus provide an in-depth analysis of the 7p21 locus linked by GWASs to frontotemporal lobar degeneration, nominating a causal variant and causal mechanism for allele-specific expression and disease association at this locus. Finally, we show that genetic variants associated with risk of neurodegenerative diseases beyond frontotemporal lobar degeneration are enriched in CTCF-binding sites found in brain-relevant tissues, implicating CTCF-mediated gene regulation in risk of neurodegeneration more generally.


Assuntos
Demência/genética , Regulação da Expressão Gênica/genética , Expressão Gênica/genética , Proteínas de Membrana/genética , Proteínas do Tecido Nervoso/genética , Polimorfismo de Nucleotídeo Único/genética , Alelos , Encéfalo/patologia , Fator de Ligação a CCCTC , Linhagem Celular Tumoral , Cromatina , Degeneração Lobar Frontotemporal/genética , Estudo de Associação Genômica Ampla , Genótipo , Células HeLa , Humanos , Neurônios/patologia , Fenótipo , Regiões Promotoras Genéticas/genética , Proteínas Repressoras/genética , Risco
10.
Artif Life ; 26(1): 23-37, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32027528

RESUMO

Susceptibility to common human diseases such as cancer is influenced by many genetic and environmental factors that work together in a complex manner. The state of the art is to perform a genome-wide association study (GWAS) that measures millions of single-nucleotide polymorphisms (SNPs) throughout the genome followed by a one-SNP-at-a-time statistical analysis to detect univariate associations. This approach has identified thousands of genetic risk factors for hundreds of diseases. However, the genetic risk factors detected have very small effect sizes and collectively explain very little of the overall heritability of the disease. Nonetheless, it is assumed that the genetic component of risk is due to many independent risk factors that contribute additively. The fact that many genetic risk factors with small effects can be detected is taken as evidence to support this notion. It is our working hypothesis that the genetic architecture of common diseases is partly driven by non-additive interactions. To test this hypothesis, we developed a heuristic simulation-based method for conducting experiments about the complexity of genetic architecture. We show that a genetic architecture driven by complex interactions is highly consistent with the magnitude and distribution of univariate effects seen in real data. We compare our results with measures of univariate and interaction effects from two large-scale GWASs of sporadic breast cancer and find evidence to support our hypothesis that is consistent with the results of our computational experiment.


Assuntos
Biologia Computacional , Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Simulação por Computador , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA