Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 172(4): 650-665, 2018 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-29425488

RESUMO

Transcription factors (TFs) recognize specific DNA sequences to control chromatin and transcription, forming a complex system that guides expression of the genome. Despite keen interest in understanding how TFs control gene expression, it remains challenging to determine how the precise genomic binding sites of TFs are specified and how TF binding ultimately relates to regulation of transcription. This review considers how TFs are identified and functionally characterized, principally through the lens of a catalog of over 1,600 likely human TFs and binding motifs for two-thirds of them. Major classes of human TFs differ markedly in their evolutionary trajectories and expression patterns, underscoring distinct functions. TFs likewise underlie many different aspects of human physiology, disease, and variation, highlighting the importance of continued effort to understand TF-mediated gene regulation.


Assuntos
Evolução Molecular , Regulação da Expressão Gênica , Elementos de Resposta , Fatores de Transcrição , Motivos de Aminoácidos , Humanos , Fatores de Transcrição/química , Fatores de Transcrição/classificação , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
2.
Cell ; 158(6): 1431-1443, 2014 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-25215497

RESUMO

Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.


Assuntos
Arabidopsis/genética , Motivos de Nucleotídeos , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Arabidopsis/metabolismo , Imunoprecipitação da Cromatina , Humanos , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Ligação Proteica , Locos de Características Quantitativas
3.
Nature ; 616(7955): 123-131, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36991119

RESUMO

The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics1. Here we examine a large cohort (the INTERVAL study2; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank3 to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.


Assuntos
Doença da Artéria Coronariana , Multiômica , Humanos , Doença da Artéria Coronariana/genética , Doença da Artéria Coronariana/metabolismo , Metabolômica/métodos , Fenótipo , Proteômica/métodos , Aprendizado de Máquina , Negro ou Afro-Americano/genética , Asiático/genética , População Europeia/genética , Reino Unido , Conjuntos de Dados como Assunto , Internet , Reprodutibilidade dos Testes , Estudos de Coortes , Proteoma/análise , Proteoma/metabolismo , Metaboloma , Plasma/metabolismo , Bases de Dados Factuais
5.
Nature ; 591(7849): 211-219, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33692554

RESUMO

Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translation of PRSs into clinical care. Here, in a collaboration between the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog, we present the Polygenic Risk Score Reporting Standards (PRS-RS), in which we update the Genetic Risk Prediction Studies (GRIPS) Statement to reflect the present state of the field. Drawing on the input of experts in epidemiology, statistics, disease-specific applications, implementation and policy, this comprehensive reporting framework defines the minimal information that is needed to interpret and evaluate PRSs, especially with respect to downstream clinical applications. Items span detailed descriptions of study populations, statistical methods for the development and validation of PRSs and considerations for the potential limitations of these scores. In addition, we emphasize the need for data availability and transparency, and we encourage researchers to deposit and share PRSs through the PGS Catalog to facilitate reproducibility and comparative benchmarking. By providing these criteria in a structured format that builds on existing standards and ontologies, the use of this framework in publishing PRSs will facilitate translation into clinical care and progress towards defining best practice.


Assuntos
Predisposição Genética para Doença , Genética Médica/normas , Herança Multifatorial/genética , Humanos , Reprodutibilidade dos Testes , Medição de Risco/normas
6.
Nucleic Acids Res ; 51(D1): D977-D985, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350656

RESUMO

The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to >200 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for >45 000 published GWAS across >5000 human traits, and >40 000 full P-value summary statistics datasets. Content is curated from publications or acquired via author submission of prepublication summary statistics through a new submission portal and validation tool. GWAS data volume has vastly increased in recent years. We have updated our software to meet this scaling challenge and to enable rapid release of submitted summary statistics. The scope of the repository has expanded to include additional data types of high interest to the community, including sequencing-based GWAS, gene-based analyses and copy number variation analyses. Community outreach has increased the number of shared datasets from under-represented traits, e.g. cancer, and we continue to contribute to awareness of the lack of population diversity in GWAS. Interoperability of the Catalog has been enhanced through links to other resources including the Polygenic Score Catalog and the International Mouse Phenotyping Consortium, refinements to GWAS trait annotation, and the development of a standard format for GWAS data.


Assuntos
Estudo de Associação Genômica Ampla , Bases de Conhecimento , Animais , Humanos , Camundongos , Variações do Número de Cópias de DNA , National Human Genome Research Institute (U.S.) , Fenótipo , Polimorfismo de Nucleotídeo Único , Software , Estados Unidos
7.
Hum Mol Genet ; 28(R2): R133-R142, 2019 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-31363735

RESUMO

Prediction of disease risk is an essential part of preventative medicine, often guiding clinical management. Risk prediction typically includes risk factors such as age, sex, family history of disease and lifestyle (e.g. smoking status); however, in recent years, there has been increasing interest to include genomic information into risk models. Polygenic risk scores (PRS) aggregate the effects of many genetic variants across the human genome into a single score and have recently been shown to have predictive value for multiple common diseases. In this review, we summarize the potential use cases for seven common diseases (breast cancer, prostate cancer, coronary artery disease, obesity, type 1 diabetes, type 2 diabetes and Alzheimer's disease) where PRS has or could have clinical utility. PRS analysis for these diseases frequently revolved around (i) risk prediction performance of a PRS alone and in combination with other non-genetic risk factors, (ii) estimation of lifetime risk trajectories, (iii) the independent information of PRS and family history of disease or monogenic mutations and (iv) estimation of the value of adding a PRS to specific clinical risk prediction scenarios. We summarize open questions regarding PRS usability, ancestry bias and transferability, emphasizing the need for the next wave of studies to focus on the implementation and health-economic value of PRS testing. In conclusion, it is becoming clear that PRS have value in disease risk prediction and there are multiple areas where this may have clinical utility.


Assuntos
Predisposição Genética para Doença , Herança Multifatorial , Doença de Alzheimer/genética , Neoplasias da Mama/genética , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Feminino , Predisposição Genética para Doença/prevenção & controle , Estudo de Associação Genômica Ampla , Humanos , Masculino , Anamnese , Obesidade/genética , Neoplasias da Próstata/genética , Reprodutibilidade dos Testes , Fatores de Risco
8.
Bioinformatics ; 32(22): 3504-3506, 2016 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-27466627

RESUMO

Measuring motif similarity is essential for identifying functionally related transcription factors (TFs) and RNA-binding proteins, and for annotating de novo motifs. Here, we describe Motif Similarity Based on Affinity of Targets (MoSBAT), an approach for measuring the similarity of motifs by computing their affinity profiles across a large number of random sequences. We show that MoSBAT successfully associates de novo ChIP-seq motifs with their respective TFs, accurately identifies motifs that are obtained from the same TF in different in vitro assays, and quantitatively reflects the similarity of in vitro binding preferences for pairs of TFs. AVAILABILITY AND IMPLEMENTATION: MoSBAT is available as a webserver at mosbat.ccbr.utoronto.ca, and for download at github.com/csglab/MoSBAT. CONTACT: t.hughes@utoronto.ca or hamed.najafabadi@mcgill.caSupplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Ligação a RNA/genética , Análise de Sequência de Proteína/métodos , Fatores de Transcrição/genética , Sítios de Ligação , Ligação Proteica , Alinhamento de Sequência
9.
Genome Med ; 16(1): 33, 2024 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-38373998

RESUMO

Polygenic scores (PGS) can be used for risk stratification by quantifying individuals' genetic predisposition to disease, and many potentially clinically useful applications have been proposed. Here, we review the latest potential benefits of PGS in the clinic and challenges to implementation. PGS could augment risk stratification through combined use with traditional risk factors (demographics, disease-specific risk factors, family history, etc.), to support diagnostic pathways, to predict groups with therapeutic benefits, and to increase the efficiency of clinical trials. However, there exist challenges to maximizing the clinical utility of PGS, including FAIR (Findable, Accessible, Interoperable, and Reusable) use and standardized sharing of the genomic data needed to develop and recalculate PGS, the equitable performance of PGS across populations and ancestries, the generation of robust and reproducible PGS calculations, and the responsible communication and interpretation of results. We outline how these challenges may be overcome analytically and with more diverse data as well as highlight sustained community efforts to achieve equitable, impactful, and responsible use of PGS in healthcare.


Assuntos
Comunicação , Predisposição Genética para Doença , Humanos , Genômica , Herança Multifatorial , Fatores de Risco , Estudo de Associação Genômica Ampla
10.
medRxiv ; 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38699308

RESUMO

Blood cell phenotypes are routinely tested in healthcare to inform clinical decisions. Genetic variants influencing mean blood cell phenotypes have been used to understand disease aetiology and improve prediction; however, additional information may be captured by genetic effects on observed variance. Here, we mapped variance quantitative trait loci (vQTL), i.e. genetic loci associated with trait variance, for 29 blood cell phenotypes from the UK Biobank (N~408,111). We discovered 176 independent blood cell vQTLs, of which 147 were not found by additive QTL mapping. vQTLs displayed on average 1.8-fold stronger negative selection than additive QTL, highlighting that selection acts to reduce extreme blood cell phenotypes. Variance polygenic scores (vPGSs) were constructed to stratify individuals in the INTERVAL cohort (N~40,466), where genetically less variable individuals (low vPGS) had increased conventional PGS accuracy (by ~19%) than genetically more variable individuals. Genetic prediction of blood cell traits improved by ~10% on average combining PGS with vPGS. Using Mendelian randomisation and vPGS association analyses, we found that alcohol consumption significantly increased blood cell trait variances highlighting the utility of blood cell vQTLs and vPGSs to provide novel insight into phenotype aetiology as well as improve prediction.

11.
medRxiv ; 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38853961

RESUMO

Polygenic scores (PGS) have transformed human genetic research and have multiple potential clinical applications, including risk stratification for disease prevention and prediction of treatment response. Here, we present a series of recent enhancements to the PGS Catalog (www.PGSCatalog.org), the largest findable, accessible, interoperable, and reusable (FAIR) repository of PGS. These include expansions in data content and ancestral diversity as well as the addition of new features. We further present the PGS Catalog Calculator (pgsc_calc, https://github.com/PGScatalog/pgsc_calc), an open-source, scalable and portable pipeline to reproducibly calculate PGS that securely democratizes equitable PGS applications by implementing genetic ancestry estimation and score normalization using reference data. With the PGS Catalog & calculator users can now quantify an individual's genetic predisposition for hundreds of common diseases and clinically relevant traits. Taken together, these updates and tools facilitate the next generation of PGS, thus lowering barriers to the clinical studies necessary to identify where PGS may be integrated into clinical practice.

12.
J Community Genet ; 14(5): 453-458, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36763324

RESUMO

The aim of this patient and public involvement and engagement (PPIE) work was to explore improvised theatre as a tool for facilitating bi-directional dialogue between researchers and patients/members of the public on the topic of polygenic risk scores (PRS) use within primary or secondary care. PRS are a tool to quantify genetic risk for a heritable disease or trait and may be used to predict future health outcomes. In the United Kingdom (UK), they are often cited as a next-in-line public health tool to be implemented, and their use in consumer genetic testing as well as patient-facing settings is increasing. Despite their potential clinical utility, broader themes about how they might influence an individual's perception of disease risk and decision-making are an active area of research; however, this has mostly been in the setting of return of results to patients. We worked with a youth theatre group and patients involved in a PPIE group to develop two short plays about public perceptions of genetic risk information that could be captured by PRS. These plays were shared in a workshop with patients/members of the public to facilitate discussions about PRS and their perceived benefits, concerns and emotional reactions. Discussions with both performers and patients/public raised three key questions: (1) can the data be trusted?; (2) does knowing genetic risk actually help the patient?; and (3) what makes a life worthwhile? Creating and watching fictional narratives helped all participants explore the potential use of PRS in a clinical setting, informing future research considerations and improving communication between the researchers and lay members of the PPIE group.

13.
Sci Data ; 10(1): 64, 2023 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-36720882

RESUMO

Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy in approximately 121,000 UK Biobank participants has recently been released as a community resource, comprising absolute concentrations and ratios of 249 circulating metabolites, lipids, and lipoprotein sub-fractions. Here we identify and characterise additional sources of unwanted technical variation influencing individual biomarkers in the data available to download from UK Biobank. These included sample preparation time, shipping plate well, spectrometer batch effects, drift over time within spectrometer, and outlier shipping plates. We developed a procedure for removing this unwanted technical variation, and demonstrate that it increases signal for genetic and epidemiological studies of the NMR metabolic biomarker data in UK Biobank. We subsequently developed an R package, ukbnmr, which we make available to the wider research community to enhance the utility of the UK Biobank NMR metabolic biomarker data and to facilitate rapid analysis.


Assuntos
Bancos de Espécimes Biológicos , Imageamento por Ressonância Magnética , Humanos , Imageamento por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/normas , Espectroscopia de Ressonância Magnética , Controle de Qualidade , Reino Unido
14.
Nat Commun ; 13(1): 7356, 2022 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-36446790

RESUMO

Understanding how genetic variants influence disease risk and complex traits (variant-to-function) is one of the major challenges in human genetics. Here we present a model-driven framework to leverage human genome-scale metabolic networks to define how genetic variants affect biochemical reaction fluxes across major human tissues, including skeletal muscle, adipose, liver, brain and heart. As proof of concept, we build personalised organ-specific metabolic flux models for 524,615 individuals of the INTERVAL and UK Biobank cohorts and perform a fluxome-wide association study (FWAS) to identify 4312 associations between personalised flux values and the concentration of metabolites in blood. Furthermore, we apply FWAS to identify 92 metabolic fluxes associated with the risk of developing coronary artery disease, many of which are linked to processes previously described to play in role in the disease. Our work demonstrates that genetically personalised metabolic models can elucidate the downstream effects of genetic variants on biochemical reactions involved in common human diseases.


Assuntos
Tecido Adiposo , Doença da Artéria Coronariana , Humanos , Encéfalo , Genoma Humano , Coração
15.
Nat Commun ; 13(1): 4664, 2022 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-35945198

RESUMO

Individuals with South Asian ancestry have a higher risk of heart disease than other groups but have been largely excluded from genetic research. Using data from 22,000 British Pakistani and Bangladeshi individuals with linked electronic health records from the Genes & Health cohort, we conducted genome-wide association studies of coronary artery disease and its key risk factors. Using power-adjusted transferability ratios, we found evidence for transferability for the majority of cardiometabolic loci powered to replicate. The performance of polygenic scores was high for lipids and blood pressure, but lower for BMI and coronary artery disease. Adding a polygenic score for coronary artery disease to clinical risk factors showed significant improvement in reclassification. In Mendelian randomisation using transferable loci as instruments, our findings were consistent with results in European-ancestry individuals. Taken together, trait-specific transferability of trait loci between populations is an important consideration with implications for risk prediction and causal inference.


Assuntos
Doença da Artéria Coronariana , Estudo de Associação Genômica Ampla , Povo Asiático/genética , Doença da Artéria Coronariana/epidemiologia , Doença da Artéria Coronariana/genética , Loci Gênicos , Humanos , Paquistão , Polimorfismo de Nucleotídeo Único
16.
Nat Metab ; 3(11): 1476-1483, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34750571

RESUMO

Cardiometabolic diseases are frequently polygenic in architecture, comprising a large number of risk alleles with small effects spread across the genome1-3. Polygenic scores (PGS) aggregate these into a metric representing an individual's genetic predisposition to disease. PGS have shown promise for early risk prediction4-7 and there is an open question as to whether PGS can also be used to understand disease biology8. Here, we demonstrate that cardiometabolic disease PGS can be used to elucidate the proteins underlying disease pathogenesis. In 3,087 healthy individuals, we found that PGS for coronary artery disease, type 2 diabetes, chronic kidney disease and ischaemic stroke are associated with the levels of 49 plasma proteins. Associations were polygenic in architecture, largely independent of cis and trans protein quantitative trait loci and present for proteins without quantitative trait loci. Over a follow-up of 7.7 years, 28 of these proteins associated with future myocardial infarction or type 2 diabetes events, 16 of which were mediators between polygenic risk and incident disease. Twelve of these were druggable targets with therapeutic potential. Our results demonstrate the potential for PGS to uncover causal disease biology and targets with therapeutic potential, including those that may be missed by approaches utilizing information at a single locus.


Assuntos
Proteínas Sanguíneas , Cardiopatias/etiologia , Cardiopatias/metabolismo , Doenças Metabólicas/etiologia , Doenças Metabólicas/metabolismo , Herança Multifatorial , Proteoma , Adulto , Biomarcadores , Gerenciamento Clínico , Suscetibilidade a Doenças , Inglaterra/epidemiologia , Feminino , Predisposição Genética para Doença , Cardiopatias/diagnóstico , Cardiopatias/epidemiologia , Humanos , Masculino , Doenças Metabólicas/diagnóstico , Doenças Metabólicas/epidemiologia , Pessoa de Meia-Idade , Vigilância em Saúde Pública , Adulto Jovem
17.
Nat Genet ; 51(6): 981-989, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31133749

RESUMO

Transcription factor (TF) binding specificities (motifs) are essential for the analysis of gene regulation. Accurate prediction of TF motifs is critical, because it is infeasible to assay all TFs in all sequenced eukaryotic genomes. There is ongoing controversy regarding the degree of motif diversification among related species that is, in part, because of uncertainty in motif prediction methods. Here we describe similarity regression, a significantly improved method for predicting motifs, which we use to update and expand the Cis-BP database. Similarity regression inherently quantifies TF motif evolution, and shows that previous claims of near-complete conservation of motifs between human and Drosophila are inflated, with nearly half of the motifs in each species absent from the other, largely due to extensive divergence in C2H2 zinc finger proteins. We conclude that diversification in DNA-binding motifs is pervasive, and present a new tool and updated resource to study TF diversity and gene regulation across eukaryotes.


Assuntos
Sequência de Bases , Sítios de Ligação , Evolução Molecular , Fatores de Transcrição/metabolismo , Animais , Biologia Computacional/métodos , Sequência Conservada , Bases de Dados Genéticas , Regulação da Expressão Gênica , Humanos , Motivos de Nucleotídeos , Ligação Proteica
18.
G3 (Bethesda) ; 8(1): 219-229, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29146583

RESUMO

KRAB C2H2 zinc finger proteins (KZNFs) are the largest and most diverse family of human transcription factors, likely due to diversifying selection driven by novel endogenous retroelements (EREs), but the vast majority lack binding motifs or functional data. Two recent studies analyzed a majority of the human KZNFs using either ChIP-seq (60 proteins) or ChIP-exo (221 proteins) in the same cell type (HEK293). The ChIP-exo paper did not describe binding motifs, however. Thirty-nine proteins are represented in both studies, enabling the systematic comparison of the data sets presented here. Typically, only a minority of peaks overlap, but the two studies nonetheless display significant similarity in ERE binding for 32/39, and yield highly similar DNA binding motifs for 23 and related motifs for 34 (MoSBAT similarity score >0.5 and >0.2, respectively). Thus, there is overall (albeit imperfect) agreement between the two studies. For the 242 proteins represented in at least one study, we selected a highest-confidence motif for each protein, utilizing several motif-derivation approaches, and evaluating motifs within and across data sets. Peaks for the majority (158) are enriched (96% with AUC >0.6 predicting peak vs. nonpeak) for a motif that is supported by the C2H2 "recognition code," consistent with intrinsic sequence specificity driving DNA binding in cells. An additional 63 yield motifs enriched in peaks, but not supported by the recognition code, which could reflect indirect binding. Altogether, these analyses validate both data sets, and provide a reference motif set with associated quality metrics.


Assuntos
Dedos de Zinco CYS2-HIS2 , Proteínas Repressoras/genética , Retroelementos , Sequência de Bases , Sítios de Ligação , Imunoprecipitação da Cromatina , Expressão Gênica , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Família Multigênica , Ligação Proteica , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas Repressoras/metabolismo
20.
Elife ; 42015 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-25905672

RESUMO

Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , DNA de Helmintos/genética , Fatores de Transcrição/genética , Dedos de Zinco/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , Sítios de Ligação , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/química , Proteínas de Caenorhabditis elegans/metabolismo , DNA de Helmintos/química , DNA de Helmintos/metabolismo , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Dados de Sequência Molecular , Regiões Promotoras Genéticas , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Receptores Citoplasmáticos e Nucleares , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA