Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 75
Filtrar
1.
Annu Rev Cell Dev Biol ; 35: 357-379, 2019 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-31283382

RESUMO

Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional.


Assuntos
Eucariotos/metabolismo , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Regulação da Expressão Gênica , Humanos
2.
Cell ; 161(2): 307-18, 2015 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-25843630

RESUMO

Protein-DNA binding is mediated by the recognition of the chemical signatures of the DNA bases and the 3D shape of the DNA molecule. Because DNA shape is a consequence of sequence, it is difficult to dissociate these modes of recognition. Here, we tease them apart in the context of Hox-DNA binding by mutating residues that, in a co-crystal structure, only recognize DNA shape. Complexes made with these mutants lose the preference to bind sequences with specific DNA shape features. Introducing shape-recognizing residues from one Hox protein to another swapped binding specificities in vitro and gene regulation in vivo. Statistical machine learning revealed that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence, and feature selection identified shape features important for recognition. Thus, shape readout is a direct and independent component of binding site selection by Hox proteins.


Assuntos
DNA/química , DNA/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo , Sequência de Aminoácidos , Animais , Cristalografia por Raios X , Proteínas de Homeodomínio/química , Proteínas de Homeodomínio/metabolismo , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Ligação Proteica , Alinhamento de Sequência
3.
Cell ; 154(3): 676-690, 2013 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-23911329

RESUMO

Reduced insulin/IGF-1-like signaling (IIS) extends C. elegans lifespan by upregulating stress response (class I) and downregulating other (class II) genes through a mechanism that depends on the conserved transcription factor DAF-16/FOXO. By integrating genome-wide mRNA expression responsiveness to DAF-16 with genome-wide in vivo binding data for a compendium of transcription factors, we discovered that PQM-1 is the elusive transcriptional activator that directly controls development (class II) genes by binding to the DAF-16-associated element (DAE). DAF-16 directly regulates class I genes only, through the DAF-16-binding element (DBE). Loss of PQM-1 suppresses daf-2 longevity and further slows development. Surprisingly, the nuclear localization of PQM-1 and DAF-16 is controlled by IIS in opposite ways and was also found to be mutually antagonistic. We observe progressive loss of nuclear PQM-1 with age, explaining declining expression of PQM-1 targets. Together, our data suggest an elegant mechanism for balancing stress response and development.


Assuntos
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/crescimento & desenvolvimento , Caenorhabditis elegans/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Longevidade , Transativadores/metabolismo , Animais , Fatores de Transcrição Forkhead , Receptor de Insulina/metabolismo , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo , Ativação Transcricional
4.
Mol Cell ; 78(1): 152-167.e11, 2020 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-32053778

RESUMO

Eukaryotic transcription factors (TFs) form complexes with various partner proteins to recognize their genomic target sites. Yet, how the DNA sequence determines which TF complex forms at any given site is poorly understood. Here, we demonstrate that high-throughput in vitro DNA binding assays coupled with unbiased computational analysis provide unprecedented insight into how different DNA sequences select distinct compositions and configurations of homeodomain TF complexes. Using inferred knowledge about minor groove width readout, we design targeted protein mutations that destabilize homeodomain binding both in vitro and in vivo in a complex-specific manner. By performing parallel systematic evolution of ligands by exponential enrichment sequencing (SELEX-seq), chromatin immunoprecipitation sequencing (ChIP-seq), RNA sequencing (RNA-seq), and Hi-C assays, we not only classify the majority of in vivo binding events in terms of complex composition but also infer complex-specific functions by perturbing the gene regulatory network controlled by a single complex.


Assuntos
DNA/química , Proteínas de Drosophila/metabolismo , Regulação da Expressão Gênica , Proteínas de Homeodomínio/metabolismo , Fatores de Transcrição/metabolismo , Animais , Sequência de Bases , Sítios de Ligação , DNA/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Proteínas de Homeodomínio/química , Proteínas de Homeodomínio/genética , Mutação , Conformação de Ácido Nucleico , Ligação Proteica , Fatores de Transcrição/química , Fatores de Transcrição/genética
5.
Cell ; 147(6): 1270-82, 2011 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-22153072

RESUMO

Members of transcription factor families typically have similar DNA binding specificities yet execute unique functions in vivo. Transcription factors often bind DNA as multiprotein complexes, raising the possibility that complex formation might modify their DNA binding specificities. To test this hypothesis, we developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative affinities to any DNA sequence for any transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we show that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd). Exd-Hox specificities group into three main classes that obey Hox gene collinearity rules and DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Together, these data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.


Assuntos
DNA/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila/metabolismo , Proteínas de Homeodomínio/metabolismo , Multimerização Proteica , Fatores de Transcrição/metabolismo , Sequência de Aminoácidos , Animais , Proteínas de Drosophila/química , Técnicas Genéticas , Proteínas de Homeodomínio/química , Dados de Sequência Molecular , Estrutura Terciária de Proteína , Fatores de Transcrição/química
6.
Cell ; 143(2): 212-24, 2010 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-20888037

RESUMO

Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, we show that the genome is segmented into five principal chromatin types that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. We identify a repressive chromatin type that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, we provide evidence that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell.


Assuntos
Cromatina/classificação , Proteínas de Ligação a DNA/análise , Proteínas de Drosophila/análise , Drosophila melanogaster/genética , Animais , Linhagem Celular , Cromatina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Eucromatina/metabolismo , Heterocromatina/metabolismo , Histonas/metabolismo , Análise de Componente Principal
7.
Nucleic Acids Res ; 51(11): 5499-5511, 2023 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-37013986

RESUMO

Classic promoter mutagenesis strategies can be used to study how proximal promoter regions regulate the expression of particular genes of interest. This is a laborious process, in which the smallest sub-region of the promoter still capable of recapitulating expression in an ectopic setting is first identified, followed by targeted mutation of putative transcription factor binding sites. Massively parallel reporter assays such as survey of regulatory elements (SuRE) provide an alternative way to study millions of promoter fragments in parallel. Here we show how a generalized linear model (GLM) can be used to transform genome-scale SuRE data into a high-resolution genomic track that quantifies the contribution of local sequence to promoter activity. This coefficient track helps identify regulatory elements and can be used to predict promoter activity of any sub-region in the genome. It thus allows in silico dissection of any promoter in the human genome to be performed. We developed a web application, available at cissector.nki.nl, that lets researchers easily perform this analysis as a starting point for their research into any promoter of interest.


Assuntos
Regiões Promotoras Genéticas , Software , Humanos , Sítios de Ligação , Genoma Humano/genética , Ligação Proteica , Sequências Reguladoras de Ácido Nucleico
8.
Nucleic Acids Res ; 51(18): 9690-9702, 2023 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-37650627

RESUMO

TP53 is a transcription factor that controls multiple cellular processes, including cell cycle arrest, DNA repair and apoptosis. The relation between TP53 binding site architecture and transcriptional output is still not fully understood. Here, we systematically examined in three different cell lines the effects of binding site affinity and copy number on TP53-dependent transcriptional output, and also probed the impact of spacer length and sequence between adjacent binding sites, and of core promoter identity. Paradoxically, we found that high-affinity TP53 binding sites are less potent than medium-affinity sites. TP53 achieves supra-additive transcriptional activation through optimally spaced adjacent binding sites, suggesting a cooperative mechanism. Optimally spaced adjacent binding sites have a ∼10-bp periodicity, suggesting a role for spatial orientation along the DNA double helix. We leveraged these insights to construct a log-linear model that explains activity from sequence features, and to identify new highly active and sensitive TP53 reporters.

9.
PLoS Genet ; 18(1): e1009719, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-35100260

RESUMO

Tens of thousands of genetic variants associated with gene expression (cis-eQTLs) have been discovered in the human population. These eQTLs are active in various tissues and contexts, but the molecular mechanisms of eQTL variability are poorly understood, hindering our understanding of genetic regulation across biological contexts. Since many eQTLs are believed to act by altering transcription factor (TF) binding affinity, we hypothesized that analyzing eQTL effect size as a function of TF level may allow discovery of mechanisms of eQTL variability. Using GTEx Consortium eQTL data from 49 tissues, we analyzed the interaction between eQTL effect size and TF level across tissues and across individuals within specific tissues and generated a list of 10,098 TF-eQTL interactions across 2,136 genes that are supported by at least two lines of evidence. These TF-eQTLs were enriched for various TF binding measures, supporting with orthogonal evidence that these eQTLs are regulated by the implicated TFs. We also found that our TF-eQTLs tend to overlap genes with gene-by-environment regulatory effects and to colocalize with GWAS loci, implying that our approach can help to elucidate mechanisms of context-specificity and trait associations. Finally, we highlight an interesting example of IKZF1 TF regulation of an APBB1IP gene eQTL that colocalizes with a GWAS signal for blood cell traits. Together, our findings provide candidate TF mechanisms for a large number of eQTLs and offer a generalizable approach for researchers to discover TF regulators of genetic variant effects in additional QTL datasets.


Assuntos
Locos de Características Quantitativas , Fatores de Transcrição/fisiologia , Alelos , Sítios de Ligação , Técnicas de Silenciamento de Genes , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Humanos , Fator Regulador 1 de Interferon/genética , Modelos Genéticos , Fenótipo , Fatores de Transcrição/metabolismo
10.
Nucleic Acids Res ; 48(9): 5037-5053, 2020 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-32315032

RESUMO

CRISPR RNA-guided endonucleases (RGEs) cut or direct activities to specific genomic loci, yet each has off-target activities that are often unpredictable. We developed a pair of simple in vitro assays to systematically measure the DNA-binding specificity (Spec-seq), catalytic activity specificity (SEAM-seq) and cleavage efficiency of RGEs. By separately quantifying binding and cleavage specificity, Spec/SEAM-seq provides detailed mechanistic insight into off-target activity. Feature-based models generated from Spec/SEAM-seq data for SpCas9 were consistent with previous reports of its in vitro and in vivo specificity, validating the approach. Spec/SEAM-seq is also useful for profiling less-well characterized RGEs. Application to an engineered SpCas9, HiFi-SpCas9, indicated that its enhanced target discrimination can be attributed to cleavage rather than binding specificity. The ortholog ScCas9, on the other hand, derives specificity from binding to an extended PAM. The decreased off-target activity of AsCas12a (Cpf1) appears to be primarily driven by DNA-binding specificity. Finally, we performed the first characterization of CasX specificity, revealing an all-or-nothing mechanism where mismatches can be bound, but not cleaved. Together, these applications establish Spec/SEAM-seq as an accessible method to rapidly and reliably evaluate the specificity of RGEs, Cas::gRNA pairs, and gain insight into the mechanism and thermodynamics of target discrimination.


Assuntos
Proteínas Associadas a CRISPR/metabolismo , Endodesoxirribonucleases/metabolismo , Acidaminococcus/enzimologia , Pareamento Incorreto de Bases , Pareamento de Bases , Proteínas Associadas a CRISPR/genética , DNA/química , DNA/metabolismo , Clivagem do DNA , Deltaproteobacteria/enzimologia , Endodesoxirribonucleases/genética , Mutação , Proteína Homeobox Nanog/genética , Ligação Proteica , RNA/química , Técnica de Seleção de Aptâmeros , Análise de Sequência de DNA , Especificidade por Substrato
11.
Genome Res ; 28(1): 111-121, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29196557

RESUMO

The DNA-binding interfaces of the androgen (AR) and glucocorticoid (GR) receptors are virtually identical, yet these transcription factors share only about a third of their genomic binding sites and regulate similarly distinct sets of target genes. To address this paradox, we determined the intrinsic specificities of the AR and GR DNA-binding domains using a refined version of SELEX-seq. We developed an algorithm, SelexGLM, that quantifies binding specificity over a large (31-bp) binding site by iteratively fitting a feature-based generalized linear model to SELEX probe counts. This analysis revealed that the DNA-binding preferences of AR and GR homodimers differ significantly, both within and outside the 15-bp core binding site. The relative preference between the two factors can be tuned over a wide range by changing the DNA sequence, with AR more sensitive to sequence changes than GR. The specificity of AR extends to the regions flanking the core 15-bp site, where isothermal calorimetry measurements reveal that affinity is augmented by enthalpy-driven readout of poly(A) sequences associated with narrowed minor groove width. We conclude that the increased specificity of AR is correlated with more enthalpy-driven binding than GR. The binding models help explain differences in AR and GR genomic binding and provide a biophysical rationale for how promiscuous binding by GR allows functional substitution for AR in some castration-resistant prostate cancers.


Assuntos
Antagonistas de Receptores de Andrógenos , Proteínas de Neoplasias , Neoplasias de Próstata Resistentes à Castração , Receptores Androgênicos/metabolismo , Receptores de Glucocorticoides , Técnica de Seleção de Aptâmeros/métodos , Antagonistas de Receptores de Andrógenos/síntese química , Antagonistas de Receptores de Andrógenos/química , Antagonistas de Receptores de Andrógenos/farmacologia , Aptâmeros de Nucleotídeos/síntese química , Aptâmeros de Nucleotídeos/química , Aptâmeros de Nucleotídeos/farmacologia , Linhagem Celular Tumoral , Humanos , Masculino , Proteínas de Neoplasias/antagonistas & inibidores , Proteínas de Neoplasias/metabolismo , Receptores de Glucocorticoides/antagonistas & inibidores , Receptores de Glucocorticoides/metabolismo
12.
Proc Natl Acad Sci U S A ; 115(16): E3692-E3701, 2018 04 17.
Artigo em Inglês | MEDLINE | ID: mdl-29610332

RESUMO

Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.


Assuntos
Pegada de DNA/métodos , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Animais , Sítios de Ligação , Conjuntos de Dados como Assunto , Proteínas de Drosophila/metabolismo , Ensaio de Desvio de Mobilidade Eletroforética , Elementos Facilitadores Genéticos , Biblioteca Gênica , Proteínas de Homeodomínio/metabolismo , Humanos , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Proteínas Recombinantes/metabolismo , Fatores de Transcrição/metabolismo , Proteína Supressora de Tumor p53/metabolismo
13.
Mol Cell ; 48(5): 799-810, 2012 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-23102701

RESUMO

The p53 tumor suppressor utilizes multiple mechanisms to selectively regulate its myriad target genes, which in turn mediate diverse cellular processes. Here, using conventional and single-molecule mRNA analyses, we demonstrate that the nucleoporin Nup98 is required for full expression of p21, a key effector of the p53 pathway, but not several other p53 target genes. Nup98 regulates p21 mRNA levels by a posttranscriptional mechanism in which a complex containing Nup98 and the p21 mRNA 3'UTR protects p21 mRNA from degradation by the exosome. An in silico approach revealed another p53 target (14-3-3σ) to be similarly regulated by Nup98. The expression of Nup98 is reduced in murine and human hepatocellular carcinomas (HCCs) and correlates with p21 expression in HCC patients. Our study elucidates a previously unrecognized function of wild-type Nup98 in regulating select p53 target genes that is distinct from the well-characterized oncogenic properties of Nup98 fusion proteins.


Assuntos
Carcinoma Hepatocelular/metabolismo , Neoplasias Hepáticas/metabolismo , Complexo de Proteínas Formadoras de Poros Nucleares/metabolismo , Processamento Pós-Transcricional do RNA , RNA Mensageiro/metabolismo , Proteína Supressora de Tumor p53/metabolismo , Proteínas 14-3-3/genética , Proteínas 14-3-3/metabolismo , Regiões 3' não Traduzidas , Subfamília B de Transportador de Cassetes de Ligação de ATP/genética , Subfamília B de Transportador de Cassetes de Ligação de ATP/metabolismo , Animais , Antineoplásicos Fitogênicos/farmacologia , Apoptose/efeitos dos fármacos , Sítios de Ligação , Camptotecina/farmacologia , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patologia , Senescência Celular , Inibidor de Quinase Dependente de Ciclina p21/genética , Inibidor de Quinase Dependente de Ciclina p21/metabolismo , Exossomos/metabolismo , Regulação Neoplásica da Expressão Gênica , Células Hep G2 , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patologia , Masculino , Camundongos , Camundongos Knockout , Complexo de Proteínas Formadoras de Poros Nucleares/genética , Interferência de RNA , Estabilidade de RNA , Fatores de Tempo , Transfecção , Proteína Supressora de Tumor p53/genética , Membro 4 da Subfamília B de Transportadores de Cassetes de Ligação de ATP
14.
Mol Syst Biol ; 14(2): e7902, 2018 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-29472273

RESUMO

Transcription factors (TFs) interpret DNA sequence by probing the chemical and structural properties of the nucleotide polymer. DNA shape is thought to enable a parsimonious representation of dependencies between nucleotide positions. Here, we propose a unified mathematical representation of the DNA sequence dependence of shape and TF binding, respectively, which simplifies and enhances analysis of shape readout. First, we demonstrate that linear models based on mononucleotide features alone account for 60-70% of the variance in minor groove width, roll, helix twist, and propeller twist. This explains why simple scoring matrices that ignore all dependencies between nucleotide positions can partially account for DNA shape readout by a TF Adding dinucleotide features as sequence-to-shape predictors to our model, we can almost perfectly explain the shape parameters. Building on this observation, we developed a post hoc analysis method that can be used to analyze any mechanism-agnostic protein-DNA binding model in terms of shape readout. Our insights provide an alternative strategy for using DNA shape information to enhance our understanding of how cis-regulatory codes are interpreted by the cellular machinery.


Assuntos
Biologia Computacional/métodos , DNA/química , Fatores de Transcrição/metabolismo , Sítios de Ligação , DNA/metabolismo , Modelos Moleculares , Modelos Teóricos , Conformação de Ácido Nucleico
15.
Proc Natl Acad Sci U S A ; 113(13): E1835-43, 2016 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-26966232

RESUMO

Regulation of gene expression by transcription factors (TFs) is highly dependent on genetic background and interactions with cofactors. Identifying specific context factors is a major challenge that requires new approaches. Here we show that exploiting natural variation is a potent strategy for probing functional interactions within gene regulatory networks. We developed an algorithm to identify genetic polymorphisms that modulate the regulatory connectivity between specific transcription factors and their target genes in vivo. As a proof of principle, we mapped connectivity quantitative trait loci (cQTLs) using parallel genotype and gene expression data for segregants from a cross between two strains of the yeast Saccharomyces cerevisiae We identified a nonsynonymous mutation in the DIG2 gene as a cQTL for the transcription factor Ste12p and confirmed this prediction empirically. We also identified three polymorphisms in TAF13 as putative modulators of regulation by Gcn4p. Our method has potential for revealing how genetic differences among individuals influence gene regulatory networks in any organism for which gene expression and genotype data are available along with information on binding preferences for transcription factors.


Assuntos
Redes Reguladoras de Genes , Locos de Características Quantitativas , Proteínas de Saccharomyces cerevisiae/genética , Fatores de Transcrição/genética , Algoritmos , Fatores de Transcrição de Zíper de Leucina Básica/genética , Regulação Fúngica da Expressão Gênica , Ontologia Genética , Genes Fúngicos Tipo Acasalamento/genética , Modelos Genéticos , Mutação , Regiões Promotoras Genéticas , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética
16.
Proc Natl Acad Sci U S A ; 112(15): 4654-9, 2015 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-25775564

RESUMO

DNA binding specificities of transcription factors (TFs) are a key component of gene regulatory processes. Underlying mechanisms that explain the highly specific binding of TFs to their genomic target sites are poorly understood. A better understanding of TF-DNA binding requires the ability to quantitatively model TF binding to accessible DNA as its basic step, before additional in vivo components can be considered. Traditionally, these models were built based on nucleotide sequence. Here, we integrated 3D DNA shape information derived with a high-throughput approach into the modeling of TF binding specificities. Using support vector regression, we trained quantitative models of TF binding specificity based on protein binding microarray (PBM) data for 68 mammalian TFs. The evaluation of our models included cross-validation on specific PBM array designs, testing across different PBM array designs, and using PBM-trained models to predict relative binding affinities derived from in vitro selection combined with deep sequencing (SELEX-seq). Our results showed that shape-augmented models compared favorably to sequence-based models. Although both k-mer and DNA shape features can encode interdependencies between nucleotide positions of the binding site, using DNA shape features reduced the dimensionality of the feature space. In addition, analyzing the feature weights of DNA shape-augmented models uncovered TF family-specific structural readout mechanisms that were not revealed by the DNA sequence. As such, this work combines knowledge from structural biology and genomics, and suggests a new path toward understanding TF binding and genome function.


Assuntos
DNA/química , DNA/metabolismo , Conformação de Ácido Nucleico , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Sequência de Bases , Sítios de Ligação/genética , Biologia Computacional/métodos , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cinética , Camundongos , Modelos Genéticos , Análise Serial de Proteínas , Ligação Proteica , Fatores de Transcrição/genética
17.
Nucleic Acids Res ; 43(21): e142, 2015 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-26184874

RESUMO

Insight into the three-dimensional architecture of RNA is essential for understanding its cellular functions. However, even the classic transfer RNA structure contains features that are overlooked by existing bioinformatics tools. Here we present DSSR (Dissecting the Spatial Structure of RNA), an integrated and automated tool for analyzing and annotating RNA tertiary structures. The software identifies canonical and noncanonical base pairs, including those with modified nucleotides, in any tautomeric or protonation state. DSSR detects higher-order coplanar base associations, termed multiplets. It finds arrays of stacked pairs, classifies them by base-pair identity and backbone connectivity, and distinguishes a stem of covalently connected canonical pairs from a helix of stacked pairs of arbitrary type/linkage. DSSR identifies coaxial stacking of multiple stems within a single helix and lists isolated canonical pairs that lie outside of a stem. The program characterizes 'closed' loops of various types (hairpin, bulge, internal, and junction loops) and pseudoknots of arbitrary complexity. Notably, DSSR employs isolated pairs and the ends of stems, whether pseudoknotted or not, to define junction loops. This new, inclusive definition provides a novel perspective on the spatial organization of RNA. Tests on all nucleic acid structures in the Protein Data Bank confirm the efficiency and robustness of the software, and applications to representative RNA molecules illustrate its unique features. DSSR and related materials are freely available at http://x3dna.org/.


Assuntos
RNA/química , Software , Proteínas Associadas a CRISPR/química , DNA/química , Bases de Dados de Proteínas , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , RNA Catalítico/química , RNA Fúngico/química , RNA de Transferência de Fenilalanina/química , RNA Viral/química , Riboswitch
18.
Proc Natl Acad Sci U S A ; 111(15): 5747-52, 2014 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-24706889

RESUMO

Retroviral insertional mutagenesis is a powerful tool for identifying putative cancer genes in mice. To uncover the regulatory mechanisms by which common insertion loci affect downstream processes, we supplemented genotyping data with genome-wide mRNA expression profiling data for 97 tumors induced by retroviral insertional mutagenesis. We developed locus expression signature analysis, an algorithm to construct and interpret the differential gene expression signature associated with each common insertion locus. Comparing locus expression signatures to promoter affinity profiles allowed us to build a detailed map of transcription factors whose protein-level regulatory activity is modulated by a particular locus. We also predicted a large set of drugs that might mitigate the effect of the insertion on tumorigenesis. Taken together, our results demonstrate the potential of a locus-specific signature approach for identifying mammalian regulatory mechanisms in a cancer context.


Assuntos
Carcinogênese/metabolismo , Biologia Computacional/métodos , Dano ao DNA , Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Variação Genética , Neoplasias/genética , Análise de Variância , Animais , Carcinogênese/genética , Análise por Conglomerados , Inibidores Enzimáticos/farmacologia , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Ontologia Genética , Ensaios de Triagem em Larga Escala/métodos , Camundongos , Inibidores de Fosfoinositídeo-3 Quinase
19.
Proc Natl Acad Sci U S A ; 110(16): 6376-81, 2013 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-23576721

RESUMO

DNA binding proteins find their cognate sequences within genomic DNA through recognition of specific chemical and structural features. Here we demonstrate that high-resolution DNase I cleavage profiles can provide detailed information about the shape and chemical modification status of genomic DNA. Analyzing millions of DNA backbone hydrolysis events on naked genomic DNA, we show that the intrinsic rate of cleavage by DNase I closely tracks the width of the minor groove. Integration of these DNase I cleavage data with bisulfite sequencing data for the same cell type's genome reveals that cleavage directly adjacent to cytosine-phosphate-guanine (CpG) dinucleotides is enhanced at least eightfold by cytosine methylation. This phenomenon we show to be attributable to methylation-induced narrowing of the minor groove. Furthermore, we demonstrate that it enables simultaneous mapping of DNase I hypersensitivity and regional DNA methylation levels using dense in vivo cleavage data. Taken together, our results suggest a general mechanism by which CpG methylation can modulate protein-DNA interaction strength via the remodeling of DNA shape.


Assuntos
Metilação de DNA/genética , DNA/química , Desoxirribonuclease I , Genômica/métodos , Modelos Moleculares , Conformação de Ácido Nucleico , Células Cultivadas , Ilhas de CpG/genética , DNA/metabolismo , Desoxirribonuclease I/metabolismo , Humanos , Modelos Genéticos , Análise de Sequência de DNA
20.
Proc Natl Acad Sci U S A ; 109(16): 6030-5, 2012 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-22460799

RESUMO

TLS/FUS (TLS) is a multifunctional protein implicated in a wide range of cellular processes, including transcription and mRNA processing, as well as in both cancer and neurological disease. However, little is currently known about TLS target genes and how they are recognized. Here, we used ChIP and promoter microarrays to identify genes potentially regulated by TLS. Among these genes, we detected a number that correlate with previously known functions of TLS, and confirmed TLS occupancy at several of them by ChIP. We also detected changes in mRNA levels of these target genes in cells where TLS levels were altered, indicative of both activation and repression. Next, we used data from the microarray and computational methods to determine whether specific sequences were enriched in DNA fragments bound by TLS. This analysis suggested the existence of TLS response elements, and we show that purified TLS indeed binds these sequences with specificity in vitro. Remarkably, however, TLS binds only single-strand versions of the sequences. Taken together, our results indicate that TLS regulates expression of specific target genes, likely via recognition of specific single-stranded DNA sequences located within their promoter regions.


Assuntos
DNA de Cadeia Simples/genética , Regulação Neoplásica da Expressão Gênica , Proteína FUS de Ligação a RNA/metabolismo , Elementos de Resposta/genética , Sequência de Bases , Ligação Competitiva , Western Blotting , Linhagem Celular Tumoral , Imunoprecipitação da Cromatina , Perfilação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Regiões Promotoras Genéticas/genética , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas/genética , Proteína FUS de Ligação a RNA/genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa