Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Annu Rev Cell Dev Biol ; 35: 357-379, 2019 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-31283382

RESUMEN

Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional.


Asunto(s)
Eucariontes/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Regulación de la Expresión Génica , Humanos
2.
Cell ; 161(2): 307-18, 2015 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-25843630

RESUMEN

Protein-DNA binding is mediated by the recognition of the chemical signatures of the DNA bases and the 3D shape of the DNA molecule. Because DNA shape is a consequence of sequence, it is difficult to dissociate these modes of recognition. Here, we tease them apart in the context of Hox-DNA binding by mutating residues that, in a co-crystal structure, only recognize DNA shape. Complexes made with these mutants lose the preference to bind sequences with specific DNA shape features. Introducing shape-recognizing residues from one Hox protein to another swapped binding specificities in vitro and gene regulation in vivo. Statistical machine learning revealed that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence, and feature selection identified shape features important for recognition. Thus, shape readout is a direct and independent component of binding site selection by Hox proteins.


Asunto(s)
ADN/química , ADN/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Secuencia de Aminoácidos , Animales , Cristalografía por Rayos X , Proteínas de Homeodominio/química , Proteínas de Homeodominio/metabolismo , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Unión Proteica , Alineación de Secuencia
3.
Cell ; 154(3): 676-690, 2013 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-23911329

RESUMEN

Reduced insulin/IGF-1-like signaling (IIS) extends C. elegans lifespan by upregulating stress response (class I) and downregulating other (class II) genes through a mechanism that depends on the conserved transcription factor DAF-16/FOXO. By integrating genome-wide mRNA expression responsiveness to DAF-16 with genome-wide in vivo binding data for a compendium of transcription factors, we discovered that PQM-1 is the elusive transcriptional activator that directly controls development (class II) genes by binding to the DAF-16-associated element (DAE). DAF-16 directly regulates class I genes only, through the DAF-16-binding element (DBE). Loss of PQM-1 suppresses daf-2 longevity and further slows development. Surprisingly, the nuclear localization of PQM-1 and DAF-16 is controlled by IIS in opposite ways and was also found to be mutually antagonistic. We observe progressive loss of nuclear PQM-1 with age, explaining declining expression of PQM-1 targets. Together, our data suggest an elegant mechanism for balancing stress response and development.


Asunto(s)
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/crecimiento & desarrollo , Caenorhabditis elegans/metabolismo , Regulación del Desarrollo de la Expresión Génica , Longevidad , Transactivadores/metabolismo , Animales , Factores de Transcripción Forkhead , Receptor de Insulina/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción/metabolismo , Activación Transcripcional
4.
Mol Cell ; 78(1): 152-167.e11, 2020 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-32053778

RESUMEN

Eukaryotic transcription factors (TFs) form complexes with various partner proteins to recognize their genomic target sites. Yet, how the DNA sequence determines which TF complex forms at any given site is poorly understood. Here, we demonstrate that high-throughput in vitro DNA binding assays coupled with unbiased computational analysis provide unprecedented insight into how different DNA sequences select distinct compositions and configurations of homeodomain TF complexes. Using inferred knowledge about minor groove width readout, we design targeted protein mutations that destabilize homeodomain binding both in vitro and in vivo in a complex-specific manner. By performing parallel systematic evolution of ligands by exponential enrichment sequencing (SELEX-seq), chromatin immunoprecipitation sequencing (ChIP-seq), RNA sequencing (RNA-seq), and Hi-C assays, we not only classify the majority of in vivo binding events in terms of complex composition but also infer complex-specific functions by perturbing the gene regulatory network controlled by a single complex.


Asunto(s)
ADN/química , Proteínas de Drosophila/metabolismo , Regulación de la Expresión Génica , Proteínas de Homeodominio/metabolismo , Factores de Transcripción/metabolismo , Animales , Secuencia de Bases , Sitios de Unión , ADN/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Proteínas de Homeodominio/química , Proteínas de Homeodominio/genética , Mutación , Conformación de Ácido Nucleico , Unión Proteica , Factores de Transcripción/química , Factores de Transcripción/genética
5.
Cell ; 147(6): 1270-82, 2011 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-22153072

RESUMEN

Members of transcription factor families typically have similar DNA binding specificities yet execute unique functions in vivo. Transcription factors often bind DNA as multiprotein complexes, raising the possibility that complex formation might modify their DNA binding specificities. To test this hypothesis, we developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative affinities to any DNA sequence for any transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we show that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd). Exd-Hox specificities group into three main classes that obey Hox gene collinearity rules and DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Together, these data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.


Asunto(s)
ADN/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila/metabolismo , Proteínas de Homeodominio/metabolismo , Multimerización de Proteína , Factores de Transcripción/metabolismo , Secuencia de Aminoácidos , Animales , Proteínas de Drosophila/química , Técnicas Genéticas , Proteínas de Homeodominio/química , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Factores de Transcripción/química
6.
Cell ; 143(2): 212-24, 2010 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-20888037

RESUMEN

Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, we show that the genome is segmented into five principal chromatin types that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. We identify a repressive chromatin type that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, we provide evidence that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell.


Asunto(s)
Cromatina/clasificación , Proteínas de Unión al ADN/análisis , Proteínas de Drosophila/análisis , Drosophila melanogaster/genética , Animales , Línea Celular , Cromatina/metabolismo , Proteínas de Unión al ADN/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Eucromatina/metabolismo , Heterocromatina/metabolismo , Histonas/metabolismo , Análisis de Componente Principal
7.
Nucleic Acids Res ; 51(18): 9690-9702, 2023 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-37650627

RESUMEN

TP53 is a transcription factor that controls multiple cellular processes, including cell cycle arrest, DNA repair and apoptosis. The relation between TP53 binding site architecture and transcriptional output is still not fully understood. Here, we systematically examined in three different cell lines the effects of binding site affinity and copy number on TP53-dependent transcriptional output, and also probed the impact of spacer length and sequence between adjacent binding sites, and of core promoter identity. Paradoxically, we found that high-affinity TP53 binding sites are less potent than medium-affinity sites. TP53 achieves supra-additive transcriptional activation through optimally spaced adjacent binding sites, suggesting a cooperative mechanism. Optimally spaced adjacent binding sites have a ∼10-bp periodicity, suggesting a role for spatial orientation along the DNA double helix. We leveraged these insights to construct a log-linear model that explains activity from sequence features, and to identify new highly active and sensitive TP53 reporters.

8.
Nucleic Acids Res ; 51(11): 5499-5511, 2023 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-37013986

RESUMEN

Classic promoter mutagenesis strategies can be used to study how proximal promoter regions regulate the expression of particular genes of interest. This is a laborious process, in which the smallest sub-region of the promoter still capable of recapitulating expression in an ectopic setting is first identified, followed by targeted mutation of putative transcription factor binding sites. Massively parallel reporter assays such as survey of regulatory elements (SuRE) provide an alternative way to study millions of promoter fragments in parallel. Here we show how a generalized linear model (GLM) can be used to transform genome-scale SuRE data into a high-resolution genomic track that quantifies the contribution of local sequence to promoter activity. This coefficient track helps identify regulatory elements and can be used to predict promoter activity of any sub-region in the genome. It thus allows in silico dissection of any promoter in the human genome to be performed. We developed a web application, available at cissector.nki.nl, that lets researchers easily perform this analysis as a starting point for their research into any promoter of interest.


Asunto(s)
Regiones Promotoras Genéticas , Programas Informáticos , Humanos , Sitios de Unión , Genoma Humano/genética , Unión Proteica , Secuencias Reguladoras de Ácidos Nucleicos
9.
PLoS Genet ; 18(1): e1009719, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-35100260

RESUMEN

Tens of thousands of genetic variants associated with gene expression (cis-eQTLs) have been discovered in the human population. These eQTLs are active in various tissues and contexts, but the molecular mechanisms of eQTL variability are poorly understood, hindering our understanding of genetic regulation across biological contexts. Since many eQTLs are believed to act by altering transcription factor (TF) binding affinity, we hypothesized that analyzing eQTL effect size as a function of TF level may allow discovery of mechanisms of eQTL variability. Using GTEx Consortium eQTL data from 49 tissues, we analyzed the interaction between eQTL effect size and TF level across tissues and across individuals within specific tissues and generated a list of 10,098 TF-eQTL interactions across 2,136 genes that are supported by at least two lines of evidence. These TF-eQTLs were enriched for various TF binding measures, supporting with orthogonal evidence that these eQTLs are regulated by the implicated TFs. We also found that our TF-eQTLs tend to overlap genes with gene-by-environment regulatory effects and to colocalize with GWAS loci, implying that our approach can help to elucidate mechanisms of context-specificity and trait associations. Finally, we highlight an interesting example of IKZF1 TF regulation of an APBB1IP gene eQTL that colocalizes with a GWAS signal for blood cell traits. Together, our findings provide candidate TF mechanisms for a large number of eQTLs and offer a generalizable approach for researchers to discover TF regulators of genetic variant effects in additional QTL datasets.


Asunto(s)
Sitios de Carácter Cuantitativo , Factores de Transcripción/fisiología , Alelos , Sitios de Unión , Técnicas de Silenciamiento del Gen , Interacción Gen-Ambiente , Estudio de Asociación del Genoma Completo , Humanos , Factor 1 Regulador del Interferón/genética , Modelos Genéticos , Fenotipo , Factores de Transcripción/metabolismo
10.
Nucleic Acids Res ; 48(9): 5037-5053, 2020 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-32315032

RESUMEN

CRISPR RNA-guided endonucleases (RGEs) cut or direct activities to specific genomic loci, yet each has off-target activities that are often unpredictable. We developed a pair of simple in vitro assays to systematically measure the DNA-binding specificity (Spec-seq), catalytic activity specificity (SEAM-seq) and cleavage efficiency of RGEs. By separately quantifying binding and cleavage specificity, Spec/SEAM-seq provides detailed mechanistic insight into off-target activity. Feature-based models generated from Spec/SEAM-seq data for SpCas9 were consistent with previous reports of its in vitro and in vivo specificity, validating the approach. Spec/SEAM-seq is also useful for profiling less-well characterized RGEs. Application to an engineered SpCas9, HiFi-SpCas9, indicated that its enhanced target discrimination can be attributed to cleavage rather than binding specificity. The ortholog ScCas9, on the other hand, derives specificity from binding to an extended PAM. The decreased off-target activity of AsCas12a (Cpf1) appears to be primarily driven by DNA-binding specificity. Finally, we performed the first characterization of CasX specificity, revealing an all-or-nothing mechanism where mismatches can be bound, but not cleaved. Together, these applications establish Spec/SEAM-seq as an accessible method to rapidly and reliably evaluate the specificity of RGEs, Cas::gRNA pairs, and gain insight into the mechanism and thermodynamics of target discrimination.


Asunto(s)
Proteínas Asociadas a CRISPR/metabolismo , Endodesoxirribonucleasas/metabolismo , Acidaminococcus/enzimología , Disparidad de Par Base , Emparejamiento Base , Proteínas Asociadas a CRISPR/genética , ADN/química , ADN/metabolismo , División del ADN , Deltaproteobacteria/enzimología , Endodesoxirribonucleasas/genética , Mutación , Proteína Homeótica Nanog/genética , Unión Proteica , ARN/química , Técnica SELEX de Producción de Aptámeros , Análisis de Secuencia de ADN , Especificidad por Sustrato
11.
Genome Res ; 28(1): 111-121, 2018 01.
Artículo en Inglés | MEDLINE | ID: mdl-29196557

RESUMEN

The DNA-binding interfaces of the androgen (AR) and glucocorticoid (GR) receptors are virtually identical, yet these transcription factors share only about a third of their genomic binding sites and regulate similarly distinct sets of target genes. To address this paradox, we determined the intrinsic specificities of the AR and GR DNA-binding domains using a refined version of SELEX-seq. We developed an algorithm, SelexGLM, that quantifies binding specificity over a large (31-bp) binding site by iteratively fitting a feature-based generalized linear model to SELEX probe counts. This analysis revealed that the DNA-binding preferences of AR and GR homodimers differ significantly, both within and outside the 15-bp core binding site. The relative preference between the two factors can be tuned over a wide range by changing the DNA sequence, with AR more sensitive to sequence changes than GR. The specificity of AR extends to the regions flanking the core 15-bp site, where isothermal calorimetry measurements reveal that affinity is augmented by enthalpy-driven readout of poly(A) sequences associated with narrowed minor groove width. We conclude that the increased specificity of AR is correlated with more enthalpy-driven binding than GR. The binding models help explain differences in AR and GR genomic binding and provide a biophysical rationale for how promiscuous binding by GR allows functional substitution for AR in some castration-resistant prostate cancers.


Asunto(s)
Antagonistas de Receptores Androgénicos , Proteínas de Neoplasias , Neoplasias de la Próstata Resistentes a la Castración , Receptores Androgénicos/metabolismo , Receptores de Glucocorticoides , Técnica SELEX de Producción de Aptámeros/métodos , Antagonistas de Receptores Androgénicos/síntesis química , Antagonistas de Receptores Androgénicos/química , Antagonistas de Receptores Androgénicos/farmacología , Aptámeros de Nucleótidos/síntesis química , Aptámeros de Nucleótidos/química , Aptámeros de Nucleótidos/farmacología , Línea Celular Tumoral , Humanos , Masculino , Proteínas de Neoplasias/antagonistas & inhibidores , Proteínas de Neoplasias/metabolismo , Receptores de Glucocorticoides/antagonistas & inhibidores , Receptores de Glucocorticoides/metabolismo
12.
Proc Natl Acad Sci U S A ; 115(16): E3692-E3701, 2018 04 17.
Artículo en Inglés | MEDLINE | ID: mdl-29610332

RESUMEN

Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.


Asunto(s)
Huella de ADN/métodos , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Animales , Sitios de Unión , Conjuntos de Datos como Asunto , Proteínas de Drosophila/metabolismo , Ensayo de Cambio de Movilidad Electroforética , Elementos de Facilitación Genéticos , Biblioteca de Genes , Proteínas de Homeodominio/metabolismo , Humanos , Modelos Moleculares , Unión Proteica , Conformación Proteica , Proteínas Recombinantes/metabolismo , Factores de Transcripción/metabolismo , Proteína p53 Supresora de Tumor/metabolismo
13.
Mol Cell ; 48(5): 799-810, 2012 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-23102701

RESUMEN

The p53 tumor suppressor utilizes multiple mechanisms to selectively regulate its myriad target genes, which in turn mediate diverse cellular processes. Here, using conventional and single-molecule mRNA analyses, we demonstrate that the nucleoporin Nup98 is required for full expression of p21, a key effector of the p53 pathway, but not several other p53 target genes. Nup98 regulates p21 mRNA levels by a posttranscriptional mechanism in which a complex containing Nup98 and the p21 mRNA 3'UTR protects p21 mRNA from degradation by the exosome. An in silico approach revealed another p53 target (14-3-3σ) to be similarly regulated by Nup98. The expression of Nup98 is reduced in murine and human hepatocellular carcinomas (HCCs) and correlates with p21 expression in HCC patients. Our study elucidates a previously unrecognized function of wild-type Nup98 in regulating select p53 target genes that is distinct from the well-characterized oncogenic properties of Nup98 fusion proteins.


Asunto(s)
Carcinoma Hepatocelular/metabolismo , Neoplasias Hepáticas/metabolismo , Proteínas de Complejo Poro Nuclear/metabolismo , Procesamiento Postranscripcional del ARN , ARN Mensajero/metabolismo , Proteína p53 Supresora de Tumor/metabolismo , Proteínas 14-3-3/genética , Proteínas 14-3-3/metabolismo , Regiones no Traducidas 3' , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Subfamilia B de Transportador de Casetes de Unión a ATP/metabolismo , Animales , Antineoplásicos Fitogénicos/farmacología , Apoptosis/efectos de los fármacos , Sitios de Unión , Camptotecina/farmacología , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patología , Senescencia Celular , Inhibidor p21 de las Quinasas Dependientes de la Ciclina/genética , Inhibidor p21 de las Quinasas Dependientes de la Ciclina/metabolismo , Exosomas/metabolismo , Regulación Neoplásica de la Expresión Génica , Células Hep G2 , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patología , Masculino , Ratones , Ratones Noqueados , Proteínas de Complejo Poro Nuclear/genética , Interferencia de ARN , Estabilidad del ARN , Factores de Tiempo , Transfección , Proteína p53 Supresora de Tumor/genética , Miembro 4 de la Subfamilia B de Casete de Unión a ATP
14.
Mol Syst Biol ; 14(2): e7902, 2018 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-29472273

RESUMEN

Transcription factors (TFs) interpret DNA sequence by probing the chemical and structural properties of the nucleotide polymer. DNA shape is thought to enable a parsimonious representation of dependencies between nucleotide positions. Here, we propose a unified mathematical representation of the DNA sequence dependence of shape and TF binding, respectively, which simplifies and enhances analysis of shape readout. First, we demonstrate that linear models based on mononucleotide features alone account for 60-70% of the variance in minor groove width, roll, helix twist, and propeller twist. This explains why simple scoring matrices that ignore all dependencies between nucleotide positions can partially account for DNA shape readout by a TF Adding dinucleotide features as sequence-to-shape predictors to our model, we can almost perfectly explain the shape parameters. Building on this observation, we developed a post hoc analysis method that can be used to analyze any mechanism-agnostic protein-DNA binding model in terms of shape readout. Our insights provide an alternative strategy for using DNA shape information to enhance our understanding of how cis-regulatory codes are interpreted by the cellular machinery.


Asunto(s)
Biología Computacional/métodos , ADN/química , Factores de Transcripción/metabolismo , Sitios de Unión , ADN/metabolismo , Modelos Moleculares , Modelos Teóricos , Conformación de Ácido Nucleico
15.
Proc Natl Acad Sci U S A ; 113(13): E1835-43, 2016 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-26966232

RESUMEN

Regulation of gene expression by transcription factors (TFs) is highly dependent on genetic background and interactions with cofactors. Identifying specific context factors is a major challenge that requires new approaches. Here we show that exploiting natural variation is a potent strategy for probing functional interactions within gene regulatory networks. We developed an algorithm to identify genetic polymorphisms that modulate the regulatory connectivity between specific transcription factors and their target genes in vivo. As a proof of principle, we mapped connectivity quantitative trait loci (cQTLs) using parallel genotype and gene expression data for segregants from a cross between two strains of the yeast Saccharomyces cerevisiae We identified a nonsynonymous mutation in the DIG2 gene as a cQTL for the transcription factor Ste12p and confirmed this prediction empirically. We also identified three polymorphisms in TAF13 as putative modulators of regulation by Gcn4p. Our method has potential for revealing how genetic differences among individuals influence gene regulatory networks in any organism for which gene expression and genotype data are available along with information on binding preferences for transcription factors.


Asunto(s)
Redes Reguladoras de Genes , Sitios de Carácter Cuantitativo , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética , Algoritmos , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/genética , Regulación Fúngica de la Expresión Génica , Ontología de Genes , Genes del Tipo Sexual de los Hongos/genética , Modelos Genéticos , Mutación , Regiones Promotoras Genéticas , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/genética
16.
Proc Natl Acad Sci U S A ; 112(15): 4654-9, 2015 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-25775564

RESUMEN

DNA binding specificities of transcription factors (TFs) are a key component of gene regulatory processes. Underlying mechanisms that explain the highly specific binding of TFs to their genomic target sites are poorly understood. A better understanding of TF-DNA binding requires the ability to quantitatively model TF binding to accessible DNA as its basic step, before additional in vivo components can be considered. Traditionally, these models were built based on nucleotide sequence. Here, we integrated 3D DNA shape information derived with a high-throughput approach into the modeling of TF binding specificities. Using support vector regression, we trained quantitative models of TF binding specificity based on protein binding microarray (PBM) data for 68 mammalian TFs. The evaluation of our models included cross-validation on specific PBM array designs, testing across different PBM array designs, and using PBM-trained models to predict relative binding affinities derived from in vitro selection combined with deep sequencing (SELEX-seq). Our results showed that shape-augmented models compared favorably to sequence-based models. Although both k-mer and DNA shape features can encode interdependencies between nucleotide positions of the binding site, using DNA shape features reduced the dimensionality of the feature space. In addition, analyzing the feature weights of DNA shape-augmented models uncovered TF family-specific structural readout mechanisms that were not revealed by the DNA sequence. As such, this work combines knowledge from structural biology and genomics, and suggests a new path toward understanding TF binding and genome function.


Asunto(s)
ADN/química , ADN/metabolismo , Conformación de Ácido Nucleico , Factores de Transcripción/metabolismo , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión/genética , Biología Computacional/métodos , ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Cinética , Ratones , Modelos Genéticos , Análisis por Matrices de Proteínas , Unión Proteica , Factores de Transcripción/genética
17.
Nucleic Acids Res ; 43(21): e142, 2015 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-26184874

RESUMEN

Insight into the three-dimensional architecture of RNA is essential for understanding its cellular functions. However, even the classic transfer RNA structure contains features that are overlooked by existing bioinformatics tools. Here we present DSSR (Dissecting the Spatial Structure of RNA), an integrated and automated tool for analyzing and annotating RNA tertiary structures. The software identifies canonical and noncanonical base pairs, including those with modified nucleotides, in any tautomeric or protonation state. DSSR detects higher-order coplanar base associations, termed multiplets. It finds arrays of stacked pairs, classifies them by base-pair identity and backbone connectivity, and distinguishes a stem of covalently connected canonical pairs from a helix of stacked pairs of arbitrary type/linkage. DSSR identifies coaxial stacking of multiple stems within a single helix and lists isolated canonical pairs that lie outside of a stem. The program characterizes 'closed' loops of various types (hairpin, bulge, internal, and junction loops) and pseudoknots of arbitrary complexity. Notably, DSSR employs isolated pairs and the ends of stems, whether pseudoknotted or not, to define junction loops. This new, inclusive definition provides a novel perspective on the spatial organization of RNA. Tests on all nucleic acid structures in the Protein Data Bank confirm the efficiency and robustness of the software, and applications to representative RNA molecules illustrate its unique features. DSSR and related materials are freely available at http://x3dna.org/.


Asunto(s)
ARN/química , Programas Informáticos , Proteínas Asociadas a CRISPR/química , ADN/química , Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , ARN Catalítico/química , ARN de Hongos/química , ARN de Transferencia de Fenilalanina/química , ARN Viral/química , Riboswitch
18.
Proc Natl Acad Sci U S A ; 111(15): 5747-52, 2014 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-24706889

RESUMEN

Retroviral insertional mutagenesis is a powerful tool for identifying putative cancer genes in mice. To uncover the regulatory mechanisms by which common insertion loci affect downstream processes, we supplemented genotyping data with genome-wide mRNA expression profiling data for 97 tumors induced by retroviral insertional mutagenesis. We developed locus expression signature analysis, an algorithm to construct and interpret the differential gene expression signature associated with each common insertion locus. Comparing locus expression signatures to promoter affinity profiles allowed us to build a detailed map of transcription factors whose protein-level regulatory activity is modulated by a particular locus. We also predicted a large set of drugs that might mitigate the effect of the insertion on tumorigenesis. Taken together, our results demonstrate the potential of a locus-specific signature approach for identifying mammalian regulatory mechanisms in a cancer context.


Asunto(s)
Carcinogénesis/metabolismo , Biología Computacional/métodos , Daño del ADN , Regulación Neoplásica de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Variación Genética , Neoplasias/genética , Análisis de Varianza , Animales , Carcinogénesis/genética , Análisis por Conglomerados , Inhibidores Enzimáticos/farmacología , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Ontología de Genes , Ensayos Analíticos de Alto Rendimiento/métodos , Ratones , Inhibidores de las Quinasa Fosfoinosítidos-3
19.
Proc Natl Acad Sci U S A ; 110(16): 6376-81, 2013 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-23576721

RESUMEN

DNA binding proteins find their cognate sequences within genomic DNA through recognition of specific chemical and structural features. Here we demonstrate that high-resolution DNase I cleavage profiles can provide detailed information about the shape and chemical modification status of genomic DNA. Analyzing millions of DNA backbone hydrolysis events on naked genomic DNA, we show that the intrinsic rate of cleavage by DNase I closely tracks the width of the minor groove. Integration of these DNase I cleavage data with bisulfite sequencing data for the same cell type's genome reveals that cleavage directly adjacent to cytosine-phosphate-guanine (CpG) dinucleotides is enhanced at least eightfold by cytosine methylation. This phenomenon we show to be attributable to methylation-induced narrowing of the minor groove. Furthermore, we demonstrate that it enables simultaneous mapping of DNase I hypersensitivity and regional DNA methylation levels using dense in vivo cleavage data. Taken together, our results suggest a general mechanism by which CpG methylation can modulate protein-DNA interaction strength via the remodeling of DNA shape.


Asunto(s)
Metilación de ADN/genética , ADN/química , Desoxirribonucleasa I , Genómica/métodos , Modelos Moleculares , Conformación de Ácido Nucleico , Células Cultivadas , Islas de CpG/genética , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Humanos , Modelos Genéticos , Análisis de Secuencia de ADN
20.
Proc Natl Acad Sci U S A ; 109(16): 6030-5, 2012 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-22460799

RESUMEN

TLS/FUS (TLS) is a multifunctional protein implicated in a wide range of cellular processes, including transcription and mRNA processing, as well as in both cancer and neurological disease. However, little is currently known about TLS target genes and how they are recognized. Here, we used ChIP and promoter microarrays to identify genes potentially regulated by TLS. Among these genes, we detected a number that correlate with previously known functions of TLS, and confirmed TLS occupancy at several of them by ChIP. We also detected changes in mRNA levels of these target genes in cells where TLS levels were altered, indicative of both activation and repression. Next, we used data from the microarray and computational methods to determine whether specific sequences were enriched in DNA fragments bound by TLS. This analysis suggested the existence of TLS response elements, and we show that purified TLS indeed binds these sequences with specificity in vitro. Remarkably, however, TLS binds only single-strand versions of the sequences. Taken together, our results indicate that TLS regulates expression of specific target genes, likely via recognition of specific single-stranded DNA sequences located within their promoter regions.


Asunto(s)
ADN de Cadena Simple/genética , Regulación Neoplásica de la Expresión Génica , Proteína FUS de Unión a ARN/metabolismo , Elementos de Respuesta/genética , Secuencia de Bases , Unión Competitiva , Western Blotting , Línea Celular Tumoral , Inmunoprecipitación de Cromatina , Perfilación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Regiones Promotoras Genéticas/genética , Unión Proteica , Dominios y Motivos de Interacción de Proteínas/genética , Proteína FUS de Unión a ARN/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA