Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
1.
Cell ; 161(2): 307-18, 2015 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-25843630

RESUMEN

Protein-DNA binding is mediated by the recognition of the chemical signatures of the DNA bases and the 3D shape of the DNA molecule. Because DNA shape is a consequence of sequence, it is difficult to dissociate these modes of recognition. Here, we tease them apart in the context of Hox-DNA binding by mutating residues that, in a co-crystal structure, only recognize DNA shape. Complexes made with these mutants lose the preference to bind sequences with specific DNA shape features. Introducing shape-recognizing residues from one Hox protein to another swapped binding specificities in vitro and gene regulation in vivo. Statistical machine learning revealed that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence, and feature selection identified shape features important for recognition. Thus, shape readout is a direct and independent component of binding site selection by Hox proteins.


Asunto(s)
ADN/química , ADN/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Secuencia de Aminoácidos , Animales , Cristalografía por Rayos X , Proteínas de Homeodominio/química , Proteínas de Homeodominio/metabolismo , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Unión Proteica , Alineación de Secuencia
2.
Nat Methods ; 21(9): 1674-1683, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39103447

RESUMEN

Predicting protein-DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein-DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein-DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology.


Asunto(s)
Proteínas de Unión al ADN , ADN , Aprendizaje Profundo , Unión Proteica , ADN/metabolismo , ADN/química , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/química , Sitios de Unión , Biología Computacional/métodos , Modelos Moleculares
3.
Cell ; 147(6): 1270-82, 2011 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-22153072

RESUMEN

Members of transcription factor families typically have similar DNA binding specificities yet execute unique functions in vivo. Transcription factors often bind DNA as multiprotein complexes, raising the possibility that complex formation might modify their DNA binding specificities. To test this hypothesis, we developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative affinities to any DNA sequence for any transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we show that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd). Exd-Hox specificities group into three main classes that obey Hox gene collinearity rules and DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Together, these data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.


Asunto(s)
ADN/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila/metabolismo , Proteínas de Homeodominio/metabolismo , Multimerización de Proteína , Factores de Transcripción/metabolismo , Secuencia de Aminoácidos , Animales , Proteínas de Drosophila/química , Técnicas Genéticas , Proteínas de Homeodominio/química , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Factores de Transcripción/química
4.
Nucleic Acids Res ; 52(W1): W7-W12, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38801070

RESUMEN

Sequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features. These query tables do not consider flanking regions longer than two base pairs, and acquiring a query table for hexamers or higher-order k-mers is currently still unrealistic due to limitations in achieving sufficient statistical coverage in molecular simulations or structural biology experiments. A recent deep-learning method, Deep DNAshape, can predict DNA shape features at the core of a DNA fragment considering flanking regions of up to seven base pairs, trained on limited simulation data. However, Deep DNAshape is rather complicated to install, and it must run locally compared to the pentamer-based DNAshape webserver, creating a barrier for users. Here, we present the Deep DNAshape webserver, which has the benefits of both methods while being accurate, fast, and accessible to all users. Additional improvements of the webserver include the detection of user input in real time, the ability of interactive visualization tools and different modes of analyses. URL: https://deepdnashape.usc.edu.


Asunto(s)
ADN , Internet , Conformación de Ácido Nucleico , Programas Informáticos , ADN/química , Aprendizaje Profundo
5.
Nucleic Acids Res ; 52(W1): W354-W361, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38630617

RESUMEN

Analyzing and visualizing the tertiary structure and complex interactions of RNA is essential for being able to mechanistically decipher their molecular functions in vivo. Secondary structure visualization software can portray many aspects of RNA; however, these layouts are often unable to preserve topological correspondence since they do not consider tertiary interactions between different regions of an RNA molecule. Likewise, quaternary interactions between two or more interacting RNA molecules are not considered in secondary structure visualization tools. The RNAscape webserver produces visualizations that can preserve topological correspondence while remaining both visually intuitive and structurally insightful. RNAscape achieves this by designing a mathematical structural mapping algorithm which prioritizes the helical segments, reflecting their tertiary organization. Non-helical segments are mapped in a way that minimizes structural clutter. RNAscape runs a plotting script that is designed to generate publication-quality images. RNAscape natively supports non-standard nucleotides, multiple base-pairing annotation styles and requires no programming experience. RNAscape can also be used to analyze RNA/DNA hybrid structures and DNA topologies, including G-quadruplexes. Users can upload their own three-dimensional structures or enter a Protein Data Bank (PDB) ID of an existing structure. The RNAscape webserver allows users to customize visualizations through various settings as desired. URL: https://rnascape.usc.edu/.


Asunto(s)
Algoritmos , Conformación de Ácido Nucleico , ARN , Programas Informáticos , ARN/química , Gráficos por Computador , Modelos Moleculares , Internet
6.
Nucleic Acids Res ; 52(17): 10161-10179, 2024 Sep 23.
Artículo en Inglés | MEDLINE | ID: mdl-38966997

RESUMEN

Development of the malaria parasite, Plasmodium falciparum, is regulated by a limited number of sequence-specific transcription factors (TFs). However, the mechanisms by which these TFs recognize genome-wide binding sites is largely unknown. To address TF specificity, we investigated the binding of two TF subsets that either bind CACACA or GTGCAC DNA sequence motifs and further characterized two additional ApiAP2 TFs, PfAP2-G and PfAP2-EXP, which bind unique DNA motifs (GTAC and TGCATGCA). We also interrogated the impact of DNA sequence and chromatin context on P. falciparum TF binding by integrating high-throughput in vitro and in vivo binding assays, DNA shape predictions, epigenetic post-translational modifications, and chromatin accessibility. We found that DNA sequence context minimally impacts binding site selection for paralogous CACACA-binding TFs, while chromatin accessibility, epigenetic patterns, co-factor recruitment, and dimerization correlate with differential binding. In contrast, GTGCAC-binding TFs prefer different DNA sequence context in addition to chromatin dynamics. Finally, we determined that TFs that preferentially bind divergent DNA motifs may bind overlapping genomic regions due to low-affinity binding to other sequence motifs. Our results demonstrate that TF binding site selection relies on a combination of DNA sequence and chromatin features, thereby contributing to the complexity of P. falciparum gene regulatory mechanisms.


Asunto(s)
Cromatina , Motivos de Nucleótidos , Plasmodium falciparum , Unión Proteica , Proteínas Protozoarias , Factores de Transcripción , Plasmodium falciparum/genética , Plasmodium falciparum/metabolismo , Cromatina/metabolismo , Cromatina/genética , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Sitios de Unión , Humanos , Proteínas Protozoarias/metabolismo , Proteínas Protozoarias/genética , Proteínas Protozoarias/química , Malaria Falciparum/parasitología , Secuencia de Bases , ADN/metabolismo , ADN/química , Epigénesis Genética , ADN Protozoario/metabolismo , ADN Protozoario/genética
7.
Proc Natl Acad Sci U S A ; 120(4): e2205796120, 2023 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-36656856

RESUMEN

DNA-binding proteins play important roles in various cellular processes, but the mechanisms by which proteins recognize genomic target sites remain incompletely understood. Functional groups at the edges of the base pairs (bp) exposed in the DNA grooves represent physicochemical signatures. As these signatures enable proteins to form specific contacts between protein residues and bp, their study can provide mechanistic insights into protein-DNA binding. Existing experimental methods, such as X-ray crystallography, can reveal such mechanisms based on physicochemical interactions between proteins and their DNA target sites. However, the low throughput of structural biology methods limits mechanistic insights for selection of many genomic sites. High-throughput binding assays enable prediction of potential target sites by determining relative binding affinities of a protein to massive numbers of DNA sequences. Many currently available computational methods are based on the sequence of standard Watson-Crick bp. They assume that the contribution of overall binding affinity is independent for each base pair, or alternatively include dinucleotides or short k-mers. These methods cannot directly expand to physicochemical contacts, and they are not suitable to apply to DNA modifications or non-Watson-Crick bp. These variations include DNA methylation, and synthetic or mismatched bp. The proposed method, DeepRec, can predict relative binding affinities as function of physicochemical signatures and the effect of DNA methylation or other chemical modifications on binding. Sequence-based modeling methods are in comparison a coarse-grain description and cannot achieve such insights. Our chemistry-based modeling framework provides a path towards understanding genome function at a mechanistic level.


Asunto(s)
Proteínas de Unión al ADN , ADN , Emparejamiento Base , ADN/metabolismo , Unión Proteica , Proteínas de Unión al ADN/metabolismo , Sitios de Unión
8.
Annu Rev Biochem ; 79: 233-69, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20334529

RESUMEN

Specific interactions between proteins and DNA are fundamental to many biological processes. In this review, we provide a revised view of protein-DNA interactions that emphasizes the importance of the three-dimensional structures of both macromolecules. We divide protein-DNA interactions into two categories: those when the protein recognizes the unique chemical signatures of the DNA bases (base readout) and those when the protein recognizes a sequence-dependent DNA shape (shape readout). We further divide base readout into those interactions that occur in the major groove from those that occur in the minor groove. Analogously, the readout of the DNA shape is subdivided into global shape recognition (for example, when the DNA helix exhibits an overall bend) and local shape recognition (for example, when a base pair step is kinked or a region of the minor groove is narrow). Based on the >1500 structures of protein-DNA complexes now available in the Protein Data Bank, we argue that individual DNA-binding proteins combine multiple readout mechanisms to achieve DNA-binding specificity. Specificity that distinguishes between families frequently involves base readout in the major groove, whereas shape readout is often exploited for higher resolution specificity, to distinguish between members within the same DNA-binding protein family.


Asunto(s)
Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , ADN/química , ADN/metabolismo , Secuencia de Bases , Cristalografía por Rayos X , Proteína Receptora de AMP Cíclico/química , Proteína Receptora de AMP Cíclico/metabolismo , Conformación de Ácido Nucleico , Proteínas Represoras/química , Proteínas Represoras/metabolismo , Proteínas Reguladoras y Accesorias Virales/química , Proteínas Reguladoras y Accesorias Virales/metabolismo
10.
Nucleic Acids Res ; 51(11): 5621-5633, 2023 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-37177995

RESUMEN

Quantifying the nucleotide preferences of DNA binding proteins is essential to understanding how transcription factors (TFs) interact with their targets in the genome. High-throughput in vitro binding assays have been used to identify the inherent DNA binding preferences of TFs in a controlled environment isolated from confounding factors such as genome accessibility, DNA methylation, and TF binding cooperativity. Unfortunately, many of the most common approaches for measuring binding preferences are not sensitive enough for the study of moderate-to-low affinity binding sites, and are unable to detect small-scale differences between closely related homologs. The Forkhead box (FOX) family of TFs is known to play a crucial role in regulating a variety of key processes from proliferation and development to tumor suppression and aging. By using the high-sequencing depth SELEX-seq approach to study all four FOX homologs in Saccharomyces cerevisiae, we have been able to precisely quantify the contribution and importance of nucleotide positions all along an extended binding site. Essential to this process was the alignment of our SELEX-seq reads to a set of candidate core sequences determined using a recently developed tool for the alignment of enriched k-mers and a newly developed approach for the reprioritization of candidate cores.


Asunto(s)
Factores de Transcripción Forkhead , Proteínas de Saccharomyces cerevisiae , Sitios de Unión , ADN/genética , ADN/metabolismo , Proteínas de Unión al ADN/metabolismo , Nucleótidos/metabolismo , Unión Proteica , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción Forkhead/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
11.
Biophys J ; 123(2): 248-259, 2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38130056

RESUMEN

DNA recognition and targeting by transcription factors (TFs) through specific binding are fundamental in biological processes. Furthermore, the histidine protonation state at the TF-DNA binding interface can significantly influence the binding mechanism of TF-DNA complexes. Nevertheless, the role of histidine in TF-DNA complexes remains underexplored. Here, we employed all-atom molecular dynamics simulations using AlphaFold2-modeled complexes based on previously solved co-crystal structures to probe the role of the His-12 residue in the Extradenticle (Exd)-Sex combs reduced (Scr)-DNA complex when binding to Scr and Ultrabithorax (Ubx) target sites. Our results demonstrate that the protonation state of histidine notably affected the DNA minor-groove width profile and binding free energy. Examining flanking sequences of various binding affinities derived from SELEX-seq experiments, we analyzed the relationship between binding affinity and specificity. We uncovered how histidine protonation leads to increased binding affinity but can lower specificity. Our findings provide new mechanistic insights into the role of histidine in modulating TF-DNA binding.


Asunto(s)
Proteínas de Drosophila , Proteínas de Homeodominio , Animales , Proteínas de Homeodominio/genética , Histidina , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , ADN/química , Sitios de Unión , Factores de Transcripción/metabolismo
12.
J Chem Inf Model ; 64(16): 6450-6463, 2024 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-39058534

RESUMEN

Recently, the remarkable growth of available crystal structure data and libraries of commercially available or readily synthesizable molecules have unlocked previously inaccessible regions of chemical space for drug development. Paired with improvements in virtual ligand screening methods, these expanded libraries are having a notable impact on early drug design efforts. Yet screening-based methods still face scalability limits, due to computational constraints and the sheer scale of drug-like space. Machine learning approaches are overcoming these limitations by learning the fundamental intra- and intermolecular relationships in drug-target systems from existing data. Here, we introduce DrugHIVE, a deep hierarchical variational autoencoder that outperforms state-of-the-art autoregressive and diffusion-based methods in both speed and performance on common generative benchmarks. DrugHIVE's hierarchical design enables improved control over molecular generation. Its capabilities include dramatically increasing virtual screening efficiency and accelerating a wide range of common drug design tasks, including de novo generation, molecular optimization, scaffold hopping, linker design, and high-throughput pattern replacement. Our highly scalable method can even be applied to receptors with high-confidence AlphaFold-predicted structures, extending the ability to generate high-quality drug-like molecules to a majority of the unsolved human proteome.


Asunto(s)
Diseño de Fármacos , Ligandos , Modelos Moleculares , Aprendizaje Profundo , Humanos
13.
Biochemistry ; 62(17): 2541-2548, 2023 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-37552860

RESUMEN

CRISPR-Cas9 has been adapted as a readily programmable genome manipulation agent, and continuing technological advances rely on an in-depth mechanistic understanding of Cas9 target discrimination. Cas9 interrogates a target by unwinding the DNA duplex to form an R-loop, where the RNA guide hybridizes with one of the DNA strands. It has been shown that RNA guides shorter than the normal length of 20-nucleotide (-nt) support Cas9 cleavage activity by enabling partial unwinding beyond the RNA/DNA hybrid. To investigate whether DNA segment beyond the RNA/DNA hybrid can impact Cas9 target discrimination with truncated guides, Cas9 double-stranded DNA cleavage rates (kcat) were measured with 16-nt guides on targets with varying sequences at +17 to +20 positions distal to the protospacer-adjacent-motif (PAM). The data reveal a log-linear inverse correlation between kcat and the PAM+(17-20) DNA duplex dissociation free energy (ΔGNN(17-20)0), with sequences having smaller ΔGNN(17-20)0 showing faster cleavage and a higher degree of unwinding. The results indicate that, with a 16-nt guide, "peripheral" DNA sequences beyond the RNA/DNA hybrid contribute to target discrimination by tuning the cleavage reaction transition state through the modulation of PAM-distal unwinding. The finding provides mechanistic insights for the further development of strategies that use RNA guide truncation to enhance Cas9 specificity.


Asunto(s)
Sistemas CRISPR-Cas , ARN , ARN/genética , Nucleótidos , ADN/genética , Edición Génica/métodos
14.
Bioinformatics ; 38(22): 5121-5123, 2022 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-36179084

RESUMEN

SUMMARY: Several high-throughput protein-DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Crawl (TDC), an ultra-rapid tool designed for the alignment of k-mer level data in a rank-dependent and position weight matrix (PWM)-independent manner. As the framework only depends on the rank of the input, the method can accept input from many types of experiments (protein binding microarray, SELEX-seq, SMiLE-seq, etc.) without the need for specialized parameterization. Measuring the performance of the alignment using multiple linear regression with 5-fold cross-validation, we find TDC to perform as well as or better than computationally expensive PWM-based methods. AVAILABILITY AND IMPLEMENTATION: TDC can be run online at https://topdowncrawl.usc.edu or locally as a python package available through pip at https://pypi.org/project/TopDownCrawl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Posición Específica de Matrices de Puntuación , Sitios de Unión , Análisis de Secuencia de ADN/métodos , Unión Proteica
15.
Proc Natl Acad Sci U S A ; 117(25): 14322-14330, 2020 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-32518115

RESUMEN

Phosphorothioate (PT) DNA modifications-in which a nonbonding phosphate oxygen is replaced with sulfur-represent a widespread, horizontally transferred epigenetic system in prokaryotes and have a highly unusual property of occupying only a small fraction of available consensus sequences in a genome. Using Salmonella enterica as a model, we asked a question of fundamental importance: How do the PT-modifying DndA-E proteins select their GPSAAC/GPSTTC targets? Here, we applied innovative analytical, sequencing, and computational tools to discover a novel behavior for DNA-binding proteins: The Dnd proteins are "parked" at the G6mATC Dam methyltransferase consensus sequence instead of the expected GAAC/GTTC motif, with removal of the 6mA permitting extensive PT modification of GATC sites. This shift in modification sites further revealed a surprising constancy in the density of PT modifications across the genome. Computational analysis showed that GAAC, GTTC, and GATC share common features of DNA shape, which suggests that PT epigenetics are regulated in a density-dependent manner partly by DNA shape-driven target selection in the genome.


Asunto(s)
Bacterias/genética , Bacterias/metabolismo , ADN Bacteriano/metabolismo , Epigénesis Genética/fisiología , Epigenómica , Fosfatos/metabolismo , 2-Aminopurina , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Secuencia de Bases , Sitios de Unión , Secuencia de Consenso , ADN Bacteriano/química , ADN Bacteriano/genética , Proteínas de Unión al ADN/metabolismo , Escherichia coli/metabolismo , Genoma Bacteriano , Salmonella enterica/genética
16.
Nature ; 540(7633): 428-432, 2016 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-27919074

RESUMEN

The functionality of stem cells declines during ageing, and this decline contributes to ageing-associated impairments in tissue regeneration and function. Alterations in developmental pathways have been associated with declines in stem-cell function during ageing, but the nature of this process remains poorly understood. Hox genes are key regulators of stem cells and tissue patterning during embryogenesis with an unknown role in ageing. Here we show that the epigenetic stress response in muscle stem cells (also known as satellite cells) differs between aged and young mice. The alteration includes aberrant global and site-specific induction of active chromatin marks in activated satellite cells from aged mice, resulting in the specific induction of Hoxa9 but not other Hox genes. Hoxa9 in turn activates several developmental pathways and represents a decisive factor that separates satellite cell gene expression in aged mice from that in young mice. The activated pathways include most of the currently known inhibitors of satellite cell function in ageing muscle, including Wnt, TGFß, JAK/STAT and senescence signalling. Inhibition of aberrant chromatin activation or deletion of Hoxa9 improves satellite cell function and muscle regeneration in aged mice, whereas overexpression of Hoxa9 mimics ageing-associated defects in satellite cells from young mice, which can be rescued by the inhibition of Hoxa9-targeted developmental pathways. Together, these data delineate an altered epigenetic stress response in activated satellite cells from aged mice, which limits satellite cell function and muscle regeneration by Hoxa9-dependent activation of developmental pathways.


Asunto(s)
Senescencia Celular , Epistasis Genética , Crecimiento y Desarrollo/genética , Proteínas de Homeodominio/metabolismo , Células Satélite del Músculo Esquelético/citología , Células Satélite del Músculo Esquelético/metabolismo , Estrés Fisiológico/genética , Envejecimiento , Animales , Senescencia Celular/genética , Cromatina/genética , Cromatina/metabolismo , Femenino , Proteínas de Homeodominio/biosíntesis , Proteínas de Homeodominio/genética , Masculino , Ratones , Músculo Esquelético/citología , Músculo Esquelético/metabolismo , Regeneración/genética
17.
Mol Cell ; 54(5): 844-857, 2014 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-24813947

RESUMEN

Transcription factors (TFs) preferentially bind sites contained in regions of computationally predicted high nucleosomal occupancy, suggesting that nucleosomes are gatekeepers of TF binding sites. However, because of their complexity mammalian genomes contain millions of randomly occurring, unbound TF consensus binding sites. We hypothesized that the information controlling nucleosome assembly may coincide with the information that enables TFs to bind cis-regulatory elements while ignoring randomly occurring sites. Hence, nucleosomes would selectively mask genomic sites that can be contacted by TFs and thus be potentially functional. The hematopoietic pioneer TF Pu.1 maintained nucleosome depletion at macrophage-specific enhancers that displayed a broad range of nucleosome occupancy in other cell types and in reconstituted chromatin. We identified a minimal set of DNA sequence and shape features that accurately predicted both Pu.1 binding and nucleosome occupancy genome-wide. These data reveal a basic organizational principle of mammalian cis-regulatory elements whereby TF recruitment and nucleosome deposition are controlled by overlapping DNA sequence features.


Asunto(s)
Elementos de Facilitación Genéticos , Nucleosomas/genética , Proteínas Proto-Oncogénicas/metabolismo , Transactivadores/metabolismo , Animales , Secuencia de Bases , Sitios de Unión , Células Cultivadas , Secuencia de Consenso , Regulación de la Expresión Génica , Técnicas de Silenciamiento del Gen , Humanos , Ratones , Modelos Genéticos , Nucleosomas/metabolismo , Proteínas Proto-Oncogénicas/genética , ARN Interferente Pequeño/genética , Análisis de Secuencia de ADN , Máquina de Vectores de Soporte , Transactivadores/genética
18.
Nucleic Acids Res ; 48(D1): D246-D255, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31665425

RESUMEN

TFBSshape (https://tfbsshape.usc.edu) is a motif database for analyzing structural profiles of transcription factor binding sites (TFBSs). The main rationale for this database is to be able to derive mechanistic insights in protein-DNA readout modes from sequencing data without available structures. We extended the quantity and dimensionality of TFBSshape, from mostly in vitro to in vivo binding and from unmethylated to methylated DNA. This new release of TFBSshape improves its functionality and launches a responsive and user-friendly web interface for easy access to the data. The current expansion includes new entries from the most recent collections of transcription factors (TFs) from the JASPAR and UniPROBE databases, methylated TFBSs derived from in vitro high-throughput EpiSELEX-seq binding assays and in vivo methylated TFBSs from the MeDReaders database. TFBSshape content has increased to 2428 structural profiles for 1900 TFs from 39 different species. The structural profiles for each TFBS entry now include 13 shape features and minor groove electrostatic potential for standard DNA and four shape features for methylated DNA. We improved the flexibility and accuracy for the shape-based alignment of TFBSs and designed new tools to compare methylated and unmethylated structural profiles of TFs and methods to derive DNA shape-preserving nucleotide mutations in TFBSs.


Asunto(s)
ADN/química , Bases de Datos Genéticas , Factores de Transcripción/metabolismo , Sitios de Unión , ADN/metabolismo , Metilación de ADN , Mutación , Motivos de Nucleótidos , Unión Proteica , Análisis de Secuencia de ADN
19.
Nucleic Acids Res ; 48(D1): D277-D287, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31612957

RESUMEN

DNAproDB (https://dnaprodb.usc.edu) is a web-based database and structural analysis tool that offers a combination of data visualization, data processing and search functionality that improves the speed and ease with which researchers can analyze, access and visualize structural data of DNA-protein complexes. In this paper, we report significant improvements made to DNAproDB since its initial release. DNAproDB now supports any DNA secondary structure from typical B-form DNA to single-stranded DNA to G-quadruplexes. We have updated the structure of our data files to support complex DNA conformations, multiple DNA-protein complexes within a DNAproDB entry and model indexing for analysis of ensemble data. Support for chemically modified residues and nucleotides has been significantly improved along with the addition of new structural features, improved structural moiety assignment and use of more sequence-based annotations. We have redesigned our report pages and search forms to support these enhancements, and the DNAproDB website has been improved to be more responsive and user-friendly. DNAproDB is now integrated with the Nucleic Acid Database, and we have increased our coverage of available Protein Data Bank entries. Our database now contains 95% of all available DNA-protein complexes, making our tools for analysis of these structures accessible to a broad community.


Asunto(s)
Proteínas de Unión al ADN/química , ADN/química , Bases de Datos Genéticas , Internet , Conformación de Ácido Nucleico , Conformación Proteica , Programas Informáticos , Interfaz Usuario-Computador
20.
Nucleic Acids Res ; 48(15): 8529-8544, 2020 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-32738045

RESUMEN

Myocyte enhancer factor-2B (MEF2B) has the unique capability of binding to its DNA target sites with a degenerate motif, while still functioning as a gene-specific transcriptional regulator. Identifying its DNA targets is crucial given regulatory roles exerted by members of the MEF2 family and MEF2B's involvement in B-cell lymphoma. Analyzing structural data and SELEX-seq experimental results, we deduced the DNA sequence and shape determinants of MEF2B target sites on a high-throughput basis in vitro for wild-type and mutant proteins. Quantitative modeling of MEF2B binding affinities and computational simulations exposed the DNA readout mechanisms of MEF2B. The resulting binding signature of MEF2B revealed distinct intricacies of DNA recognition compared to other transcription factors. MEF2B uses base readout at its half-sites combined with shape readout at the center of its degenerate motif, where A-tract polarity dictates nuances of binding. The predominant role of shape readout at the center of the core motif, with most contacts formed in the minor groove, differs from previously observed protein-DNA readout modes. MEF2B, therefore, represents a unique protein for studies of the role of DNA shape in achieving binding specificity. MEF2B-DNA recognition mechanisms are likely representative for other members of the MEF2 family.


Asunto(s)
Proteínas de Unión al ADN/ultraestructura , ADN/ultraestructura , Complejos Multiproteicos/ultraestructura , Secuencia de Aminoácidos/genética , Sitios de Unión/genética , ADN/genética , Proteínas de Unión al ADN/química , Humanos , Linfoma de Células B/genética , Linfoma de Células B/patología , Proteínas de Dominio MADS/genética , Proteínas de Dominio MADS/ultraestructura , Factores de Transcripción MEF2/química , Factores de Transcripción MEF2/ultraestructura , Complejos Multiproteicos/genética , Conformación de Ácido Nucleico , Motivos de Nucleótidos/genética , Unión Proteica/genética
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda