Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 113
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nat Rev Genet ; 21(11): 699-714, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32665585

RESUMEN

Despite enormous progress in understanding the fundamentals of bacterial gene regulation, our knowledge remains limited when compared with the number of bacterial genomes and regulatory systems to be discovered. Derived from a small number of initial studies, classic definitions for concepts of gene regulation have evolved as the number of characterized promoters has increased. Together with discoveries made using new technologies, this knowledge has led to revised generalizations and principles. In this Expert Recommendation, we suggest precise, updated definitions that support a logical, consistent conceptual framework of bacterial gene regulation, focusing on transcription initiation. The resulting concepts can be formalized by ontologies for computational modelling, laying the foundation for improved bioinformatics tools, knowledge-based resources and scientific communication. Thus, this work will help researchers construct better predictive models, with different formalisms, that will be useful in engineering, synthetic biology, microbiology and genetics.


Asunto(s)
Bacterias/genética , Regulación Bacteriana de la Expresión Génica , Iniciación de la Transcripción Genética , Operón , Regiones Promotoras Genéticas , Regulón , Factores de Transcripción/fisiología
2.
Nucleic Acids Res ; 51(11): 5364-5376, 2023 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-36951113

RESUMEN

The human genome contains about 800 C2H2 zinc finger proteins (ZFPs), and most of them are composed of long arrays of zinc fingers. Standard ZFP recognition model asserts longer finger arrays should recognize longer DNA-binding sites. However, recent experimental efforts to identify in vivo ZFP binding sites contradict this assumption, with many exhibiting short motifs. Here we use ZFY, CTCF, ZIM3, and ZNF343 as examples to address three closely related questions: What are the reasons that impede current motif discovery methods? What are the functions of those seemingly unused fingers and how can we improve the motif discovery algorithms based on long ZFPs' biophysical properties? Using ZFY, we employed a variety of methods and find evidence for 'dependent recognition' where downstream fingers can recognize some previously undiscovered motifs only in the presence of an intact core site. For CTCF, high-throughput measurements revealed its upstream specificity profile depends on the strength of its core. Moreover, the binding strength of the upstream site modulates CTCF's sensitivity to different epigenetic modifications within the core, providing new insight into how the previously identified intellectual disability-causing and cancer-related mutant R567W disrupts upstream recognition and deregulates the epigenetic control by CTCF. Our results establish that, because of irregular motif structures, variable spacing and dependent recognition between sub-motifs, the specificities of long ZFPs are significantly underestimated, so we developed an algorithm, ModeMap, to infer the motifs and recognition models of ZIM3 and ZNF343, which facilitates high-confidence identification of specific binding sites, including repeats-derived elements. With revised concept, technique, and algorithm, we can discover the overlooked specificities and functions of those 'extra' fingers, and therefore decipher their broader roles in human biology and diseases.


Asunto(s)
ADN , Factores de Transcripción , Dedos de Zinc , Humanos , Sitios de Unión , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Algoritmos , Motivos de Nucleótidos , Secuencias de Aminoácidos , ADN/química , ADN/metabolismo
3.
Bioinformatics ; 39(6)2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37294804

RESUMEN

MOTIVATION: Motifs play a crucial role in computational biology, as they provide valuable information about the binding specificity of proteins. However, conventional motif discovery methods typically rely on simple combinatoric or probabilistic approaches, which can be biased by heuristics such as substring-masking for multiple motif discovery. In recent years, deep neural networks have become increasingly popular for motif discovery, as they are capable of capturing complex patterns in data. Nonetheless, inferring motifs from neural networks remains a challenging problem, both from a modeling and computational standpoint, despite the success of these networks in supervised learning tasks. RESULTS: We present a principled representation learning approach based on a hierarchical sparse representation for motif discovery. Our method effectively discovers gapped, long, or overlapping motifs that we show to commonly exist in next-generation sequencing datasets, in addition to the short and enriched primary binding sites. Our model is fully interpretable, fast, and capable of capturing motifs in a large number of DNA strings. A key concept emerged from our approach-enumerating at the image level-effectively overcomes the k-mers paradigm, enabling modest computational resources for capturing the long and varied but conserved patterns, in addition to capturing the primary binding sites. AVAILABILITY AND IMPLEMENTATION: Our method is available as a Julia package under the MIT license at https://github.com/kchu25/MOTIFs.jl, and the results on experimental data can be found at https://zenodo.org/record/7783033.


Asunto(s)
Proteínas , Programas Informáticos , Proteínas/química , Sitios de Unión , Redes Neurales de la Computación , ADN
4.
Immunity ; 40(6): 896-909, 2014 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-24882217

RESUMEN

Animal host defense against infection requires the expression of defense genes at the right place and the right time. Understanding such tight control of host defense requires the elucidation of the transcription factors involved. By using an unbiased approach in the model Caenorhabditis elegans, we discovered that HLH-30 (known as TFEB in mammals) is a key transcription factor for host defense. HLH-30 was activated shortly after Staphylococcus aureus infection, and drove the expression of close to 80% of the host response, including antimicrobial and autophagy genes that were essential for host tolerance of infection. TFEB was also rapidly activated in murine macrophages upon S. aureus infection and was required for proper transcriptional induction of several proinflammatory cytokines and chemokines. Thus, our data suggest that TFEB is a previously unappreciated, evolutionarily ancient transcription factor in the host response to infection.


Asunto(s)
Factores de Transcripción Básicos con Cremalleras de Leucinas y Motivos Hélice-Asa-Hélice/inmunología , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/inmunología , Proteínas de Caenorhabditis elegans/inmunología , Caenorhabditis elegans/inmunología , Caenorhabditis elegans/microbiología , Infecciones Estafilocócicas/inmunología , Animales , Autofagia/genética , Autofagia/inmunología , Factores de Transcripción Básicos con Cremalleras de Leucinas y Motivos Hélice-Asa-Hélice/genética , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Proteínas de Caenorhabditis elegans/genética , Enterococcus faecalis/inmunología , Inmunidad Innata , Macrófagos/inmunología , Ratones , Infecciones por Pseudomonas/inmunología , Pseudomonas aeruginosa/inmunología , Interferencia de ARN , ARN Interferente Pequeño , Infecciones por Salmonella/inmunología , Salmonella enterica/inmunología , Transducción de Señal/inmunología , Staphylococcus aureus/inmunología , Activación Transcripcional/genética , Activación Transcripcional/inmunología
5.
Cell ; 133(7): 1277-89, 2008 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-18585360

RESUMEN

We describe the comprehensive characterization of homeodomain DNA-binding specificities from a metazoan genome. The analysis of all 84 independent homeodomains from D. melanogaster reveals the breadth of DNA sequences that can be specified by this recognition motif. The majority of these factors can be organized into 11 different specificity groups, where the preferred recognition sequence between these groups can differ at up to four of the six core recognition positions. Analysis of the recognition motifs within these groups led to a catalog of common specificity determinants that may cooperate or compete to define the binding site preference. With these recognition principles, a homeodomain can be reengineered to create factors where its specificity is altered at the majority of recognition positions. This resource also allows prediction of homeodomain specificities from other organisms, which is demonstrated by the prediction and analysis of human homeodomain specificities.


Asunto(s)
ADN/metabolismo , Proteínas de Drosophila/química , Drosophila melanogaster/química , Proteínas de Homeodominio/química , Secuencia de Aminoácidos , Animales , Bacterias/química , Bacterias/genética , Secuencia de Bases , ADN/química , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Genoma de los Insectos , Proteínas de Homeodominio/genética , Humanos , Modelos Moleculares , Filogenia , Ingeniería de Proteínas , Estructura Terciaria de Proteína , Técnicas del Sistema de Dos Híbridos
6.
Mol Biol Evol ; 38(7): 2854-2868, 2021 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-33720298

RESUMEN

Transcription factor-driven cell fate engineering in pluripotency induction, transdifferentiation, and forward reprogramming requires efficiency, speed, and maturity for widespread adoption and clinical translation. Here, we used Oct4, Sox2, Klf4, and c-Myc driven pluripotency reprogramming to evaluate methods for enhancing and tailoring cell fate transitions, through directed evolution with iterative screening of pooled mutant libraries and phenotypic selection. We identified an artificially evolved and enhanced POU factor (ePOU) that substantially outperforms wild-type Oct4 in terms of reprogramming speed and efficiency. In contrast to Oct4, not only can ePOU induce pluripotency with Sox2 alone, but it can also do so in the absence of Sox2 in a three-factor ePOU/Klf4/c-Myc cocktail. Biochemical assays combined with genome-wide analyses showed that ePOU possesses a new preference to dimerize on palindromic DNA elements. Yet, the moderate capacity of Oct4 to function as a pioneer factor, its preference to bind octamer DNA and its capability to dimerize with Sox2 and Sox17 proteins remain unchanged in ePOU. Compared with Oct4, ePOU is thermodynamically stabilized and persists longer in reprogramming cells. In consequence, ePOU: 1) differentially activates several genes hitherto not implicated in reprogramming, 2) reveals an unappreciated role of thyrotropin-releasing hormone signaling, and 3) binds a distinct class of retrotransposons. Collectively, these features enable ePOU to accelerate the establishment of the pluripotency network. This demonstrates that the phenotypic selection of novel factor variants from mammalian cells with desired properties is key to advancing cell fate conversions with artificially evolved biomolecules.


Asunto(s)
Técnicas de Reprogramación Celular , Evolución Molecular Dirigida , Factores del Dominio POU/genética , Animales , Factor 4 Similar a Kruppel , Ratones , Ingeniería de Proteínas
7.
Nucleic Acids Res ; 45(14): 8199-8207, 2017 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-28510715

RESUMEN

The quantitative specificity of the STAT1 transcription factor was determined by measuring the relative affinity to hundreds of variants of the consensus binding site including variations in the length of the site. The known consensus sequence is observed to have the highest affinity, with all variants decreasing binding affinity considerably. There is very little loss of binding affinity when the CpG within the consensus binding site is methylated. Additionally, the specificity of mutant proteins, with variants of amino acids that interact with the DNA, was determined and nearly all of them are observed to lose specificity across the entire binding site. The change of Asn at position 460 to His, which corresponds to the natural amino acid at the homologous position in STAT6, does not change the specificity nor does it change the length preference to match that of STAT6. These results provide the first quantitative analysis of changes in binding affinity for the STAT1 protein, and several variants of it, to hundreds of different binding sites including different spacer lengths, and the effect of CpG methylation.


Asunto(s)
Islas de CpG/genética , ADN/genética , Variación Genética , Factor de Transcripción STAT1/genética , Algoritmos , Secuencia de Aminoácidos , Secuencia de Bases , Sitios de Unión/genética , Unión Competitiva , ADN/metabolismo , Metilación de ADN , Electroforesis en Gel de Poliacrilamida , Cinética , Mutación Missense , Unión Proteica , Factor de Transcripción STAT1/metabolismo , Factor de Transcripción STAT6/genética , Factor de Transcripción STAT6/metabolismo , Homología de Secuencia de Aminoácido
8.
Nucleic Acids Res ; 45(2): 832-845, 2017 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-27915232

RESUMEN

Cooperative binding of transcription factors is known to be important in the regulation of gene expression programs conferring cellular identities. However, current methods to measure cooperativity parameters have been laborious and therefore limited to studying only a few sequence variants at a time. We developed Coop-seq (cooperativity by sequencing) that is capable of efficiently and accurately determining the cooperativity parameters for hundreds of different DNA sequences in a single experiment. We apply Coop-seq to 12 dimer pairs from the Sox and POU families of transcription factors using 324 unique sequences with changed half-site orientation, altered spacing and discrete randomization within the binding elements. The study reveals specific dimerization profiles of different Sox factors with Oct4. By contrast, Oct4 and the three neural class III POU factors Brn2, Brn4 and Oct6 assemble with Sox2 in a surprisingly indistinguishable manner. Two novel half-site configurations can support functional Sox/Oct dimerization in addition to known composite motifs. Moreover, Coop-seq uncovers a nucleotide switch within the POU half-site when spacing is altered, which is mirrored in genomic loci bound by Sox2/Oct4 complexes.


Asunto(s)
Factores del Dominio POU/metabolismo , Factores de Transcripción SOX/metabolismo , Animales , ADN/química , ADN/metabolismo , Ratones , Modelos Moleculares , Factor 3 de Transcripción de Unión a Octámeros/química , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Factores del Dominio POU/química , Unión Proteica , Conformación Proteica , Multimerización de Proteína , Factores de Transcripción SOX/química , Factores de Transcripción SOXB1/química , Factores de Transcripción SOXB1/metabolismo
9.
BMC Bioinformatics ; 19(1): 86, 2018 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-29510689

RESUMEN

BACKGROUND: Transcription factor (TF) binding site specificity is commonly represented by some form of matrix model in which the positions in the binding site are assumed to contribute independently to the site's activity. The independence assumption is known to be an approximation, often a good one but sometimes poor. Alternative approaches have been developed that use k-mers (DNA "words" of length k) to account for the non-independence, and more recently DNA structural parameters have been incorporated into the models. ChIP-seq data are often used to assess the discriminatory power of motifs and to compare different models. However, to measure the improvement due to using more complex models, one must compare to optimized matrix models. RESULTS: We describe a program "Discriminative Additive Model Optimization" (DAMO) that uses positive and negative examples, as in ChIP-seq data, and finds the additive position weight matrix (PWM) that maximizes the Area Under the Receiver Operating Characteristic Curve (AUROC). We compare to a recent study where structural parameters, serving as features in a gradient boosting classifier algorithm, are shown to improve the AUROC over JASPAR position frequency matrices (PFMs). In agreement with the previous results, we find that adding structural parameters gives the largest improvement, but most of the gain can be obtained by an optimized PWM and nearly all of the gain can be obtained with a di-nucleotide extension to the PWM. CONCLUSION: To appropriately compare different models for TF bind sites, optimized models must be used. PWMs and their extensions are good representations of binding specificity for most TFs, and more complex models, including the incorporation of DNA shape features and gradient boosting classifiers, provide only moderate improvements for a few TFs.


Asunto(s)
Algoritmos , ADN/química , Modelos Moleculares , Motivos de Nucleótidos/genética , Posición Específica de Matrices de Puntuación , Área Bajo la Curva , Sitios de Unión , Bases de Datos de Ácidos Nucleicos , Humanos , Unión Proteica
10.
BMC Mol Biol ; 19(1): 5, 2018 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-29587652

RESUMEN

BACKGROUND: BATF family transcription factors (BATF, BATF2 and BATF3) form hetero-trimers with JUNB and either IRF4 or IRF8 to regulate cell fate in T cells and dendritic cells in vivo. While each combination of the hetero-trimer has a distinct role, some degree of cross-compensation was observed. The basis for the differential actions of IRF4 and IRF8 with BATF factors and JUNB is still unknown. We propose that the differences in function between these hetero-trimers may be caused by differences in their DNA binding preferences. While all three BATF family transcription factors have similar binding preferences when binding as a hetero-dimer with JUNB, the cooperative binding of IRF4 or IRF8 to the hetero-dimer/DNA complex could change the preferences. We used Spec-seq, which allows for the efficient and accurate determination of relative affinity to a large collection of sequences in parallel, to find differences between cooperative DNA binding of IRF4, IRF8 and BATF family members. RESULTS: We found that without IRF binding, all three hetero-dimer pairs exhibit nearly the same binding preferences to both expected wildtype binding sites TRE (TGA(C/G)TCA) and CRE (TGACGTCA). IRF4 and IRF8 show the very similar DNA binding preferences when binding with any of the three hetero-dimers. No major change of binding preferences was found in the half-sites between different hetero-trimers. IRF proteins bind with substantially lower affinity with either a single nucleotide spacer between IRF and BATF binding site or with an alternative mode of binding in the opposite orientation. In addition, the preference to CRE binding site was reduced with either IRF binding in all BATF-JUNB combinations. CONCLUSIONS: The specificities of BATF, BATF2 and BATF3 are all very similar as are their interactions with IRF4 and IRF8. IRF proteins binding adjacent to BATF sites increases affinity substantially compared to sequences with spacings between the sites, indicating cooperative binding through protein-protein interactions. The preference for the type of BATF binding site, TRE or CRE, is also altered when IRF proteins bind. These in vitro preferences aid in the understanding of in vivo binding activities.


Asunto(s)
Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/metabolismo , Factores Reguladores del Interferón/genética , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/genética , Animales , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/química , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/genética , Sitios de Unión , Humanos , Factores Reguladores del Interferón/química , Factores Reguladores del Interferón/metabolismo , Ratones , Multimerización de Proteína , Proteínas Represoras/química , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Proteínas Supresoras de Tumor/química , Proteínas Supresoras de Tumor/genética , Proteínas Supresoras de Tumor/metabolismo
11.
Bioinformatics ; 33(15): 2288-2295, 2017 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-28379348

RESUMEN

MOTIVATION: Characterizing the binding specificities of transcription factors (TFs) is crucial to the study of gene expression regulation. Recently developed high-throughput experimental methods, including protein binding microarrays (PBM) and high-throughput SELEX (HT-SELEX), have enabled rapid measurements of the specificities for hundreds of TFs. However, few studies have developed efficient algorithms for estimating binding motifs based on HT-SELEX data. Also the simple method of constructing a position weight matrix (PWM) by comparing the frequency of the preferred sequence with single-nucleotide variants has the risk of generating motifs with higher information content than the true binding specificity. RESULTS: We developed an algorithm called BEESEM that builds on a comprehensive biophysical model of protein-DNA interactions, which is trained using the expectation maximization method. BEESEM is capable of selecting the optimal motif length and calculating the confidence intervals of estimated parameters. By comparing BEESEM with the published motifs estimated using the same HT-SELEX data, we demonstrate that BEESEM provides significant improvements. We also evaluate several motif discovery algorithms on independent PBM and ChIP-seq data. BEESEM provides significantly better fits to in vitro data, but its performance is similar to some other methods on in vivo data under the criterion of the area under the receiver operating characteristic curve (AUROC). This highlights the limitations of the purely rank-based AUROC criterion. Using quantitative binding data to assess models, however, demonstrates that BEESEM improves on prior models. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at http://stormo.wustl.edu/resources.html . CONTACT: stormo@wustl.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , ADN/metabolismo , Análisis por Matrices de Proteínas/métodos , Programas Informáticos , Termodinámica , Factores de Transcripción/metabolismo , Algoritmos , Animales , Sitios de Unión , ADN/química , Humanos , Ratones , Posición Específica de Matrices de Puntuación , Unión Proteica , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/química
12.
PLoS Comput Biol ; 13(7): e1005638, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28686588

RESUMEN

The specificities of transcription factors are most commonly represented with probabilistic models. These models provide a probability for each base occurring at each position within the binding site and the positions are assumed to contribute independently. The model is simple and intuitive and is the basis for many motif discovery algorithms. However, the model also has inherent limitations that prevent it from accurately representing true binding probabilities, especially for the highest affinity sites under conditions of high protein concentration. The limitations are not due to the assumption of independence between positions but rather are caused by the non-linear relationship between binding affinity and binding probability and the fact that independent normalization at each position skews the site probabilities. Generally probabilistic models are reasonably good approximations, but new high-throughput methods allow for biophysical models with increased accuracy that should be used whenever possible.


Asunto(s)
ADN/química , ADN/metabolismo , Modelos Estadísticos , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Biología Computacional , Simulación por Computador , Programas Informáticos
13.
J Biol Chem ; 290(32): 19756-69, 2015 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-26088140

RESUMEN

Combinatorial gene regulation through feed-forward loops (FFLs) can bestow specificity and temporal control to client gene expression; however, characteristics of binding sites that mediate these effects are not established. We previously showed that the glucocorticoid receptor (GR) and KLF15 form coherent FFLs that cooperatively induce targets such as the amino acid-metabolizing enzymes AASS and PRODH and incoherent FFLs exemplified by repression of MT2A by KLF15. Here, we demonstrate that GR and KLF15 physically interact and identify low affinity GR binding sites within glucocorticoid response elements (GREs) for PRODH and AASS that contribute to combinatorial regulation with KLF15. We used deep sequencing and electrophoretic mobility shift assays to derive in vitro GR binding affinities across sequence space. We applied these data to show that AASS GRE activity correlated (r(2) = 0.73) with predicted GR binding affinities across a 50-fold affinity range in transfection assays; however, the slope of the linear relationship more than doubled when KLF15 was expressed. Whereas activity of the MT2A GRE was even more strongly (r(2) = 0.89) correlated with GR binding site affinity, the slope of the linear relationship was sharply reduced by KLF15, consistent with incoherent FFL logic. Thus, GRE architecture and co-regulator expression together determine the functional parameters that relate GR binding site affinity to hormone-induced transcriptional responses. Utilization of specific affinity response functions and GR binding sites by FFLs may contribute to the diversity of gene expression patterns within GR-regulated transcriptomes.


Asunto(s)
Factores de Transcripción de Tipo Kruppel/metabolismo , Proteínas Nucleares/metabolismo , Prolina Oxidasa/metabolismo , Receptores de Glucocorticoides/metabolismo , Elementos de Respuesta , Sacaropina Deshidrogenasas/metabolismo , Transcripción Genética , Animales , Secuencia de Bases , Sitios de Unión , Bronquios/citología , Bronquios/efectos de los fármacos , Bronquios/metabolismo , Línea Celular , Dexametasona/farmacología , Ensayo de Cambio de Movilidad Electroforética , Células Epiteliales/citología , Células Epiteliales/efectos de los fármacos , Células Epiteliales/metabolismo , Fibroblastos/citología , Fibroblastos/efectos de los fármacos , Fibroblastos/metabolismo , Regulación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Factores de Transcripción de Tipo Kruppel/química , Factores de Transcripción de Tipo Kruppel/genética , Ratones , Datos de Secuencia Molecular , Proteínas Nucleares/química , Proteínas Nucleares/genética , Prolina Oxidasa/química , Prolina Oxidasa/genética , Regiones Promotoras Genéticas , Unión Proteica , Receptores de Glucocorticoides/química , Receptores de Glucocorticoides/genética , Sacaropina Deshidrogenasas/química , Sacaropina Deshidrogenasas/genética , Transducción de Señal
14.
Genome Res ; 23(6): 928-40, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23471540

RESUMEN

Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have characterized the DNA-binding specificities of 129 zinc finger sets from Drosophila using a bacterial one-hybrid system. This data set contains the DNA-binding specificities for at least one encoded ZFP from 70 unique genes and 23 alternate splice isoforms representing the largest set of characterized ZFPs from any organism described to date. These recognition motifs can be used to predict genomic binding sites for these factors within the fruit fly genome. Subsets of fingers from these ZFPs were characterized to define their orientation and register on their recognition sequences, thereby allowing us to define the recognition diversity within this finger set. We find that the characterized fingers can specify 47 of the 64 possible DNA triplets. To confirm the utility of our finger recognition models, we employed subsets of Drosophila fingers in combination with an existing archive of artificial zinc finger modules to create ZFPs with novel DNA-binding specificity. These hybrids of natural and artificial fingers can be used to create functional zinc finger nucleases for editing vertebrate genomes.


Asunto(s)
Sitios de Unión , Proteínas de Drosophila/genética , Drosophila/genética , Motivos de Nucleótidos , Dedos de Zinc/genética , Empalme Alternativo , Animales , Secuencia de Bases , Análisis por Conglomerados , Biología Computacional/métodos , Proteínas de Drosophila/química , Proteínas de Drosophila/clasificación , Modelos Moleculares , Filogenia , Posición Específica de Matrices de Puntuación , Unión Proteica , Conformación Proteica
15.
Nat Rev Genet ; 11(11): 751-60, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20877328

RESUMEN

Proteins, such as many transcription factors, that bind to specific DNA sequences are essential for the proper regulation of gene expression. Identifying the specific sequences that each factor binds can help to elucidate regulatory networks within cells and how genetic variation can cause disruption of normal gene expression, which is often associated with disease. Traditional methods for determining the specificity of DNA-binding proteins are slow and laborious, but several new high-throughput methods can provide comprehensive binding information much more rapidly. Combined with in vivo determinations of transcription factor binding locations, this information provides more detailed views of the regulatory circuitry of cells and the effects of variation on gene expression.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Animales , Secuencia de Bases , Sitios de Unión/genética , Técnicas de Laboratorio Clínico , Proteínas de Unión al ADN/análisis , Humanos , Modelos Biológicos , Unión Proteica , Especificidad por Sustrato , Factores de Transcripción/metabolismo
16.
Nucleic Acids Res ; 42(8): 4800-12, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24523353

RESUMEN

Cys(2)-His(2) zinc finger proteins (ZFPs) are the largest family of transcription factors in higher metazoans. They also represent the most diverse family with regards to the composition of their recognition sequences. Although there are a number of ZFPs with characterized DNA-binding preferences, the specificity of the vast majority of ZFPs is unknown and cannot be directly inferred by homology due to the diversity of recognition residues present within individual fingers. Given the large number of unique zinc fingers and assemblies present across eukaryotes, a comprehensive predictive recognition model that could accurately estimate the DNA-binding specificity of any ZFP based on its amino acid sequence would have great utility. Toward this goal, we have used the DNA-binding specificities of 678 two-finger modules from both natural and artificial sources to construct a random forest-based predictive model for ZFP recognition. We find that our recognition model outperforms previously described determinant-based recognition models for ZFPs, and can successfully estimate the specificity of naturally occurring ZFPs with previously defined specificities.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Dedos de Zinc , Inteligencia Artificial , Sitios de Unión , ADN/química , Proteínas de Unión al ADN/química , Modelos Biológicos , Motivos de Nucleótidos , Factores de Transcripción/química
17.
Genome Res ; 22(10): 1889-98, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22539651

RESUMEN

The recognition potential of most families of DNA-binding domains (DBDs) remains relatively unexplored. Homeodomains (HDs), like many other families of DBDs, display limited diversity in their preferred recognition sequences. To explore the recognition potential of HDs, we utilized a bacterial selection system to isolate HD variants, from a randomized library, that are compatible with each of the 64 possible 3' triplet sites (i.e., TAANNN). The majority of these selections yielded sets of HDs with overrepresented residues at specific recognition positions, implying the selection of specific binders. The DNA-binding specificity of 151 representative HD variants was subsequently characterized, identifying HDs that preferentially recognize 44 of these target sites. Many of these variants contain novel combinations of specificity determinants that are uncommon or absent in extant HDs. These novel determinants, when grafted into different HD backbones, produce a corresponding alteration in specificity. This information was used to create more explicit HD recognition models, which can inform the prediction of transcriptional regulatory networks for extant HDs or the engineering of HDs with novel DNA-recognition potential. The diversity of recovered HD recognition sequences raises important questions about the fitness barrier that restricts the evolution of alternate recognition modalities in natural systems.


Asunto(s)
ADN/química , Proteínas de Homeodominio/química , Animales , Secuencia de Bases , Sitios de Unión , ADN/metabolismo , Proteínas de Homeodominio/genética , Proteínas de Homeodominio/metabolismo , Humanos , Simulación del Acoplamiento Molecular , Unión Proteica
18.
Nat Methods ; 9(6): 588-90, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22543349

RESUMEN

The widespread use of zinc-finger nucleases (ZFNs) for genome engineering is hampered by the fact that only a subset of sequences can be efficiently recognized using published finger archives. We describe a set of validated two-finger modules that complement existing finger archives and expand the range of ZFN-accessible sequences threefold. Using this archive, we introduced lesions at 9 of 11 target sites in the zebrafish genome.


Asunto(s)
Marcación de Gen/métodos , Dedos de Zinc/genética , Animales , Dominio Catalítico , Roturas del ADN de Doble Cadena , Endodesoxirribonucleasas/genética , Endodesoxirribonucleasas/metabolismo , Pez Cebra
19.
Bioinformatics ; 30(7): 941-8, 2014 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-24369152

RESUMEN

MOTIVATION: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. RESULTS: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. AVAILABILITY AND IMPLEMENTATION: DiMO is available at http://stormo.wustl.edu/DiMO


Asunto(s)
Inteligencia Artificial , Factores de Transcripción/química , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión , ADN/química , ADN/metabolismo , Drosophila melanogaster , Humanos , Unión Proteica , Saccharomyces cerevisiae , Programas Informáticos , Factores de Transcripción/metabolismo
20.
Nature ; 460(7253): 405-9, 2009 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-19578362

RESUMEN

Activator protein 1 (AP-1, also known as JUN) transcription factors are dimers of JUN, FOS, MAF and activating transcription factor (ATF) family proteins characterized by basic region and leucine zipper domains. Many AP-1 proteins contain defined transcriptional activation domains, but BATF and the closely related BATF3 (refs 2, 3) contain only a basic region and leucine zipper, and are considered to be inhibitors of AP-1 activity. Here we show that Batf is required for the differentiation of IL17-producing T helper (T(H)17) cells. T(H)17 cells comprise a CD4(+) T-cell subset that coordinates inflammatory responses in host defence but is pathogenic in autoimmunity. Batf(-/-) mice have normal T(H)1 and T(H)2 differentiation, but show a defect in T(H)17 differentiation, and are resistant to experimental autoimmune encephalomyelitis. Batf(-/-) T cells fail to induce known factors required for T(H)17 differentiation, such as RORgamma t (encoded by Rorc) and the cytokine IL21 (refs 14-17). Neither the addition of IL21 nor the overexpression of RORgamma t fully restores IL17 production in Batf(-/-) T cells. The Il17 promoter is BATF-responsive, and after T(H)17 differentiation, BATF binds conserved intergenic elements in the Il17a-Il17f locus and to the Il17, Il21 and Il22 (ref. 18) promoters. These results demonstrate that the AP-1 protein BATF has a critical role in T(H)17 differentiation.


Asunto(s)
Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/metabolismo , Diferenciación Celular , Interleucina-17/metabolismo , Linfocitos T Colaboradores-Inductores/citología , Linfocitos T Colaboradores-Inductores/metabolismo , Factor de Transcripción AP-1/metabolismo , Animales , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/deficiencia , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/genética , Encefalomielitis Autoinmune Experimental/genética , Femenino , Regulación de la Expresión Génica , Predisposición Genética a la Enfermedad , Interleucina-17/biosíntesis , Interleucina-17/genética , Interleucinas/genética , Interleucinas/metabolismo , Interleucinas/farmacología , Ganglios Linfáticos/metabolismo , Masculino , Ratones , Miembro 3 del Grupo F de la Subfamilia 1 de Receptores Nucleares , Regiones Promotoras Genéticas/genética , Receptores de Ácido Retinoico/genética , Receptores de Ácido Retinoico/metabolismo , Receptores de Hormona Tiroidea/genética , Receptores de Hormona Tiroidea/metabolismo , Factor de Transcripción AP-1/deficiencia , Factor de Transcripción AP-1/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA