Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
1.
bioRxiv ; 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38585983

RESUMO

Cone-Rod Homeobox, encoded by CRX, is a transcription factor (TF) essential for the terminal differentiation and maintenance of mammalian photoreceptors. Structurally, CRX comprises an ordered DNA-binding homeodomain and an intrinsically disordered transcriptional effector domain. Although a handful of human variants in CRX have been shown to cause several different degenerative retinopathies with varying cone and rod predominance, as with most human disease genes the vast majority of observed CRX genetic variants are uncharacterized variants of uncertain significance (VUS). We performed a deep mutational scan (DMS) of nearly all possible single amino acid substitution variants in CRX, using an engineered cell-based transcriptional reporter assay. We measured the ability of each CRX missense variant to transactivate a synthetic fluorescent reporter construct in a pooled fluorescence-activated cell sorting assay and compared the activation strength of each variant to that of wild-type CRX to compute an activity score, identifying thousands of variants with altered transcriptional activity. We calculated a statistical confidence for each activity score derived from multiple independent measurements of each variant marked by unique sequence barcodes, curating a high-confidence list of nearly 2,000 variants with significantly altered transcriptional activity compared to wild-type CRX. We evaluated the performance of the DMS assay as a clinical variant classification tool using gold-standard classified human variants from ClinVar, and determined that activity scores could be used to identify pathogenic variants with high specificity. That this performance could be achieved using a synthetic reporter assay in a foreign cell type, even for a highly cell type-specific TF like CRX, suggests that this approach shows promise for DMS of other TFs that function in cell types that are not easily accessible. Per-position average activity scores closely aligned to a predicted structure of the ordered homeodomain and demonstrated position-specific residue requirements. The intrinsically disordered transcriptional effector domain, by contrast, displayed a qualitatively different pattern of substitution effects, following compositional constraints without specific residue position requirements in the peptide chain. The observed compositional constraints of the effector domain were consistent with the acidic exposure model of transcriptional activation. Together, the results of the CRX DMS identify molecular features of the CRX effector domain and demonstrate clinical utility for variant classification.

2.
Genome Res ; 34(2): 243-255, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38355306

RESUMO

Dozens of variants in the gene for the homeodomain transcription factor (TF) cone-rod homeobox (CRX) are linked with human blinding diseases that vary in their severity and age of onset. How different variants in this single TF alter its function in ways that lead to a range of phenotypes is unclear. We characterized the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in mouse retina explants carrying knock-ins of two variants, one in the DNA-binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation in these mutant Crx retinas corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, and p.E168d2 has distinct effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are derepressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci are partially predictive of episomal MPRA activity, and distal elements whose accessibility increases later in retinal development are enriched for CREs with silencer activity. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers while having a qualitatively different impact on silencers.


Assuntos
Proteínas de Homeodomínio , Transativadores , Animais , Humanos , Camundongos , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Sequências Reguladoras de Ácido Nucleico , Retina/metabolismo , Células Fotorreceptoras Retinianas Cones/metabolismo , Transativadores/genética , Transativadores/metabolismo , Fatores de Transcrição/genética
3.
PLoS Comput Biol ; 20(1): e1011802, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38227575

RESUMO

The effects of transcription factor binding sites (TFBSs) on the activity of a cis-regulatory element (CRE) depend on the local sequence context. In rod photoreceptors, binding sites for the transcription factor (TF) Cone-rod homeobox (CRX) occur in both enhancers and silencers, but the sequence context that determines whether CRX binding sites contribute to activation or repression of transcription is not understood. To investigate the context-dependent activity of CRX sites, we fit neural network-based models to the activities of synthetic CREs composed of photoreceptor TFBSs. The models revealed that CRX binding sites consistently make positive, independent contributions to CRE activity, while negative homotypic interactions between sites cause CREs composed of multiple CRX sites to function as silencers. The effects of negative homotypic interactions can be overcome by the presence of other TFBSs that either interact cooperatively with CRX sites or make independent positive contributions to activity. The context-dependent activity of CRX sites is thus determined by the balance between positive heterotypic interactions, independent contributions of TFBSs, and negative homotypic interactions. Our findings explain observed patterns of activity among genomic CRX-bound enhancers and silencers, and suggest that enhancers may require diverse TFBSs to overcome negative homotypic interactions between TFBSs.


Assuntos
Transativadores , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Transativadores/metabolismo , Proteínas de Homeodomínio/genética , Regulação da Expressão Gênica , Sítios de Ligação/genética , Retina
4.
Commun Biol ; 6(1): 1151, 2023 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-37953348

RESUMO

The function of regulatory elements is highly dependent on the cellular context, and thus for understanding the function of elements associated with psychiatric diseases these would ideally be studied in neurons in a living brain. Massively Parallel Reporter Assays (MPRAs) are molecular genetic tools that enable functional screening of hundreds of predefined sequences in a single experiment. These assays have not yet been adapted to query specific cell types in vivo in a complex tissue like the mouse brain. Here, using a test-case 3'UTR MPRA library with genomic elements containing variants from autism patients, we developed a method to achieve reproducible measurements of element effects in vivo in a cell type-specific manner, using excitatory cortical neurons and striatal medium spiny neurons as test cases. This targeted technique should enable robust, functional annotation of genetic elements in the cellular contexts most relevant to psychiatric disease.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos , Sequências Reguladoras de Ácido Nucleico , Animais , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Regiões 3' não Traduzidas , Córtex Cerebral , Neurônios Espinhosos Médios
5.
bioRxiv ; 2023 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-37662358

RESUMO

Cis-regulatory elements (CREs) direct gene expression in health and disease, and models that can accurately predict their activities from DNA sequences are crucial for biomedicine. Deep learning represents one emerging strategy to model the regulatory grammar that relates CRE sequence to function. However, these models require training data on a scale that exceeds the number of CREs in the genome. We address this problem using active machine learning to iteratively train models on multiple rounds of synthetic DNA sequences assayed in live mammalian retinas. During each round of training the model actively selects sequence perturbations to assay, thereby efficiently generating informative training data. We iteratively trained a model that predicts the activities of sequences containing binding motifs for the photoreceptor transcription factor Cone-rod homeobox (CRX) using an order of magnitude less training data than current approaches. The model's internal confidence estimates of its predictions are reliable guides for designing sequences with high activity. The model correctly identified critical sequence differences between active and inactive sequences with nearly identical transcription factor binding sites, and revealed order and spacing preferences for combinations of motifs. Our results establish active learning as an effective method to train accurate deep learning models of cis-regulatory function after exhausting naturally occurring training examples in the genome.

6.
bioRxiv ; 2023 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-37292699

RESUMO

Dozens of variants in the photoreceptor-specific transcription factor (TF) CRX are linked with human blinding diseases that vary in their severity and age of onset. It is unclear how different variants in this single TF alter its function in ways that lead to a range of phenotypes. We examined the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in live mouse retinas carrying knock-ins of two variants, one in the DNA binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation caused by the variants corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, while p.E168d2 has stronger effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are de-repressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci were partially predictive of episomal MPRA activity, and silencers were notably enriched among distal elements whose accessibility increases later in retinal development. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers, while having a qualitatively different impact on silencers.

7.
Genetics ; 224(3)2023 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-37226217

RESUMO

Stochastic differences among clonal cells can initiate cell fate decisions in development or cause cell-to-cell differences in the responses to drugs or extracellular ligands. One hypothesis is that some of this phenotypic variability is caused by stochastic fluctuations in the activities of transcription factors (TFs). We tested this hypothesis in NIH3T3-CG cells using the response to Hedgehog signaling as a model cellular response. Here, we present evidence for the existence of distinct fast- and slow-responding substates in NIH3T3-CG cells. These two substates have distinct expression profiles, and fluctuations in the Prrx1 TF underlie some of the differences in expression and responsiveness between fast and slow cells. Our results show that fluctuations in TFs can contribute to cell-to-cell differences in Hedgehog signaling.


Assuntos
Proteínas Hedgehog , Fatores de Transcrição , Animais , Camundongos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Proteínas Hedgehog/genética , Proteínas Hedgehog/metabolismo , Células NIH 3T3 , Regulação da Expressão Gênica , Transdução de Sinais
8.
Science ; 380(6641): eabn7113, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-37053313

RESUMO

Postzygotic mutations (PZMs) begin to accrue in the human genome immediately after fertilization, but how and when PZMs affect development and lifetime health remain unclear. To study the origins and functional consequences of PZMs, we generated a multitissue atlas of PZMs spanning 54 tissue and cell types from 948 donors. Nearly half the variation in mutation burden among tissue samples can be explained by measured technical and biological effects, and 9% can be attributed to donor-specific effects. Through phylogenetic reconstruction of PZMs, we found that their type and predicted functional impact vary during prenatal development, across tissues, and through the germ cell life cycle. Thus, methods for interpreting effects across the body and the life span are needed to fully understand the consequences of genetic variants.


Assuntos
Análise Mutacional de DNA , Longevidade , Zigoto , Feminino , Humanos , Longevidade/genética , Mutação , Filogenia , RNA-Seq
9.
Nat Genet ; 55(2): 346-354, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36635387

RESUMO

Massively parallel reporter gene assays are key tools in regulatory genomics but cannot be used to identify cell-type-specific regulatory elements without performing assays serially across different cell types. To address this problem, we developed a single-cell massively parallel reporter assay (scMPRA) to measure the activity of libraries of cis-regulatory sequences (CRSs) across multiple cell types simultaneously. We assayed a library of core promoters in a mixture of HEK293 and K562 cells and showed that scMPRA is a reproducible, highly parallel, single-cell reporter gene assay that detects cell-type-specific cis-regulatory activity. We then measured a library of promoter variants across multiple cell types in live mouse retinas and showed that subtle genetic variants can produce cell-type-specific effects on cis-regulatory activity. We anticipate that scMPRA will be widely applicable for studying the role of CRSs across diverse cell types.


Assuntos
Genes Reporter , Células HEK293 , Animais , Humanos , Camundongos , Biblioteca Gênica , Genes Reporter/genética , Regiões Promotoras Genéticas , Retina/metabolismo
10.
Genome Res ; 32(10): 1840-1851, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36192170

RESUMO

Many transposable elements (TEs) contain transcription factor binding sites and are implicated as potential regulatory elements. However, TEs are rarely functionally tested for regulatory activity, which in turn limits our understanding of how TE regulatory activity has evolved. We systematically tested the human LTR18A subfamily for regulatory activity using massively parallel reporter assay (MPRA) and found AP-1- and CEBP-related binding motifs as drivers of enhancer activity. Functional analysis of evolutionarily reconstructed ancestral sequences revealed that LTR18A elements have generally lost regulatory activity over time through sequence changes, with the largest effects occurring owing to mutations in the AP-1 and CEBP motifs. We observed that the two motifs are conserved at higher rates than expected based on neutral evolution. Finally, we identified LTR18A elements as potential enhancers in the human genome, primarily in epithelial cells. Together, our results provide a model for the origin, evolution, and co-option of TE-derived regulatory elements.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Fator de Transcrição AP-1 , Humanos , Fator de Transcrição AP-1/genética , Elementos de DNA Transponíveis/genética , Genoma Humano , Sequências Repetidas Terminais/genética , Evolução Molecular , Elementos Facilitadores Genéticos
11.
Genome Biol ; 23(1): 221, 2022 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-36253868

RESUMO

BACKGROUND: We and others have suggested that pioneer activity - a transcription factor's (TF's) ability to bind and open inaccessible loci - is not a qualitative trait limited to a select class of pioneer TFs. We hypothesize that most TFs display pioneering activity that depends on the TF concentration and the motif content at their target loci. RESULTS: Here, we present a quantitative in vivo measure of pioneer activity that captures the relative difference in a TF's ability to bind accessible versus inaccessible DNA. The metric is based on experiments that use CUT&Tag to measure the binding of doxycycline-inducible TFs. For each location across the genome, we determine the concentration of doxycycline required for a TF to reach half-maximal occupancy; lower concentrations reflect higher affinity. We propose that the relative difference in a TF's affinity between ATAC-seq labeled accessible and inaccessible binding sites is a measure of its pioneer activity. We estimate binding affinities at tens of thousands of genomic loci for the endodermal TFs FOXA1 and HNF4A and show that HNF4A has stronger pioneer activity than FOXA1. We show that both FOXA1 and HNF4A display higher binding affinity at inaccessible sites with more copies of their respective motifs. The quantitative analysis of binding suggests different modes of binding for FOXA1, including an anti-cooperative mode of binding at certain accessible loci. CONCLUSIONS: Our results suggest that relative binding affinities are reasonable measures of pioneer activity and support the model wherein most TFs have some degree of context-dependent pioneer activity.


Assuntos
DNA , Doxiciclina , Sítios de Ligação , DNA/metabolismo , Genoma , Genômica
12.
Cell Syst ; 13(4): 334-345.e5, 2022 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-35120642

RESUMO

Acidic activation domains are intrinsically disordered regions of the transcription factors that bind coactivators. The intrinsic disorder and low evolutionary conservation of activation domains have made it difficult to identify the sequence features that control activity. To address this problem, we designed thousands of variants in seven acidic activation domains and measured their activities with a high-throughput assay in human cell culture. We found that strong activation domain activity requires a balance between the number of acidic residues and aromatic and leucine residues. These findings motivated a predictor of acidic activation domains that scans the human proteome for clusters of aromatic and leucine residues embedded in regions of high acidity. This predictor identifies known activation domains and accurately predicts previously unidentified ones. Our results support a flexible acidic exposure model of activation domains in which the acidic residues solubilize hydrophobic motifs so that they can interact with coactivators. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Proteínas de Ligação a DNA , Fatores de Transcrição , Sequência de Aminoácidos , Proteínas de Ligação a DNA/genética , Humanos , Leucina/metabolismo , Fatores de Transcrição/metabolismo , Ativação Transcricional
13.
Elife ; 112022 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-34984978

RESUMO

The pioneer factor hypothesis (PFH) states that pioneer factors (PFs) are a subclass of transcription factors (TFs) that bind to and open inaccessible sites and then recruit non-pioneer factors (non-PFs) that activate batteries of silent genes. The PFH predicts that ectopic gene activation requires the sequential activity of qualitatively different TFs. We tested the PFH by expressing the endodermal PF FOXA1 and non-PF HNF4A in K562 lymphoblast cells. While co-expression of FOXA1 and HNF4A activated a burst of endoderm-specific gene expression, we found no evidence for a functional distinction between these two TFs. When expressed independently, both TFs bound and opened inaccessible sites, activated endodermal genes, and 'pioneered' for each other, although FOXA1 required fewer copies of its motif for binding. A subset of targets required both TFs, but the predominant mode of action at these targets did not conform to the sequential activity predicted by the PFH. From these results, we hypothesize an alternative to the PFH where 'pioneer activity' depends not on categorically different TFs but rather on the affinity of interaction between TF and DNA.


Cells only use a fraction of their genetic information to make the proteins they need. The rest is carefully packaged away and tightly bundled in structures called nucleosomes. This physically shields the DNA from being accessed by transcription factors ­ the molecular actors that can read genes and kickstart the protein production process. Effectively, the genetic sequences inside nucleosomes are being silenced. However, during development, transcription factors must overcome this nucleosome barrier and activate silent genes to program cells. The pioneer factor hypothesis describes how this may be possible: first, 'pioneer' transcription factors can bind to and 'open up' nucleosomes to make target genes accessible. Then, non-pioneer factors can access the genetic sequence and recruit cofactors that begin copying the now-exposed genetic information. The widely accepted theory is based on studies of two proteins ­ FOXA1, an archetypal pioneer factor, and HNF4A, a non-pioneer factor ­ but the predictions of the pioneer factor hypothesis have yet to be explicitly tested. To do so, Hansen et al. expressed FOXA1 and HNF4A, separately and together, in cells which do not usually make these proteins. They then assessed how the proteins could bind to DNA and impact gene accessibility and transcription. The experiments demonstrate that FOXA1 and HNF4A do not necessarily follow the two-step activation predicted by the pioneer factor hypothesis. When expressed independently, both transcription factors bound and opened inaccessible sites, activated target genes, and 'pioneered' for each other. Similar patterns were observed across the genome. The only notable distinction between the two factors was that FOXA1, the archetypal pioneering factor, required fewer copies of its target sequence to bind DNA than HNF4A. These findings led Hansen et al. to propose an alternative theory to the pioneer factor hypothesis which eliminates the categorical distinction between pioneer and non-pioneer factors. Overall, this work has implications for how biologists understand the way that transcription factors activate silent genes during development.


Assuntos
Expressão Ectópica do Gene , Fator 3-alfa Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/genética , Fígado/metabolismo , Fator 3-alfa Nuclear de Hepatócito/metabolismo , Fator 4 Nuclear de Hepatócito/metabolismo , Humanos , Células K562
14.
Genome Res ; 32(1): 85-96, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34961747

RESUMO

A classical model of gene regulation is that enhancers provide specificity whereas core promoters provide a modular site for the assembly of the basal transcriptional machinery. However, examples of core promoter specificity have led to an alternate hypothesis in which specificity is achieved by core promoters with different sequence motifs that respond differently to genomic environments containing different enhancers and chromatin landscapes. To distinguish between these models, we measured the activities of hundreds of diverse core promoters in four different genomic locations and, in a complementary experiment, six different core promoters at thousands of locations across the genome. Although genomic locations had large effects on expression, the intrinsic activities of different classes of promoters were preserved across genomic locations, suggesting that core promoters are modular regulatory elements whose activities are independently scaled up or down by different genomic locations. This scaling of promoter activities is nonlinear and depends on the genomic location and the strength of the core promoter. Our results support the classical model of regulation in which diverse core promoter motifs set the intrinsic strengths of core promoters, which are then amplified or dampened by the activities of their genomic environments.


Assuntos
Cromatina , Genômica , Cromatina/genética , Regulação da Expressão Gênica , Regiões Promotoras Genéticas
15.
Cancer Res Commun ; 1(3): 148-163, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34957471

RESUMO

In cancer, missense mutations in the DNA-binding domain of TP53 are common. They abrogate canonical p53 activity and frequently confer gain-of-oncogenic function (GOF) through localization of transcriptionally active mutant p53 to non-canonical genes. We found that several recurring p53 mutations exhibit a sex difference in frequency in patients with glioblastoma (GBM). In vitro and in vivo analysis of three mutations, p53R172H, p53Y202C, and p53Y217C revealed unique interactions between cellular sex and p53 GOF mutations that determined each mutation's ability to transform male versus female primary mouse astrocytes. These phenotypic differences were correlated with sex- and p53 mutation- specific patterns of genomic localization to the transcriptional start sites of upregulated genes belonging to core cancer pathways. The promoter regions of these genes exhibited a sex difference in enrichment for different transcription factor DNA-binding motifs. Together, our data establish a novel mechanism for sex specific mutant p53 GOF activity in GBM with implications for all cancer.


Assuntos
Glioblastoma , Proteína Supressora de Tumor p53 , Animais , Camundongos , Feminino , Masculino , Proteína Supressora de Tumor p53/genética , Mutação com Ganho de Função , Recidiva Local de Neoplasia , Mutação , Glioblastoma/genética , DNA
16.
Elife ; 102021 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-34486522

RESUMO

Enhancers and silencers often depend on the same transcription factors (TFs) and are conflated in genomic assays of TF binding or chromatin state. To identify sequence features that distinguish enhancers and silencers, we assayed massively parallel reporter libraries of genomic sequences targeted by the photoreceptor TF cone-rod homeobox (CRX) in mouse retinas. Both enhancers and silencers contain more TF motifs than inactive sequences, but relative to silencers, enhancers contain motifs from a more diverse collection of TFs. We developed a measure of information content that describes the number and diversity of motifs in a sequence and found that, while both enhancers and silencers depend on CRX motifs, enhancers have higher information content. The ability of information content to distinguish enhancers and silencers targeted by the same TF illustrates how motif context determines the activity of cis-regulatory sequences.


Different cell types are established by activating and repressing the activity of specific sets of genes, a process controlled by proteins called transcription factors. Transcription factors work by recognizing and binding short stretches of DNA in parts of the genome called cis-regulatory sequences. A cis-regulatory sequence that increases the activity of a gene when bound by transcription factors is called an enhancer, while a sequence that causes a decrease in gene activity is called a silencer. To establish a cell type, a particular transcription factor will act on both enhancers and silencers that control the activity of different genes. For example, the transcription factor cone-rod homeobox (CRX) is critical for specifying different types of cells in the retina, and it acts on both enhancers and silencers. In rod photoreceptors, CRX activates rod genes by binding their enhancers, while repressing cone photoreceptor genes by binding their silencers. However, CRX always recognizes and binds to the same DNA sequence, known as its binding site, making it unclear why some cis-regulatory sequences bound to CRX act as silencers, while others act as enhancers. Friedman et al. sought to understand how enhancers and silencers, both bound by CRX, can have different effects on the genes they control. Since both enhancers and silencers contain CRX binding sites, the difference between the two must lie in the sequence of the DNA surrounding these binding sites. Using retinas that have been explanted from mice and kept alive in the laboratory, Friedman et al. tested the activity of thousands of CRX-binding sequences from the mouse genome. This showed that both enhancers and silencers have more copies of CRX-binding sites than sequences of the genome that are inactive. Additionally, the results revealed that enhancers have a diverse collection of binding sites for other transcription factors, while silencers do not. Friedman et al. developed a new metric they called information content, which captures the diverse combinations of different transcription binding sites that cis-regulatory sequences can have. Using this metric, Friedman et al. showed that it is possible to distinguish enhancers from silencers based on their information content. It is critical to understand how the DNA sequences of cis-regulatory regions determine their activity, because mutations in these regions of the genome can cause disease. However, since every person has thousands of benign mutations in cis-regulatory sequences, it is a challenge to identify specific disease-causing mutations, which are relatively rare. One long-term goal of models of enhancers and silencers, such as Friedman et al.'s information content model, is to understand how mutations can affect cis-regulatory sequences, and, in some cases, lead to disease.


Assuntos
Células Fotorreceptoras/fisiologia , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Feminino , Masculino , Camundongos , Ligação Proteica , Retina/citologia , Retina/fisiologia , Fatores de Transcrição/genética
17.
Cell Rep ; 33(12): 108531, 2020 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-33357440

RESUMO

CELF6 is a CELF-RNA-binding protein, and thus part of a protein family with roles in human disease; however, its mRNA targets in the brain are largely unknown. Using cross-linking immunoprecipitation and sequencing (CLIP-seq), we define its CNS targets, which are enriched for 3' UTRs in synaptic protein-coding genes. Using a massively parallel reporter assay framework, we test the consequence of CELF6 expression on target sequences, with and without mutating putative binding motifs. Where CELF6 exerts an effect on sequences, it is largely to decrease RNA abundance, which is reversed by mutating UGU-rich motifs. This is also the case for CELF3-5, with a protein-dependent effect on magnitude. Finally, we demonstrate that targets are derepressed in CELF6-mutant mice, and at least two key CNS proteins, FOS and FGF13, show altered protein expression levels and localization. Our works find, in addition to previously identified roles in splicing, that CELF6 is associated with repression of its CNS targets via the 3' UTR in vivo.


Assuntos
Proteínas CELF/metabolismo , RNA Mensageiro/metabolismo , Sinapses/metabolismo , Regiões 3' não Traduzidas , Animais , Encéfalo/metabolismo , Proteínas CELF/genética , Linhagem Celular Tumoral , Feminino , Humanos , Masculino , Camundongos , RNA Mensageiro/genética , Ribossomos/genética , Ribossomos/metabolismo
18.
Elife ; 92020 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-32043966

RESUMO

In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity.


Transcription factors are proteins that flip genetic switches; their role is to control when and where genes are active. They do this by binding to short stretches of DNA called cis-regulatory sequences. Each sequence can have several binding sites for different transcription factors, but it is largely unclear whether the transcription factors binding to the same regulatory sequence actually work together. It is possible that each transcription factor may work independently and there only needs to be critical mass of transcription factors bound to throw the genetic switch. If this is the case, the most important features of a cis-regulatory sequence should be the number of binding sites it contains, and how tightly the transcription factors bind to those sites. The more transcription factors and the more strongly they bind, the more active the gene should be. An alternative option is that certain transcription factors may work better together, enhancing each other's effects such that the total effect is more than the sum of its parts. If this is true, the order, orientation and spacing of the binding sites within a sequence should matter more than the number. One way to investigate to distinguish between these possibilities is to study mouse embryonic stem cells, which have a core set of four transcription factors. Looking directly at a real genome, however, can be confusing and it is difficult to measure the effects of different cis-regulatory sequences because genes differ in so many other ways. To tackle this problem, King et al. created a synthetic set of cis-regulatory sequences based on the four core transcription factors found in mouse stem cells. The synthetic set had every combination of two, three or four of the binding sites, with each site either facing forwards or backwards along the DNA strand. King et al. attached each of the synthetic cis-regulatory sequences to a reporter gene to find out how well each sequence performed. This revealed that the cis-regulatory sequences with the most binding sites and the tightest binding affinities work best, suggesting that transcription factors mainly work independently. There was evidence of some interaction between some transcription factors, because, of the synthetic sequences with four binding sites, some worked better than others, and there were patterns in the most effective binding site combinations. However, these effects were small and when King et al. went on to test sequences from the real mouse genome, the most important factor by far was the number of binding sites. Synthetic libraries of DNA sequences allow researchers to examine gene regulation more clearly than is possible in real genomes. Yet this approach does have its limitations and it is impossible to capture every type of cis-regulatory sequence in one library. The next step to extend this work is to combine the two approaches, taking sequences from the real genome and manipulating them one by one. This could help to unravel the rules that govern how cis-regulatory sequences work in real cells.


Assuntos
Células-Tronco Embrionárias/metabolismo , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Animais , Fator 4 Semelhante a Kruppel , Camundongos
19.
Nat Biotechnol ; 2018 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-30451991

RESUMO

A gene's position in the genome can profoundly affect its expression because regional differences in chromatin modulate the activity of locally acting cis-regulatory sequences (CRSs). Here we study how CRSs and regional chromatin act in concert on a genome-wide scale. We present a massively parallel reporter gene assay that measures the activities of hundreds of different CRSs, each integrated at many specific genomic locations. Although genome location strongly affected CRS activity, the relative strengths of CRSs were maintained at all chromosomal locations. The intrinsic activities of CRSs also correlated with their activities in plasmid-based assays. We explain our data with a quantitative model in which expression levels are set by independent contributions from local CRSs and the regional chromatin environment, rather than by more complex sequence- or protein-specific interactions between these two factors. The methods we present will help investigators determine when regulatory information is integrated in a modular fashion and when regulatory sequences interact in more complex ways.

20.
Cell Syst ; 6(4): 444-455.e6, 2018 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-29525204

RESUMO

Transcriptional activation domains are essential for gene regulation, but their intrinsic disorder and low primary sequence conservation have made it difficult to identify the amino acid composition features that underlie their activity. Here, we describe a rational mutagenesis scheme that deconvolves the function of four activation domain sequence features-acidity, hydrophobicity, intrinsic disorder, and short linear motifs-by quantifying the activity of thousands of variants in vivo and simulating their conformational ensembles using an all-atom Monte Carlo approach. Our results with a canonical activation domain from the Saccharomyces cerevisiae transcription factor Gcn4 reconcile existing observations into a unified model of its function: the intrinsic disorder and acidic residues keep two hydrophobic motifs from driving collapse. Instead, the most-active variants keep their aromatic residues exposed to the solvent. Our results illustrate how the function of intrinsically disordered proteins can be revealed by high-throughput rational mutagenesis.


Assuntos
Fatores de Transcrição de Zíper de Leucina Básica/química , Proteínas de Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/genética , Fatores de Transcrição/química , Fatores de Transcrição de Zíper de Leucina Básica/fisiologia , Domínio Catalítico , Regulação da Expressão Gênica , Concentração de Íons de Hidrogênio , Modelos Moleculares , Método de Monte Carlo , Mutagênese Sítio-Dirigida , Domínios Proteicos , Proteínas de Saccharomyces cerevisiae/fisiologia , Análise de Sequência de Proteína , Fatores de Transcrição/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...