Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Nat Commun ; 14(1): 2709, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37188663

RESUMO

Narcolepsy type 1 (NT1) is caused by a loss of hypocretin/orexin transmission. Risk factors include pandemic 2009 H1N1 influenza A infection and immunization with Pandemrix®. Here, we dissect disease mechanisms and interactions with environmental triggers in a multi-ethnic sample of 6,073 cases and 84,856 controls. We fine-mapped GWAS signals within HLA (DQ0602, DQB1*03:01 and DPB1*04:02) and discovered seven novel associations (CD207, NAB1, IKZF4-ERBB3, CTSC, DENND1B, SIRPG, PRF1). Significant signals at TRA and DQB1*06:02 loci were found in 245 vaccination-related cases, who also shared polygenic risk. T cell receptor associations in NT1 modulated TRAJ*24, TRAJ*28 and TRBV*4-2 chain-usage. Partitioned heritability and immune cell enrichment analyses found genetic signals to be driven by dendritic and helper T cells. Lastly comorbidity analysis using data from FinnGen, suggests shared effects between NT1 and other autoimmune diseases. NT1 genetic variants shape autoimmunity and response to environmental triggers, including influenza A infection and immunization with Pandemrix®.


Assuntos
Doenças Autoimunes , Vírus da Influenza A Subtipo H1N1 , Vacinas contra Influenza , Influenza Humana , Narcolepsia , Humanos , Autoimunidade/genética , Influenza Humana/epidemiologia , Influenza Humana/genética , Vírus da Influenza A Subtipo H1N1/genética , Doenças Autoimunes/epidemiologia , Doenças Autoimunes/genética , Vacinas contra Influenza/efeitos adversos , Narcolepsia/induzido quimicamente , Narcolepsia/genética
2.
Genome Res ; 29(2): 171-183, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30622120

RESUMO

Despite much research, our understanding of the architecture and cis-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome. We used this method to investigate thousands of native promoters and preinitiation complex (PIC) binding regions followed by in-depth characterization of the sequence motifs underlying promoter activity, including core promoter elements and TF binding sites. We find that core promoters drive transcription mostly unidirectionally and that sequences originating from promoters exhibit stronger activity than those originating from enhancers. By testing multiple synthetic configurations of core promoter elements, we dissect the motifs that positively and negatively regulate transcription as well as the effect of their combinations and distances, including a 10-bp periodicity in the optimal distance between the TATA and the initiator. By comprehensively screening 133 TF binding sites, we find that in contrast to core promoters, TF binding sites maintain similar activity levels in both orientations, supporting a model by which divergent transcription is driven by two distinct unidirectional core promoters sharing bidirectional TF binding sites. Finally, we find a striking agreement between the effect of binding site multiplicity of individual TFs in our assay and their tendency to appear in homotypic clusters throughout the genome. Overall, our study systematically assays the elements that drive expression in core and proximal promoter regions and sheds light on organization principles of regulatory regions in the human genome.


Assuntos
Regiões Promotoras Genéticas , Transcrição Gênica , Sítios de Ligação , Elementos Facilitadores Genéticos , Regulação da Expressão Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Nucleossomos/química , Análise de Sequência de DNA , TATA Box , Fatores de Transcrição/metabolismo
3.
Cell ; 175(2): 544-557.e16, 2018 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-30245013

RESUMO

A major challenge in genetics is to identify genetic variants driving natural phenotypic variation. However, current methods of genetic mapping have limited resolution. To address this challenge, we developed a CRISPR-Cas9-based high-throughput genome editing approach that can introduce thousands of specific genetic variants in a single experiment. This enabled us to study the fitness consequences of 16,006 natural genetic variants in yeast. We identified 572 variants with significant fitness differences in glucose media; these are highly enriched in promoters, particularly in transcription factor binding sites, while only 19.2% affect amino acid sequences. Strikingly, nearby variants nearly always favor the same parent's alleles, suggesting that lineage-specific selection is often driven by multiple clustered variants. In sum, our genome editing approach reveals the genetic architecture of fitness variation at single-base resolution and could be adapted to measure the effects of genome-wide genetic variation in any screen for cell survival or cell-sortable markers.


Assuntos
Edição de Genes/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Saccharomyces cerevisiae/genética , Sistemas CRISPR-Cas , Mapeamento Cromossômico , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Variação Genética/genética , Vetores Genéticos , Genoma , Leveduras/genética
4.
PLoS Comput Biol ; 13(8): e1005629, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28771616

RESUMO

Quantification of cell-free DNA (cfDNA) in circulating blood derived from a transplanted organ is a powerful approach to monitoring post-transplant injury. Genome transplant dynamics (GTD) quantifies donor-derived cfDNA (dd-cfDNA) by taking advantage of single-nucleotide polymorphisms (SNPs) distributed across the genome to discriminate donor and recipient DNA molecules. In its current implementation, GTD requires genotyping of both the transplant recipient and donor. However, in practice, donor genotype information is often unavailable. Here, we address this issue by developing an algorithm that estimates dd-cfDNA levels in the absence of a donor genotype. Our algorithm predicts heart and lung allograft rejection with an accuracy that is similar to conventional GTD. We furthermore refined the algorithm to handle closely related recipients and donors, a scenario that is common in bone marrow and kidney transplantation. We show that it is possible to estimate dd-cfDNA in bone marrow transplant patients that are unrelated or that are siblings of the donors, using a hidden Markov model (HMM) of identity-by-descent (IBD) states along the genome. Last, we demonstrate that comparing dd-cfDNA to the proportion of donor DNA in white blood cells can differentiate between relapse and the onset of graft-versus-host disease (GVHD). These methods alleviate some of the barriers to the implementation of GTD, which will further widen its clinical application.


Assuntos
DNA/análise , Técnicas de Genotipagem/métodos , Transplante , Medula Óssea/química , DNA/classificação , DNA/genética , Feminino , Genótipo , Rejeição de Enxerto/prevenção & controle , Humanos , Masculino , Modelos Estatísticos , Análise de Sequência de DNA , Doadores de Tecidos , Transplantes/química
6.
Genome Res ; 27(1): 87-94, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27965290

RESUMO

Transcription factors (TFs) are key mediators that propagate extracellular and intracellular signals through to changes in gene expression profiles. However, the rules by which promoters decode the amount of active TF into target gene expression are not well understood. To determine the mapping between promoter DNA sequence, TF concentration, and gene expression output, we have conducted in budding yeast a large-scale measurement of the activity of thousands of designed promoters at six different levels of TF. We observe that maximum promoter activity is determined by TF concentration and not by the number of binding sites. Surprisingly, the addition of an activator site often reduces expression. A thermodynamic model that incorporates competition between neighboring binding sites for a local pool of TF molecules explains this behavior and accurately predicts both absolute expression and the amount by which addition of a site increases or reduces expression. Taken together, our findings support a model in which neighboring binding sites interact competitively when TF is limiting but otherwise act additively.


Assuntos
Proteínas de Ligação a DNA/genética , Regulação da Expressão Gênica/genética , Regiões Promotoras Genéticas , Fatores de Transcrição/genética , Sequência de Bases , Sítios de Ligação , Imunoprecipitação da Cromatina , Redes Reguladoras de Genes/genética , Saccharomyces cerevisiae/genética
7.
Nat Genet ; 48(9): 995-1002, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27479906

RESUMO

In each individual, a highly diverse T cell receptor (TCR) repertoire interacts with peptides presented by major histocompatibility complex (MHC) molecules. Despite extensive research, it remains controversial whether germline-encoded TCR-MHC contacts promote TCR-MHC specificity and, if so, whether differences exist in TCR V gene compatibilities with different MHC alleles. We applied expression quantitative trait locus (eQTL) mapping to test for associations between genetic variation and TCR V gene usage in a large human cohort. We report strong trans associations between variation in the MHC locus and TCR V gene usage. Fine-mapping of the association signals identifies specific amino acids from MHC genes that bias V gene usage, many of which contact or are spatially proximal to the TCR or peptide in the TCR-peptide-MHC complex. Hence, these MHC variants, several of which are linked to autoimmune diseases, can directly affect TCR-MHC interaction. These results provide the first examples of trans-QTL effects mediated by protein-protein interactions and are consistent with intrinsic TCR-MHC specificity.


Assuntos
Variação Genética/genética , Região Variável de Imunoglobulina/metabolismo , Complexo Principal de Histocompatibilidade/fisiologia , Locos de Características Quantitativas , Receptores de Antígenos de Linfócitos T/metabolismo , Estudos de Coortes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Região Variável de Imunoglobulina/genética , Receptores de Antígenos de Linfócitos T/genética
9.
PLoS Genet ; 11(4): e1005147, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25875337

RESUMO

The 3'end genomic region encodes a wide range of regulatory process including mRNA stability, 3' end processing and translation. Here, we systematically investigate the sequence determinants of 3' end mediated expression control by measuring the effect of 13,000 designed 3' end sequence variants on constitutive expression levels in yeast. By including a high resolution scanning mutagenesis of more than 200 native 3' end sequences in this designed set, we found that most mutations had only a mild effect on expression, and that the vast majority (~90%) of strongly effecting mutations localized to a single positive TA-rich element, similar to a previously described 3' end processing efficiency element, and resulted in up to ten-fold decrease in expression. Measurements of 3' UTR lengths revealed that these mutations result in mRNAs with aberrantly long 3'UTRs, confirming the role for this element in 3' end processing. Interestingly, we found that other sequence elements that were previously described in the literature to be part of the polyadenylation signal had a minor effect on expression. We further characterize the sequence specificities of the TA-rich element using additional synthetic 3' end sequences and show that its activity is sensitive to single base pair mutations and strongly depends on the A/T content of the surrounding sequences. Finally, using a computational model, we show that the strength of this element in native 3' end sequences can explain some of their measured expression variability (R = 0.41). Together, our results emphasize the importance of efficient 3' end processing for endogenous protein levels and contribute to an improved understanding of the sequence elements involved in this process.


Assuntos
Regiões 3' não Traduzidas , Regulação Fúngica da Expressão Gênica , Leveduras/genética , Genoma Fúngico , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Leveduras/metabolismo
10.
Genome Res ; 25(7): 1018-29, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25762553

RESUMO

Binding of transcription factors (TFs) to regulatory sequences is a pivotal step in the control of gene expression. Despite many advances in the characterization of sequence motifs recognized by TFs, our ability to quantitatively predict TF binding to different regulatory sequences is still limited. Here, we present a novel experimental assay termed BunDLE-seq that provides quantitative measurements of TF binding to thousands of fully designed sequences of 200 bp in length within a single experiment. Applying this binding assay to two yeast TFs, we demonstrate that sequences outside the core TF binding site profoundly affect TF binding. We show that TF-specific models based on the sequence or DNA shape of the regions flanking the core binding site are highly predictive of the measured differential TF binding. We further characterize the dependence of TF binding, accounting for measurements of single and co-occurring binding events, on the number and location of binding sites and on the TF concentration. Finally, by coupling our in vitro TF binding measurements, and another application of our method probing nucleosome formation, to in vivo expression measurements carried out with the same template sequences serving as promoters, we offer insights into mechanisms that may determine the different expression outcomes observed. Our assay thus paves the way to a more comprehensive understanding of TF binding to regulatory sequences and allows the characterization of TF binding determinants within and outside of core binding sites.


Assuntos
Sítios de Ligação , Fatores de Transcrição/metabolismo , Biologia Computacional/métodos , Nucleossomos/metabolismo , Poli A , Poli T , Ligação Proteica , Sequências Reguladoras de Ácido Nucleico , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Termodinâmica
11.
Genome Res ; 24(10): 1698-706, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25030889

RESUMO

Genetically identical cells exhibit large variability (noise) in gene expression, with important consequences for cellular function. Although the amount of noise decreases with and is thus partly determined by the mean expression level, the extent to which different promoter sequences can deviate away from this trend is not fully known. Here, we present a high-throughput method for measuring promoter-driven noise for thousands of designed synthetic promoters in parallel. We use it to investigate how promoters encode different noise levels and find that the noise levels of promoters with similar mean expression levels can vary more than one order of magnitude, with nucleosome-disfavoring sequences resulting in lower noise and more transcription factor binding sites resulting in higher noise. We propose a kinetic model of gene expression that takes into account the nonspecific DNA binding and one-dimensional sliding along the DNA, which occurs when transcription factors search for their target sites. We show that this assumption can improve the prediction of the mean-independent component of expression noise for our designed promoter sequences, suggesting that a transcription factor target search may affect gene expression noise. Consistent with our findings in designed promoters, we find that binding-site multiplicity in native promoters is associated with higher expression noise. Overall, our results demonstrate that small changes in promoter DNA sequence can tune noise levels in a manner that is predictable and partly decoupled from effects on the mean expression levels. These insights may assist in designing promoters with desired noise levels.


Assuntos
Biologia Computacional/métodos , DNA/metabolismo , Expressão Gênica , Regiões Promotoras Genéticas , Saccharomyces cerevisiae/genética , Sítios de Ligação , Genes Fúngicos , Modelos Lineares , Dados de Sequência Molecular , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo
12.
Genome Res ; 23(11): 1928-37, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23950146

RESUMO

The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites.


Assuntos
Crowdsourcing , Expressão Gênica , Regiões Promotoras Genéticas , Proteínas Ribossômicas/genética , Ribossomos/genética , Saccharomyces cerevisiae/genética , Algoritmos , Sítios de Ligação/genética , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Redes Reguladoras de Genes , Genes Fúngicos , Modelos Genéticos , Mutação , Elementos Reguladores de Transcrição , Ribossomos/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Biologia de Sistemas
13.
Proc Natl Acad Sci U S A ; 110(30): E2792-801, 2013 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-23832786

RESUMO

The 5'-untranslated region (5'-UTR) of mRNAs contains elements that affect expression, yet the rules by which these regions exert their effect are poorly understood. Here, we studied the impact of 5'-UTR sequences on protein levels in yeast, by constructing a large-scale library of mutants that differ only in the 10 bp preceding the translational start site of a fluorescent reporter. Using a high-throughput sequencing strategy, we obtained highly accurate measurements of protein abundance for over 2,000 unique sequence variants. The resulting pool spanned an approximately sevenfold range of protein levels, demonstrating the powerful consequences of sequence manipulations of even 1-10 nucleotides immediately upstream of the start codon. We devised computational models that predicted over 70% of the measured expression variability in held-out sequence variants. Notably, a combined model of the most prominent features successfully explained protein abundance in an additional, independently constructed library, whose nucleotide composition differed greatly from the library used to parameterize the model. Our analysis reveals the dominant contribution of the start codon context at positions -3 to -1, mRNA secondary structure, and out-of-frame upstream AUGs (uAUGs) to phenotypic diversity, thereby advancing our understanding of how protein levels are modulated by 5'-UTR sequences, and paving the way toward predictably tuning protein expression through manipulations of 5'-UTRs.


Assuntos
Regiões 5' não Traduzidas , Proteínas Fúngicas/metabolismo , Saccharomyces cerevisiae/metabolismo , Sequência de Bases , Códon de Iniciação , Primers do DNA , Proteínas Fúngicas/genética , Conformação de Ácido Nucleico , RNA Mensageiro/genética , Saccharomyces cerevisiae/genética
14.
PLoS Comput Biol ; 9(3): e1002934, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23505350

RESUMO

A full understanding of gene regulation requires an understanding of the contributions that the various regulatory regions have on gene expression. Although it is well established that sequences downstream of the main promoter can affect expression, our understanding of the scale of this effect and how it is encoded in the DNA is limited. Here, to measure the effect of native S. cerevisiae 3' end sequences on expression, we constructed a library of 85 fluorescent reporter strains that differ only in their 3' end region. Notably, despite being driven by the same strong promoter, our library spans a continuous twelve-fold range of expression values. These measurements correlate with endogenous mRNA levels, suggesting that the 3' end contributes to constitutive differences in mRNA levels. We used deep sequencing to map the 3'UTR ends of our strains and show that determination of polyadenylation sites is intrinsic to the local 3' end sequence. Polyadenylation mapping was followed by sequence analysis, we found that increased A/T content upstream of the main polyadenylation site correlates with higher expression, both in the library and genome-wide, suggesting that native genes differ by the encoded efficiency of 3' end processing. Finally, we use single cells fluorescence measurements, in different promoter activation levels, to show that 3' end sequences modulate protein expression dynamics differently than promoters, by predominantly affecting the size of protein production bursts as opposed to the frequency at which these bursts occur. Altogether, our results lead to a more complete understanding of gene regulation by demonstrating that 3' end regions have a unique and sequence dependent effect on gene expression.


Assuntos
Regiões 3' não Traduzidas , Regulação Fúngica da Expressão Gênica , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Composição de Bases , Biologia Computacional , Genes Fúngicos , Genes Reporter , Poli A/genética , Poli A/metabolismo , Regiões Promotoras Genéticas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
15.
Nat Genet ; 44(7): 743-50, 2012 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-22634752

RESUMO

Understanding how precise control of gene expression is specified within regulatory DNA sequences is a key challenge with far-reaching implications. Many studies have focused on the regulatory role of transcription factor-binding sites. Here, we explore the transcriptional effects of different elements, nucleosome-disfavoring sequences and, specifically, poly(dA:dT) tracts that are highly prevalent in eukaryotic promoters. By measuring promoter activity for a large-scale promoter library, designed with systematic manipulations to the properties and spatial arrangement of poly(dA:dT) tracts, we show that these tracts significantly and causally affect transcription. We show that manipulating these elements offers a general genetic mechanism, applicable to promoters regulated by different transcription factors, for tuning expression in a predictable manner, with resolution that can be even finer than that attained by altering transcription factor sites. Overall, our results advance the understanding of the regulatory code and suggest a potential mechanism by which promoters yielding prespecified expression patterns can be designed.


Assuntos
Regulação Fúngica da Expressão Gênica , Genes Fúngicos , Nucleossomos/genética , Leveduras/genética , Sequência de Bases , Sítios de Ligação , DNA Fúngico/genética , Dados de Sequência Molecular , Regiões Promotoras Genéticas , Fatores de Transcrição/genética , Transcrição Gênica
16.
Nat Biotechnol ; 30(6): 521-30, 2012 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-22609971

RESUMO

Despite extensive research, our understanding of the rules according to which cis-regulatory sequences are converted into gene expression is limited. We devised a method for obtaining parallel, highly accurate gene expression measurements from thousands of designed promoters and applied it to measure the effect of systematic changes in the location, number, orientation, affinity and organization of transcription-factor binding sites and nucleosome-disfavoring sequences. Our analyses reveal a clear relationship between expression and binding-site multiplicity, as well as dependencies of expression on the distance between transcription-factor binding sites and gene starts which are transcription-factor specific, including a striking ∼10-bp periodic relationship between gene expression and binding-site location. We show how this approach can measure transcription-factor sequence specificities and the sensitivity of transcription-factor sites to the surrounding sequence context, and compare the activity of 75 yeast transcription factors. Our method can be used to study both cis and trans effects of genotype on transcriptional, post-transcriptional and translational control.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Regiões Promotoras Genéticas , Fatores de Transcrição/genética , Sítios de Ligação , Análise por Conglomerados , Regulação Fúngica da Expressão Gênica , Engenharia Genética , Genoma Fúngico , Genótipo , Modelos Genéticos , Nucleossomos , Projetos de Pesquisa , Leveduras/genética
17.
Genome Res ; 21(12): 2114-28, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22009988

RESUMO

Coordinate regulation of ribosomal protein (RP) genes is key for controlling cell growth. In yeast, it is unclear how this regulation achieves the required equimolar amounts of the different RP components, given that some RP genes exist in duplicate copies, while others have only one copy. Here, we tested whether the solution to this challenge is partly encoded within the DNA sequence of the RP promoters, by fusing 110 different RP promoters to a fluorescent gene reporter, allowing us to robustly detect differences in their promoter activities that are as small as ~10%. We found that single-copy RP promoters have significantly higher activities, suggesting that proper RP stoichiometry is indeed partly encoded within the RP promoters. Notably, we also partially uncovered how this regulation is encoded by finding that RP promoters with higher activity have more nucleosome-disfavoring sequences and characteristic spatial organizations of these sequences and of binding sites for key RP regulators. Mutations in these elements result in a significant decrease of RP promoter activity. Thus, our results suggest that intrinsic (DNA-dependent) nucleosome organization may be a key mechanism by which genomes encode biologically meaningful promoter activities. Our approach can readily be applied to uncover how transcriptional programs of other promoters are encoded.


Assuntos
Dosagem de Genes/fisiologia , Regulação Fúngica da Expressão Gênica/fisiologia , Genoma Fúngico/fisiologia , Proteínas Ribossômicas/biossíntese , Proteínas de Saccharomyces cerevisiae/biossíntese , Saccharomyces cerevisiae/metabolismo , Nucleossomos/genética , Nucleossomos/metabolismo , Proteínas Ribossômicas/genética , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética
18.
Subcell Biochem ; 52: 193-204, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21557084

RESUMO

Binding of transcription factors to functional sites is a fundamental step in transcriptional regulation. In this chapter, we discuss how transcription factors are thought to achieve specificity to their functional targets, despite their typically low concentrations and degenerate binding specificities, and the fact that in large genomes their functional binding sites must compete with their widespread alternative binding sites. We highlight the importance of the chromatin structure context of the binding sites in this process, and its dependency on the genomic DNA sequence.


Assuntos
Sítios de Ligação , Fatores de Transcrição , Sequência de Bases , Regulação da Expressão Gênica , Genoma , Genômica , Ligação Proteica , Fatores de Transcrição/genética
19.
PLoS Comput Biol ; 4(11): e1000216, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18989395

RESUMO

The detailed positions of nucleosomes profoundly impact gene regulation and are partly encoded by the genomic DNA sequence. However, less is known about the functional consequences of this encoding. Here, we address this question using a genome-wide map of approximately 380,000 yeast nucleosomes that we sequenced in their entirety. Utilizing the high resolution of our map, we refine our understanding of how nucleosome organizations are encoded by the DNA sequence and demonstrate that the genomic sequence is highly predictive of the in vivo nucleosome organization, even across new nucleosome-bound sequences that we isolated from fly and human. We find that Poly(dA:dT) tracts are an important component of these nucleosome positioning signals and that their nucleosome-disfavoring action results in large nucleosome depletion over them and over their flanking regions and enhances the accessibility of transcription factors to their cognate sites. Our results suggest that the yeast genome may utilize these nucleosome positioning signals to regulate gene expression with different transcriptional noise and activation kinetics and DNA replication with different origin efficiency. These distinct functions may be achieved by encoding both relatively closed (nucleosome-covered) chromatin organizations over some factor binding sites, where factors must compete with nucleosomes for DNA access, and relatively open (nucleosome-depleted) organizations over other factor sites, where factors bind without competition.


Assuntos
DNA Fúngico/genética , Região de Controle de Locus Gênico , Nucleossomos/genética , Saccharomyces cerevisiae/genética , Transcrição Gênica/genética , Animais , Sequência de Bases/genética , Sítios de Ligação/genética , Montagem e Desmontagem da Cromatina/genética , Drosophila melanogaster/genética , Regulação Fúngica da Expressão Gênica/genética , Células HeLa , Humanos , Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo
20.
PLoS Comput Biol ; 4(8): e1000154, 2008 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-18725950

RESUMO

Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF-DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP) dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/.


Assuntos
Proteínas de Ligação a DNA/química , DNA/química , Software , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo , Animais , Inteligência Artificial , Sequência de Bases , Sítios de Ligação , Fator de Ligação a CCCTC , Imunoprecipitação da Cromatina , DNA/metabolismo , Proteínas de Ligação a DNA/metabolismo , Bases de Dados Genéticas , Humanos , Camundongos , Modelos Químicos , Modelos Genéticos , Proteínas Repressoras/química , Proteínas Repressoras/metabolismo , Análise de Sequência de DNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA