Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Nat Methods ; 21(6): 1033-1043, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38684783

RESUMO

Signaling pathways that drive gene expression are typically depicted as having a dozen or so landmark phosphorylation and transcriptional events. In reality, thousands of dynamic post-translational modifications (PTMs) orchestrate nearly every cellular function, and we lack technologies to find causal links between these vast biochemical pathways and genetic circuits at scale. Here we describe the high-throughput, functional assessment of phosphorylation sites through the development of PTM-centric base editing coupled to phenotypic screens, directed by temporally resolved phosphoproteomics. Using T cell activation as a model, we observe hundreds of unstudied phosphorylation sites that modulate NFAT transcriptional activity. We identify the phosphorylation-mediated nuclear localization of PHLPP1, which promotes NFAT but inhibits NFκB activity. We also find that specific phosphosite mutants can alter gene expression in subtle yet distinct patterns, demonstrating the potential for fine-tuning transcriptional responses. Overall, base editor screening of PTM sites provides a powerful platform to dissect PTM function within signaling pathways.


Assuntos
Processamento de Proteína Pós-Traducional , Fosforilação , Humanos , Fatores de Transcrição NFATC/metabolismo , Fatores de Transcrição NFATC/genética , Transdução de Sinais , Células HEK293 , Proteômica/métodos , Ensaios de Triagem em Larga Escala/métodos , Linfócitos T/metabolismo , Células Jurkat , NF-kappa B/metabolismo
2.
Nat Struct Mol Biol ; 31(3): 559-567, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38448573

RESUMO

Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.


Assuntos
Saccharomyces cerevisiae , Transcriptoma , Humanos , Saccharomyces cerevisiae/genética , Genoma , DNA , Cromatina
4.
Nature ; 625(7993): 41-50, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38093018

RESUMO

Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.


Assuntos
Genômica , Aprendizado de Máquina , Modelos Genéticos , Sequências Reguladoras de Ácido Nucleico , DNA/síntese química , DNA/genética , DNA/metabolismo , Sequências Reguladoras de Ácido Nucleico/genética , Fatores de Transcrição/metabolismo
5.
Nat Genet ; 54(5): 603-612, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35513721

RESUMO

Genome-wide association studies (GWASs) have uncovered hundreds of autoimmune disease-associated loci; however, the causal genetic variants within each locus are mostly unknown. Here, we perform high-throughput allele-specific reporter assays to prioritize disease-associated variants for five autoimmune diseases. By examining variants that both promote allele-specific reporter expression and are located in accessible chromatin, we identify 60 putatively causal variants that enrich for statistically fine-mapped variants by up to 57.8-fold. We introduced the risk allele of a prioritized variant (rs72928038) into a human T cell line and deleted the orthologous sequence in mice, both resulting in reduced BACH2 expression. Naive CD8 T cells from mice containing the deletion had reduced expression of genes that suppress activation and maintain stemness and, upon acute viral infection, displayed greater propensity to become effector T cells. Our results represent an example of an effective approach for prioritizing variants and studying their physiologically relevant effects.


Assuntos
Doenças Autoimunes , Estudo de Associação Genômica Ampla , Alelos , Animais , Doenças Autoimunes/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Camundongos , Polimorfismo de Nucleotídeo Único/genética , Sequências Reguladoras de Ácido Nucleico , Linfócitos T
6.
Nature ; 603(7901): 455-463, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35264797

RESUMO

Mutations in non-coding regulatory DNA sequences can alter gene expression, organismal phenotype and fitness1-3. Constructing complete fitness landscapes, in which DNA sequences are mapped to fitness, is a long-standing goal in biology, but has remained elusive because it is challenging to generalize reliably to vast sequence spaces4-6. Here we build sequence-to-expression models that capture fitness landscapes and use them to decipher principles of regulatory evolution. Using millions of randomly sampled promoter DNA sequences and their measured expression levels in the yeast Saccharomyces cerevisiae, we learn deep neural network models that generalize with excellent prediction performance, and enable sequence design for expression engineering. Using our models, we study expression divergence under genetic drift and strong-selection weak-mutation regimes to find that regulatory evolution is rapid and subject to diminishing returns epistasis; that conflicting expression objectives in different environments constrain expression adaptation; and that stabilizing selection on gene expression leads to the moderation of regulatory complexity. We present an approach for using such models to detect signatures of selection on expression from natural variation in regulatory sequences and use it to discover an instance of convergent regulatory evolution. We assess mutational robustness, finding that regulatory mutation effect sizes follow a power law, characterize regulatory evolvability, visualize promoter fitness landscapes, discover evolvability archetypes and illustrate the mutational robustness of natural regulatory sequence populations. Our work provides a general framework for designing regulatory sequences and addressing fundamental questions in regulatory evolution.


Assuntos
Deriva Genética , Modelos Genéticos , Evolução Biológica , DNA , Evolução Molecular , Regulação da Expressão Gênica , Mutação/genética , Fenótipo , Saccharomyces cerevisiae/genética
7.
Hum Mol Genet ; 31(12): 1946-1961, 2022 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-34970970

RESUMO

BACKGROUND: FCGR2A binds antibody-antigen complexes to regulate the abundance of circulating and deposited complexes along with downstream immune and autoimmune responses. Although the abundance of FCRG2A may be critical in immune-mediated diseases, little is known about whether its surface expression is regulated through cis genomic elements and non-coding variants. In the current study, we aimed to characterize the regulation of FCGR2A expression, the impact of genetic variation and its association with autoimmune disease. METHODS: We applied CRISPR-based interference and editing to scrutinize 1.7 Mb of open chromatin surrounding the FCGR2A gene to identify regulatory elements. Relevant transcription factors (TFs) binding to these regions were defined through public databases. Genetic variants affecting regulation were identified using luciferase reporter assays and were verified in a cohort of 1996 genotyped healthy individuals using flow cytometry. RESULTS: We identified a complex proximal region and five distal enhancers regulating FCGR2A. The proximal region split into subregions upstream and downstream of the transcription start site, was enriched in binding of inflammation-regulated TFs, and harbored a variant associated with FCGR2A expression in primary myeloid cells. One distal enhancer region was occupied by CCCTC-binding factor (CTCF) whose binding site was disrupted by a rare genetic variant, altering gene expression. CONCLUSIONS: The FCGR2A gene is regulated by multiple proximal and distal genomic regions, with links to autoimmune disease. These findings may open up novel therapeutic avenues where fine-tuning of FCGR2A levels may constitute a part of treatment strategies for immune-mediated diseases.


Assuntos
Doenças Autoimunes , Elementos Facilitadores Genéticos , Receptores de IgG , Doenças Autoimunes/genética , Sítios de Ligação , Genômica , Genótipo , Humanos , Receptores de IgG/genética
8.
Nat Commun ; 12(1): 1611, 2021 03 12.
Artigo em Inglês | MEDLINE | ID: mdl-33712590

RESUMO

Genome-wide association studies of Systemic Lupus Erythematosus (SLE) nominate 3073 genetic variants at 91 risk loci. To systematically screen these variants for allelic transcriptional enhancer activity, we construct a massively parallel reporter assay (MPRA) library comprising 12,396 DNA oligonucleotides containing the genomic context around every allele of each SLE variant. Transfection into the Epstein-Barr virus-transformed B cell line GM12878 reveals 482 variants with enhancer activity, with 51 variants showing genotype-dependent (allelic) enhancer activity at 27 risk loci. Comparison of MPRA results in GM12878 and Jurkat T cell lines highlights shared and unique allelic transcriptional regulatory mechanisms at SLE risk loci. In-depth analysis of allelic transcription factor (TF) binding at and around allelic variants identifies one class of TFs whose DNA-binding motif tends to be directly altered by the risk variant and a second class of TFs that bind allelically without direct alteration of their motif by the variant. Collectively, our approach provides a blueprint for the discovery of allelic gene regulation at risk loci for any disease and offers insight into the transcriptional regulatory mechanisms underlying SLE.


Assuntos
Alelos , Predisposição Genética para Doença/genética , Lúpus Eritematoso Sistêmico/genética , Linfócitos B , Linhagem Celular , Cromatina , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Herpesvirus Humano 4 , Humanos , Locos de Características Quantitativas , Sinaptogirinas/genética , Linfócitos T
9.
Nat Biotechnol ; 38(10): 1211, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32792646

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

10.
Genome Biol ; 21(1): 134, 2020 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-32493396

RESUMO

Improved methods are needed to model CRISPR screen data for interrogation of genetic elements that alter reporter gene expression readout. We create MAUDE (Mean Alterations Using Discrete Expression) for quantifying the impact of guide RNAs on a target gene's expression in a pooled, sorting-based expression screen. MAUDE quantifies guide-level effects by modeling the distribution of cells across sorting expression bins. It then combines guides to estimate the statistical significance and effect size of targeted genetic elements. We demonstrate that MAUDE outperforms previous approaches and provide experimental design guidelines to best leverage MAUDE, which is available on https://github.com/Carldeboer/MAUDE.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Expressão Gênica , Técnicas Genéticas , RNA Guia de Cinetoplastídeos , Software , Algoritmos , Sistemas CRISPR-Cas , Modelos Genéticos
11.
Nat Commun ; 11(1): 1237, 2020 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-32144282

RESUMO

Genome-wide association studies have associated thousands of genetic variants with complex traits and diseases, but pinpointing the causal variant(s) among those in tight linkage disequilibrium with each associated variant remains a major challenge. Here, we use seven experimental assays to characterize all common variants at the multiple disease-associated TNFAIP3 locus in five disease-relevant immune cell lines, based on a set of features related to regulatory potential. Trait/disease-associated variants are enriched among SNPs prioritized based on either: (1) residing within CRISPRi-sensitive regulatory regions, or (2) localizing in a chromatin accessible region while displaying allele-specific reporter activity. Of the 15 trait/disease-associated haplotypes at TNFAIP3, 9 have at least one variant meeting one or both of these criteria, 5 of which are further supported by genetic fine-mapping. Our work provides a comprehensive strategy to characterize genetic variation at important disease-associated loci, and aids in the effort to identify trait causal genetic variants.


Assuntos
Doenças Autoimunes/genética , Loci Gênicos/genética , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Proteína 3 Induzida por Fator de Necrose Tumoral alfa/genética , Linhagem Celular Tumoral , Predisposição Genética para Doença , Variação Genética/imunologia , Haplótipos/genética , Haplótipos/imunologia , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/imunologia , Estudo de Prova de Conceito
12.
Nat Biotechnol ; 38(1): 56-65, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31792407

RESUMO

How transcription factors (TFs) interpret cis-regulatory DNA sequence to control gene expression remains unclear, largely because past studies using native and engineered sequences had insufficient scale. Here, we measure the expression output of >100 million synthetic yeast promoter sequences that are fully random. These sequences yield diverse, reproducible expression levels that can be explained by their chance inclusion of functional TF binding sites. We use machine learning to build interpretable models of transcriptional regulation that predict ~94% of the expression driven from independent test promoters and ~89% of the expression driven from native yeast promoter fragments. These models allow us to characterize each TF's specificity, activity and interactions with chromatin. TF activity depends on binding-site strand, position, DNA helical face and chromatin context. Notably, expression level is influenced by weak regulatory interactions, which confound designed-sequence studies. Our analyses show that massive-throughput assays of fully random DNA can provide the big data necessary to develop complex, predictive models of gene regulation.


Assuntos
Eucariotos/genética , Regulação da Expressão Gênica , Lógica , Regiões Promotoras Genéticas , Sítios de Ligação , DNA/metabolismo , Genes Reporter , Modelos Genéticos , Saccharomyces cerevisiae/genética , Fatores de Transcrição/metabolismo
14.
Cell Rep ; 25(11): 2992-3005.e5, 2018 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-30540934

RESUMO

Long-term hematopoietic stem cells (LT-HSCs) maintain hematopoietic output throughout an animal's lifespan. However, with age, the balance is disrupted, and LT-HSCs produce a myeloid-biased output, resulting in poor immune responses to infectious challenge and the development of myeloid leukemias. Here, we show that young and aged LT-HSCs respond differently to inflammatory stress, such that aged LT-HSCs produce a cell-intrinsic, myeloid-biased expression program. Using single-cell RNA sequencing (scRNA-seq), we identify a myeloid-biased subset within the LT-HSC population (mLT-HSCs) that is prevalent among aged LT-HSCs. We identify CD61 as a marker of mLT-HSCs and show that CD61-high LT-HSCs are uniquely primed to respond to acute inflammatory challenge. We predict that several transcription factors regulate the mLT-HSCs gene program and show that Klf5, Ikzf1, and Stat3 play an important role in age-related inflammatory myeloid bias. We have therefore identified and isolated an LT-HSC subset that regulates myeloid versus lymphoid balance under inflammatory challenge and with age.


Assuntos
Envelhecimento/patologia , Células-Tronco Hematopoéticas/metabolismo , Inflamação/patologia , Animais , Biomarcadores/metabolismo , Inflamação/genética , Ligantes , Camundongos Endogâmicos C57BL , Modelos Biológicos , Células Mieloides/metabolismo , Receptores Toll-Like/metabolismo , Transcrição Gênica
15.
Cell ; 175(4): 998-1013.e20, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30388456

RESUMO

Treatment of cancer has been revolutionized by immune checkpoint blockade therapies. Despite the high rate of response in advanced melanoma, the majority of patients succumb to disease. To identify factors associated with success or failure of checkpoint therapy, we profiled transcriptomes of 16,291 individual immune cells from 48 tumor samples of melanoma patients treated with checkpoint inhibitors. Two distinct states of CD8+ T cells were defined by clustering and associated with patient tumor regression or progression. A single transcription factor, TCF7, was visualized within CD8+ T cells in fixed tumor samples and predicted positive clinical outcome in an independent cohort of checkpoint-treated patients. We delineated the epigenetic landscape and clonality of these T cell states and demonstrated enhanced antitumor immunity by targeting novel combinations of factors in exhausted cells. Our study of immune cell transcriptomes from tumors demonstrates a strategy for identifying predictors, mechanisms, and targets for enhancing checkpoint immunotherapy.


Assuntos
Linfócitos T CD8-Positivos/imunologia , Imunoterapia/métodos , Melanoma/imunologia , Transcriptoma , Animais , Anticorpos Monoclonais Humanizados/imunologia , Anticorpos Monoclonais Humanizados/farmacologia , Antígenos CD/imunologia , Antineoplásicos Imunológicos/imunologia , Antineoplásicos Imunológicos/farmacologia , Apirase/antagonistas & inibidores , Apirase/imunologia , Linhagem Celular Tumoral , Humanos , Antígenos Comuns de Leucócito/antagonistas & inibidores , Antígenos Comuns de Leucócito/imunologia , Melanoma/terapia , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Fator 1 de Transcrição de Linfócitos T/metabolismo
16.
BMC Bioinformatics ; 19(1): 253, 2018 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-29970004

RESUMO

BACKGROUND: Variation in chromatin organization across single cells can help shed important light on the mechanisms controlling gene expression, but scale, noise, and sparsity pose significant challenges for interpretation of single cell chromatin data. Here, we develop BROCKMAN (Brockman Representation Of Chromatin by K-mers in Mark-Associated Nucleotides), an approach to infer variation in transcription factor (TF) activity across samples through unsupervised analysis of the variation in DNA sequences associated with an epigenomic mark. RESULTS: BROCKMAN represents each sample as a vector of epigenomic-mark-associated DNA word frequencies, and decomposes the resulting matrix to find hidden structure in the data, followed by unsupervised grouping of samples and identification of the TFs that distinguish groups. Applied to single cell ATAC-seq, BROCKMAN readily distinguished cell types, treatments, batch effects, experimental artifacts, and cycling cells. We show that each variable component in the k-mer landscape reflects a set of co-varying TFs, which are often known to physically interact. For example, in K562 cells, AP-1 TFs were central determinant of variability in chromatin accessibility through their variable expression levels and diverse interactions with other TFs. We provide a theoretical basis for why cooperative TF binding - and any associated epigenomic mark - is inherently more variable than non-cooperative binding. CONCLUSIONS: BROCKMAN and related approaches will help gain a mechanistic understanding of the trans determinants of chromatin variability between cells, treatments, and individuals.


Assuntos
Epigenômica/métodos , Fatores de Transcrição/metabolismo , Sítios de Ligação , Humanos
17.
Cell ; 165(2): 303-16, 2016 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-27058663

RESUMO

Leukemia stem cells (LSCs) have the capacity to self-renew and propagate disease upon serial transplantation in animal models, and elimination of this cell population is required for curative therapies. Here, we describe a series of pooled, in vivo RNAi screens to identify essential transcription factors (TFs) in a murine model of acute myeloid leukemia (AML) with genetically and phenotypically defined LSCs. These screens reveal the heterodimeric, circadian rhythm TFs Clock and Bmal1 as genes required for the growth of AML cells in vitro and in vivo. Disruption of canonical circadian pathway components produces anti-leukemic effects, including impaired proliferation, enhanced myeloid differentiation, and depletion of LSCs. We find that both normal and malignant hematopoietic cells harbor an intact clock with robust circadian oscillations, and genetic knockout models reveal a leukemia-specific dependence on the pathway. Our findings establish a role for the core circadian clock genes in AML.


Assuntos
Fatores de Transcrição ARNTL/genética , Proteínas CLOCK/genética , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/patologia , Células-Tronco Neoplásicas/patologia , Animais , Ritmo Circadiano , Modelos Animais de Doenças , Técnicas de Inativação de Genes , Hematopoese , Humanos , Leucemia Mieloide Aguda/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Células-Tronco Neoplásicas/metabolismo , Interferência de RNA , RNA Interferente Pequeno/metabolismo
18.
Sci Rep ; 6: 21849, 2016 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-26898953

RESUMO

Linkage mapping studies in model organisms have typically focused their efforts in polymorphisms within coding regions, ignoring those within regulatory regions that may contribute to gene expression variation. In this context, differences in transcript abundance are frequently proposed as a source of phenotypic diversity between individuals, however, until now, little molecular evidence has been provided. Here, we examined Allele Specific Expression (ASE) in six F1 hybrids from Saccharomyces cerevisiae derived from crosses between representative strains of the four main lineages described in yeast. ASE varied between crosses with levels ranging between 28% and 60%. Part of the variation in expression levels could be explained by differences in transcription factors binding to polymorphic cis-regulations and to differences in trans-activation depending on the allelic form of the TF. Analysis on highly expressed alleles on each background suggested ASN1 as a candidate transcript underlying nitrogen consumption differences between two strains. Further promoter allele swap analysis under fermentation conditions confirmed that coding and non-coding regions explained aspartic and glutamic acid consumption differences, likely due to a polymorphism affecting Uga3 binding. Together, we provide a new catalogue of variants to bridge the gap between genotype and phenotype.


Assuntos
Aspartato-Amônia Ligase/genética , Regulação Fúngica da Expressão Gênica , Genoma Fúngico , Regiões Promotoras Genéticas , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Fatores de Transcrição/genética , Alelos , Aspartato-Amônia Ligase/metabolismo , Sequência de Bases , Quimera , Cruzamentos Genéticos , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Estudos de Associação Genética , Variação Genética , Padrões de Herança , Nitrogênio/metabolismo , Fases de Leitura Aberta , Locos de Características Quantitativas , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo
19.
PLoS One ; 9(10): e110479, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25353956

RESUMO

Nucleosomes regulate many DNA-dependent processes by controlling the accessibility of DNA, and DNA sequences such as the poly-dA:dT element are known to affect nucleosome binding. We demonstrate that poly-dA:dT tracts form an asymmetric barrier to nucleosome movement in vivo, mediated by ATP-dependent chromatin remodelers. We theorize that nucleosome transit over poly-A elements is more energetically favourable in one direction, leading to an asymmetric arrangement of nucleosomes around these sequences. We demonstrate that different arrangements of poly-A and poly-T tracts result in very different outcomes for nucleosome occupancy in yeast, mouse, and human, and show that yeast takes advantage of this phenomenon in its promoter architecture.


Assuntos
DNA/genética , Nucleossomos/genética , Poli dA-dT/genética , Trifosfato de Adenosina/metabolismo , Animais , Humanos , Camundongos , Regiões Promotoras Genéticas , Leveduras/genética
20.
Genome Res ; 24(1): 154-66, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24170600

RESUMO

Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.


Assuntos
DNA Fúngico/genética , Modelos Genéticos , Saccharomyces cerevisiae/genética , Sítio de Iniciação de Transcrição , Transcrição Gênica , Sítios de Ligação , Simulação por Computador , Genes Fúngicos , Genoma Fúngico , Nucleossomos/genética , Regiões Promotoras Genéticas , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...