RESUMEN
Nucleosome positioning varies between cell types. By deep sequencing cell-free DNA (cfDNA), isolated from circulating blood plasma, we generated maps of genome-wide in vivo nucleosome occupancy and found that short cfDNA fragments harbor footprints of transcription factors. The cfDNA nucleosome occupancies correlate well with the nuclear architecture, gene structure, and expression observed in cells, suggesting that they could inform the cell type of origin. Nucleosome spacing inferred from cfDNA in healthy individuals correlates most strongly with epigenetic features of lymphoid and myeloid cells, consistent with hematopoietic cell death as the normal source of cfDNA. We build on this observation to show how nucleosome footprints can be used to infer cell types contributing to cfDNA in pathological states such as cancer. Since this strategy does not rely on genetic differences to distinguish between contributing tissues, it may enable the noninvasive monitoring of a much broader set of clinical conditions than currently possible.
Asunto(s)
ADN/química , Nucleosomas/química , Especificidad de Órganos , Factor de Unión a CCCTC , Línea Celular , Ensamble y Desensamble de Cromatina , ADN/metabolismo , Huella de ADN , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Neoplasias/genética , Proteínas Represoras/metabolismo , Análisis de Secuencia de ADNRESUMEN
Gene activation requires the cooperative activity of multiple transcription factors at cis-regulatory elements (CREs). Yet, most transcription factors have short residence time, questioning the requirement of their physical co-occupancy on DNA to achieve cooperativity. Here, we present a DNA footprinting method that detects individual molecular interactions of transcription factors and nucleosomes with DNA in vivo. We apply this strategy to quantify the simultaneous binding of multiple transcription factors on single DNA molecules at mouse CREs. Analysis of the binary occupancy patterns at thousands of motif combinations reveals that high DNA co-occupancy occurs for most types of transcription factors, in the absence of direct physical interaction, at sites of competition with nucleosomes. Perturbation of pairwise interactions demonstrates the function of molecular co-occupancy in binding cooperativity. Our results reveal the interactions regulating CREs at molecular resolution and identify DNA co-occupancy as a widespread cooperativity mechanism used by transcription factors to remodel chromatin.
Asunto(s)
Huella de ADN/métodos , ADN/genética , Nucleosomas/química , Elementos Reguladores de la Transcripción , Factores de Transcripción/genética , Animales , Sitios de Unión , ADN/química , ADN/metabolismo , Masculino , Ratones , Células Madre Embrionarias de Ratones/citología , Células Madre Embrionarias de Ratones/metabolismo , Nucleosomas/metabolismo , Unión Proteica , Factores de Transcripción/química , Factores de Transcripción/metabolismo , Transcripción GenéticaRESUMEN
Translation regulation occurs largely during the initiation phase. Here, we develop selective 40S footprinting to visualize initiating 40S ribosomes on endogenous mRNAs in vivo. This reveals the positions on mRNAs where initiation factors join the ribosome to act and where they leave. We discover that in most human cells, most scanning ribosomes remain attached to the 5' cap. Consequently, only one ribosome scans a 5' UTR at a time, and 5' UTR length affects translation efficiency. We discover that eukaryotic initiation factor 3B (eIF3B,) eIF4G1, and eIF4E remain bound to 80S ribosomes as they begin translating, with a decay half-length of â¼12 codons. Hence, ribosomes retain these initiation factors while translating short upstream open reading frames (uORFs), providing an explanation for how ribosomes can reinitiate translation after uORFs in humans. This method will be of use for studying translation initiation mechanisms in vivo.
Asunto(s)
Regiones no Traducidas 5' , Huella de ADN/métodos , Iniciación de la Cadena Peptídica Traduccional , Subunidades Ribosómicas Pequeñas de Eucariotas/metabolismo , Animales , Codón Iniciador , Factor 3 de Iniciación Eucariótica/genética , Factor 3 de Iniciación Eucariótica/metabolismo , Factor 4E Eucariótico de Iniciación/genética , Factor 4E Eucariótico de Iniciación/metabolismo , Factor 4G Eucariótico de Iniciación/genética , Factor 4G Eucariótico de Iniciación/metabolismo , Células HeLa , Humanos , Ratones , Células 3T3 NIH , Sistemas de Lectura Abierta , ARN Mensajero/genética , ARN de Transferencia de Metionina/genética , Subunidades Ribosómicas/genética , Subunidades Ribosómicas/metabolismo , Subunidades Ribosómicas Pequeñas de Eucariotas/genéticaRESUMEN
The ribosome-associated protein quality control (RQC) system that resolves stalled translation events is activated when ribosomes collide and form disome, trisome, or higher-order complexes. However, it is unclear whether this system distinguishes collision complexes formed on defective mRNAs from those with functional roles on endogenous transcripts. Here, we performed disome and trisome footprint profiling in yeast and found collisions were enriched on diverse sequence motifs known to slow translation. When 60S recycling was inhibited, disomes accumulated at stop codons and could move into the 3' UTR to reinitiate translation. The ubiquitin ligase and RQC factor Hel2/ZNF598 generally recognized collisions but did not induce degradation of endogenous transcripts. However, loss of Hel2 triggered the integrated stress response, via phosphorylation of eIF2α, thus linking these pathways. Our results suggest that Hel2 has a role in sensing ribosome collisions on endogenous mRNAs, and such events may be important for cellular homeostasis.
Asunto(s)
Huella de ADN/métodos , Genoma Fúngico , Ribosomas/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Ubiquitina-Proteína Ligasas/metabolismo , Regiones no Traducidas 3' , Anisomicina/farmacología , Codón de Terminación , Factor 2 Eucariótico de Iniciación/genética , Factor 2 Eucariótico de Iniciación/metabolismo , Mutación , Fosforilación , Estabilidad del ARN , Subunidades Ribosómicas Grandes de Eucariotas/genética , Subunidades Ribosómicas Grandes de Eucariotas/metabolismo , Ribosomas/metabolismo , Saccharomyces cerevisiae/efectos de los fármacos , Proteínas de Saccharomyces cerevisiae/genética , Ubiquitina-Proteína Ligasas/genéticaRESUMEN
Here, we present a method for enrichment of double-stranded cfDNA with an average length of â¼40 bp from cfDNA for high-throughput DNA sequencing. This class of cfDNA is enriched at gene promoters and binding sites of transcription factors or structural DNA-binding proteins, so that a genome-wide DNA footprint is directly captured from liquid biopsies. In short double-stranded cfDNA from healthy individuals, we find significant enrichment of 203 transcription factor motifs. Additionally, short double-stranded cfDNA signals at specific genomic regions correlate negatively with DNA methylation, positively with H3K4me3 histone modifications and gene transcription. The diagnostic potential of short double-stranded cell-free DNA (cfDNA) in blood plasma has not yet been recognized. When comparing short double-stranded cfDNA from patient samples of pancreatic ductal adenocarcinoma with colorectal carcinoma or septic with postoperative controls, we identify 136 and 241 differentially enriched loci, respectively. Using these differentially enriched loci, the disease types can be clearly distinguished by principal component analysis, demonstrating the diagnostic potential of short double-stranded cfDNA signals as a new class of biomarkers for liquid biopsies.
Asunto(s)
Ácidos Nucleicos Libres de Células , Huella de ADN , Humanos , Ácidos Nucleicos Libres de Células/sangre , Ácidos Nucleicos Libres de Células/genética , Huella de ADN/métodos , Metilación de ADN , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/sangre , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Histonas/metabolismo , Biopsia Líquida/métodos , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/sangre , Regiones Promotoras Genéticas , Sitios de UniónRESUMEN
The combinatorial cross-regulation of hundreds of sequence-specific transcription factors (TFs) defines a regulatory network that underlies cellular identity and function. Here we use genome-wide maps of in vivo DNaseI footprints to assemble an extensive core human regulatory network comprising connections among 475 sequence-specific TFs and to analyze the dynamics of these connections across 41 diverse cell and tissue types. We find that human TF networks are highly cell selective and are driven by cohorts of factors that include regulators with previously unrecognized roles in control of cellular identity. Moreover, we identify many widely expressed factors that impact transcriptional regulatory networks in a cell-selective manner. Strikingly, in spite of their inherent diversity, all cell-type regulatory networks independently converge on a common architecture that closely resembles the topology of living neuronal networks. Together, our results provide an extensive description of the circuitry, dynamics, and organizing principles of the human TF regulatory network.
Asunto(s)
Redes Reguladoras de Genes , Factores de Transcripción/metabolismo , Animales , Huella de ADN , Desoxirribonucleasa I/metabolismo , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Especificidad de ÓrganosRESUMEN
The human mitochondrial genome comprises a distinct genetic system transcribed as precursor polycistronic transcripts that are subsequently cleaved to generate individual mRNAs, tRNAs, and rRNAs. Here, we provide a comprehensive analysis of the human mitochondrial transcriptome across multiple cell lines and tissues. Using directional deep sequencing and parallel analysis of RNA ends, we demonstrate wide variation in mitochondrial transcript abundance and precisely resolve transcript processing and maturation events. We identify previously undescribed transcripts, including small RNAs, and observe the enrichment of several nuclear RNAs in mitochondria. Using high-throughput in vivo DNaseI footprinting, we establish the global profile of DNA-binding protein occupancy across the mitochondrial genome at single-nucleotide resolution, revealing regulatory features at mitochondrial transcription initiation sites and functional insights into disease-associated variants. This integrated analysis of the mitochondrial transcriptome reveals unexpected complexity in the regulation, expression, and processing of mitochondrial RNA and provides a resource for future studies of mitochondrial function (accessed at http://mitochondria.matticklab.com).
Asunto(s)
Perfilación de la Expresión Génica , Mitocondrias/genética , ARN/análisis , Núcleo Celular/metabolismo , Huella de ADN , Proteínas de Unión al ADN/análisis , Desoxirribonucleasa I/metabolismo , Regulación de la Expresión Génica , Genoma Mitocondrial , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Región de Control de Posición , Proteínas Mitocondriales/análisis , Conformación de Ácido Nucleico , ARN/metabolismo , ARN Mitocondrial , Análisis de Secuencia de ARNRESUMEN
Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.
Asunto(s)
Huella de ADN/normas , Genoma Humano/genética , Factores de Transcripción/metabolismo , Secuencia de Consenso , ADN/genética , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Genética de Población , Estudio de Asociación del Genoma Completo , Humanos , Modelos Moleculares , Polimorfismo de Nucleótido Simple , Secuencias Reguladoras de Ácidos Nucleicos/genéticaRESUMEN
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Asunto(s)
ADN/genética , Bases de Datos Genéticas , Genoma/genética , Genómica , Anotación de Secuencia Molecular , Sistema de Registros , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/genética , Cromatina/metabolismo , ADN/química , Huella de ADN , Metilación de ADN/genética , Momento de Replicación del ADN , Desoxirribonucleasa I/metabolismo , Genoma Humano , Histonas/metabolismo , Humanos , Ratones , Ratones Transgénicos , Proteínas de Unión al ARN/genética , Transcripción Genética/genética , Transposasas/metabolismoRESUMEN
Elucidating the transcriptional regulatory networks that underlie growth and development requires robust ways to define the complete set of transcription factor (TF) binding sites. Although TF-binding sites are known to be generally located within accessible chromatin regions (ACRs), pinpointing these DNA regulatory elements globally remains challenging. Current approaches primarily identify binding sites for a single TF (e.g. ChIP-seq), or globally detect ACRs but lack the resolution to consistently define TF-binding sites (e.g. DNAse-seq, ATAC-seq). To address this challenge, we developed MNase-defined cistrome-Occupancy Analysis (MOA-seq), a high-resolution (< 30 bp), high-throughput, and genome-wide strategy to globally identify putative TF-binding sites within ACRs. We used MOA-seq on developing maize ears as a proof of concept, able to define a cistrome of 145,000 MOA footprints (MFs). While a substantial majority (76%) of the known ATAC-seq ACRs intersected with the MFs, only a minority of MFs overlapped with the ATAC peaks, indicating that the majority of MFs were novel and not detected by ATAC-seq. MFs were associated with promoters and significantly enriched for TF-binding and long-range chromatin interaction sites, including for the well-characterized FASCIATED EAR4, KNOTTED1, and TEOSINTE BRANCHED1. Importantly, the MOA-seq strategy improved the spatial resolution of TF-binding prediction and allowed us to identify 215 motif families collectively distributed over more than 100,000 non-overlapping, putatively-occupied binding sites across the genome. Our study presents a simple, efficient, and high-resolution approach to identify putative TF footprints and binding motifs genome-wide, to ultimately define a native cistrome atlas.
Asunto(s)
Huella de ADN/métodos , Regiones Promotoras Genéticas , Factores de Transcripción/metabolismo , Zea mays/genética , Sitios de Unión , Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Elementos Reguladores de la Transcripción , Secuenciación Completa del GenomaRESUMEN
The ferric uptake regulator (Fur) is a global regulator that influences the expression of virulence genes in Klebsiella pneumoniae. Bioinformatics analysis suggests Fur may involve in iron acquisition via the identified regulatory box upstream of the yersiniabactin receptor gene fyuA. To observe the impact of the gene fyuA on the virulence of K. pneumoniae, the gene fyuA knockout strain and complementation strain were constructed and then conducted a series of phenotypic experiments including chrome azurol S (CAS) detection, crystal violet staining, and wax moth virulence experiment. To examine the regulatory relationship between Fur and the gene fyuA, green fluorescent protein (GFP) reporter gene fusion assay, real-time quantitative reverse transcription polymerase chain reaction (RT-qPCR), gel migration assay (EMSA), and DNase I footprinting assay were used to clarify the regulatory mechanism of Fur on fyuA. CAS detection revealed that the gene fyuA could affect the generation of iron carriers in K. pneumoniae. Crystal violet staining experiment showed that fyuA could positively influence biofilm formation. Wax moth virulence experiment indicated that the deletion of the fyuA could weaken bacterial virulence. GFP reporter gene fusion experiment and RT-qPCR analysis revealed that Fur negatively regulated the expression of fyuA in iron-sufficient environment. EMSA experiment demonstrated that Fur could directly bind to the promoter region of fyuA, and DNase I footprinting assay further identified the specific binding site sequences. The study showed that Fur negatively regulated the transcriptional expression of fyuA by binding to upstream of the gene promoter region, and then affected the virulence of K. pneumoniae.
Asunto(s)
Proteínas Bacterianas , Biopelículas , Regulación Bacteriana de la Expresión Génica , Hierro , Klebsiella pneumoniae , Mariposas Nocturnas , Regiones Promotoras Genéticas , Proteínas Represoras , Klebsiella pneumoniae/genética , Klebsiella pneumoniae/metabolismo , Klebsiella pneumoniae/patogenicidad , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Virulencia/genética , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Animales , Mariposas Nocturnas/microbiología , Biopelículas/crecimiento & desarrollo , Hierro/metabolismo , Infecciones por Klebsiella/microbiología , Transcripción Genética , Huella de ADN , Fenoles , TiazolesRESUMEN
Transcription is tightly regulated by cis-regulatory DNA elements where transcription factors (TFs) can bind. Thus, identification of TF binding sites (TFBSs) is key to understanding gene expression and whole regulatory networks within a cell. The standard approaches used for TFBS prediction, such as position weight matrices (PWMs) and chromatin immunoprecipitation followed by sequencing (ChIP-seq), are widely used but have their drawbacks, including high false-positive rates and limited antibody availability, respectively. Several computational footprinting algorithms have been developed to detect TFBSs by investigating chromatin accessibility patterns; however, these also have limitations. We have developed a footprinting method to predict TF footprints in active chromatin elements (TRACE) to improve the prediction of TFBS footprints. TRACE incorporates DNase-seq data and PWMs within a multivariate hidden Markov model (HMM) to detect footprint-like regions with matching motifs. TRACE is an unsupervised method that accurately annotates binding sites for specific TFs automatically with no requirement for pregenerated candidate binding sites or ChIP-seq training data. Compared with published footprinting algorithms, TRACE has the best overall performance with the distinct advantage of targeting multiple motifs in a single model.
Asunto(s)
Cromatina/metabolismo , Huella de ADN/métodos , Análisis de Secuencia de ADN , Factores de Transcripción/metabolismo , Sitios de Unión , Línea Celular , Desoxirribonucleasas , Humanos , Células K562 , Cadenas de Markov , Motivos de NucleótidosRESUMEN
CRISPR technologies comprising a Cas nuclease and a guide RNA (gRNA) can utilize multiple gRNAs to enact multi-site editing or regulation in the same cell. Nature devised a highly compact means of encoding gRNAs in the form of CRISPR arrays composed of conserved repeats separated by targeting spacers. However, the capacity to acquire new spacers keeps the arrays longer than necessary for CRISPR technologies. Here, we show that CRISPR arrays utilized by the Cas9 nuclease can be shortened without compromising and sometimes even enhancing targeting activity. Using multiplexed gene repression in E. coli, we found that each region could be systematically shortened to varying degrees before severely compromising targeting activity. Surprisingly, shortening some spacers yielded enhanced targeting activity, which was linked to folding of the transcribed array prior to processing. Overall, shortened CRISPR-Cas9 arrays can facilitate multiplexed editing and gene regulation from a smaller DNA footprint across many bacterial applications of CRISPR technologies.
Asunto(s)
Sistemas CRISPR-Cas , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , Huella de ADN , Escherichia coli/genética , Marcación de Gen , Bacterias/genética , EndonucleasasRESUMEN
In Staphylococcus aureus, most multiresistance plasmids lack conjugation or mobilization genes for horizontal transfer. However, most are mobilizable due to carriage of origin-of-transfer (oriT) sequences mimicking those of conjugative plasmids related to pWBG749. pWBG749-family plasmids have diverged to carry five distinct oriT subtypes and non-conjugative plasmids have been identified that contain mimics of each. The relaxasome accessory factor SmpO, encoded by each conjugative plasmid, determines specificity for its cognate oriT. Here we characterized the binding of SmpO proteins to each oriT. SmpO proteins predominantly formed tetramers in solution and bound 5'-GNNNNC-3' sites within each oriT. Four of the five SmpO proteins specifically bound their cognate oriT. An F7K substitution in pWBG749 SmpO switched oriT-binding specificity in vitro. In vivo, the F7K substitution reduced but did not abolish self-transfer of pWBG749. Notably, the substitution broadened the oriT subtypes that were mobilized. Thus, this substitution represents a potential evolutionary intermediate with promiscuous DNA-binding specificity that could facilitate a switch between oriT specificities. Phylogenetic analysis suggests pWBG749-family plasmids have switched oriT specificity more than once during evolution. We hypothesize the convergent evolution of oriT specificity in distinct branches of the pWBG749-family phylogeny reflects indirect selection pressure to mobilize plasmids carrying non-cognate oriT-mimics.
Asunto(s)
Plásmidos/genética , Staphylococcus aureus/genética , Sustitución de Aminoácidos , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Sitios de Unión , Conjugación Genética , Huella de ADN , Evolución Molecular , Filogenia , Plásmidos/clasificaciónRESUMEN
MOTIVATION: High-throughput chromatin immunoprecipitation (ChIP) sequencing-based assays capture genomic regions associated with the profiled transcription factor (TF). ChIP-exo is a modified protocol, which uses lambda exonuclease to digest DNA close to the TF-DNA complex, in order to improve on the positional resolution of the TF-DNA contact. Because the digestion occurs in the 5'-3' orientation, the protocol produces directional footprints close to the complex, on both sides of the double stranded DNA. Like all ChIP-based methods, ChIP-exo reports a mixture of different regions associated with the TF: those bound directly to the TF as well as via intermediaries. However, the distribution of footprints are likely to be indicative of the complex forming at the DNA. RESULTS: We present ExoDiversity, which uses a model-based framework to learn a joint distribution over footprints and motifs, thus resolving the mixture of ChIP-exo footprints into diverse binding modes. It uses no prior motif or TF information and automatically learns the number of different modes from the data. We show its application on a wide range of TFs and organisms/cell-types. Because its goal is to explain the complete set of reported regions, it is able to identify co-factor TF motifs that appear in a small fraction of the dataset. Further, ExoDiversity discovers small nucleotide variations within and outside canonical motifs, which co-occur with variations in footprints, suggesting that the TF-DNA structural configuration at those regions is likely to be different. Finally, we show that detected modes have specific DNA shape features and conservation signals, giving insights into the structure and function of the putative TF-DNA complexes. AVAILABILITY AND IMPLEMENTATION: The code for ExoDiversity is available on https://github.com/NarlikarLab/exoDIVERSITY. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
ADN , Exonucleasas , Sitios de Unión , Inmunoprecipitación de Cromatina , ADN/metabolismo , Huella de ADN , Unión Proteica , Análisis de Secuencia de ADNRESUMEN
Genomic footprinting has emerged as an unbiased discovery method for transcription factor (TF) occupancy at cognate DNA in vivo. A basic premise of footprinting is that sequence-specific TF-DNA interactions are associated with localized resistance to nucleases, leaving observable signatures of cleavage within accessible chromatin. This phenomenon is interpreted to imply protection of the critical nucleotides by the stably bound protein factor. However, this model conflicts with previous reports of many TFs exchanging with specific binding sites in living cells on a timescale of seconds. We show that TFs with short DNA residence times have no footprints at bound motif elements. Moreover, the nuclease cleavage profile within a footprint originates from the DNA sequence in the factor-binding site, rather than from the protein occupying specific nucleotides. These findings suggest a revised understanding of TF footprinting and reveal limitations in comprehensive reconstruction of the TF regulatory network using this approach.
Asunto(s)
Secuencia de Bases , Huella de ADN , ADN/metabolismo , Análisis de Secuencia de ADN , Factores de Transcripción/metabolismo , Sitios de Unión/genética , ADN/química , División del ADN , Desoxirribonucleasa I/química , Endodesoxirribonucleasas/química , Genómica , Humanos , Unión Proteica/genética , Estructura Terciaria de Proteína , Curva ROC , Factores de Transcripción/químicaRESUMEN
Human mitochondrial DNA (mtDNA) is believed to lack chromatin and histones. Instead, it is coated solely by the transcription factor TFAM. We asked whether mtDNA packaging is more regulated than once thought. To address this, we analyzed DNase-seq experiments in 324 human cell types and found, for the first time, a pattern of 29 mtDNA Genomic footprinting (mt-DGF) sites shared by â¼90% of the samples. Their syntenic conservation in mouse DNase-seq experiments reflect selective constraints. Colocalization with known mtDNA regulatory elements, with G-quadruplex structures, in TFAM-poor sites (in HeLa cells) and with transcription pausing sites, suggest a functional regulatory role for such mt-DGFs. Altered mt-DGF pattern in interleukin 3-treated CD34+ cells, certain tissue differences, and significant prevalence change in fetal versus nonfetal samples, offer first clues to their physiological importance. Taken together, human mtDNA has a conserved protein-DNA organization, which is likely involved in mtDNA regulation.
Asunto(s)
Cromatina/genética , ADN Mitocondrial/genética , Proteínas de Unión al ADN/genética , Genoma Humano , Proteínas Mitocondriales/genética , Factores de Transcripción/genética , Animales , Línea Celular , Huella de ADN/métodos , Desoxirribonucleasas/genética , G-Cuádruplex , Regulación de la Expresión Génica , Células HeLa , Humanos , Ratones , Mitocondrias/genéticaRESUMEN
Deoxyribonuclease I (DNase I)-hypersensitive site sequencing (DNase-seq) has been widely used to determine chromatin accessibility and its underlying regulatory lexicon. However, exploring DNase-seq data requires sophisticated downstream bioinformatics analyses. In this study, we first review computational methods for all of the major steps in DNase-seq data analysis, including experimental design, quality control, read alignment, peak calling, annotation of cis-regulatory elements, genomic footprinting and visualization. The challenges associated with each step are highlighted. Next, we provide a practical guideline and a computational pipeline for DNase-seq data analysis by integrating some of these tools. We also discuss the competing techniques and the potential applications of this pipeline for the analysis of analogous experimental data. Finally, we discuss the integration of DNase-seq with other functional genomics techniques.
Asunto(s)
Biología Computacional/métodos , Manejo de Datos/métodos , Desoxirribonucleasa I/metabolismo , Análisis de Secuencia de ADN/métodos , Huella de ADN , Control de CalidadRESUMEN
CUT&RUN is a powerful tool to study protein-DNA interactions in vivo. DNA fragments cleaved by the targeted micrococcal nuclease identify the footprints of DNA-binding proteins on the chromatin. We performed CUT&RUN on human lung carcinoma cell line A549 maintained in a multi-well cell culture plate to profile RNA polymerase II. Long (> 270 bp) DNA fragments released by CUT&RUN corresponded to the bimodal peak around the transcription start sites, as previously seen with chromatin immunoprecipitation. However, we found that short (< 120 bp) fragments identify a well-defined peak localised at the transcription start sites. This distinct DNA footprint of short fragments, which constituted only about 5% of the total reads, suggests the transient positioning of RNA polymerase II before promoter-proximal pausing, which has not been detected in the physiological settings by standard chromatin immunoprecipitation. We showed that the positioning of the large-size-class DNA footprints around the short-fragment peak was associated with the directionality of transcription, demonstrating the biological significance of distinct CUT&RUN footprints of RNA polymerase II.
Asunto(s)
Sitios de Unión , Biología Computacional , Huella de ADN , ARN Polimerasa II/metabolismo , Programas Informáticos , Sitio de Iniciación de la Transcripción , Cromatina/genética , Inmunoprecipitación de Cromatina , Biología Computacional/métodos , Huella de ADN/métodos , Proteínas de Unión al ADN , Humanos , Regiones Promotoras Genéticas , Transcripción GenéticaRESUMEN
SNAIL1 is a key regulator of epithelial-mesenchymal transition (EMT) and its expression is associated with tumor progression and poor clinical prognosis of cancer patients. Compared to the studies of SNAIL1 stability and its transcriptional regulation, very limited knowledge is available regarding effective approaches to directly target SNAIL1. In this study, we revealed the potential regulation of SNAIL1 gene expression by G-quadruplex structures in its promoter. We first revealed that the negative strand of the SNAIL1 promoter contained a multi-G-tract region with high potential of forming G-quadruplex structures. In circular dichroism studies, the oligonucleotide based on this region showed characteristic molar ellipticity at specific wavelengths of G-quadruplex structures. We also utilized native polyacrylamide gel electrophoresis, gel-shift assays, immunofluorescent staining, dimethyl sulfate footprinting and chromatin immunoprecipitation studies to verify the G-quadruplex structures formed by the oligonucleotide. In reporter assays, disruption of G-quadruplex potential increased SNAIL1 promoter-mediated transcription, suggesting that G-quadruplexes played a negative role in SNAIL1 expression. In a DNA synthesis study, we detected G-quadruplex-mediated retardation in the SNAIL1 promoter replication. Consistently, we discovered that the G-quadruplex region of the SNAIL1 promoter is highly enriched for mutations, implicating the clinical relevance of G-quadruplexes to the altered SNAIL1 expression in cancer cells.