Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
1.
Genome Res ; 2024 Jul 11.
Article in English | MEDLINE | ID: mdl-38858087

ABSTRACT

Multiomics require concerted recording of independent information, ideally from a single experiment. In this study, we introduce RIMS-seq2, a high-throughput technique to simultaneously sequence genomes and overlay methylation information while requiring only a small modification of the experimental protocol for high-throughput DNA sequencing to include a controlled deamination step. Importantly, the rate of deamination of 5-methylcytosine is negligible and thus does not interfere with standard DNA sequencing and data processing. Thus, RIMS-seq2 libraries from whole- or targeted-genome sequencing show the same germline variation calling accuracy and sensitivity compared with standard DNA-seq. Additionally, regional methylation levels provide an accurate map of the human methylome.

2.
Genome Res ; 32(1): 162-174, 2022 01.
Article in English | MEDLINE | ID: mdl-34815308

ABSTRACT

Determination of eukaryotic transcription start sites (TSSs) has been based on methods that require the cap structure at the 5' end of transcripts derived from Pol II RNA polymerase. Consequently, these methods do not reveal TSSs derived from the other RNA polymerases that also play critical roles in various cell functions. To address this limitation, we developed ReCappable-seq, which comprehensively identifies TSS for both Pol II and non-Pol II transcripts at single-nucleotide resolution. The method relies on specific enzymatic exchange of 5' m7G caps and 5' triphosphates with a selectable tag. When applied to human transcriptomes, ReCappable-seq identifies Pol II TSSs that are in agreement with orthogonal methods such as CAGE. Additionally, ReCappable-seq reveals a rich landscape of TSSs associated with Pol III transcripts that have not previously been amenable to study at genome-wide scale. Novel TSS from non-Pol II transcription can be located in the nuclear and mitochondrial genomes. ReCappable-seq interrogates the regulatory landscape of coding and noncoding RNA concurrently and enables the classification of epigenetic profiles associated with Pol II and non-Pol II TSS.


Subject(s)
DNA-Directed RNA Polymerases , RNA Polymerase II , RNA Polymerase II/genetics , RNA Polymerase II/metabolism , RNA, Untranslated , Transcription Initiation Site , Transcriptome
3.
Genome Res ; 32(11-12): 2079-2091, 2022.
Article in English | MEDLINE | ID: mdl-36332968

ABSTRACT

Covalent modifications of genomic DNA are crucial for most organisms to survive. Amplicon-based high-throughput sequencing technologies erase all DNA modifications to retain only sequence information for the four canonical nucleobases, necessitating specialized technologies for ascertaining epigenetic information. To also capture base modification information, we developed Methyl-SNP-seq, a technology that takes advantage of the complementarity of the double helix to extract the methylation and original sequence information from a single DNA molecule. More specifically, Methyl-SNP-seq uses bisulfite conversion of one of the strands to identify cytosine methylation while retaining the original four-bases sequence information on the other strand. As both strands are locked together to link the dual readouts on a single paired-end read, Methyl-SNP-seq allows detecting the methylation status of any DNA even without a reference genome. Because one of the strands retains the original four nucleotide composition, Methyl-SNP-seq can also be used in conjunction with standard sequence-specific probes for targeted enrichment and amplification. We show the usefulness of this technology in a broad spectrum of applications ranging from allele-specific methylation analysis in humans to identification of methyltransferase specificity in complex bacterial communities.


Subject(s)
DNA Methylation , Epigenome , Humans , Sequence Analysis, DNA , DNA/genetics , Alleles , High-Throughput Nucleotide Sequencing , Sulfites/chemistry
4.
PLoS Genet ; 18(9): e1010389, 2022 09.
Article in English | MEDLINE | ID: mdl-36121836

ABSTRACT

Phosphorothioation (PT), in which a non-bridging oxygen is replaced by a sulfur, is one of the rare modifications discovered in bacteria and archaea that occurs on the sugar-phosphate backbone as opposed to the nucleobase moiety of DNA. While PT modification is widespread in the prokaryotic kingdom, how PT modifications are distributed in the genomes and their exact roles in the cell remain to be defined. In this study, we developed a simple and convenient technique called EcoWI-seq based on a modification-dependent restriction endonuclease to identify genomic positions of PT modifications. EcoWI-seq shows similar performance than other PT modification detection techniques and additionally, is easily scalable while requiring little starting material. As a proof of principle, we applied EcoWI-seq to map the PT modifications at base resolution in the genomes of both the Salmonella enterica cerro 87 and E. coli expressing the dnd+ gene cluster. Specifically, we address whether the partial establishment of modified PT positions is a stochastic or deterministic process. EcoWI-seq reveals a systematic usage of the same subset of target sites in clones for which the PT modification has been independently established.


Subject(s)
Escherichia coli , Salmonella enterica , DNA/genetics , DNA Restriction Enzymes , DNA, Bacterial/genetics , Escherichia coli/genetics , High-Throughput Nucleotide Sequencing , Oxygen , Phosphates , Salmonella enterica/genetics , Sugars , Sulfur
5.
PLoS Genet ; 18(4): e1009943, 2022 04.
Article in English | MEDLINE | ID: mdl-35377874

ABSTRACT

Understanding mechanisms that shape horizontal exchange in prokaryotes is a key problem in biology. A major limit on DNA entry is imposed by restriction-modification (RM) processes that depend on the pattern of DNA modification at host-specified sites. In classical RM, endonucleolytic DNA cleavage follows detection of unprotected sites on entering DNA. Recent investigation has uncovered BREX (BacteRiophage EXclusion) systems. These RM-like activities employ host protection by DNA modification, but immediate replication arrest occurs without evident of nuclease action on unmodified phage DNA. Here we show that the historical stySA RM locus of Salmonella enterica sv Typhimurium is a variant BREX system. A laboratory strain disabled for both the restriction and methylation activity of StySA nevertheless has wild type sequence in pglX, the modification gene homolog. Instead, flanking genes pglZ and brxC each carry multiple mutations (µ) in their C-terminal domains. We further investigate this system in situ, replacing the mutated pglZµ and brxCµ genes with the WT counterpart. PglZ-WT supports methylation in the presence of either BrxCµ or BrxC-WT but not in the presence of a deletion/insertion allele, ΔbrxC::cat. Restriction requires both BrxC-WT and PglZ-WT, implicating the BrxC C-terminus specifically in restriction activity. These results suggests that while BrxC, PglZ and PglX are principal components of the BREX modification activity, BrxL is required for restriction only. Furthermore, we show that a partial disruption of brxL disrupts transcription globally.


Subject(s)
Bacteriophages , Bacteriophages/genetics , Bacteriophages/metabolism , DNA, Viral , Methylation , Salmonella typhimurium/genetics , Salmonella typhimurium/metabolism
6.
Genome Res ; 31(2): 291-300, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33468551

ABSTRACT

The predominant methodology for DNA methylation analysis relies on the chemical deamination by sodium bisulfite of unmodified cytosine to uracil to permit the differential readout of methylated cytosines. Bisulfite treatment damages the DNA, leading to fragmentation and loss of long-range methylation information. To overcome this limitation of bisulfite-treated DNA, we applied a new enzymatic deamination approach, termed enzymatic methyl-seq (EM-seq), to long-range sequencing technologies. Our methodology, named long-read enzymatic modification sequencing (LR-EM-seq), preserves the integrity of DNA, allowing long-range methylation profiling of 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) over multikilobase length of genomic DNA. When applied to known differentially methylated regions (DMRs), LR-EM-seq achieves phasing of >5 kb, resulting in broader and better defined DMRs compared with that previously reported. This result showed the importance of phasing methylation for biologically relevant questions and the applicability of LR-EM-seq for long-range epigenetic analysis at single-molecule and single-nucleotide resolution.

7.
RNA ; 28(2): 162-176, 2022 02.
Article in English | MEDLINE | ID: mdl-34728536

ABSTRACT

Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' m7G capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for m7G caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.


Subject(s)
RNA Isoforms/chemistry , RNA, Messenger/chemistry , Cell Line, Tumor , Humans , Nanopore Sequencing/methods , RNA 3' Polyadenylation Signals , RNA Isoforms/genetics , RNA, Messenger/genetics , Transcriptome
8.
Nucleic Acids Res ; 50(6): 3475-3489, 2022 04 08.
Article in English | MEDLINE | ID: mdl-35244721

ABSTRACT

The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5' cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for the scientific community and provide important insights into the mechanisms that regulate the transcription of SARS-CoV-2 sgRNAs.


Subject(s)
COVID-19 , Nanopores , RNA, Guide, Kinetoplastida/chemistry , COVID-19/genetics , Genome, Viral/genetics , Humans , RNA Caps , RNA, Viral/genetics , RNA, Viral/metabolism , SARS-CoV-2/genetics
9.
Brief Bioinform ; 22(6)2021 11 05.
Article in English | MEDLINE | ID: mdl-33957668

ABSTRACT

Alternative transcription units (ATUs) are dynamically encoded under different conditions and display overlapping patterns (sharing one or more genes) under a specific condition in bacterial genomes. Genome-scale identification of ATUs is essential for studying the emergence of human diseases caused by bacterial organisms. However, it is unrealistic to identify all ATUs using experimental techniques because of the complexity and dynamic nature of ATUs. Here, we present the first-of-its-kind computational framework, named SeqATU, for genome-scale ATU prediction based on next-generation RNA-Seq data. The framework utilizes a convex quadratic programming model to seek an optimum expression combination of all of the to-be-identified ATUs. The predicted ATUs in Escherichia coli reached a precision of 0.77/0.74 and a recall of 0.75/0.76 in the two RNA-Sequencing datasets compared with the benchmarked ATUs from third-generation RNA-Seq data. In addition, the proportion of 5'- or 3'-end genes of the predicted ATUs, having documented transcription factor binding sites and transcription termination sites, was three times greater than that of no 5'- or 3'-end genes. We further evaluated the predicted ATUs by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes functional enrichment analyses. The results suggested that gene pairs frequently encoded in the same ATUs are more functionally related than those that can belong to two distinct ATUs. Overall, these results demonstrated the high reliability of predicted ATUs. We expect that the new insights derived by SeqATU will not only improve the understanding of the transcription mechanism of bacteria but also guide the reconstruction of a genome-scale transcriptional regulatory network.


Subject(s)
Computational Biology/methods , Genome-Wide Association Study/methods , RNA Isoforms , Transcription, Genetic , Algorithms , Bacteria/genetics , Databases, Genetic , Escherichia coli/genetics , Genome, Bacterial , Genomics/methods , Humans , RNA, Messenger/genetics , RNA-Seq , Single-Cell Analysis/methods , Terminator Regions, Genetic , Transcription Initiation Site
10.
Nucleic Acids Res ; 49(19): e113, 2021 11 08.
Article in English | MEDLINE | ID: mdl-34417598

ABSTRACT

DNA methylation is widespread amongst eukaryotes and prokaryotes to modulate gene expression and confer viral resistance. 5-Methylcytosine (m5C) methylation has been described in genomes of a large fraction of bacterial species as part of restriction-modification systems, each composed of a methyltransferase and cognate restriction enzyme. Methylases are site-specific and target sequences vary across organisms. High-throughput methods, such as bisulfite-sequencing can identify m5C at base resolution but require specialized library preparations and single molecule, real-time (SMRT) sequencing usually misses m5C. Here, we present a new method called RIMS-seq (rapid identification of methylase specificity) to simultaneously sequence bacterial genomes and determine m5C methylase specificities using a simple experimental protocol that closely resembles the DNA-seq protocol for Illumina. Importantly, the resulting sequencing quality is identical to DNA-seq, enabling RIMS-seq to substitute standard sequencing of bacterial genomes. Applied to bacteria and synthetic mixed communities, RIMS-seq reveals new methylase specificities, supporting routine study of m5C methylation while sequencing new genomes.


Subject(s)
5-Methylcytosine/metabolism , DNA Modification Methylases/metabolism , DNA Restriction Enzymes/metabolism , Escherichia coli K12/genetics , Genome, Bacterial , High-Throughput Nucleotide Sequencing/methods , Acinetobacter calcoaceticus/enzymology , Acinetobacter calcoaceticus/genetics , Aeromonas hydrophila/enzymology , Aeromonas hydrophila/genetics , Bacillus amyloliquefaciens/enzymology , Bacillus amyloliquefaciens/genetics , Base Sequence , Clostridium acetobutylicum/enzymology , Clostridium acetobutylicum/genetics , DNA Methylation , DNA Modification Methylases/genetics , DNA Restriction Enzymes/genetics , Escherichia coli K12/enzymology , Gene Expression Regulation, Bacterial , Haemophilus/enzymology , Haemophilus/genetics , Haemophilus influenzae/enzymology , Haemophilus influenzae/genetics , Humans , Microbiota/genetics , Sequence Analysis, DNA , Skin/microbiology
11.
PLoS Biol ; 17(4): e3000185, 2019 04.
Article in English | MEDLINE | ID: mdl-30947255

ABSTRACT

Dmrt1 is a highly conserved transcription factor, which is critically involved in regulation of gonad development of vertebrates. In medaka, a duplicate of dmrt1-acting as master sex-determining gene-has a tightly timely and spatially controlled gonadal expression pattern. In addition to transcriptional regulation, a sequence motif in the 3' UTR (D3U-box) mediates transcript stability of dmrt1 mRNAs from medaka and other vertebrates. We show here that in medaka, two RNA-binding proteins with antagonizing properties target this D3U-box, promoting either RNA stabilization in germ cells or degradation in the soma. The D3U-box is also conserved in other germ-cell transcripts, making them responsive to the same RNA binding proteins. The evolutionary conservation of the D3U-box motif within dmrt1 genes of metazoans-together with preserved expression patterns of the targeting RNA binding proteins in subsets of germ cells-suggest that this new mechanism for controlling RNA stability is not restricted to fishes but might also apply to other vertebrates.


Subject(s)
Gene Expression Regulation, Developmental/genetics , Oryzias/genetics , Sex Determination Processes/genetics , 3' Untranslated Regions/genetics , Animals , Biological Evolution , Female , Fish Proteins/genetics , Germ Cells/metabolism , Male , RNA Recognition Motif Proteins/metabolism , RNA Stability/genetics , RNA, Messenger/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism , Vertebrates/metabolism
12.
Genes Dev ; 27(16): 1769-86, 2013 Aug 15.
Article in English | MEDLINE | ID: mdl-23964093

ABSTRACT

The majority of neural stem cells (NSCs) in the adult brain are quiescent, and this fraction increases with aging. Although signaling pathways that promote NSC quiescence have been identified, the transcriptional mechanisms involved are mostly unknown, largely due to lack of a cell culture model. In this study, we first demonstrate that NSC cultures (NS cells) exposed to BMP4 acquire cellular and transcriptional characteristics of quiescent cells. We then use epigenomic profiling to identify enhancers associated with the quiescent NS cell state. Motif enrichment analysis of these enhancers predicts a major role for the nuclear factor one (NFI) family in the gene regulatory network controlling NS cell quiescence. Interestingly, we found that the family member NFIX is robustly induced when NS cells enter quiescence. Using genome-wide location analysis and overexpression and silencing experiments, we demonstrate that NFIX has a major role in the induction of quiescence in cultured NSCs. Transcript profiling of NS cells overexpressing or silenced for Nfix and the phenotypic analysis of the hippocampus of Nfix mutant mice suggest that NFIX controls the quiescent state by regulating the interactions of NSCs with their microenvironment.


Subject(s)
Epigenesis, Genetic , NFI Transcription Factors/metabolism , Neural Stem Cells/cytology , Neural Stem Cells/metabolism , Animals , Bone Morphogenetic Protein 4/pharmacology , Cell Proliferation/drug effects , Cells, Cultured , Enhancer Elements, Genetic , Gene Expression Profiling , Gene Expression Regulation, Developmental/drug effects , HEK293 Cells , Humans , Mice , NFI Transcription Factors/genetics , Neural Stem Cells/drug effects , Protein Binding
13.
Genes Dev ; 25(9): 930-45, 2011 May 01.
Article in English | MEDLINE | ID: mdl-21536733

ABSTRACT

Proneural genes such as Ascl1 are known to promote cell cycle exit and neuronal differentiation when expressed in neural progenitor cells. The mechanisms by which proneural genes activate neurogenesis--and, in particular, the genes that they regulate--however, are mostly unknown. We performed a genome-wide characterization of the transcriptional targets of Ascl1 in the embryonic brain and in neural stem cell cultures by location analysis and expression profiling of embryos overexpressing or mutant for Ascl1. The wide range of molecular and cellular functions represented among these targets suggests that Ascl1 directly controls the specification of neural progenitors as well as the later steps of neuronal differentiation and neurite outgrowth. Surprisingly, Ascl1 also regulates the expression of a large number of genes involved in cell cycle progression, including canonical cell cycle regulators and oncogenic transcription factors. Mutational analysis in the embryonic brain and manipulation of Ascl1 activity in neural stem cell cultures revealed that Ascl1 is indeed required for normal proliferation of neural progenitors. This study identified a novel and unexpected activity of the proneural gene Ascl1, and revealed a direct molecular link between the phase of expansion of neural progenitors and the subsequent phases of cell cycle exit and neuronal differentiation.


Subject(s)
Basic Helix-Loop-Helix Transcription Factors/metabolism , Gene Expression Regulation, Developmental , Neural Stem Cells/cytology , Neural Stem Cells/metabolism , Neurogenesis , Telencephalon/cytology , Telencephalon/embryology , Animals , Basic Helix-Loop-Helix Transcription Factors/genetics , Cell Differentiation , Cell Line , Cell Proliferation , Cells, Cultured , Female , Gene Expression Profiling , Gene Knockdown Techniques , Genome-Wide Association Study , Mice , Pregnancy
14.
Genome Res ; 25(1): 41-56, 2015 01.
Article in English | MEDLINE | ID: mdl-25294244

ABSTRACT

The gene regulatory network (GRN) that supports neural stem cell (NS cell) self-renewal has so far been poorly characterized. Knowledge of the central transcription factors (TFs), the noncoding gene regulatory regions that they bind to, and the genes whose expression they modulate will be crucial in unlocking the full therapeutic potential of these cells. Here, we use DNase-seq in combination with analysis of histone modifications to identify multiple classes of epigenetically and functionally distinct cis-regulatory elements (CREs). Through motif analysis and ChIP-seq, we identify several of the crucial TF regulators of NS cells. At the core of the network are TFs of the basic helix-loop-helix (bHLH), nuclear factor I (NFI), SOX, and FOX families, with CREs often densely bound by several of these different TFs. We use machine learning to highlight several crucial regulatory features of the network that underpin NS cell self-renewal and multipotency. We validate our predictions by functional analysis of the bHLH TF OLIG2. This TF makes an important contribution to NS cell self-renewal by concurrently activating pro-proliferation genes and preventing the untimely activation of genes promoting neuronal differentiation and stem cell quiescence.


Subject(s)
Basic Helix-Loop-Helix Transcription Factors/metabolism , Gene Expression Regulation, Developmental , Gene Regulatory Networks , Nerve Tissue Proteins/metabolism , Neural Stem Cells/cytology , Animals , Basic Helix-Loop-Helix Transcription Factors/genetics , Cell Differentiation , Cells, Cultured , Cluster Analysis , Epigenomics , Logistic Models , Mice , Microarray Analysis , Models, Theoretical , NFI Transcription Factors/genetics , NFI Transcription Factors/metabolism , Nerve Tissue Proteins/genetics , Oligodendrocyte Transcription Factor 2 , Regulatory Sequences, Nucleic Acid , SOX Transcription Factors/genetics , SOX Transcription Factors/metabolism , Sequence Analysis, DNA
15.
Genome Res ; 24(3): 390-400, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24398455

ABSTRACT

Long-range regulatory interactions play an important role in shaping gene-expression programs. However, the genomic features that organize these activities are still poorly characterized. We conducted a large operational analysis to chart the distribution of gene regulatory activities along the mouse genome, using hundreds of insertions of a regulatory sensor. We found that enhancers distribute their activities along broad regions and not in a gene-centric manner, defining large regulatory domains. Remarkably, these domains correlate strongly with the recently described TADs, which partition the genome into distinct self-interacting blocks. Different features, including specific repeats and CTCF-binding sites, correlate with the transition zones separating regulatory domains, and may help to further organize promiscuously distributed regulatory influences within large domains. These findings support a model of genomic organization where TADs confine regulatory activities to specific but large regulatory domains, contributing to the establishment of specific gene expression profiles.


Subject(s)
Binding Sites , Enhancer Elements, Genetic , Animals , CCCTC-Binding Factor , Cell Cycle Proteins/metabolism , Chromosomal Proteins, Non-Histone/metabolism , Embryo, Mammalian , Genome , Mice , Regulatory Sequences, Nucleic Acid , Repetitive Sequences, Nucleic Acid , Repressor Proteins/metabolism , Cohesins
16.
BMC Genomics ; 17: 199, 2016 Mar 08.
Article in English | MEDLINE | ID: mdl-26951544

ABSTRACT

BACKGROUND: The initiating nucleotide found at the 5' end of primary transcripts has a distinctive triphosphorylated end that distinguishes these transcripts from all other RNA species. Recognizing this distinction is key to deconvoluting the primary transcriptome from the plethora of processed transcripts that confound analysis of the transcriptome. The currently available methods do not use targeted enrichment for the 5'end of primary transcripts, but rather attempt to deplete non-targeted RNA. RESULTS: We developed a method, Cappable-seq, for directly enriching for the 5' end of primary transcripts and enabling determination of transcription start sites at single base resolution. This is achieved by enzymatically modifying the 5' triphosphorylated end of RNA with a selectable tag. We first applied Cappable-seq to E. coli, achieving up to 50 fold enrichment of primary transcripts and identifying an unprecedented 16539 transcription start sites (TSS) genome-wide at single base resolution. We also applied Cappable-seq to a mouse cecum sample and identified TSS in a microbiome. CONCLUSIONS: Cappable-seq allows for the first time the capture of the 5' end of primary transcripts. This enables a unique robust TSS determination in bacteria and microbiomes.  In addition to and beyond TSS determination, Cappable-seq depletes ribosomal RNA and reduces the complexity of the transcriptome to a single quantifiable tag per transcript enabling digital profiling of gene expression in any microbiome.


Subject(s)
Escherichia coli/genetics , Gastrointestinal Microbiome/genetics , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Transcription Initiation Site , Animals , Female , Mice , Mice, Inbred C57BL , Promoter Regions, Genetic , RNA, Bacterial/genetics , Transcriptome
19.
PLoS Biol ; 9(11): e1001188, 2011 Nov.
Article in English | MEDLINE | ID: mdl-22069375

ABSTRACT

Evolutionary innovation relies partially on changes in gene regulation. While a growing body of evidence demonstrates that such innovation is generated by functional changes or translocation of regulatory elements via mobile genetic elements, the de novo generation of enhancers from non-regulatory/non-mobile sequences has, to our knowledge, not previously been demonstrated. Here we show evidence for the de novo genesis of enhancers in vertebrates. For this, we took advantage of the massive gene loss following the last whole genome duplication in teleosts to systematically identify regions that have lost their coding capacity but retain sequence conservation with mammals. We found that these regions show enhancer activity while the orthologous coding regions have no regulatory activity. These results demonstrate that these enhancers have been de novo generated in fish. By revealing that minor changes in non-regulatory sequences are sufficient to generate new enhancers, our study highlights an important playground for creating new regulatory variability and evolutionary innovation.


Subject(s)
Enhancer Elements, Genetic , Evolution, Molecular , Smegmamorpha/genetics , Animals , Cloning, Molecular , Computational Biology , Conserved Sequence , Exons , Gene Duplication , Genes, Reporter , Genetic Loci , Genetic Variation , Humans , In Situ Hybridization/methods , Mammals , Mice , Sequence Alignment , Smegmamorpha/metabolism , Synteny , Transcription Factors/genetics , Transcription Factors/metabolism
20.
Front Microbiol ; 15: 1286822, 2024.
Article in English | MEDLINE | ID: mdl-38655080

ABSTRACT

Winged helix (wH) domains, also termed winged helix-turn-helix (wHTH) domains, are widespread in all kingdoms of life and have diverse roles. In the context of DNA binding and DNA modification sensing, some eukaryotic wH domains are known as sensors of non-methylated CpG. In contrast, the prokaryotic wH domains in DpnI and HhiV4I act as sensors of adenine methylation in the 6mApT (N6-methyladenine, 6mA, or N6mA) context. DNA-binding modes and interactions with the probed dinucleotide are vastly different in the two cases. Here, we show that the role of the wH domain as a sensor of adenine methylation is widespread in prokaryotes. We present previously uncharacterized examples of PD-(D/E)XK-wH (FcyTI, Psp4BI), PUA-wH-HNH (HtuIII), wH-GIY-YIG (Ahi29725I, Apa233I), and PLD-wH (Aba4572I, CbaI) fusion endonucleases that sense adenine methylation in the Dam+ Gm6ATC sequence contexts. Representatives of the wH domain endonuclease fusion families with the exception of the PLD-wH family could be purified, and an in vitro preference for adenine methylation in the Dam context could be demonstrated. Like most other modification-dependent restriction endonucleases (MDREs, also called type IV restriction systems), the new fusion endonucleases except those in the PD-(D/E)XK-wH family cleave close to but outside the recognition sequence. Taken together, our data illustrate the widespread combinatorial use of prokaryotic wH domains as adenine methylation readers. Other potential 6mA sensors in modified DNA are also discussed.

SELECTION OF CITATIONS
SEARCH DETAIL