Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 51
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2024 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-39071437

RESUMEN

Methylation patterns in bacteria can be used to study Restriction-Modification (RM) or other defense systems with novel properties. While m4C and m6A methylation is well characterized mainly through PacBio sequencing, the landscape of m5C methylation is under-characterized. To bridge this gap, we performed RIMS-seq2 on microbiomes composed of resolved assemblies of distinct genomes through proximity ligation. This high-throughput approach enables the identification of m5C methylated motifs and links them to cognate methyltransferases directly on native microbiomes without the need to isolate bacterial strains. Methylation patterns can also be identified on viral DNA and compared to host DNA, strengthening evidence for virus-host interaction. Applied to three different microbiomes, the method unveils over 1900 motifs that were deposited in REBASE. The motifs include a novel 8-base recognition site (CATm5CGATG) that was experimentally validated by characterizing its cognate methyltransferase. Our findings suggest that microbiomes harbor arrays of untapped m5C methyltransferase specificities, providing insights to bacterial biology and biotechnological applications.

2.
Genome Res ; 34(6): 904-913, 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-38858087

RESUMEN

Multiomics require concerted recording of independent information, ideally from a single experiment. In this study, we introduce RIMS-seq2, a high-throughput technique to simultaneously sequence genomes and overlay methylation information while requiring only a small modification of the experimental protocol for high-throughput DNA sequencing to include a controlled deamination step. Importantly, the rate of deamination of 5-methylcytosine is negligible and thus does not interfere with standard DNA sequencing and data processing. Thus, RIMS-seq2 libraries from whole- or targeted-genome sequencing show the same germline variation calling accuracy and sensitivity compared with standard DNA-seq. Additionally, regional methylation levels provide an accurate map of the human methylome.


Asunto(s)
Metilación de ADN , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desaminación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Epigenoma , Citosina/metabolismo , 5-Metilcitosina/metabolismo , Análisis de Secuencia de ADN/métodos
3.
Front Microbiol ; 15: 1286822, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38655080

RESUMEN

Winged helix (wH) domains, also termed winged helix-turn-helix (wHTH) domains, are widespread in all kingdoms of life and have diverse roles. In the context of DNA binding and DNA modification sensing, some eukaryotic wH domains are known as sensors of non-methylated CpG. In contrast, the prokaryotic wH domains in DpnI and HhiV4I act as sensors of adenine methylation in the 6mApT (N6-methyladenine, 6mA, or N6mA) context. DNA-binding modes and interactions with the probed dinucleotide are vastly different in the two cases. Here, we show that the role of the wH domain as a sensor of adenine methylation is widespread in prokaryotes. We present previously uncharacterized examples of PD-(D/E)XK-wH (FcyTI, Psp4BI), PUA-wH-HNH (HtuIII), wH-GIY-YIG (Ahi29725I, Apa233I), and PLD-wH (Aba4572I, CbaI) fusion endonucleases that sense adenine methylation in the Dam+ Gm6ATC sequence contexts. Representatives of the wH domain endonuclease fusion families with the exception of the PLD-wH family could be purified, and an in vitro preference for adenine methylation in the Dam context could be demonstrated. Like most other modification-dependent restriction endonucleases (MDREs, also called type IV restriction systems), the new fusion endonucleases except those in the PD-(D/E)XK-wH family cleave close to but outside the recognition sequence. Taken together, our data illustrate the widespread combinatorial use of prokaryotic wH domains as adenine methylation readers. Other potential 6mA sensors in modified DNA are also discussed.

4.
Genome Biol Evol ; 2023 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-37154102

RESUMEN

The intracellular endosymbiotic proteobacteria Wolbachia have evolved across the phyla nematoda and arthropoda. In Wolbachia phylogeny, supergroup F is the only clade known so far with members from both arthropod and filarial nematode hosts and therefore can provide unique insights into their evolution and biology. In this study, 4 new supergroup F Wolbachia genomes have been assembled using a metagenomic assembly and binning approach, wMoz and wMpe from the human filarial parasites Mansonella ozzardi and Mansonella perstans, and wOcae and wMoviF from the blue mason bee Osmia caerulescens and the sheep ked Melophagus ovinus respectively. A comprehensive phylogenomic analysis revealed two distinct lineages of filarial Wolbachia in supergroup F, indicating multiple horizontal transfer events between arthropod and nematode hosts. The analysis also reveals that the evolution of Wolbachia-filaria symbioses is accompanied by a convergent pseudogenization and loss of the bacterioferritin gene, a phenomenon found to be shared by all filarial Wolbachia, even those outside supergroup F. These observations indicate that differences in heme metabolism might be a key feature distinguishing filarial and arthropod Wolbachia. The new genomes provide a valuable resource for further studies on symbiosis, evolution, and the discovery of new antibiotics to treat mansonellosis.

5.
ACS Synth Biol ; 11(12): 4077-4088, 2022 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-36427328

RESUMEN

Control of gene expression is fundamental to cell engineering. Here we demonstrate a set of approaches to tune gene expression in Clostridia using the model Clostridium phytofermentans. Initially, we develop a simple benchtop electroporation method that we use to identify a set of replicating plasmids and resistance markers that can be cotransformed into C. phytofermentans. We define a series of promoters spanning a >100-fold expression range by testing a promoter library driving the expression of a luminescent reporter. By insertion of tet operator sites upstream of the reporter, its expression can be quantitatively altered using the Tet repressor and anhydrotetracycline (aTc). We integrate these methods into an aTc-regulated dCas12a system with which we show in vivo CRISPRi-mediated repression of reporter and fermentation genes in C. phytofermentans. Together, these approaches advance genetic transformation and experimental control of gene expression in Clostridia.


Asunto(s)
Clostridiales , Clostridium , Clostridiales/genética , Regiones Promotoras Genéticas/genética , Clostridium/genética , Clostridium/metabolismo , Expresión Génica
6.
Genome Res ; 32(11-12): 2079-2091, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36332968

RESUMEN

Covalent modifications of genomic DNA are crucial for most organisms to survive. Amplicon-based high-throughput sequencing technologies erase all DNA modifications to retain only sequence information for the four canonical nucleobases, necessitating specialized technologies for ascertaining epigenetic information. To also capture base modification information, we developed Methyl-SNP-seq, a technology that takes advantage of the complementarity of the double helix to extract the methylation and original sequence information from a single DNA molecule. More specifically, Methyl-SNP-seq uses bisulfite conversion of one of the strands to identify cytosine methylation while retaining the original four-bases sequence information on the other strand. As both strands are locked together to link the dual readouts on a single paired-end read, Methyl-SNP-seq allows detecting the methylation status of any DNA even without a reference genome. Because one of the strands retains the original four nucleotide composition, Methyl-SNP-seq can also be used in conjunction with standard sequence-specific probes for targeted enrichment and amplification. We show the usefulness of this technology in a broad spectrum of applications ranging from allele-specific methylation analysis in humans to identification of methyltransferase specificity in complex bacterial communities.


Asunto(s)
Metilación de ADN , Epigenoma , Humanos , Análisis de Secuencia de ADN , ADN/genética , Alelos , Secuenciación de Nucleótidos de Alto Rendimiento , Sulfitos/química
7.
PLoS One ; 17(10): e0276315, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36251663

RESUMEN

The luciferin sulfokinase (coelenterazine sulfotransferase) of Renilla was previously reported to activate the storage form, luciferyl sulfate (coelenterazine sulfate) to luciferin (coelenterazine), the substrate for the luciferase bioluminescence reaction. The gene coding for the coelenterazine sulfotransferase has not been identified. Here we used a combined proteomic/transcriptomic approach to identify and clone the sulfotransferase cDNA. Multiple isoforms of coelenterazine sulfotransferase were identified from the anthozoan Renilla muelleri by intersecting its transcriptome with the LC-MS/MS derived peptide sequences of coelenterazine sulfotransferase purified from Renilla. Two of the isoforms were expressed in E. coli, purified, and partially characterized. The encoded enzymes display sulfotransferase activity that is comparable to that of the native sulfotransferase isolated from Renilla reniformis that was reported in 1970. The bioluminescent assay for sensitive detection of 3'-phosphoadenosine 5'-phosphate (PAP) using the recombinant sulfotransferase is demonstrated.


Asunto(s)
Escherichia coli , Proteómica , Animales , Arilsulfotransferasa , Cromatografía Liquida , ADN Complementario , Escherichia coli/genética , Imidazoles , Luciferasas/genética , Mediciones Luminiscentes , Pirazinas , Renilla/genética , Sulfatos , Sulfotransferasas/genética , Espectrometría de Masas en Tándem
8.
Bio Protoc ; 12(17)2022 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-36245800

RESUMEN

Nucleic acids in living organisms are more complex than the simple combinations of the four canonical nucleotides. Recent advances in biomedical research have led to the discovery of numerous naturally occurring nucleotide modifications and enzymes responsible for the synthesis of such modifications. In turn, these enzymes can be leveraged towards toolkits for DNA and RNA manipulation for epigenetic sequencing or other biotechnological applications. Here, we present the protocol to obtain purified 5-hydroxymethylcytosine carbamoyltransferase enzymes and the associated assays to convert 5-hydroxymethylcytosine to 5-carbamoyloxymethylcytosine in vitro . We include detailed assays using DNA, RNA, and single nucleotide/deoxynucleotide as substrates. These assays can be combined with downstream applications for genetic/epigenetic regulatory mechanism studies and next-generation sequencing purposes.

9.
PLoS Genet ; 18(9): e1010389, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36121836

RESUMEN

Phosphorothioation (PT), in which a non-bridging oxygen is replaced by a sulfur, is one of the rare modifications discovered in bacteria and archaea that occurs on the sugar-phosphate backbone as opposed to the nucleobase moiety of DNA. While PT modification is widespread in the prokaryotic kingdom, how PT modifications are distributed in the genomes and their exact roles in the cell remain to be defined. In this study, we developed a simple and convenient technique called EcoWI-seq based on a modification-dependent restriction endonuclease to identify genomic positions of PT modifications. EcoWI-seq shows similar performance than other PT modification detection techniques and additionally, is easily scalable while requiring little starting material. As a proof of principle, we applied EcoWI-seq to map the PT modifications at base resolution in the genomes of both the Salmonella enterica cerro 87 and E. coli expressing the dnd+ gene cluster. Specifically, we address whether the partial establishment of modified PT positions is a stochastic or deterministic process. EcoWI-seq reveals a systematic usage of the same subset of target sites in clones for which the PT modification has been independently established.


Asunto(s)
Escherichia coli , Salmonella enterica , ADN/genética , Enzimas de Restricción del ADN , ADN Bacteriano/genética , Escherichia coli/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Oxígeno , Fosfatos , Salmonella enterica/genética , Azúcares , Azufre
10.
Front Immunol ; 13: 909904, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35844560

RESUMEN

As the goal of a bacterium is to become bacteria, evolution has imposed continued selections for gene expression. The intracellular pathogen Mycobacterium tuberculosis, the causative agent of tuberculosis, has adopted a fine-tuned response to survive its host's methods to aggressively eradicate invaders. The development of microarrays and later RNA sequencing has led to a better understanding of biological processes controlling the relationship between host and pathogens. In this study, RNA-seq was performed to detail the transcriptomes of M. tuberculosis grown in various conditions related to stresses endured by M. tuberculosis during host infection and to delineate a general stress response incurring during persisting macrophage stresses. M. tuberculosis was subjected to long-term growth, nutrient starvation, hypoxic and acidic environments. The commonalities between these stresses point to M. tuberculosis maneuvering to exploit propionate metabolism for lipid synthesis or to withstand propionate toxicity whilst in the intracellular environment. While nearly all stresses led to a general shutdown of most biological processes, up-regulation of pathways involved in the synthesis of amino acids, cofactors, and lipids were observed only in hypoxic M. tuberculosis. This data reveals genes and gene cohorts that are specifically or exclusively induced during all of these persisting stresses. Such knowledge could be used to design novel drug targets or to define possible M. tuberculosis vulnerabilities for vaccine development. Furthermore, the disruption of specific functions from this gene set will enhance our understanding of the evolutionary forces that have caused the tubercle bacillus to be a highly successful pathogen.


Asunto(s)
Mycobacterium tuberculosis , Tuberculosis Ganglionar , Humanos , Macrófagos/microbiología , Mycobacterium tuberculosis/fisiología , Propionatos/metabolismo , Transcriptoma
11.
PLoS Genet ; 18(4): e1009943, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35377874

RESUMEN

Understanding mechanisms that shape horizontal exchange in prokaryotes is a key problem in biology. A major limit on DNA entry is imposed by restriction-modification (RM) processes that depend on the pattern of DNA modification at host-specified sites. In classical RM, endonucleolytic DNA cleavage follows detection of unprotected sites on entering DNA. Recent investigation has uncovered BREX (BacteRiophage EXclusion) systems. These RM-like activities employ host protection by DNA modification, but immediate replication arrest occurs without evident of nuclease action on unmodified phage DNA. Here we show that the historical stySA RM locus of Salmonella enterica sv Typhimurium is a variant BREX system. A laboratory strain disabled for both the restriction and methylation activity of StySA nevertheless has wild type sequence in pglX, the modification gene homolog. Instead, flanking genes pglZ and brxC each carry multiple mutations (µ) in their C-terminal domains. We further investigate this system in situ, replacing the mutated pglZµ and brxCµ genes with the WT counterpart. PglZ-WT supports methylation in the presence of either BrxCµ or BrxC-WT but not in the presence of a deletion/insertion allele, ΔbrxC::cat. Restriction requires both BrxC-WT and PglZ-WT, implicating the BrxC C-terminus specifically in restriction activity. These results suggests that while BrxC, PglZ and PglX are principal components of the BREX modification activity, BrxL is required for restriction only. Furthermore, we show that a partial disruption of brxL disrupts transcription globally.


Asunto(s)
Bacteriófagos , Bacteriófagos/genética , Bacteriófagos/metabolismo , ADN Viral , Metilación , Salmonella typhimurium/genética , Salmonella typhimurium/metabolismo
12.
Nucleic Acids Res ; 50(6): 3475-3489, 2022 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-35244721

RESUMEN

The SARS-CoV-2 virus has a complex transcriptome characterised by multiple, nested subgenomic RNAsused to express structural and accessory proteins. Long-read sequencing technologies such as nanopore direct RNA sequencing can recover full-length transcripts, greatly simplifying the assembly of structurally complex RNAs. However, these techniques do not detect the 5' cap, thus preventing reliable identification and quantification of full-length, coding transcript models. Here we used Nanopore ReCappable Sequencing (NRCeq), a new technique that can identify capped full-length RNAs, to assemble a complete annotation of SARS-CoV-2 sgRNAs and annotate the location of capping sites across the viral genome. We obtained robust estimates of sgRNA expression across cell lines and viral isolates and identified novel canonical and non-canonical sgRNAs, including one that uses a previously un-annotated leader-to-body junction site. The data generated in this work constitute a useful resource for the scientific community and provide important insights into the mechanisms that regulate the transcription of SARS-CoV-2 sgRNAs.


Asunto(s)
COVID-19 , Nanoporos , ARN Guía de Kinetoplastida/química , COVID-19/genética , Genoma Viral/genética , Humanos , Caperuzas de ARN , ARN Viral/genética , ARN Viral/metabolismo , SARS-CoV-2/genética
13.
Genome Res ; 32(1): 162-174, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34815308

RESUMEN

Determination of eukaryotic transcription start sites (TSSs) has been based on methods that require the cap structure at the 5' end of transcripts derived from Pol II RNA polymerase. Consequently, these methods do not reveal TSSs derived from the other RNA polymerases that also play critical roles in various cell functions. To address this limitation, we developed ReCappable-seq, which comprehensively identifies TSS for both Pol II and non-Pol II transcripts at single-nucleotide resolution. The method relies on specific enzymatic exchange of 5' m7G caps and 5' triphosphates with a selectable tag. When applied to human transcriptomes, ReCappable-seq identifies Pol II TSSs that are in agreement with orthogonal methods such as CAGE. Additionally, ReCappable-seq reveals a rich landscape of TSSs associated with Pol III transcripts that have not previously been amenable to study at genome-wide scale. Novel TSS from non-Pol II transcription can be located in the nuclear and mitochondrial genomes. ReCappable-seq interrogates the regulatory landscape of coding and noncoding RNA concurrently and enables the classification of epigenetic profiles associated with Pol II and non-Pol II TSS.


Asunto(s)
ARN Polimerasas Dirigidas por ADN , ARN Polimerasa II , ARN Polimerasa II/genética , ARN Polimerasa II/metabolismo , ARN no Traducido , Sitio de Iniciación de la Transcripción , Transcriptoma
14.
RNA ; 28(2): 162-176, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34728536

RESUMEN

Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' m7G caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' m7G capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for m7G caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.


Asunto(s)
Isoformas de ARN/química , ARN Mensajero/química , Línea Celular Tumoral , Humanos , Secuenciación de Nanoporos/métodos , Señales de Poliadenilación de ARN 3' , Isoformas de ARN/genética , ARN Mensajero/genética , Transcriptoma
15.
Elife ; 102021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34747693

RESUMEN

Shotgun metagenomic sequencing is a powerful approach to study microbiomes in an unbiased manner and of increasing relevance for identifying novel enzymatic functions. However, the potential of metagenomics to relate from microbiome composition to function has thus far been underutilized. Here, we introduce the Metagenomics Genome-Phenome Association (MetaGPA) study framework, which allows linking genetic information in metagenomes with a dedicated functional phenotype. We applied MetaGPA to identify enzymes associated with cytosine modifications in environmental samples. From the 2365 genes that met our significance criteria, we confirm known pathways for cytosine modifications and proposed novel cytosine-modifying mechanisms. Specifically, we characterized and identified a novel nucleic acid-modifying enzyme, 5-hydroxymethylcytosine carbamoyltransferase, that catalyzes the formation of a previously unknown cytosine modification, 5-carbamoyloxymethylcytosine, in DNA and RNA. Our work introduces MetaGPA as a novel and versatile tool for advancing functional metagenomics.


Many industrial processes, such as starch processing and oil refinement, use chemicals that cause harm to the environment. These can often be switched to more sustainable biological processes that are powered by proteins called enzymes. Enzymes are micro-factories that speed up biochemical reactions in most living things. Communities of microorganisms (also known as microbiomes) are an amazing but often untapped resource for discovering enzymes that can be harnessed for industrial purposes. To gain a better picture of the microbes present within a population, researchers often extract and sequence the genetic material of all microorganisms in an environmental sample, also known as the metagenome. While current methods for analyzing the metagenome are good at identifying new species, they often provide limited information about the microorganism's functional role within the community. This makes it difficult to find new enzymes that may be useful for industry. Here, Yang, Lin et al. have developed a new technique called Metagenomics Genome-Phenome Association, or MetaGPA for short. The method works in a similar way to genome-wide association studies (GWAS) which are used to identify genes involved in human disease. However, instead of disease associated genes in humans, MetaGPA finds microbial genes that are associated with a biological process useful for biotechnology. Like GWAS, the new approach created by Yang, Lin et al. compares two groups: the first contains microorganisms that carry out a specific process, and the second contains all organisms in the microbiome. The metagenome of each group is extracted and a computational pipeline is then applied to identify genes, including those coding for enzymes, that are found more often in the group performing the desired task. To test the technique, Yang, Lin et al. used MetGPA to find new enzymes involved in DNA modification. Microbiome samples were collected from coastal water and sewage, and the computational pipeline was applied to discover genes that are associated with this process. Further analysis revealed that one of the identified genes codes for an enzyme that introduces a previously unknown change to DNA. MetaGPA could be applied to other processes and microbiomes, and, if successful, may help researchers to identify more diverse enzymes than is currently available. This could scale up the discovery of new enzymes that can be used to power industrial reactions.


Asunto(s)
Citosina/metabolismo , ADN Bacteriano/metabolismo , Escherichia coli K12/genética , Genoma Bacteriano , Microbiota/genética , ARN Bacteriano/metabolismo
16.
Nucleic Acids Res ; 49(19): e113, 2021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34417598

RESUMEN

DNA methylation is widespread amongst eukaryotes and prokaryotes to modulate gene expression and confer viral resistance. 5-Methylcytosine (m5C) methylation has been described in genomes of a large fraction of bacterial species as part of restriction-modification systems, each composed of a methyltransferase and cognate restriction enzyme. Methylases are site-specific and target sequences vary across organisms. High-throughput methods, such as bisulfite-sequencing can identify m5C at base resolution but require specialized library preparations and single molecule, real-time (SMRT) sequencing usually misses m5C. Here, we present a new method called RIMS-seq (rapid identification of methylase specificity) to simultaneously sequence bacterial genomes and determine m5C methylase specificities using a simple experimental protocol that closely resembles the DNA-seq protocol for Illumina. Importantly, the resulting sequencing quality is identical to DNA-seq, enabling RIMS-seq to substitute standard sequencing of bacterial genomes. Applied to bacteria and synthetic mixed communities, RIMS-seq reveals new methylase specificities, supporting routine study of m5C methylation while sequencing new genomes.


Asunto(s)
5-Metilcitosina/metabolismo , Metilasas de Modificación del ADN/metabolismo , Enzimas de Restricción del ADN/metabolismo , Escherichia coli K12/genética , Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Acinetobacter calcoaceticus/enzimología , Acinetobacter calcoaceticus/genética , Aeromonas hydrophila/enzimología , Aeromonas hydrophila/genética , Bacillus amyloliquefaciens/enzimología , Bacillus amyloliquefaciens/genética , Secuencia de Bases , Clostridium acetobutylicum/enzimología , Clostridium acetobutylicum/genética , Metilación de ADN , Metilasas de Modificación del ADN/genética , Enzimas de Restricción del ADN/genética , Escherichia coli K12/enzimología , Regulación Bacteriana de la Expresión Génica , Haemophilus/enzimología , Haemophilus/genética , Haemophilus influenzae/enzimología , Haemophilus influenzae/genética , Humanos , Microbiota/genética , Análisis de Secuencia de ADN , Piel/microbiología
17.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33957668

RESUMEN

Alternative transcription units (ATUs) are dynamically encoded under different conditions and display overlapping patterns (sharing one or more genes) under a specific condition in bacterial genomes. Genome-scale identification of ATUs is essential for studying the emergence of human diseases caused by bacterial organisms. However, it is unrealistic to identify all ATUs using experimental techniques because of the complexity and dynamic nature of ATUs. Here, we present the first-of-its-kind computational framework, named SeqATU, for genome-scale ATU prediction based on next-generation RNA-Seq data. The framework utilizes a convex quadratic programming model to seek an optimum expression combination of all of the to-be-identified ATUs. The predicted ATUs in Escherichia coli reached a precision of 0.77/0.74 and a recall of 0.75/0.76 in the two RNA-Sequencing datasets compared with the benchmarked ATUs from third-generation RNA-Seq data. In addition, the proportion of 5'- or 3'-end genes of the predicted ATUs, having documented transcription factor binding sites and transcription termination sites, was three times greater than that of no 5'- or 3'-end genes. We further evaluated the predicted ATUs by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes functional enrichment analyses. The results suggested that gene pairs frequently encoded in the same ATUs are more functionally related than those that can belong to two distinct ATUs. Overall, these results demonstrated the high reliability of predicted ATUs. We expect that the new insights derived by SeqATU will not only improve the understanding of the transcription mechanism of bacteria but also guide the reconstruction of a genome-scale transcriptional regulatory network.


Asunto(s)
Biología Computacional/métodos , Estudio de Asociación del Genoma Completo/métodos , Isoformas de ARN , Transcripción Genética , Algoritmos , Bacterias/genética , Bases de Datos Genéticas , Escherichia coli/genética , Genoma Bacteriano , Genómica/métodos , Humanos , ARN Mensajero/genética , RNA-Seq , Análisis de la Célula Individual/métodos , Regiones Terminadoras Genéticas , Sitio de Iniciación de la Transcripción
18.
Genome Res ; 31(2): 291-300, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33468551

RESUMEN

The predominant methodology for DNA methylation analysis relies on the chemical deamination by sodium bisulfite of unmodified cytosine to uracil to permit the differential readout of methylated cytosines. Bisulfite treatment damages the DNA, leading to fragmentation and loss of long-range methylation information. To overcome this limitation of bisulfite-treated DNA, we applied a new enzymatic deamination approach, termed enzymatic methyl-seq (EM-seq), to long-range sequencing technologies. Our methodology, named long-read enzymatic modification sequencing (LR-EM-seq), preserves the integrity of DNA, allowing long-range methylation profiling of 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) over multikilobase length of genomic DNA. When applied to known differentially methylated regions (DMRs), LR-EM-seq achieves phasing of >5 kb, resulting in broader and better defined DMRs compared with that previously reported. This result showed the importance of phasing methylation for biologically relevant questions and the applicability of LR-EM-seq for long-range epigenetic analysis at single-molecule and single-nucleotide resolution.

19.
Front Mol Biosci ; 8: 734154, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34988112

RESUMEN

Transposable elements (TE) are mobile genetic elements, present in all domains of life. They commonly encode a single transposase enzyme, that performs the excision and reintegration reactions, and these enzymes have been used in mutagenesis and creation of next-generation sequencing libraries. All transposases have some bias in the DNA sequence they bind to when reintegrating the TE DNA. We sought to identify a transposase that showed minimal sequence bias and could be produced recombinantly, using information from the literature and a novel bioinformatic analysis, resulting in the selection of the hATx-6 transposase from Hydra vulgaris (aka Hydra magnipapillata) for further study. This transposase was tested and shown to be active both in vitro and in vivo, and we were able to demonstrate very low sequence bias in its integration preference. This transposase could be an excellent candidate for use in biotechnology, such as the creation of next-generation sequencing libraries.

20.
DNA Repair (Amst) ; 80: 36-44, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31247470

RESUMEN

RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire genome. Here, we use RADAR-seq to measure the frequency and map the location of ribonucleotides in wild-type and RNaseH2-deficient E. coli and Thermococcus kodakarensis strains. Additionally, by tracking ribonucleotides incorporated during in vivo lagging strand DNA synthesis, we determined the replication initiation point in E. coli, and its relation to the origin of replication (oriC). RADAR-seq was also used to map cyclobutane pyrimidine dimers (CPDs) in Escherichia coli (E. coli) genomic DNA exposed to UV-radiation. On a broader scale, RADAR-seq can be applied to understand formation and repair of DNA damage, the correlation between DNA damage and disease initiation and progression, and complex biological pathways, including DNA replication.


Asunto(s)
Daño del ADN , Reparación del ADN , Genoma Arqueal , Genoma Bacteriano , Pruebas de Mutagenicidad/métodos , Análisis de Secuencia de ADN/métodos , Replicación del ADN , ADN de Archaea , ADN Bacteriano/efectos de la radiación , Escherichia coli/genética , Escherichia coli/efectos de la radiación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Dímeros de Pirimidina , Ribonucleótidos , Thermococcus/genética , Rayos Ultravioleta
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...