RESUMO
Asgard archaea are of great interest as the progenitors of Eukaryotes, but little is known about the mobile genetic elements (MGEs) that may shape their ongoing evolution. Here, we describe MGEs that replicate in Atabeyarchaeia, a wetland Asgard archaea lineage represented by two complete genomes. We used soil depth-resolved population metagenomic data sets to track 18 MGEs for which genome structures were defined and precise chromosome integration sites could be identified for confident host linkage. Additionally, we identified a complete 20.67 kbp circular plasmid and two family-level groups of viruses linked to Atabeyarchaeia, via CRISPR spacer targeting. Closely related 40 kbp viruses possess a hypervariable genomic region encoding combinations of specific genes for small cysteine-rich proteins structurally similar to restriction-homing endonucleases. One 10.9 kbp integrative conjugative element (ICE) integrates genomically into the Atabeyarchaeum deiterrae-1 chromosome and has a 2.5 kbp circularizable element integrated within it. The 10.9 kbp ICE encodes an expressed Type IIG restriction-modification system with a sequence specificity matching an active methylation motif identified by Pacific Biosciences (PacBio) high-accuracy long-read (HiFi) metagenomic sequencing. Restriction-modification of Atabeyarchaeia differs from that of another coexisting Asgard archaea, Freyarchaeia, which has few identified MGEs but possesses diverse defense mechanisms, including DISARM and Hachiman, not found in Atabeyarchaeia. Overall, defense systems and methylation mechanisms of Asgard archaea likely modulate their interactions with MGEs, and integration/excision and copy number variation of MGEs in turn enable host genetic versatility.
Assuntos
Archaea , Genoma Arqueal , Sequências Repetitivas Dispersas , Archaea/genética , Plasmídeos/genética , Filogenia , Metagenômica/métodosRESUMO
The amount of bacterial and archaeal genome sequence and methylome data has greatly increased over the last decade, enabling new insights into the functional roles of DNA methylation in these organisms. Methyltransferases (MTases), the enzymes responsible for DNA methylation, are exchanged between prokaryotes through horizontal gene transfer and can function either as part of restriction-modification systems or in apparent isolation as single (orphan) genes. The patterns of DNA methylation they confer on the host chromosome can have significant effects on gene expression, DNA replication, and other cellular processes. Some processes require very stable patterns of methylation, resulting in conservation of persistent MTases in a particular lineage. Other processes require patterns that are more dynamic yet more predictable than what is afforded by horizontal gene transfer and gene loss, resulting in phase-variable or recombination-driven MTase alleles. In this review, we discuss what is currently known about the functions of DNA methylation in prokaryotes in light of these evolutionary patterns.
Assuntos
Metilação de DNA , Epigenômica , Enzimas de Restrição-Modificação do DNA/genética , Enzimas de Restrição-Modificação do DNA/metabolismo , Metiltransferases/genética , Metiltransferases/metabolismo , Células Procarióticas/metabolismoRESUMO
REBASE is a comprehensive and extensively curated database of information about the components of restriction-modification (RM) systems. It is fully referenced and provides information about the recognition and cleavage sites for both restriction enzymes and DNA methyltransferases together with their commercial availability, methylation sensitivity, crystal and sequence data. All completely sequenced genomes and select shotgun sequences are analyzed for RM system components. When PacBio sequence data is available, the recognition sequences of many DNA methyltransferases (MTases) can be determined. This has led to an explosive growth in the number of well-characterized MTases in REBASE. The contents of REBASE may be browsed from the web rebase.neb.com and selected compilations can be downloaded by FTP (ftp.neb.com). Monthly updates are also available via email.
Assuntos
Metilação de DNA , Metilases de Modificação do DNA , Bases de Dados Factuais , Enzimas de Restrição do DNA/metabolismo , Metilases de Modificação do DNA/metabolismo , DNA/genética , Enzimas de Restrição-Modificação do DNA/genéticaRESUMO
How do we scale biological science to the demand of next generation biology and medicine to keep track of the facts, predictions, and hypotheses? These days, enormous amounts of DNA sequence and other omics data are generated. Since these data contain the blueprint for life, it is imperative that we interpret it accurately. The abundance of DNA is only one part of the challenge. Artificial Intelligence (AI) and network methods routinely build on large screens, single cell technologies, proteomics, and other modalities to infer or predict biological functions and phenotypes associated with proteins, pathways, and organisms. As a first step, how do we systematically trace the provenance of knowledge from experimental ground truth to gene function predictions and annotations? Here, we review the main challenges in tracking the evolution of biological knowledge and propose several specific solutions to provenance and computational tracing of evidence in functional linkage networks.
Assuntos
Big Data , Redes Reguladoras de Genes , Genômica/estatística & dados numéricos , Algoritmos , Inteligência Artificial , Biologia Computacional , Ligação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Genéticos , Proteômica/estatística & dados numéricos , Biologia Sintética , Biologia de SistemasRESUMO
DNA methylation is widespread amongst eukaryotes and prokaryotes to modulate gene expression and confer viral resistance. 5-Methylcytosine (m5C) methylation has been described in genomes of a large fraction of bacterial species as part of restriction-modification systems, each composed of a methyltransferase and cognate restriction enzyme. Methylases are site-specific and target sequences vary across organisms. High-throughput methods, such as bisulfite-sequencing can identify m5C at base resolution but require specialized library preparations and single molecule, real-time (SMRT) sequencing usually misses m5C. Here, we present a new method called RIMS-seq (rapid identification of methylase specificity) to simultaneously sequence bacterial genomes and determine m5C methylase specificities using a simple experimental protocol that closely resembles the DNA-seq protocol for Illumina. Importantly, the resulting sequencing quality is identical to DNA-seq, enabling RIMS-seq to substitute standard sequencing of bacterial genomes. Applied to bacteria and synthetic mixed communities, RIMS-seq reveals new methylase specificities, supporting routine study of m5C methylation while sequencing new genomes.
Assuntos
5-Metilcitosina/metabolismo , Metilases de Modificação do DNA/metabolismo , Enzimas de Restrição do DNA/metabolismo , Escherichia coli K12/genética , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Acinetobacter calcoaceticus/enzimologia , Acinetobacter calcoaceticus/genética , Aeromonas hydrophila/enzimologia , Aeromonas hydrophila/genética , Bacillus amyloliquefaciens/enzimologia , Bacillus amyloliquefaciens/genética , Sequência de Bases , Clostridium acetobutylicum/enzimologia , Clostridium acetobutylicum/genética , Metilação de DNA , Metilases de Modificação do DNA/genética , Enzimas de Restrição do DNA/genética , Escherichia coli K12/enzimologia , Regulação Bacteriana da Expressão Gênica , Haemophilus/enzimologia , Haemophilus/genética , Haemophilus influenzae/enzimologia , Haemophilus influenzae/genética , Humanos , Microbiota/genética , Análise de Sequência de DNA , Pele/microbiologiaRESUMO
Analysis of genomic DNA from pathogenic strains of Burkholderia cenocepacia J2315 and Escherichia coli O104:H4 revealed the presence of two unusual MTase genes. Both are plasmid-borne ORFs, carried by pBCA072 for B. cenocepacia J2315 and pESBL for E. coli O104:H4. Pacific Biosciences SMRT sequencing was used to investigate DNA methyltransferases M.BceJIII and M.EcoGIX, using artificial constructs. Mating properties of engineered pESBL derivatives were also investigated. Both MTases yield promiscuous m6A modification of single strands, in the context SAY (where S = C or G and Y = C or T). Strikingly, this methylation is asymmetric in vivo, detected almost exclusively on one DNA strand, and is incomplete: typically, around 40% of susceptible motifs are modified. Genetic and biochemical studies suggest that enzyme action depends on replication mode: DNA Polymerase I (PolI)-dependent ColE1 and p15A origins support asymmetric modification, while the PolI-independent pSC101 origin does not. An MTase-PolI complex may enable discrimination of PolI-dependent and independent plasmid origins. M.EcoGIX helps to establish pESBL in new hosts by blocking the action of restriction enzymes, in an orientation-dependent fashion. Expression and action appear to occur on the entering single strand in the recipient, early in conjugal transfer, until lagging-strand replication creates the double-stranded form.
Assuntos
Metilação de DNA/genética , DNA Polimerase I/genética , DNA de Cadeia Simples/genética , Metiltransferases/genética , Proteínas de Bactérias/genética , Burkholderia cenocepacia/genética , Replicação do DNA/genética , Escherichia coli O104/genética , Proteínas de Escherichia coli/genética , Genoma Bacteriano/genética , Plasmídeos/genética , Proteínas Ribossômicas/genéticaRESUMO
HhaI, a Type II restriction endonuclease, recognizes the symmetric sequence 5'-GCG↓C-3' in duplex DNA and cleaves ('↓') to produce fragments with 2-base, 3'-overhangs. We determined the structure of HhaI in complex with cognate DNA at an ultra-high atomic resolution of 1.0 Å. Most restriction enzymes act as dimers with two catalytic sites, and cleave the two strands of duplex DNA simultaneously, in a single binding event. HhaI, in contrast, acts as a monomer with only one catalytic site, and cleaves the DNA strands sequentially, one after the other. HhaI comprises three domains, each consisting of a mixed five-stranded ß sheet with a defined function. The first domain contains the catalytic-site; the second contains residues for sequence recognition; and the third contributes to non-specific DNA binding. The active-site belongs to the 'PD-D/EXK' superfamily of nucleases and contains the motif SD-X11-EAK. The first two domains are similar in structure to two other monomeric restriction enzymes, HinP1I (G↓CGC) and MspI (C↓CGG), which produce fragments with 5'-overhangs. The third domain, present only in HhaI, shifts the positions of the recognition residues relative to the catalytic site enabling this enzyme to cleave the recognition sequence at a different position. The structure of M.HhaI, the biological methyltransferase partner of HhaI, was determined earlier. Together, these two structures represent the first natural pair of restriction-modification enzymes to be characterized in atomic detail.
Assuntos
DNA/ultraestrutura , Desoxirribonucleases de Sítio Específico do Tipo II/ultraestrutura , Conformação de Ácido Nucleico , Conformação Proteica , Domínio Catalítico , Cristalografia por Raios X , DNA/química , DNA/genética , Enzimas de Restrição do DNA/química , Enzimas de Restrição do DNA/genética , Enzimas de Restrição do DNA/ultraestrutura , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/ultraestrutura , Desoxirribonucleases de Sítio Específico do Tipo II/química , Desoxirribonucleases de Sítio Específico do Tipo II/genética , Haemophilus/química , Haemophilus/enzimologia , Ligação Proteica/genéticaRESUMO
The genomes of gut Bacteroidales contain numerous invertible regions, many of which contain promoters that dictate phase-variable synthesis of surface molecules such as polysaccharides, fimbriae, and outer surface proteins. Here, we characterize a different type of phase-variable system of Bacteroides fragilis, a Type I restriction modification system (R-M). We show that reversible DNA inversions within this R-M locus leads to the generation of eight specificity proteins with distinct recognition sites. In vitro grown bacteria have a different proportion of specificity gene combinations at the expression locus than bacteria isolated from the mammalian gut. By creating mutants, each able to produce only one specificity protein from this region, we identified the R-M recognition sites of four of these S-proteins using SMRT sequencing. Transcriptome analysis revealed that the locked specificity mutants, whether grown in vitro or isolated from the mammalian gut, have distinct transcriptional profiles, likely creating different phenotypes, one of which was confirmed. Genomic analyses of diverse strains of Bacteroidetes from both host-associated and environmental sources reveal the ubiquity of phase-variable R-M systems in this phylum.
Assuntos
Proteínas de Bactérias/metabolismo , Bacteroides fragilis/enzimologia , Enzimas de Restrição-Modificação do DNA/metabolismo , Microbioma Gastrointestinal , Animais , Proteínas de Bactérias/genética , Enzimas de Restrição-Modificação do DNA/genética , Humanos , Camundongos , Mutação , TranscriptomaRESUMO
Type I restriction-modification (R-M) systems consist of a DNA endonuclease (HsdR, HsdM and HsdS subunits) and methyltransferase (HsdM and HsdS subunits). The hsdS sequences flanked by inverted repeats (referred to as epigenetic invertons) in certain Type I R-M systems undergo invertase-catalyzed inversions. Previous studies in Streptococcus pneumoniae have shown that hsdS inversions within clonal populations produce subpopulations with profound differences in the methylome, cellular physiology and virulence. In this study, we bioinformatically identified six major clades of the tyrosine and serine family invertases homologs from 16 bacterial phyla, which potentially catalyze hsdS inversions in the epigenetic invertons. In particular, the epigenetic invertons are highly enriched in host-associated bacteria. We further verified hsdS inversions in the Type I R-M systems of four representative host-associated bacteria and found that each of the resultant hsdS allelic variants specifies methylation of a unique DNA sequence. In addition, transcriptome analysis revealed that hsdS allelic variations in Enterococcus faecalis exert significant impact on gene expression. These findings indicate that epigenetic switches driven by invertases in the epigenetic invertons broadly operate in the host-associated bacteria, which may broadly contribute to bacterial host adaptation and virulence beyond the role of the Type I R-M systems against phage infection.
Assuntos
Proteínas de Bactérias/genética , Enzimas de Restrição-Modificação do DNA/genética , Epigênese Genética , Regulação Bacteriana da Expressão Gênica , Bacteroides fragilis/genética , Metilação de DNA , DNA Bacteriano/química , Enterococcus faecalis/genética , Sequências Repetidas Invertidas , Streptococcus agalactiae/genética , Treponema denticola/genéticaRESUMO
We describe the cloning, expression and characterization of the first truly non-specific adenine DNA methyltransferase, M.EcoGII. It is encoded in the genome of the pathogenic strain Escherichia coli O104:H4 C227-11, where it appears to reside on a cryptic prophage, but is not expressed. However, when the gene encoding M.EcoGII is expressed in vivo - using a high copy pRRS plasmid vector and a methylation-deficient E. coli host-extensive in vivo adenine methylation activity is revealed. M.EcoGII methylates adenine residues in any DNA sequence context and this activity extends to dA and rA bases in either strand of a DNA:RNA-hybrid oligonucleotide duplex and to rA bases in RNAs prepared by in vitro transcription. Using oligonucleotide and bacteriophage M13mp18 virion DNA substrates, we find that M.EcoGII also methylates single-stranded DNA in vitro and that this activity is only slightly less robust than that observed using equivalent double-stranded DNAs. In vitro assays, using purified recombinant M.EcoGII enzyme, demonstrate that up to 99% of dA bases in duplex DNA substrates can be methylated thereby rendering them insensitive to cleavage by multiple restriction endonucleases. These properties suggest that the enzyme could also be used for high resolution mapping of protein binding sites in DNA and RNA substrates.
Assuntos
Enzimas de Restrição do DNA/metabolismo , Escherichia coli/genética , Prófagos/enzimologia , DNA Metiltransferases Sítio Específica (Adenina-Específica)/metabolismo , Adenina/metabolismo , Sequência de Bases , Metilação de DNA , Enzimas de Restrição do DNA/genética , DNA de Cadeia Simples/genética , DNA de Cadeia Simples/metabolismo , Eletroforese em Gel de Poliacrilamida , Escherichia coli/virologia , Prófagos/genética , Ligação Proteica , RNA de Cadeia Dupla/genética , RNA de Cadeia Dupla/metabolismo , DNA Metiltransferases Sítio Específica (Adenina-Específica)/genética , Especificidade por SubstratoRESUMO
The creation of restriction enzymes with programmable DNA-binding and -cleavage specificities has long been a goal of modern biology. The recently discovered Type IIL MmeI family of restriction-and-modification (RM) enzymes that possess a shared target recognition domain provides a framework for engineering such new specificities. However, a lack of structural information on Type IIL enzymes has limited the repertoire that can be rationally engineered. We report here a crystal structure of MmeI in complex with its DNA substrate and an S-adenosylmethionine analog (Sinefungin). The structure uncovers for the first time the interactions that underlie MmeI-DNA recognition and methylation (5'-TCCRAC-3'; R = purine) and provides a molecular basis for changing specificity at four of the six base pairs of the recognition sequence (5'-TCCRAC-3'). Surprisingly, the enzyme is resilient to specificity changes at the first position of the recognition sequence (5'-TCCRAC-3'). Collectively, the structure provides a basis for engineering further derivatives of MmeI and delineates which base pairs of the recognition sequence are more amenable to alterations than others.
Assuntos
DNA/química , Desoxirribonucleases de Sítio Específico do Tipo II/química , Sequência de Bases , Metilação de DNA , Hidrólise , Dados de Sequência MolecularRESUMO
Staphylococcus aureus displays a clonal population structure in which horizontal gene transfer between different lineages is extremely rare. This is due, in part, to the presence of a Type I DNA restriction-modification (RM) system given the generic name of Sau1, which maintains different patterns of methylation on specific target sequences on the genomes of different lineages. We have determined the target sequences recognized by the Sau1 Type I RM systems present in a wide range of the most prevalent S. aureus lineages and assigned the sequences recognized to particular target recognition domains within the RM enzymes. We used a range of biochemical assays on purified enzymes and single molecule real-time sequencing on genomic DNA to determine these target sequences and their patterns of methylation. Knowledge of the main target sequences for Sau1 will facilitate the synthesis of new vectors for transformation of the most prevalent lineages of this 'untransformable' bacterium.
Assuntos
Metilases de Modificação do DNA/química , Metilases de Modificação do DNA/metabolismo , Desoxirribonucleases de Sítio Específico do Tipo I/química , Desoxirribonucleases de Sítio Específico do Tipo I/metabolismo , Staphylococcus aureus/enzimologia , Sequência de Aminoácidos , DNA/química , DNA/metabolismo , Domínios Proteicos , Análise de Sequência de DNA , Staphylococcus aureus/genética , Transformação BacterianaRESUMO
Two restriction-modification systems have been previously discovered in Thermus aquaticus YT-1. TaqI is a 263-amino acid (aa) Type IIP restriction enzyme that recognizes and cleaves within the symmetric sequence 5'-TCGA-3'. TaqII, in contrast, is a 1105-aa Type IIC restriction-and-modification enzyme, one of a family of Thermus homologs. TaqII was originally reported to recognize two different asymmetric sequences: 5'-GACCGA-3' and 5'-CACCCA-3'. We previously cloned the taqIIRM gene, purified the recombinant protein from Escherichia coli, and showed that TaqII recognizes the 5'-GACCGA-3' sequence only. Here, we report the discovery, isolation, and characterization of TaqIII, the third R-M system from T. aquaticus YT-1. TaqIII is a 1101-aa Type IIC/IIL enzyme and recognizes the 5'-CACCCA-3' sequence previously attributed to TaqII. The cleavage site is 11/9 nucleotides downstream of the A residue. The enzyme exhibits striking biochemical similarity to TaqII. The 93% identity between their aa sequences suggests that they have a common evolutionary origin. The genes are located on two separate plasmids, and are probably paralogs or pseudoparalogs. Putative positions and aa that specify DNA recognition were identified and recognition motifs for 6 uncharacterized Thermus-family enzymes were predicted.
Assuntos
Proteínas de Bactérias/genética , Desoxirribonucleases de Sítio Específico do Tipo II/genética , Motivos de Nucleotídeos , Plasmídeos/metabolismo , Thermus/enzimologia , Sequência de Aminoácidos , Proteínas de Bactérias/metabolismo , Clonagem Molecular , Clivagem do DNA , Desoxirribonucleases de Sítio Específico do Tipo II/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Expressão Gênica , Isoenzimas/genética , Isoenzimas/metabolismo , Peso Molecular , Plasmídeos/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Especificidade por Substrato , Thermus/genéticaRESUMO
DNA methylation acts in concert with restriction enzymes to protect the integrity of prokaryotic genomes. Studies in a limited number of organisms suggest that methylation also contributes to prokaryotic genome regulation, but the prevalence and properties of such non-restriction-associated methylation systems remain poorly understood. Here, we used single molecule, real-time sequencing to map DNA modifications including m6A, m4C, and m5C across the genomes of 230 diverse bacterial and archaeal species. We observed DNA methylation in nearly all (93%) organisms examined, and identified a total of 834 distinct reproducibly methylated motifs. This data enabled annotation of the DNA binding specificities of 620 DNA Methyltransferases (MTases), doubling known specificities for previously hard to study Type I, IIG and III MTases, and revealing their extraordinary diversity. Strikingly, 48% of organisms harbor active Type II MTases with no apparent cognate restriction enzyme. These active 'orphan' MTases are present in diverse bacterial and archaeal phyla and show motif specificities and methylation patterns consistent with functions in gene regulation and DNA replication. Our results reveal the pervasive presence of DNA methylation throughout the prokaryotic kingdoms, as well as the diversity of sequence specificities and potential functions of DNA methylation systems.
Assuntos
Epigenômica , Células Procarióticas/metabolismo , Sequência Conservada , Metilação de DNA/genética , Replicação do DNA/genética , Enzimas de Restrição-Modificação do DNA/classificação , Enzimas de Restrição-Modificação do DNA/metabolismo , Evolução Molecular , Regulação da Expressão Gênica , Genoma , Metiltransferases/metabolismo , Anotação de Sequência Molecular , Família Multigênica , Motivos de Nucleotídeos/genética , Filogenia , Especificidade por SubstratoRESUMO
A Gram-stain-positive, catalase-positive and pleomorphic rod organism was isolated from malted barley in Finland, classified initially by partial 16S rRNA gene sequencing and originally deposited in the VTT Culture Collection as a strain of Propionibacterium acidipropionici (currently Acidipropionibacterium acidipropionici). The subsequent comparison of the whole 16S rRNA gene with other representatives of the genus Acidipropionibacterium revealed that the strain belongs to a novel species, most closely related to Acidipropionibacterium microaerophilum and Acidipropionibacterium acidipropionici, with similarity values of 98.46 and 98.31â%, respectively. The whole genome sequencing using PacBio RS II platform allowed further comparison of the genome with all of the other DNA sequences available for the type strains of the Acidipropionibacterium species. Those comparisons revealed the highest similarity of strain JS278T to A. acidipropionici, which was confirmed by the average nucleotide identity analysis. The genome of strain JS278T is intermediate in size compared to the A. acidipropionici and Acidipropionibacterium jensenii at 3â432â872 bp, the G+C content is 68.4 mol%. The strain fermented a wide range of carbon sources, and produced propionic acid as the major fermentation product. Besides its poor ability to grow at 37 °C and positive catalase reaction, the observed phenotype was almost indistinguishable from those of A. acidipropionici and A. jensenii. Based on our findings, we conclude that the organism represents a novel member of the genus Acidipropionibacterium, for which we propose the name Acidipropionibacteriumvirtanenii sp. nov. The type strain is JS278T (=VTT E-113202T=DSM 106790T).
Assuntos
Hordeum/microbiologia , Filogenia , Propionibacterium/classificação , Técnicas de Tipagem Bacteriana , Composição de Bases , DNA Bacteriano/genética , Fermentação , Finlândia , Propionibacterium/genética , Propionibacterium/isolamento & purificação , RNA Ribossômico 16S/genética , Análise de Sequência de DNARESUMO
We identify a new subgroup of Type I Restriction-Modification enzymes that modify cytosine in one DNA strand and adenine in the opposite strand for host protection. Recognition specificity has been determined for ten systems using SMRT sequencing and each recognizes a novel DNA sequence motif. Previously characterized Type I systems use two identical copies of a single methyltransferase (MTase) subunit, with one bound at each half site of the specificity (S) subunit to form the MTase. The new m4C-producing Type I systems we describe have two separate yet highly similar MTase subunits that form a heterodimeric M1M2S MTase. The MTase subunits from these systems group into two families, one of which has NPPF in the highly conserved catalytic motif IV and modifies adenine to m6A, and one having an NPPY catalytic motif IV and modifying cytosine to m4C. The high degree of similarity among their cytosine-recognizing components (MTase and S) suggest they have recently evolved, most likely from the far more common m6A Type I systems. Type I enzymes that modify cytosine exclusively were formed by replacing the adenine target recognition domain (TRD) with a cytosine-recognizing TRD. These are the first examples of m4C modification in Type I RM systems.
Assuntos
Citosina/metabolismo , Enzimas de Restrição-Modificação do DNA/metabolismo , DNA/metabolismo , Adenina/metabolismo , Sequência de Aminoácidos , Catálise , Biologia Computacional/métodos , DNA/química , Enzimas de Restrição-Modificação do DNA/química , Enzimas de Restrição-Modificação do DNA/genética , Metilação , Metiltransferases/química , Metiltransferases/metabolismo , Mutação , Motivos de Nucleotídeos , Subunidades Proteicas/química , Subunidades Proteicas/metabolismo , Especificidade por SubstratoRESUMO
The COMBREX database (COMBREX-DB; combrex.bu.edu) is an online repository of information related to (i) experimentally determined protein function, (ii) predicted protein function, (iii) relationships among proteins of unknown function and various types of experimental data, including molecular function, protein structure, and associated phenotypes. The database was created as part of the novel COMBREX (COMputational BRidges to EXperiments) effort aimed at accelerating the rate of gene function validation. It currently holds information on â¼ 3.3 million known and predicted proteins from over 1000 completely sequenced bacterial and archaeal genomes. The database also contains a prototype recommendation system for helping users identify those proteins whose experimental determination of function would be most informative for predicting function for other proteins within protein families. The emphasis on documenting experimental evidence for function predictions, and the prioritization of uncharacterized proteins for experimental testing distinguish COMBREX from other publicly available microbial genomics resources. This article describes updates to COMBREX-DB since an initial description in the 2011 NAR Database Issue.
Assuntos
Proteínas Arqueais/fisiologia , Proteínas de Bactérias/fisiologia , Bases de Dados de Proteínas , Proteínas Arqueais/química , Proteínas Arqueais/genética , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Anotação de Sequência MolecularRESUMO
Modified DNA bases in mammalian genomes, such as 5-methylcytosine ((5m)C) and its oxidized forms, are implicated in important epigenetic regulation processes. In human or mouse, successive enzymatic conversion of (5m)C to its oxidized forms is carried out by the ten-eleven translocation (TET) proteins. Previously we reported the structure of a TET-like (5m)C oxygenase (NgTET1) from Naegleria gruberi, a single-celled protist evolutionarily distant from vertebrates. Here we show that NgTET1 is a 5-methylpyrimidine oxygenase, with activity on both (5m)C (major activity) and thymidine (T) (minor activity) in all DNA forms tested, and provide unprecedented evidence for the formation of 5-formyluridine ((5f)U) and 5-carboxyuridine ((5ca)U) in vitro. Mutagenesis studies reveal a delicate balance between choice of (5m)C or T as the preferred substrate. Furthermore, our results suggest substrate preference by NgTET1 to (5m)CpG and TpG dinucleotide sites in DNA. Intriguingly, NgTET1 displays higher T-oxidation activity in vitro than mammalian TET1, supporting a closer evolutionary relationship between NgTET1 and the base J-binding proteins from trypanosomes. Finally, we demonstrate that NgTET1 can be readily used as a tool in (5m)C sequencing technologies such as single molecule, real-time sequencing to map (5m)C in bacterial genomes at base resolution.