Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Mol Cell ; 82(14): 2714-2726.e4, 2022 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-35649413

RESUMO

As part of the ongoing bacterial-phage arms race, CRISPR-Cas systems in bacteria clear invading phages whereas anti-CRISPR proteins (Acrs) in phages inhibit CRISPR defenses. Known Acrs have proven extremely diverse, complicating their identification. Here, we report a deep learning algorithm for Acr identification that revealed an Acr against type VI-B CRISPR-Cas systems. The algorithm predicted numerous putative Acrs spanning almost all CRISPR-Cas types and subtypes, including over 7,000 putative type IV and VI Acrs not predicted by other algorithms. By performing a cell-free screen for Acr hits against type VI-B systems, we identified a potent inhibitor of Cas13b nucleases we named AcrVIB1. AcrVIB1 blocks Cas13b-mediated defense against a targeted plasmid and lytic phage, and its inhibitory function principally occurs upstream of ribonucleoprotein complex formation. Overall, our work helps expand the known Acr universe, aiding our understanding of the bacteria-phage arms race and the use of Acrs to control CRISPR technologies.


Assuntos
Bacteriófagos , Aprendizado Profundo , Bactérias/genética , Bactérias/metabolismo , Bacteriófagos/genética , Bacteriófagos/metabolismo , Proteína 9 Associada à CRISPR/genética , Proteína 9 Associada à CRISPR/metabolismo , Sistemas CRISPR-Cas , Endonucleases/genética , Endonucleases/metabolismo
2.
Nucleic Acids Res ; 52(2): 769-783, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38015466

RESUMO

CRISPR-Cas systems store fragments of invader DNA as spacers to recognize and clear those same invaders in the future. Spacers can also be acquired from the host's genomic DNA, leading to lethal self-targeting. While self-targeting can be circumvented through different mechanisms, natural examples remain poorly explored. Here, we investigate extensive self-targeting by two CRISPR-Cas systems encoding 24 self-targeting spacers in the plant pathogen Xanthomonas albilineans. We show that the native I-C and I-F1 systems are actively expressed and that CRISPR RNAs are properly processed. When expressed in Escherichia coli, each Cascade complex binds its PAM-flanked DNA target to block transcription, while the addition of Cas3 paired with genome targeting induces cell killing. While exploring how X. albilineans survives self-targeting, we predicted putative anti-CRISPR proteins (Acrs) encoded within the bacterium's genome. Screening of identified candidates with cell-free transcription-translation systems and in E. coli revealed two Acrs, which we named AcrIC11 and AcrIF12Xal, that inhibit the activity of Cas3 but not Cascade of the respective system. While AcrF12Xal is homologous to AcrIF12, AcrIC11 shares sequence and structural homology with the anti-restriction protein KlcA. These findings help explain tolerance of self-targeting through two CRISPR-Cas systems and expand the known suite of DNA degradation-inhibiting Acrs.


Assuntos
Proteínas Associadas a CRISPR , Xanthomonas , Sistemas CRISPR-Cas , Escherichia coli/genética , Escherichia coli/metabolismo , Xanthomonas/genética , DNA/genética , Proteínas Associadas a CRISPR/metabolismo
3.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35037022

RESUMO

Small proteins encoded by short open reading frames (ORFs) with 50 codons or fewer are emerging as an important class of cellular macromolecules in diverse organisms. However, they often evade detection by proteomics or in silico methods. Ribosome profiling (Ribo-seq) has revealed widespread translation in genomic regions previously thought to be non-coding, driving the development of ORF detection tools using Ribo-seq data. However, only a handful of tools have been designed for bacteria, and these have not yet been systematically compared. Here, we aimed to identify tools that use Ribo-seq data to correctly determine the translational status of annotated bacterial ORFs and also discover novel translated regions with high sensitivity. To this end, we generated a large set of annotated ORFs from four diverse bacterial organisms, manually labeled for their translation status based on Ribo-seq data, which are available for future benchmarking studies. This set was used to investigate the predictive performance of seven Ribo-seq-based ORF detection tools (REPARATION_blast, DeepRibo, Ribo-TISH, PRICE, smORFer, ribotricer and SPECtre), as well as IRSOM, which uses coding potential and RNA-seq coverage only. DeepRibo and REPARATION_blast robustly predicted translated ORFs, including sORFs, with no significant difference for ORFs in close proximity to other genes versus stand-alone genes. However, no tool predicted a set of novel, experimentally verified sORFs with high sensitivity. Start codon predictions with smORFer show the value of initiation site profiling data to further improve the sensitivity of ORF prediction tools in bacteria. Overall, we find that bacterial tools perform well for sORF detection, although there is potential for improving their performance, applicability, usability and reproducibility.


Assuntos
Benchmarking , Ribossomos , Bactérias/genética , Fases de Leitura Aberta , Reprodutibilidade dos Testes , Ribossomos/genética , Ribossomos/metabolismo
4.
Int J Mol Sci ; 25(4)2024 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-38396779

RESUMO

Cancer is a leading cause of death globally. The majority of cancer cases are only diagnosed in the late stages of cancer due to the use of conventional methods. This reduces the chance of survival for cancer patients. Therefore, early detection consequently followed by early diagnoses are important tasks in cancer research. Gene expression microarray technology has been applied to detect and diagnose most types of cancers in their early stages and has gained encouraging results. In this paper, we address the problem of classifying cancer based on gene expression for handling the class imbalance problem and the curse of dimensionality. The oversampling technique is utilized to overcome this problem by adding synthetic samples. Another common issue related to the gene expression dataset addressed in this paper is the curse of dimensionality. This problem is addressed by applying chi-square and information gain feature selection techniques. After applying these techniques individually, we proposed a method to select the most significant genes by combining those two techniques (CHiS and IG). We investigated the effect of these techniques individually and in combination. Four benchmarking biomedical datasets (Leukemia-subtypes, Leukemia-ALLAML, Colon, and CuMiDa) were used. The experimental results reveal that the oversampling techniques improve the results in most cases. Additionally, the performance of the proposed feature selection technique outperforms individual techniques in nearly all cases. In addition, this study provides an empirical study for evaluating several oversampling techniques along with ensemble-based learning. The experimental results also reveal that SVM-SMOTE, along with the random forests classifier, achieved the highest results, with a reporting accuracy of 100%. The obtained results surpass the findings in the existing literature as well.


Assuntos
Leucemia , Neoplasias , Humanos , Neoplasias/genética , Leucemia/genética , Expressão Gênica
5.
Bioinformatics ; 38(Suppl_2): ii42-ii48, 2022 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-36124799

RESUMO

MOTIVATION: The CRISPR-Cas9 system is a Type II CRISPR system that has rapidly become the most versatile and widespread tool for genome engineering. It consists of two components, the Cas9 effector protein, and a single guide RNA that combines the spacer (for identifying the target) with the tracrRNA, a trans-activating small RNA required for both crRNA maturation and interference. While there are well-established methods for screening Cas effector proteins and CRISPR arrays, the detection of tracrRNA remains the bottleneck in detecting Class 2 CRISPR systems. RESULTS: We introduce a new pipeline CRISPRtracrRNA for screening and evaluation of tracrRNA candidates in genomes. This pipeline combines evidence from different components of the Cas9-sgRNA complex. The core is a newly developed structural model via covariance models from a sequence-structure alignment of experimentally validated tracrRNAs. As additional evidence, we determine the terminator signal (required for the tracrRNA transcription) and the RNA-RNA interaction between the CRISPR array repeat and the 5'-part of the tracrRNA. Repeats are detected via an ML-based approach (CRISPRidenify). Providing further evidence, we detect the cassette containing the Cas9 (Type II CRISPR systems) and Cas12 (Type V CRISPR systems) effector protein. Our tool is the first for detecting tracrRNA for Type V systems. AVAILABILITY AND IMPLEMENTATION: The implementation of the CRISPRtracrRNA is available on GitHub upon requesting the access permission, (https://github.com/BackofenLab/CRISPRtracrRNA). Data generated in this study can be obtained upon request to the corresponding person: Rolf Backofen (backofen@informatik.uni-freiburg.de). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sistemas CRISPR-Cas , RNA , Humanos , Genoma , RNA/genética , Alinhamento de Sequência , Pequeno RNA não Traduzido/genética
6.
Nucleic Acids Res ; 49(4): e20, 2021 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-33290505

RESUMO

CRISPR-Cas are adaptive immune systems that degrade foreign genetic elements in archaea and bacteria. In carrying out their immune functions, CRISPR-Cas systems heavily rely on RNA components. These CRISPR (cr) RNAs are repeat-spacer units that are produced by processing of pre-crRNA, the transcript of CRISPR arrays, and guide Cas protein(s) to the cognate invading nucleic acids, enabling their destruction. Several bioinformatics tools have been developed to detect CRISPR arrays based solely on DNA sequences, but all these tools employ the same strategy of looking for repetitive patterns, which might correspond to CRISPR array repeats. The identified patterns are evaluated using a fixed, built-in scoring function, and arrays exceeding a cut-off value are reported. Here, we instead introduce a data-driven approach that uses machine learning to detect and differentiate true CRISPR arrays from false ones based on several features. Our CRISPR detection tool, CRISPRidentify, performs three steps: detection, feature extraction and classification based on manually curated sets of positive and negative examples of CRISPR arrays. The identified CRISPR arrays are then reported to the user accompanied by detailed annotation. We demonstrate that our approach identifies not only previously detected CRISPR arrays, but also CRISPR array candidates not detected by other tools. Compared to other methods, our tool has a drastically reduced false positive rate. In contrast to the existing tools, our approach not only provides the user with the basic statistics on the identified CRISPR arrays but also produces a certainty score as a practical measure of the likelihood that a given genomic region is a CRISPR array.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Aprendizado de Máquina , Software , Genoma Arqueal , Genoma Bacteriano
7.
Nucleic Acids Res ; 49(W1): W125-W130, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-34133710

RESUMO

CRISPR-Cas systems are adaptive immune systems in prokaryotes, providing resistance against invading viruses and plasmids. The identification of CRISPR loci is currently a non-standardized, ambiguous process, requiring the manual combination of multiple tools, where existing tools detect only parts of the CRISPR-systems, and lack quality control, annotation and assessment capabilities of the detected CRISPR loci. Our CRISPRloci server provides the first resource for the prediction and assessment of all possible CRISPR loci. The server integrates a series of advanced Machine Learning tools within a seamless web interface featuring: (i) prediction of all CRISPR arrays in the correct orientation; (ii) definition of CRISPR leaders for each locus; and (iii) annotation of cas genes and their unambiguous classification. As a result, CRISPRloci is able to accurately determine the CRISPR array and associated information, such as: the Cas subtypes; cassette boundaries; accuracy of the repeat structure, orientation and leader sequence; virus-host interactions; self-targeting; as well as the annotation of cas genes, all of which have been missing from existing tools. This annotation is presented in an interactive interface, making it easy for scientists to gain an overview of the CRISPR system in their organism of interest. Predictions are also rendered in GFF format, enabling in-depth genome browser inspection. In summary, CRISPRloci constitutes a full suite for CRISPR-Cas system characterization that offers annotation quality previously available only after manual inspection.


Assuntos
Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Anotação de Sequência Molecular , Software , Proteínas Associadas a CRISPR/classificação , Proteínas Associadas a CRISPR/genética , Aprendizado de Máquina
8.
Int J Mol Sci ; 24(7)2023 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-37047235

RESUMO

The CRISPR-Cas system has evolved into a cutting-edge technology that has transformed the field of biological sciences through precise genetic manipulation. CRISPR/Cas9 nuclease is evolving into a revolutionizing method to edit any gene of any species with desirable outcomes. The swift advancement of CRISPR-Cas technology is reflected in an ever-expanding ecosystem of bioinformatics tools designed to make CRISPR/Cas9 experiments easier. To assist researchers with efficient guide RNA designs with fewer off-target effects, nuclease target site selection, and experimental validation, bioinformaticians have built and developed a comprehensive set of tools. In this article, we will review the various computational tools available for the assessment of off-target effects, as well as the quantification of nuclease activity and specificity, including web-based search tools and experimental methods, and we will describe how these tools can be optimized for gene knock-out (KO) and gene knock-in (KI) for model organisms. We also discuss future directions in precision genome editing and its applications, as well as challenges in target selection, particularly in predicting off-target effects.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Biologia Computacional/métodos , Sistemas CRISPR-Cas/genética , Edição de Genes/métodos , RNA Guia de Sistemas CRISPR-Cas
9.
Bioinformatics ; 37(10): 1352-1359, 2021 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-33226067

RESUMO

MOTIVATION: CRISPR-Cas are important systems found in most archaeal and many bacterial genomes, providing adaptive immunity against mobile genetic elements in prokaryotes. The CRISPR-Cas systems are encoded by a set of consecutive cas genes, here termed cassette. The identification of cassette boundaries is key for finding cassettes in CRISPR research field. This is often carried out by using Hidden Markov Models and manual annotation. In this article, we propose the first method able to automatically define the cassette boundaries. In addition, we present a Cas-type predictive model used by the method to assign each gene located in the region defined by a cassette's boundaries a Cas label from a set of pre-defined Cas types. Furthermore, the proposed method can detect potentially new cas genes and decompose a cassette into its modules. RESULTS: We evaluate the predictive performance of our proposed method on data collected from the two most recent CRISPR classification studies. In our experiments, we obtain an average similarity of 0.86 between the predicted and expected cassettes. Besides, we achieve F-scores above 0.9 for the classification of cas genes of known types and 0.73 for the unknown ones. Finally, we conduct two additional study cases, where we investigate the occurrence of potentially new cas genes and the occurrence of module exchange between different genomes. AVAILABILITY AND IMPLEMENTATION: https://github.com/BackofenLab/Casboundary. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Archaea , Sistemas CRISPR-Cas , Archaea/genética , Sistemas CRISPR-Cas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Genoma Bacteriano
10.
J Biol Chem ; 295(39): 13502-13515, 2020 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-32723866

RESUMO

Haloferax volcanii is, to our knowledge, the only prokaryote known to tolerate CRISPR-Cas-mediated damage to its genome in the WT background; the resulting cleavage of the genome is repaired by homologous recombination restoring the WT version. In mutant Haloferax strains with enhanced self-targeting, cell fitness decreases and microhomology-mediated end joining becomes active, generating deletions in the targeted gene. Here we use self-targeting to investigate adaptation in H. volcanii CRISPR-Cas type I-B. We show that self-targeting and genome breakage events that are induced by self-targeting, such as those catalyzed by active transposases, can generate DNA fragments that are used by the CRISPR-Cas adaptation machinery for integration into the CRISPR loci. Low cellular concentrations of self-targeting crRNAs resulted in acquisition of large numbers of spacers originating from the entire genomic DNA. In contrast, high concentrations of self-targeting crRNAs resulted in lower acquisition that was mostly centered on the targeting site. Furthermore, we observed naïve spacer acquisition at a low level in WT Haloferax cells and with higher efficiency upon overexpression of the Cas proteins Cas1, Cas2, and Cas4. Taken together, these findings indicate that naïve adaptation is a regulated process in H. volcanii that operates at low basal levels and is induced by DNA breaks.


Assuntos
Adaptação Fisiológica/genética , Sistemas CRISPR-Cas/genética , Haloferax volcanii/genética , DNA Arqueal/genética , Genoma Arqueal/genética , Haloferax volcanii/citologia , Sequenciamento de Nucleotídeos em Larga Escala
11.
Methods ; 172: 3-11, 2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-31326596

RESUMO

Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated proteins (Cas) are essential genetic elements in many archaeal and bacterial genomes, playing a key role in a prokaryote adaptive immune system against invasive foreign elements. In recent years, the CRISPR-Cas system has also been engineered to facilitate target gene editing in eukaryotic genomes. Bioinformatics played an essential role in the detection and analysis of CRISPR systems and here we review the bioinformatics-based efforts that pushed the field of CRISPR-Cas research further. We discuss the bioinformatics tools that have been published over the last few years and, finally, present the most popular tools for the design of CRISPR-Cas9 guides.


Assuntos
Sistemas CRISPR-Cas/genética , Biologia Computacional/métodos , Edição de Genes , Algoritmos , Biologia Computacional/tendências , RNA Guia de Cinetoplastídeos/genética
12.
Nucleic Acids Res ; 46(W1): W25-W29, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29788132

RESUMO

The Freiburg RNA tools webserver is a well established online resource for RNA-focused research. It provides a unified user interface and comprehensive result visualization for efficient command line tools. The webserver includes RNA-RNA interaction prediction (IntaRNA, CopraRNA, metaMIR), sRNA homology search (GLASSgo), sequence-structure alignments (LocARNA, MARNA, CARNA, ExpaRNA), CRISPR repeat classification (CRISPRmap), sequence design (antaRNA, INFO-RNA, SECISDesign), structure aberration evaluation of point mutations (RaSE), and RNA/protein-family models visualization (CMV), and other methods. Open education resources offer interactive visualizations of RNA structure and RNA-RNA interaction prediction as well as basic and advanced sequence alignment algorithms. The services are freely available at http://rna.informatik.uni-freiburg.de.


Assuntos
Sequência de Bases/genética , Internet , RNA/genética , Software , Algoritmos , Conformação de Ácido Nucleico , RNA/química , Alinhamento de Sequência/instrumentação , Análise de Sequência de RNA/instrumentação , Relação Estrutura-Atividade
13.
RNA Biol ; 16(4): 492-503, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30153081

RESUMO

The clustered regularly interspaced short palindromic repeat (CRISPR) system is a prokaryotic adaptive defense system against foreign nucleic acids. In the methanoarchaeon Methanosarcina mazei Gö1, two types of CRISPR-Cas systems are present (type I-B and type III-C). Both loci encode a Cas6 endonuclease, Cas6b-IB and Cas6b-IIIC, typically responsible for maturation of functional short CRISPR RNAs (crRNAs). To evaluate potential cross cleavage activity, we biochemically characterized both Cas6b proteins regarding their crRNA binding behavior and their ability to process pre-crRNA from the respective CRISPR array in vivo. Maturation of crRNA was studied in the respective single deletion mutants by northern blot and RNA-Seq analysis demonstrating that in vivo primarily Cas6b-IB is responsible for crRNA processing of both CRISPR arrays. Tentative protein level evidence for the translation of both Cas6b proteins under standard growth conditions was detected, arguing for different activities or a potential non-redundant role of Cas6b-IIIC within the cell. Conservation of both Cas6 endonucleases was observed in several other M. mazei isolates, though a wide variety was displayed. In general, repeat and leader sequence conservation revealed a close correlation in the M. mazei strains. The repeat sequences from both CRISPR arrays from M. mazei Gö1 contain the same sequence motif with differences only in two nucleotides. These data stand in contrast to all other analyzed M. mazei isolates, which have at least one additional CRISPR array with repeats belonging to another sequence motif. This conforms to the finding that Cas6b-IB is the crucial and functional endonuclease in M. mazei Gö1. Abbreviations: sRNA: small RNA; crRNA: CRISPR RNA; pre-crRNAs: Precursor CRISPR RNA; CRISPR: clustered regularly interspaced short palindromic repeats; Cas: CRISPR associated; nt: nucleotide; RNP: ribonucleoprotein; RBS: ribosome binding site.


Assuntos
Proteínas Associadas a CRISPR/metabolismo , Sistemas CRISPR-Cas/genética , Methanosarcina/genética , Processamento Pós-Transcricional do RNA/genética , RNA Arqueal/genética , Sequência de Bases , Sequência Conservada/genética , Endonucleases/metabolismo , Regulação da Expressão Gênica , Mutação/genética , Nucleotídeos/genética , Sequências Repetitivas de Ácido Nucleico/genética
14.
RNA Biol ; 16(4): 530-542, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29911924

RESUMO

A study was undertaken to identify conserved proteins that are encoded adjacent to cas gene cassettes of Type III CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats - CRISPR associated) interference modules. Type III modules have been shown to target and degrade dsDNA, ssDNA and ssRNA and are frequently intertwined with cofunctional accessory genes, including genes encoding CRISPR-associated Rossman Fold (CARF) domains. Using a comparative genomics approach, and defining a Type III association score accounting for coevolution and specificity of flanking genes, we identified and classified 39 new Type III associated gene families. Most archaeal and bacterial Type III modules were seen to be flanked by several accessory genes, around half of which did not encode CARF domains and remain of unknown function. Northern blotting and interference assays in Synechocystis confirmed that one particular non-CARF accessory protein family was involved in crRNA maturation. Non-CARF accessory genes were generally diverse, encoding nuclease, helicase, protease, ATPase, transporter and transmembrane domains with some encoding no known domains. We infer that additional families of non-CARF accessory proteins remain to be found. The method employed is scalable for potential application to metagenomic data once automated pipelines for annotation of CRISPR-Cas systems have been developed. All accessory genes found in this study are presented online in a readily accessible and searchable format for researchers to audit their model organism of choice: http://accessory.crispr.dk .


Assuntos
Archaea/genética , Bactérias/genética , Proteínas Associadas a CRISPR/genética , Sistemas CRISPR-Cas/genética , Família Multigênica , Proteínas Associadas a CRISPR/química , Mapeamento Cromossômico , Deleção de Genes , Filogenia , Domínios Proteicos , Synechocystis/genética
15.
RNA Biol ; 16(4): 518-529, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29995583

RESUMO

Novel CRISPR-Cas systems possess substantial potential for genome editing and manipulation of gene expression. The types and numbers of CRISPR-Cas systems vary substantially between different organisms. Some filamentous cyanobacteria harbor > 40 different putative CRISPR repeat-spacer cassettes, while the number of cas gene instances is much lower. Here we addressed the types and diversity of CRISPR-Cas systems and of CRISPR-like repeat-spacer arrays in 171 publicly available genomes of multicellular cyanobacteria. The number of 1328 repeat-spacer arrays exceeded the total of 391 encoded Cas1 proteins suggesting a tendency for fragmentation or the involvement of alternative adaptation factors. The model cyanobacterium Anabaena sp. PCC 7120 contains only three cas1 genes but hosts three Class 1, possibly one Class 2 and five orphan repeat-spacer arrays, all of which exhibit crRNA-typical expression patterns suggesting active transcription, maturation and incorporation into CRISPR complexes. The CRISPR-Cas system within the element interrupting the Anabaena sp. PCC 7120 fdxN gene, as well as analogous arrangements in other strains, occupy the genetic elements that become excised during the differentiation-related programmed site-specific recombination. This fact indicates the propensity of these elements for the integration of CRISPR-cas systems and points to a previously not recognized connection. The gene all3613 resembling a possible Class 2 effector protein is linked to a short repeat-spacer array and a single tRNA gene, similar to its homologs in other cyanobacteria. The diversity and presence of numerous CRISPR-Cas systems in DNA elements that are programmed for homologous recombination make filamentous cyanobacteria a prolific resource for their study. Abbreviations: Cas: CRISPR associated sequences; CRISPR: Clustered Regularly Interspaced Short Palindromic Repeats; C2c: Class 2 candidate; SDR: small dispersed repeat; TSS: transcriptional start site; UTR: untranslated region.


Assuntos
Sistemas CRISPR-Cas/genética , Cianobactérias/citologia , Cianobactérias/genética , Sequência de Bases , Diferenciação Celular/genética , Regulação Bacteriana da Expressão Gênica , Recombinação Homóloga/genética , Filogenia , Sintenia/genética
16.
RNA Biol ; 16(4): 469-480, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29649958

RESUMO

Invading genetic elements pose a constant threat to prokaryotic survival, requiring an effective defence. Eleven years ago, the arsenal of known defence mechanisms was expanded by the discovery of the CRISPR-Cas system. Although CRISPR-Cas is present in the majority of archaea, research often focuses on bacterial models. Here, we provide a perspective based on insights gained studying CRISPR-Cas system I-B of the archaeon Haloferax volcanii. The system relies on more than 50 different crRNAs, whose stability and maintenance critically depend on the proteins Cas5 and Cas7, which bind the crRNA and form the Cascade complex. The interference machinery requires a seed sequence and can interact with multiple PAM sequences. H. volcanii stands out as the first example of an organism that can tolerate autoimmunity via the CRISPR-Cas system while maintaining a constitutively active system. In addition, the H. volcanii system was successfully developed into a tool for gene regulation.


Assuntos
Sistemas CRISPR-Cas/genética , Haloferax/genética , Sequência de Bases , Proteínas Associadas a CRISPR/metabolismo , RNA Arqueal/genética , Transcrição Gênica
17.
Nucleic Acids Res ; 45(2): 915-925, 2017 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-27599840

RESUMO

A hallmark of defense mechanisms based on clustered regularly interspaced short palindromic repeats (CRISPR) and associated sequences (Cas) are the crRNAs that guide these complexes in the destruction of invading DNA or RNA. Three separate CRISPR-Cas systems exist in the cyanobacterium Synechocystis sp. PCC 6803. Based on genetic and transcriptomic evidence, two associated endoribonucleases, Cas6-1 and Cas6-2a, were postulated to be involved in crRNA maturation from CRISPR1 or CRISPR2, respectively. Here, we report a promiscuity of both enzymes to process in vitro not only their cognate transcripts, but also the respective non-cognate precursors, whereas they are specific in vivo Moreover, while most of the repeats serving as substrates were cleaved in vitro, some were not. RNA structure predictions suggested that the context sequence surrounding a repeat can interfere with its stable folding. Indeed, structure accuracy calculations of the hairpin motifs within the repeat sequences explained the majority of analyzed cleavage reactions, making this a good measure for predicting successful cleavage events. We conclude that the cleavage of CRISPR1 and CRISPR2 repeat instances requires a stable formation of the characteristic hairpin motif, which is similar between the two types of repeats. The influence of surrounding sequences might partially explain variations in crRNA abundances and should be considered when designing artificial CRISPR arrays.


Assuntos
Sistemas CRISPR-Cas , RNA/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Endorribonucleases/metabolismo , Sequências Repetidas Invertidas , Conformação de Ácido Nucleico , Oligorribonucleotídeos/química , Oligorribonucleotídeos/genética , Clivagem do RNA , Ribonucleases/metabolismo , Especificidade por Substrato
18.
Mol Microbiol ; 103(1): 151-164, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27743417

RESUMO

Archaeal and eukaryotic organisms contain sets of C/D box s(no)RNAs with guide sequences that determine ribose 2'-O-methylation sites of target RNAs. The composition of these C/D box sRNA sets is highly variable between organisms and results in varying RNA modification patterns which are important for ribosomal RNA folding and stability. Little is known about the genomic organization of C/D box sRNA genes in archaea. Here, we aimed to obtain first insights into the biogenesis of these archaeal C/D box sRNAs and analyzed the genetic context of more than 300 archaeal sRNA genes. We found that the majority of these genes do not possess independent promoters but are rather located at positions that allow for co-transcription with neighboring genes and their start or stop codons were frequently incorporated into the conserved boxC and D motifs. The biogenesis of plasmid-encoded C/D box sRNA variants was analyzed in vivo in Sulfolobus acidocaldarius. It was found that C/D box sRNA maturation occurs independent of their genetic context and relies solely on the presence of intact RNA kink-turn structures. The observed plasticity of C/D box sRNA biogenesis is suggested to enable their accelerated evolution and, consequently, allow for adjustments of the RNA modification landscape.


Assuntos
Archaea/genética , RNA Nuclear Pequeno/metabolismo , RNA Nucleolar Pequeno/metabolismo , Archaea/metabolismo , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Sequência de Bases/genética , Genes Arqueais/genética , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Motivos de Nucleotídeos/genética , Regiões Promotoras Genéticas/genética , RNA Ribossômico/genética , RNA Nuclear Pequeno/genética , RNA Nucleolar Pequeno/genética
19.
J Virol ; 91(22)2017 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-28878086

RESUMO

A novel archaeal lytic virus targeting species of the genus Methanosarcina was isolated using Methanosarcina mazei strain Gö1 as the host. Due to its spherical morphology, the virus was designated Methanosarcina spherical virus (MetSV). Molecular analysis demonstrated that MetSV contains double-stranded linear DNA with a genome size of 10,567 bp containing 22 open reading frames (ORFs), all oriented in the same direction. Functions were predicted for some of these ORFs, i.e., such as DNA polymerase, ATPase, and DNA-binding protein as well as envelope (structural) protein. MetSV-derived spacers in CRISPR loci were detected in several published Methanosarcina draft genomes using bioinformatic tools, revealing a potential protospacer-adjacent motif (PAM) motif (TTA/T). Transcription and expression of several predicted viral ORFs were validated by reverse transcription-PCR (RT-PCR), PAGE analysis, and liquid chromatography-mass spectrometry (LC-MS)-based proteomics. Analysis of core lipids by atmospheric pressure chemical ionization (APCI) mass spectrometry showed that MetSV and Methanosarcina mazei both contain archaeol and glycerol dialkyl glycerol tetraether without a cyclopentane moiety (GDGT-0). The MetSV host range is limited to Methanosarcina strains growing as single cells (M. mazei, Methanosarcina barkeri and Methanosarcina soligelidi). In contrast, strains growing as sarcina-like aggregates were apparently protected from infection. Heterogeneity related to morphology phases in M. mazei cultures allowed acquisition of resistance to MetSV after challenge by growing cultures as sarcina-like aggregates. CRISPR/Cas-mediated resistance was excluded since neither of the two CRISPR arrays showed MetSV-derived spacer acquisition. Based on these findings, we propose that changing the morphology from single cells to sarcina-like aggregates upon rearrangement of the envelope structure prevents infection and subsequent lysis by MetSV.IMPORTANCE Methanoarchaea are among the most abundant organisms on the planet since they are present in high numbers in major anaerobic environments. They convert various carbon sources, e.g., acetate, methylamines, or methanol, to methane and carbon dioxide; thus, they have a significant impact on the emission of major greenhouse gases. Today, very little is known about viruses specifically infecting methanoarchaea that most probably impact the abundance of methanoarchaea in microbial consortia. Here, we characterize the first identified Methanosarcina-infecting virus (MetSV) and show a mechanism for acquiring resistance against MetSV. Based on our results, we propose that growth as sarcina-like aggregates prevents infection and subsequent lysis. These findings allow new insights into the virus-host relationship in methanogenic community structures, their dynamics, and their phase heterogeneity. Moreover, the availability of a specific virus provides new possibilities to deepen our knowledge of the defense mechanisms of potential hosts and offers tools for genetic manipulation.


Assuntos
Vírus de Archaea/fisiologia , Methanosarcina/virologia , Methanosarcina/genética , Especificidade da Espécie
20.
Bioinformatics ; 32(17): i576-i585, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587677

RESUMO

MOTIVATION: The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs or their possible functions. Leader properties have mainly been characterized through transcriptional initiation data from single organisms but large-scale characterization of leaders has remained challenging due to their low level of sequence conservation. RESULTS: We developed a method to successfully detect leader sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from distinctive sizes to preferential indel localization. CRISPRleader provides a full annotation of the CRISPR array, its strand orientation as well as conserved core leader boundaries that can be uploaded to any genome browser. In addition, it outputs reader-friendly HTML pages for conserved leader clusters from our database. AVAILABILITY AND IMPLEMENTATION: CRISPRleader and multiple sequence alignments for all 195 leader clusters are available at http://www.bioinf.uni-freiburg.de/Software/CRISPRleader/ CONTACT: costa@informatik.uni-freiburg.de or backofen@informatik.uni-freiburg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Archaea , Sequência de Bases , Sequência Conservada , Loci Gênicos , Anotação de Sequência Molecular , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA