Your browser doesn't support javascript.
loading
CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci.
Alkhnbashi, Omer S; Costa, Fabrizio; Shah, Shiraz A; Garrett, Roger A; Saunders, Sita J; Backofen, Rolf.
Affiliation
  • Alkhnbashi OS; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
  • Costa F; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
  • Shah SA; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
  • Garrett RA; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
  • Saunders SJ; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
  • Backofen R; Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany, Department of Biology, University of Copenhagen, Archaea Centre, Ole Maaloes Vej 5, DK2200 Copenhagen, Denmark and BIOSS Centre for Biological Signalling Studies, Cluster of Excellence, Univers
Bioinformatics ; 30(17): i489-96, 2014 Sep 01.
Article in En | MEDLINE | ID: mdl-25161238
ABSTRACT
MOTIVATION The discovery of CRISPR-Cas systems almost 20 years ago rapidly changed our perception of the bacterial and archaeal immune systems. CRISPR loci consist of several repetitive DNA sequences called repeats, inter-spaced by stretches of variable length sequences called spacers. This CRISPR array is transcribed and processed into multiple mature RNA species (crRNAs). A single crRNA is integrated into an interference complex, together with CRISPR-associated (Cas) proteins, to bind and degrade invading nucleic acids. Although existing bioinformatics tools can recognize CRISPR loci by their characteristic repeat-spacer architecture, they generally output CRISPR arrays of ambiguous orientation and thus do not determine the strand from which crRNAs are processed. Knowledge of the correct orientation is crucial for many tasks, including the classification of CRISPR conservation, the detection of leader regions, the identification of target sites (protospacers) on invading genetic elements and the characterization of protospacer-adjacent motifs.

RESULTS:

We present a fast and accurate tool to determine the crRNA-encoding strand at CRISPR loci by predicting the correct orientation of repeats based on an advanced machine learning approach. Both the repeat sequence and mutation information were encoded and processed by an efficient graph kernel to learn higher-order correlations. The model was trained and tested on curated data comprising >4500 CRISPRs and yielded a remarkable performance of 0.95 AUC ROC (area under the curve of the receiver operator characteristic). In addition, we show that accurate orientation information greatly improved detection of conserved repeat sequence families and structure motifs. We integrated CRISPRstrand predictions into our CRISPRmap web server of CRISPR conservation and updated the latter to version 2.0.

AVAILABILITY:

CRISPRmap and CRISPRstrand are available at http//rna.informatik.uni-freiburg.de/CRISPRmap. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: RNA / Clustered Regularly Interspaced Short Palindromic Repeats Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2014 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: RNA / Clustered Regularly Interspaced Short Palindromic Repeats Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2014 Document type: Article