Your browser doesn't support javascript.
loading
Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT.
Luo, Ye; Chen, Yaowen; Xie, HuanZeng; Zhu, Wentao; Zhang, Guishan.
Afiliação
  • Luo Y; College of Engineering, Shantou University, Shantou, 515063, China.
  • Chen Y; College of Engineering, Shantou University, Shantou, 515063, China.
  • Xie H; College of Engineering, Shantou University, Shantou, 515063, China.
  • Zhu W; College of Engineering, Shantou University, Shantou, 515063, China.
  • Zhang G; College of Engineering, Shantou University, Shantou, 515063, China. Electronic address: gszhang@stu.edu.cn.
Comput Biol Med ; 169: 107932, 2024 Feb.
Article em En | MEDLINE | ID: mdl-38199209
ABSTRACT
Off-target effects of CRISPR/Cas9 can lead to suboptimal genome editing outcomes. Numerous deep learning-based approaches have achieved excellent performance for off-target prediction; however, few can predict the off-target activities with both mismatches and indels between single guide RNA (sgRNA) and target DNA sequence pair. In addition, data imbalance is a common pitfall for off-target prediction. Moreover, due to the complexity of genomic contexts, generating an interpretable model also remains challenged. To address these issues, firstly we developed a BERT-based model called CRISPR-BERT for enhancing the prediction of off-target activities with both mismatches and indels. Secondly, we proposed an adaptive batch-wise class balancing strategy to combat the noise exists in imbalanced off-target data. Finally, we applied a visualization approach for investigating the generalizable nucleotide position-dependent patterns of sgRNA-DNA pair for off-target activity. In our comprehensive comparison to existing methods on five mismatches-only datasets and two mismatches-and-indels datasets, CRISPR-BERT achieved the best performance in terms of AUROC and PRAUC. Besides, the visualization analysis demonstrated how implicit knowledge learned by CRISPR-BERT facilitates off-target prediction, which shows potential in model interpretability. Collectively, CRISPR-BERT provides an accurate and interpretable framework for off-target prediction, further contributes to sgRNA optimization in practical use for improved target specificity in CRISPR/Cas9 genome editing. The source code is available at https//github.com/BrokenStringx/CRISPR-BERT.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sistemas CRISPR-Cas / RNA Guia de Sistemas CRISPR-Cas Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sistemas CRISPR-Cas / RNA Guia de Sistemas CRISPR-Cas Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Revista: Comput Biol Med Ano de publicação: 2024 Tipo de documento: Article