Búsqueda | Portal de Búsqueda de la BVS

The Gene Sculpt Suite: a set of tools for genome editing.

Mann, Carla M; Martínez-Gálvez, Gabriel; Welker, Jordan M; Wierson, Wesley A; Ata, Hirotaka; Almeida, Maira P; Clark, Karl J; Essner, Jeffrey J; McGrail, Maura; Ekker, Stephen C; Dobbs, Drena.

Nucleic Acids Res ; 47(W1): W175-W182, 2019 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-31127311

RESUMEN

The discovery and development of DNA-editing nucleases (Zinc Finger Nucleases, TALENs, CRISPR/Cas systems) has given scientists the ability to precisely engineer or edit genomes as never before. Several different platforms, protocols and vectors for precision genome editing are now available, leading to the development of supporting web-based software. Here we present the Gene Sculpt Suite (GSS), which comprises three tools: (i) GTagHD, which automatically designs and generates oligonucleotides for use with the GeneWeld knock-in protocol; (ii) MEDJED, a machine learning method, which predicts the extent to which a double-stranded DNA break site will utilize the microhomology-mediated repair pathway; and (iii) MENTHU, a tool for identifying genomic locations likely to give rise to a single predominant microhomology-mediated end joining allele (PreMA) repair outcome. All tools in the GSS are freely available for download under the GPL v3.0 license and can be run locally on Windows, Mac and Linux systems capable of running R and/or Docker. The GSS is also freely available online at www.genesculpt.org.

Asunto(s)

Bases de Datos Genéticas , Edición Génica , Ingeniería Genética/métodos , Programas Informáticos , Animales , Sistemas CRISPR-Cas/genética , Roturas del ADN de Doble Cadena , Humanos , Nucleasas de los Efectores Tipo Activadores de la Transcripción/genética , Nucleasas con Dedos de Zinc/genética

Robust activation of microhomology-mediated end joining for precision gene editing applications.

Ata, Hirotaka; Ekstrom, Thomas L; Martínez-Gálvez, Gabriel; Mann, Carla M; Dvornikov, Alexey V; Schaefbauer, Kyle J; Ma, Alvin C; Dobbs, Drena; Clark, Karl J; Ekker, Stephen C.

PLoS Genet ; 14(9): e1007652, 2018 09.

Artículo en Inglés | MEDLINE | ID: mdl-30208061

RESUMEN

One key problem in precision genome editing is the unpredictable plurality of sequence outcomes at the site of targeted DNA double stranded breaks (DSBs). This is due to the typical activation of the versatile Non-homologous End Joining (NHEJ) pathway. Such unpredictability limits the utility of somatic gene editing for applications including gene therapy and functional genomics. For germline editing work, the accurate reproduction of the identical alleles using NHEJ is a labor intensive process. In this study, we propose Microhomology-mediated End Joining (MMEJ) as a viable solution for improving somatic sequence homogeneity in vivo, capable of generating a single predictable allele at high rates (56% ~ 86% of the entire mutant allele pool). Using a combined dataset from zebrafish (Danio rerio) in vivo and human HeLa cell in vitro, we identified specific contextual sequence determinants surrounding genomic DSBs for robust MMEJ pathway activation. We then applied our observation to prospectively design MMEJ-inducing sgRNAs against a variety of proof-of-principle genes and demonstrated high levels of mutant allele homogeneity. MMEJ-based DNA repair at these target loci successfully generated F0 mutant zebrafish embryos and larvae that faithfully recapitulated previously reported, recessive, loss-of-function phenotypes. We also tested the generalizability of our approach in cultured human cells. Finally, we provide a novel algorithm, MENTHU (http://genesculpt.org/menthu/), for improved and facile prediction of candidate MMEJ loci. We believe that this MMEJ-centric approach will have a broader impact on genome engineering and its applications. For example, whereas somatic mosaicism hinders efficient recreation of knockout mutant allele at base pair resolution via the standard NHEJ-based approach, we demonstrate that F0 founders transmitted the identical MMEJ allele of interest at high rates. Most importantly, the ability to directly dictate the reading frame of an endogenous target will have important implications for gene therapy applications in human genetic diseases.

Asunto(s)

Roturas del ADN de Doble Cadena , Reparación del ADN por Unión de Extremidades/genética , Edición Génica/métodos , Modelos Genéticos , Algoritmos , Alelos , Animales , Estudios de Factibilidad , Femenino , Enfermedades Genéticas Congénitas/genética , Enfermedades Genéticas Congénitas/terapia , Terapia Genética/métodos , Células HeLa , Humanos , Masculino , Mutagénesis Sitio-Dirigida , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/metabolismo , Pez Cebra

GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics.

Zvyagin, Maxim; Brace, Alexander; Hippe, Kyle; Deng, Yuntian; Zhang, Bin; Bohorquez, Cindy Orozco; Clyde, Austin; Kale, Bharat; Perez-Rivera, Danilo; Ma, Heng; Mann, Carla M; Irvin, Michael; Pauloski, J Gregory; Ward, Logan; Hayot-Sasson, Valerie; Emani, Murali; Foreman, Sam; Xie, Zhen; Lin, Diangen; Shukla, Maulik; Nie, Weili; Romero, Josh; Dallago, Christian; Vahdat, Arash; Xiao, Chaowei; Gibbs, Thomas; Foster, Ian; Davis, James J; Papka, Michael E; Brettin, Thomas; Stevens, Rick; Anandkumar, Anima; Vishwanath, Venkatram; Ramanathan, Arvind.

bioRxiv ; 2022 Nov 23.

Artículo en Inglés | MEDLINE | ID: mdl-36451881

RESUMEN

We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.

GeneWeld: Efficient Targeted Integration Directed by Short Homology in Zebrafish.

Welker, Jordan M; Wierson, Wesley A; Almeida, Maira P; Mann, Carla M; Torrie, Melanie E; Ming, Zhitao; Ekker, Stephen C; Clark, Karl J; Dobbs, Drena L; Essner, Jeffrey J; McGrail, Maura.

Bio Protoc ; 11(14): e4100, 2021 Jul 20.

Artículo en Inglés | MEDLINE | ID: mdl-34395736

RESUMEN

Efficient precision genome engineering requires high frequency and specificity of integration at the genomic target site. Multiple design strategies for zebrafish gene targeting have previously been reported with widely varying frequencies for germline recovery of integration alleles. The GeneWeld protocol and pGTag (plasmids for Gene Tagging) vector series provide a set of resources to streamline precision gene targeting in zebrafish. Our approach uses short homology of 24-48 bp to drive targeted integration of DNA reporter cassettes by homology-mediated end joining (HMEJ) at a CRISPR/Cas induced DNA double-strand break. The pGTag vectors contain reporters flanked by a universal CRISPR sgRNA sequence to liberate the targeting cassette in vivo and expose homology arms for homology-driven integration. Germline transmission rates for precision-targeted integration alleles range 22-100%. Our system provides a streamlined, straightforward, and cost-effective approach for high-efficiency gene targeting applications in zebrafish. Graphic abstract: GeneWeld method for CRISPR/Cas9 targeted integration.

Efficient targeted integration directed by short homology in zebrafish and mammalian cells.

Wierson, Wesley A; Welker, Jordan M; Almeida, Maira P; Mann, Carla M; Webster, Dennis A; Torrie, Melanie E; Weiss, Trevor J; Kambakam, Sekhar; Vollbrecht, Macy K; Lan, Merrina; McKeighan, Kenna C; Levey, Jacklyn; Ming, Zhitao; Wehmeier, Alec; Mikelson, Christopher S; Haltom, Jeffrey A; Kwan, Kristen M; Chien, Chi-Bin; Balciunas, Darius; Ekker, Stephen C; Clark, Karl J; Webber, Beau R; Moriarity, Branden S; Solin, Stacy L; Carlson, Daniel F; Dobbs, Drena L; McGrail, Maura; Essner, Jeffrey.

Elife ; 92020 05 15.

Artículo en Inglés | MEDLINE | ID: mdl-32412410

RESUMEN

Efficient precision genome engineering requires high frequency and specificity of integration at the genomic target site. Here, we describe a set of resources to streamline reporter gene knock-ins in zebrafish and demonstrate the broader utility of the method in mammalian cells. Our approach uses short homology of 24-48 bp to drive targeted integration of DNA reporter cassettes by homology-mediated end joining (HMEJ) at high frequency at a double strand break in the targeted gene. Our vector series, pGTag (plasmids for Gene Tagging), contains reporters flanked by a universal CRISPR sgRNA sequence which enables in vivo liberation of the homology arms. We observed high rates of germline transmission (22-100%) for targeted knock-ins at eight zebrafish loci and efficient integration at safe harbor loci in porcine and human cells. Our system provides a straightforward and cost-effective approach for high efficiency gene targeting applications in CRISPR and TALEN compatible systems.

Asunto(s)

Proteínas Asociadas a CRISPR/genética , Sistemas CRISPR-Cas , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Técnicas de Sustitución del Gen , Genes Reporteros , Proteínas Fluorescentes Verdes/genética , Nucleasas de los Efectores Tipo Activadores de la Transcripción/genética , Pez Cebra/genética , Animales , Animales Modificados Genéticamente , Proteínas Asociadas a CRISPR/metabolismo , Fibroblastos/metabolismo , Regulación de la Expresión Génica , Proteínas Fluorescentes Verdes/metabolismo , Humanos , Células K562 , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/metabolismo , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/metabolismo , Reparación del ADN por Recombinación , Homología de Secuencia de Ácido Nucleico , Sus scrofa , Nucleasas de los Efectores Tipo Activadores de la Transcripción/metabolismo

Computational Prediction of RNA-Protein Interactions.

Mann, Carla M; Muppirala, Usha K; Dobbs, Drena.

Methods Mol Biol ; 1543: 169-185, 2017.

Artículo en Inglés | MEDLINE | ID: mdl-28349426

RESUMEN

Experimental methods for identifying protein(s) bound by a specific promoter-associated RNA (paRNA) of interest can be expensive, difficult, and time-consuming. This chapter describes a general computational framework for identifying potential binding partners in RNA-protein complexes or RNA-protein interaction networks. Protocols for using three web-based tools to predict RNA-protein interaction partners are outlined. Also, tables listing additional webservers and software tools for predicting RNA-protein interactions, as well as databases that contain valuable information about known RNA-protein complexes and recognition sites for RNA-binding proteins, are provided. Although only one of the tools described, lncPro, was designed expressly to identify proteins that bind long noncoding RNAs (including paRNAs), all three approaches can be applied to predict potential binding partners for both coding and noncoding RNAs (ncRNAs).

Asunto(s)

Biología Computacional/métodos , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismo , ARN/química , ARN/metabolismo , Programas Informáticos , Sitios de Unión , Simulación por Computador , Bases de Datos Genéticas , Unión Proteica , ARN/genética , Motor de Búsqueda , Máquina de Vectores de Soporte , Navegador Web

A MOTIF-BASED METHOD FOR PREDICTING INTERFACIAL RESIDUES IN BOTH THE RNA AND PROTEIN COMPONENTS OF PROTEIN-RNA COMPLEXES.

Muppirala, Usha; Lewis, Benjamin A; Mann, Carla M; Dobbs, Drena.

Pac Symp Biocomput ; 21: 445-455, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-26776208

RESUMEN

Efforts to predict interfacial residues in protein-RNA complexes have largely focused on predicting RNA-binding residues in proteins. Computational methods for predicting protein-binding residues in RNA sequences, however, are a problem that has received relatively little attention to date. Although the value of sequence motifs for classifying and annotating protein sequences is well established, sequence motifs have not been widely applied to predicting interfacial residues in macromolecular complexes. Here, we propose a novel sequence motif-based method for "partner-specific" interfacial residue prediction. Given a specific protein-RNA pair, the goal is to simultaneously predict RNA binding residues in the protein sequence and protein-binding residues in the RNA sequence. In 5-fold cross validation experiments, our method, PS-PRIP, achieved 92% Specificity and 61% Sensitivity, with a Matthews correlation coefficient (MCC) of 0.58 in predicting RNA-binding sites in proteins. The method achieved 69% Specificity and 75% Sensitivity, but with a low MCC of 0.13 in predicting protein binding sites in RNAs. Similar performance results were obtained when PS-PRIP was tested on two independent "blind" datasets of experimentally validated protein- RNA interactions, suggesting the method should be widely applicable and valuable for identifying potential interfacial residues in protein-RNA complexes for which structural information is not available. The PS-PRIP webserver and datasets are available at: http://pridb.gdcb.iastate.edu/PSPRIP/.

Asunto(s)

Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismo , ARN/química , ARN/metabolismo , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Secuencia de Bases , Sitios de Unión/genética , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Bases de Datos de Proteínas/estadística & datos numéricos , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Modelos Moleculares , Unión Proteica , ARN/genética , ARN Bacteriano/química , ARN Bacteriano/genética , ARN Bacteriano/metabolismo , ARN Ribosómico 16S/química , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/metabolismo , Proteínas de Unión al ARN/genética , Proteínas Ribosómicas/química , Proteínas Ribosómicas/genética , Proteínas Ribosómicas/metabolismo , Programas Informáticos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA