Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
Cell ; 173(3): 665-676.e14, 2018 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-29551272

RESUMEN

Class 2 CRISPR-Cas systems endow microbes with diverse mechanisms for adaptive immunity. Here, we analyzed prokaryotic genome and metagenome sequences to identify an uncharacterized family of RNA-guided, RNA-targeting CRISPR systems that we classify as type VI-D. Biochemical characterization and protein engineering of seven distinct orthologs generated a ribonuclease effector derived from Ruminococcus flavefaciens XPD3002 (CasRx) with robust activity in human cells. CasRx-mediated knockdown exhibits high efficiency and specificity relative to RNA interference across diverse endogenous transcripts. As one of the most compact single-effector Cas enzymes, CasRx can also be flexibly packaged into adeno-associated virus. We target virally encoded, catalytically inactive CasRx to cis elements of pre-mRNA to manipulate alternative splicing, alleviating dysregulated tau isoform ratios in a neuronal model of frontotemporal dementia. Our results present CasRx as a programmable RNA-binding module for efficient targeting of cellular RNA, enabling a general platform for transcriptome engineering and future therapeutic development.


Asunto(s)
Sistemas CRISPR-Cas , Biología Computacional/métodos , Ingeniería Genética/métodos , Ingeniería de Proteínas/métodos , ARN/análisis , Empalme Alternativo , Animales , Proteínas Bacterianas/metabolismo , Diferenciación Celular , Escherichia coli/metabolismo , Perfilación de la Expresión Génica , Células HEK293 , Humanos , Células Madre Pluripotentes Inducidas/citología , Lentivirus/genética , Ratones , Interferencia de ARN , ARN Guía de Kinetoplastida/genética , Ruminococcus , Análisis de Secuencia de ARN , Transcriptoma
2.
Cell ; 175(1): 212-223.e17, 2018 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-30241607

RESUMEN

CRISPR-Cas endonucleases directed against foreign nucleic acids mediate prokaryotic adaptive immunity and have been tailored for broad genetic engineering applications. Type VI-D CRISPR systems contain the smallest known family of single effector Cas enzymes, and their signature Cas13d ribonuclease employs guide RNAs to cleave matching target RNAs. To understand the molecular basis for Cas13d function and explain its compact molecular architecture, we resolved cryoelectron microscopy structures of Cas13d-guide RNA binary complex and Cas13d-guide-target RNA ternary complex to 3.4 and 3.3 Å resolution, respectively. Furthermore, a 6.5 Å reconstruction of apo Cas13d combined with hydrogen-deuterium exchange revealed conformational dynamics that have implications for RNA scanning. These structures, together with biochemical and cellular characterization, provide insights into its RNA-guided, RNA-targeting mechanism and delineate a blueprint for the rational design of improved transcriptome engineering technologies.


Asunto(s)
Sistemas CRISPR-Cas/genética , ARN Guía de Kinetoplastida/fisiología , Ribonucleasas/fisiología , Sistemas CRISPR-Cas/fisiología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , Microscopía por Crioelectrón/métodos , Endonucleasas/metabolismo , Células HEK293 , Humanos , Conformación Molecular , ARN/genética , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/ultraestructura , Ribonucleasas/metabolismo , Ribonucleasas/ultraestructura
3.
Cell ; 164(5): 950-61, 2016 Feb 25.
Artículo en Inglés | MEDLINE | ID: mdl-26875867

RESUMEN

The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5'-NGG-3' PAM, and used the structural information to create a variant that can recognize the more relaxed 5'-YG-3' PAM. Furthermore, we demonstrated that the FnCas9-ribonucleoprotein complex can be microinjected into mouse zygotes to edit endogenous sites with the 5'-YG-3' PAM, thus expanding the target space of the CRISPR-Cas9 toolbox.


Asunto(s)
Proteínas Bacterianas/química , Sistemas CRISPR-Cas , Endonucleasas/química , Francisella/enzimología , Ingeniería Genética/métodos , Animales , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Blastocisto/metabolismo , Proteína 9 Asociada a CRISPR , Cristalografía por Rayos X , Embrión de Mamíferos/metabolismo , Endonucleasas/genética , Endonucleasas/metabolismo , Ratones , Microinyecciones/métodos , Modelos Moleculares , ARN Guía de Kinetoplastida/genética
4.
Nature ; 630(8018): 984-993, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38926615

RESUMEN

Genomic rearrangements, encompassing mutational changes in the genome such as insertions, deletions or inversions, are essential for genetic diversity. These rearrangements are typically orchestrated by enzymes that are involved in fundamental DNA repair processes, such as homologous recombination, or in the transposition of foreign genetic material by viruses and mobile genetic elements1,2. Here we report that IS110 insertion sequences, a family of minimal and autonomous mobile genetic elements, express a structured non-coding RNA that binds specifically to their encoded recombinase. This bridge RNA contains two internal loops encoding nucleotide stretches that base-pair with the target DNA and the donor DNA, which is the IS110 element itself. We demonstrate that the target-binding and donor-binding loops can be independently reprogrammed to direct sequence-specific recombination between two DNA molecules. This modularity enables the insertion of DNA into genomic target sites, as well as programmable DNA excision and inversion. The IS110 bridge recombination system expands the diversity of nucleic-acid-guided systems beyond CRISPR and RNA interference, offering a unified mechanism for the three fundamental DNA rearrangements-insertion, excision and inversion-that are required for genome design.


Asunto(s)
ADN , ARN no Traducido , Recombinación Genética , Emparejamiento Base , Secuencia de Bases , ADN/genética , ADN/metabolismo , Elementos Transponibles de ADN/genética , Mutagénesis Insercional/genética , Recombinasas/metabolismo , Recombinasas/genética , Recombinación Genética/genética , ARN no Traducido/genética , ARN no Traducido/metabolismo
5.
Nature ; 630(8018): 994-1002, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38926616

RESUMEN

Insertion sequence (IS) elements are the simplest autonomous transposable elements found in prokaryotic genomes1. We recently discovered that IS110 family elements encode a recombinase and a non-coding bridge RNA (bRNA) that confers modular specificity for target DNA and donor DNA through two programmable loops2. Here we report the cryo-electron microscopy structures of the IS110 recombinase in complex with its bRNA, target DNA and donor DNA in three different stages of the recombination reaction cycle. The IS110 synaptic complex comprises two recombinase dimers, one of which houses the target-binding loop of the bRNA and binds to target DNA, whereas the other coordinates the bRNA donor-binding loop and donor DNA. We uncovered the formation of a composite RuvC-Tnp active site that spans the two dimers, positioning the catalytic serine residues adjacent to the recombination sites in both target and donor DNA. A comparison of the three structures revealed that (1) the top strands of target and donor DNA are cleaved at the composite active sites to form covalent 5'-phosphoserine intermediates, (2) the cleaved DNA strands are exchanged and religated to create a Holliday junction intermediate, and (3) this intermediate is subsequently resolved by cleavage of the bottom strands. Overall, this study reveals the mechanism by which a bispecific RNA confers target and donor DNA specificity to IS110 recombinases for programmable DNA recombination.


Asunto(s)
ADN , ARN no Traducido , Recombinación Genética , Dominio Catalítico , Microscopía por Crioelectrón , ADN/química , ADN/metabolismo , ADN/ultraestructura , Elementos Transponibles de ADN/genética , Modelos Moleculares , Conformación de Ácido Nucleico , Multimerización de Proteína , Recombinasas/química , Recombinasas/genética , Recombinasas/metabolismo , ARN no Traducido/química , ARN no Traducido/genética , ARN no Traducido/metabolismo , ARN no Traducido/ultraestructura , Especificidad por Sustrato
6.
Cell ; 157(6): 1262-1278, 2014 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-24906146

RESUMEN

Recent advances in genome engineering technologies based on the CRISPR-associated RNA-guided endonuclease Cas9 are enabling the systematic interrogation of mammalian genome function. Analogous to the search function in modern word processors, Cas9 can be guided to specific locations within complex genomes by a short RNA search string. Using this system, DNA sequences within the endogenous genome and their functional outputs are now easily edited or modulated in virtually any organism of choice. Cas9-mediated genetic perturbation is simple and scalable, empowering researchers to elucidate the functional organization of the genome at the systems level and establish causal linkages between genetic variations and biological phenotypes. In this Review, we describe the development and applications of Cas9 for a variety of research or translational applications while highlighting challenges as well as future directions. Derived from a remarkable microbial defense system, Cas9 is driving innovative applications from basic biology to biotechnology and medicine.


Asunto(s)
Bacterias/genética , Sistemas CRISPR-Cas , Marcación de Gen , Ingeniería Genética , Animales , Bacterias/clasificación , Bacterias/inmunología , Bacterias/virología , Células Eucariotas/metabolismo , Genoma , Humanos , Streptococcus pyogenes/enzimología , Streptococcus pyogenes/genética
7.
Cell ; 156(5): 935-49, 2014 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-24529477

RESUMEN

The CRISPR-associated endonuclease Cas9 can be targeted to specific genomic loci by single guide RNAs (sgRNAs). Here, we report the crystal structure of Streptococcus pyogenes Cas9 in complex with sgRNA and its target DNA at 2.5 Å resolution. The structure revealed a bilobed architecture composed of target recognition and nuclease lobes, accommodating the sgRNA:DNA heteroduplex in a positively charged groove at their interface. Whereas the recognition lobe is essential for binding sgRNA and DNA, the nuclease lobe contains the HNH and RuvC nuclease domains, which are properly positioned for cleavage of the complementary and noncomplementary strands of the target DNA, respectively. The nuclease lobe also contains a carboxyl-terminal domain responsible for the interaction with the protospacer adjacent motif (PAM). This high-resolution structure and accompanying functional analyses have revealed the molecular mechanism of RNA-guided DNA targeting by Cas9, thus paving the way for the rational design of new, versatile genome-editing technologies.


Asunto(s)
Proteínas Asociadas a CRISPR/química , Cristalografía por Rayos X , Endonucleasas/química , ARN Bacteriano/química , Streptococcus pyogenes/química , Secuencia de Aminoácidos , Bacterias/enzimología , Proteínas Asociadas a CRISPR/metabolismo , ADN Bacteriano/química , ADN Bacteriano/metabolismo , Endonucleasas/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , ARN Bacteriano/metabolismo , Alineación de Secuencia , Streptococcus pyogenes/enzimología , Streptococcus pyogenes/metabolismo , ARN Pequeño no Traducido
8.
Cell ; 154(6): 1380-9, 2013 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-23992846

RESUMEN

Targeted genome editing technologies have enabled a broad range of research and medical applications. The Cas9 nuclease from the microbial CRISPR-Cas system is targeted to specific genomic loci by a 20 nt guide sequence, which can tolerate certain mismatches to the DNA target and thereby promote undesired off-target mutagenesis. Here, we describe an approach that combines a Cas9 nickase mutant with paired guide RNAs to introduce targeted double-strand breaks. Because individual nicks in the genome are repaired with high fidelity, simultaneous nicking via appropriately offset guide RNAs is required for double-stranded breaks and extends the number of specifically recognized bases for target cleavage. We demonstrate that using paired nicking can reduce off-target activity by 50- to 1,500-fold in cell lines and to facilitate gene knockout in mouse zygotes without sacrificing on-target cleavage efficiency. This versatile strategy enables a wide variety of genome editing applications that require high specificity.


Asunto(s)
Roturas del ADN de Doble Cadena , Marcación de Gen/métodos , Genoma , Animales , Secuencia de Bases , Ratones , Datos de Secuencia Molecular , Streptococcus pyogenes/enzimología , Streptococcus pyogenes/genética , Cigoto/metabolismo , ARN Pequeño no Traducido
9.
PLoS Biol ; 21(6): e3002097, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37310920

RESUMEN

Identifying host genes essential for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has the potential to reveal novel drug targets and further our understanding of Coronavirus Disease 2019 (COVID-19). We previously performed a genome-wide CRISPR/Cas9 screen to identify proviral host factors for highly pathogenic human coronaviruses. Few host factors were required by diverse coronaviruses across multiple cell types, but DYRK1A was one such exception. Although its role in coronavirus infection was previously undescribed, DYRK1A encodes Dual Specificity Tyrosine Phosphorylation Regulated Kinase 1A and is known to regulate cell proliferation and neuronal development. Here, we demonstrate that DYRK1A regulates ACE2 and DPP4 transcription independent of its catalytic kinase function to support SARS-CoV, SARS-CoV-2, and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) entry. We show that DYRK1A promotes DNA accessibility at the ACE2 promoter and a putative distal enhancer, facilitating transcription and gene expression. Finally, we validate that the proviral activity of DYRK1A is conserved across species using cells of nonhuman primate and human origin. In summary, we report that DYRK1A is a novel regulator of ACE2 and DPP4 expression that may dictate susceptibility to multiple highly pathogenic human coronaviruses.


Asunto(s)
COVID-19 , Internalización del Virus , Animales , Humanos , Enzima Convertidora de Angiotensina 2 , COVID-19/genética , COVID-19/metabolismo , Dipeptidil Peptidasa 4 , Coronavirus del Síndrome Respiratorio de Oriente Medio/genética , SARS-CoV-2/genética , Coronavirus Relacionado al Síndrome Respiratorio Agudo Severo/genética , Quinasas DyrK
10.
PLoS Pathog ; 19(7): e1011351, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37410700

RESUMEN

Identification of host determinants of coronavirus infection informs mechanisms of pathogenesis and may provide novel therapeutic targets. Here, we demonstrate that the histone demethylase KDM6A promotes infection of diverse coronaviruses, including SARS-CoV, SARS-CoV-2, MERS-CoV and mouse hepatitis virus (MHV) in a demethylase activity-independent manner. Mechanistic studies reveal that KDM6A promotes viral entry by regulating expression of multiple coronavirus receptors, including ACE2, DPP4 and Ceacam1. Importantly, the TPR domain of KDM6A is required for recruitment of the histone methyltransferase KMT2D and histone deacetylase p300. Together this KDM6A-KMT2D-p300 complex localizes to the proximal and distal enhancers of ACE2 and regulates receptor expression. Notably, small molecule inhibition of p300 catalytic activity abrogates ACE2 and DPP4 expression and confers resistance to all major SARS-CoV-2 variants and MERS-CoV in primary human airway and intestinal epithelial cells. These data highlight the role for KDM6A-KMT2D-p300 complex activities in conferring diverse coronaviruses susceptibility and reveal a potential pan-coronavirus therapeutic target to combat current and emerging coronaviruses. One Sentence Summary: The KDM6A/KMT2D/EP300 axis promotes expression of multiple viral receptors and represents a potential drug target for diverse coronaviruses.


Asunto(s)
COVID-19 , Coronavirus del Síndrome Respiratorio de Oriente Medio , Animales , Humanos , Ratones , Enzima Convertidora de Angiotensina 2/metabolismo , Dipeptidil Peptidasa 4/metabolismo , Histona Demetilasas/metabolismo , Coronavirus del Síndrome Respiratorio de Oriente Medio/metabolismo , Receptores Virales/genética , Receptores Virales/metabolismo , SARS-CoV-2/metabolismo
11.
Mol Cell ; 63(3): 355-70, 2016 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-27494557

RESUMEN

Advances in the development of delivery, repair, and specificity strategies for the CRISPR-Cas9 genome engineering toolbox are helping researchers understand gene function with unprecedented precision and sensitivity. CRISPR-Cas9 also holds enormous therapeutic potential for the treatment of genetic disorders by directly correcting disease-causing mutations. Although the Cas9 protein has been shown to bind and cleave DNA at off-target sites, the field of Cas9 specificity is rapidly progressing, with marked improvements in guide RNA selection, protein and guide engineering, novel enzymes, and off-target detection methods. We review important challenges and breakthroughs in the field as a comprehensive practical guide to interested users of genome editing technologies, highlighting key tools and strategies for optimizing specificity. The genome editing community should now strive to standardize such methods for measuring and reporting off-target activity, while keeping in mind that the goal for specificity should be continued improvement and vigilance.


Asunto(s)
Proteínas Asociadas a CRISPR/metabolismo , Sistemas CRISPR-Cas , ADN/metabolismo , Endonucleasas/metabolismo , Edición Génica/métodos , Marcación de Gen/métodos , Genómica/métodos , Animales , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Proteínas Asociadas a CRISPR/genética , Biología Computacional , ADN/genética , Endonucleasas/genética , Humanos , Cinética , Mutación , Ingeniería de Proteínas , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/metabolismo , Especificidad por Sustrato
12.
Nat Chem Biol ; 17(9): 982-988, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34354262

RESUMEN

Direct, amplification-free detection of RNA has the potential to transform molecular diagnostics by enabling simple on-site analysis of human or environmental samples. CRISPR-Cas nucleases offer programmable RNA-guided RNA recognition that triggers cleavage and release of a fluorescent reporter molecule, but long reaction times hamper their detection sensitivity and speed. Here, we show that unrelated CRISPR nucleases can be deployed in tandem to provide both direct RNA sensing and rapid signal generation, thus enabling robust detection of ~30 molecules per µl of RNA in 20 min. Combining RNA-guided Cas13 and Csm6 with a chemically stabilized activator creates a one-step assay that can detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA extracted from respiratory swab samples with quantitative reverse transcriptase PCR (qRT-PCR)-derived cycle threshold (Ct) values up to 33, using a compact detector. This Fast Integrated Nuclease Detection In Tandem (FIND-IT) approach enables sensitive, direct RNA detection in a format that is amenable to point-of-care infection diagnosis as well as to a wide range of other diagnostic or research applications.


Asunto(s)
COVID-19/genética , Sistemas CRISPR-Cas/genética , ARN Viral/genética , SARS-CoV-2/genética , Humanos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
13.
Nature ; 517(7536): 583-8, 2015 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-25494202

RESUMEN

Systematic interrogation of gene function requires the ability to perturb gene expression in a robust and generalizable manner. Here we describe structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci. We used these engineered Cas9 activation complexes to investigate single-guide RNA (sgRNA) targeting rules for effective transcriptional activation, to demonstrate multiplexed activation of ten genes simultaneously, and to upregulate long intergenic non-coding RNA (lincRNA) transcripts. We also synthesized a library consisting of 70,290 guides targeting all human RefSeq coding isoforms to screen for genes that, upon activation, confer resistance to a BRAF inhibitor. The top hits included genes previously shown to be able to confer resistance, and novel candidates were validated using individual sgRNA and complementary DNA overexpression. A gene expression signature based on the top screening hits correlated with markers of BRAF inhibitor resistance in cell lines and patient-derived samples. These results collectively demonstrate the potential of Cas9-based activators as a powerful genetic perturbation technology.


Asunto(s)
Sistemas CRISPR-Cas/genética , Ingeniería Genética/métodos , Genoma Humano/genética , Melanoma/genética , Activación Transcripcional/genética , Proteínas Asociadas a CRISPR/genética , Proteínas Asociadas a CRISPR/metabolismo , Línea Celular Tumoral , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , ADN Complementario/biosíntesis , ADN Complementario/genética , Resistencia a Antineoplásicos/efectos de los fármacos , Resistencia a Antineoplásicos/genética , Regulación Neoplásica de la Expresión Génica/genética , Biblioteca de Genes , Sitios Genéticos/genética , Pruebas Genéticas , Humanos , Indoles/farmacología , Melanoma/tratamiento farmacológico , Proteínas Proto-Oncogénicas B-raf/antagonistas & inhibidores , ARN no Traducido/biosíntesis , ARN no Traducido/genética , ARN no Traducido/metabolismo , Reproducibilidad de los Resultados , Sulfonamidas/farmacología , Regulación hacia Arriba/genética
14.
Nature ; 500(7463): 472-476, 2013 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-23877069

RESUMEN

The dynamic nature of gene expression enables cellular programming, homeostasis and environmental adaptation in living systems. Dissection of causal gene functions in cellular and organismal processes therefore necessitates approaches that enable spatially and temporally precise modulation of gene expression. Recently, a variety of microbial and plant-derived light-sensitive proteins have been engineered as optogenetic actuators, enabling high-precision spatiotemporal control of many cellular functions. However, versatile and robust technologies that enable optical modulation of transcription in the mammalian endogenous genome remain elusive. Here we describe the development of light-inducible transcriptional effectors (LITEs), an optogenetic two-hybrid system integrating the customizable TALE DNA-binding domain with the light-sensitive cryptochrome 2 protein and its interacting partner CIB1 from Arabidopsis thaliana. LITEs do not require additional exogenous chemical cofactors, are easily customized to target many endogenous genomic loci, and can be activated within minutes with reversibility. LITEs can be packaged into viral vectors and genetically targeted to probe specific cell populations. We have applied this system in primary mouse neurons, as well as in the brain of freely behaving mice in vivo to mediate reversible modulation of mammalian endogenous gene expression as well as targeted epigenetic chromatin modifications. The LITE system establishes a novel mode of optogenetic control of endogenous cellular processes and enables direct testing of the causal roles of genetic and epigenetic regulation in normal biological processes and disease states.


Asunto(s)
Epigénesis Genética/genética , Epigénesis Genética/efectos de la radiación , Regulación de la Expresión Génica/efectos de la radiación , Luz , Optogenética/métodos , Transcripción Genética/efectos de la radiación , Animales , Proteínas de Arabidopsis/metabolismo , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/metabolismo , Células Cultivadas , Cromatina/genética , Cromatina/efectos de la radiación , Criptocromos/metabolismo , Regulación de la Expresión Génica/genética , Vectores Genéticos/genética , Masculino , Ratones , Ratones Endogámicos C57BL , Neuronas/metabolismo , Neuronas/efectos de la radiación , Factores de Tiempo , Transcripción Genética/genética , Técnicas del Sistema de Dos Híbridos , Vigilia
16.
bioRxiv ; 2024 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-38948874

RESUMEN

Gene therapies have the potential to treat disease by delivering therapeutic genetic cargo to disease-associated cells. One limitation to their widespread use is the lack of short regulatory sequences, or promoters, that differentially induce the expression of delivered genetic cargo in target cells, minimizing side effects in other cell types. Such cell-type-specific promoters are difficult to discover using existing methods, requiring either manual curation or access to large datasets of promoter-driven expression from both targeted and untargeted cells. Model-based optimization (MBO) has emerged as an effective method to design biological sequences in an automated manner, and has recently been used in promoter design methods. However, these methods have only been tested using large training datasets that are expensive to collect, and focus on designing promoters for markedly different cell types, overlooking the complexities associated with designing promoters for closely related cell types that share similar regulatory features. Therefore, we introduce a comprehensive framework for utilizing MBO to design promoters in a data-efficient manner, with an emphasis on discovering promoters for similar cell types. We use conservative objective models (COMs) for MBO and highlight practical considerations such as best practices for improving sequence diversity, getting estimates of model uncertainty, and choosing the optimal set of sequences for experimental validation. Using three relatively similar blood cancer cell lines (Jurkat, K562, and THP1), we show that our approach discovers many novel cell-type-specific promoters after experimentally validating the designed sequences. For K562 cells, in particular, we discover a promoter that has 75.85% higher cell-type-specificity than the best promoter from the initial dataset used to train our models.

17.
bioRxiv ; 2024 Jan 26.
Artículo en Inglés | MEDLINE | ID: mdl-38328150

RESUMEN

Genomic rearrangements, encompassing mutational changes in the genome such as insertions, deletions, or inversions, are essential for genetic diversity. These rearrangements are typically orchestrated by enzymes involved in fundamental DNA repair processes such as homologous recombination or in the transposition of foreign genetic material by viruses and mobile genetic elements (MGEs). We report that IS110 insertion sequences, a family of minimal and autonomous MGEs, express a structured non-coding RNA that binds specifically to their encoded recombinase. This bridge RNA contains two internal loops encoding nucleotide stretches that base-pair with the target DNA and donor DNA, which is the IS110 element itself. We demonstrate that the target-binding and donor-binding loops can be independently reprogrammed to direct sequence-specific recombination between two DNA molecules. This modularity enables DNA insertion into genomic target sites as well as programmable DNA excision and inversion. The IS110 bridge system expands the diversity of nucleic acid-guided systems beyond CRISPR and RNA interference, offering a unified mechanism for the three fundamental DNA rearrangements required for genome design.

18.
bioRxiv ; 2023 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-36909524

RESUMEN

Advances in gene delivery technologies are enabling rapid progress in molecular medicine, but require precise expression of genetic cargo in desired cell types, which is predominantly achieved via a regulatory DNA sequence called a promoter; however, only a handful of cell type-specific promoters are known. Efficiently designing compact promoter sequences with a high density of regulatory information by leveraging machine learning models would therefore be broadly impactful for fundamental research and direct therapeutic applications. However, models of expression from such compact promoter sequences are lacking, despite the recent success of deep learning in modelling expression from endogenous regulatory sequences. Despite the lack of large datasets measuring promoter-driven expression in many cell types, data from a few well-studied cell types or from endogenous gene expression may provide relevant information for transfer learning, which has not yet been explored in this setting. Here, we evaluate a variety of pretraining tasks and transfer strategies for modelling cell type-specific expression from compact promoters and demonstrate the effectiveness of pretraining on existing promoter-driven expression datasets from other cell types. Our approach is broadly applicable for modelling promoter-driven expression in any data-limited cell type of interest, and will enable the use of model-based optimization techniques for promoter design for gene delivery applications. Our code and data are available at https://github.com/anikethjr/promoter_models.

19.
Nat Biotechnol ; 41(4): 488-499, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36217031

RESUMEN

Large serine recombinases (LSRs) are DNA integrases that facilitate the site-specific integration of mobile genetic elements into bacterial genomes. Only a few LSRs, such as Bxb1 and PhiC31, have been characterized to date, with limited efficiency as tools for DNA integration in human cells. In this study, we developed a computational approach to identify thousands of LSRs and their DNA attachment sites, expanding known LSR diversity by >100-fold and enabling the prediction of their insertion site specificities. We tested their recombination activity in human cells, classifying them as landing pad, genome-targeting or multi-targeting LSRs. Overall, we achieved up to seven-fold higher recombination than Bxb1 and genome integration efficiencies of 40-75% with cargo sizes over 7 kb. We also demonstrate virus-free, direct integration of plasmid or amplicon libraries for improved functional genomics applications. This systematic discovery of recombinases directly from microbial sequencing data provides a resource of over 60 LSRs experimentally characterized in human cells for large-payload genome insertion without exposed DNA double-stranded breaks.


Asunto(s)
Ingeniería Genética , Integrasas , Humanos , Genoma Humano , Transfección , Biblioteca Genómica
20.
Cell Syst ; 14(12): 1087-1102.e13, 2023 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-38091991

RESUMEN

Effective and precise mammalian transcriptome engineering technologies are needed to accelerate biological discovery and RNA therapeutics. Despite the promise of programmable CRISPR-Cas13 ribonucleases, their utility has been hampered by an incomplete understanding of guide RNA design rules and cellular toxicity resulting from off-target or collateral RNA cleavage. Here, we quantified the performance of over 127,000 RfxCas13d (CasRx) guide RNAs and systematically evaluated seven machine learning models to build a guide efficiency prediction algorithm orthogonally validated across multiple human cell types. Deep learning model interpretation revealed preferred sequence motifs and secondary features for highly efficient guides. We next identified and screened 46 novel Cas13d orthologs, finding that DjCas13d achieves low cellular toxicity and high specificity-even when targeting abundant transcripts in sensitive cell types, including stem cells and neurons. Our Cas13d guide efficiency model was successfully generalized to DjCas13d, illustrating the power of combining machine learning with ortholog discovery to advance RNA targeting in human cells.


Asunto(s)
Sistemas CRISPR-Cas , Aprendizaje Profundo , ARN , Humanos , Sistemas CRISPR-Cas/genética , ARN/genética , ARN Guía de Sistemas CRISPR-Cas , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA