Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 173(3): 665-676.e14, 2018 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-29551272

RESUMO

Class 2 CRISPR-Cas systems endow microbes with diverse mechanisms for adaptive immunity. Here, we analyzed prokaryotic genome and metagenome sequences to identify an uncharacterized family of RNA-guided, RNA-targeting CRISPR systems that we classify as type VI-D. Biochemical characterization and protein engineering of seven distinct orthologs generated a ribonuclease effector derived from Ruminococcus flavefaciens XPD3002 (CasRx) with robust activity in human cells. CasRx-mediated knockdown exhibits high efficiency and specificity relative to RNA interference across diverse endogenous transcripts. As one of the most compact single-effector Cas enzymes, CasRx can also be flexibly packaged into adeno-associated virus. We target virally encoded, catalytically inactive CasRx to cis elements of pre-mRNA to manipulate alternative splicing, alleviating dysregulated tau isoform ratios in a neuronal model of frontotemporal dementia. Our results present CasRx as a programmable RNA-binding module for efficient targeting of cellular RNA, enabling a general platform for transcriptome engineering and future therapeutic development.


Assuntos
Sistemas CRISPR-Cas , Biologia Computacional/métodos , Engenharia Genética/métodos , Engenharia de Proteínas/métodos , RNA/análise , Processamento Alternativo , Animais , Proteínas de Bactérias/metabolismo , Diferenciação Celular , Escherichia coli/metabolismo , Perfilação da Expressão Gênica , Células HEK293 , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Lentivirus/genética , Camundongos , Interferência de RNA , RNA Guia de Cinetoplastídeos/genética , Ruminococcus , Análise de Sequência de RNA , Transcriptoma
2.
Cell ; 175(1): 212-223.e17, 2018 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-30241607

RESUMO

CRISPR-Cas endonucleases directed against foreign nucleic acids mediate prokaryotic adaptive immunity and have been tailored for broad genetic engineering applications. Type VI-D CRISPR systems contain the smallest known family of single effector Cas enzymes, and their signature Cas13d ribonuclease employs guide RNAs to cleave matching target RNAs. To understand the molecular basis for Cas13d function and explain its compact molecular architecture, we resolved cryoelectron microscopy structures of Cas13d-guide RNA binary complex and Cas13d-guide-target RNA ternary complex to 3.4 and 3.3 Å resolution, respectively. Furthermore, a 6.5 Å reconstruction of apo Cas13d combined with hydrogen-deuterium exchange revealed conformational dynamics that have implications for RNA scanning. These structures, together with biochemical and cellular characterization, provide insights into its RNA-guided, RNA-targeting mechanism and delineate a blueprint for the rational design of improved transcriptome engineering technologies.


Assuntos
Sistemas CRISPR-Cas/genética , RNA Guia de Cinetoplastídeos/fisiologia , Ribonucleases/fisiologia , Sistemas CRISPR-Cas/fisiologia , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Microscopia Crioeletrônica/métodos , Endonucleases/metabolismo , Células HEK293 , Humanos , Conformação Molecular , RNA/genética , RNA Guia de Cinetoplastídeos/genética , RNA Guia de Cinetoplastídeos/ultraestrutura , Ribonucleases/metabolismo , Ribonucleases/ultraestrutura
3.
Cell ; 164(5): 950-61, 2016 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-26875867

RESUMO

The RNA-guided endonuclease Cas9 cleaves double-stranded DNA targets complementary to the guide RNA and has been applied to programmable genome editing. Cas9-mediated cleavage requires a protospacer adjacent motif (PAM) juxtaposed with the DNA target sequence, thus constricting the range of targetable sites. Here, we report the 1.7 Å resolution crystal structures of Cas9 from Francisella novicida (FnCas9), one of the largest Cas9 orthologs, in complex with a guide RNA and its PAM-containing DNA targets. A structural comparison of FnCas9 with other Cas9 orthologs revealed striking conserved and divergent features among distantly related CRISPR-Cas9 systems. We found that FnCas9 recognizes the 5'-NGG-3' PAM, and used the structural information to create a variant that can recognize the more relaxed 5'-YG-3' PAM. Furthermore, we demonstrated that the FnCas9-ribonucleoprotein complex can be microinjected into mouse zygotes to edit endogenous sites with the 5'-YG-3' PAM, thus expanding the target space of the CRISPR-Cas9 toolbox.


Assuntos
Proteínas de Bactérias/química , Sistemas CRISPR-Cas , Endonucleases/química , Francisella/enzimologia , Engenharia Genética/métodos , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Blastocisto/metabolismo , Proteína 9 Associada à CRISPR , Cristalografia por Raios X , Embrião de Mamíferos/metabolismo , Endonucleases/genética , Endonucleases/metabolismo , Camundongos , Microinjeções/métodos , Modelos Moleculares , RNA Guia de Cinetoplastídeos/genética
4.
Nature ; 630(8018): 984-993, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38926615

RESUMO

Genomic rearrangements, encompassing mutational changes in the genome such as insertions, deletions or inversions, are essential for genetic diversity. These rearrangements are typically orchestrated by enzymes that are involved in fundamental DNA repair processes, such as homologous recombination, or in the transposition of foreign genetic material by viruses and mobile genetic elements1,2. Here we report that IS110 insertion sequences, a family of minimal and autonomous mobile genetic elements, express a structured non-coding RNA that binds specifically to their encoded recombinase. This bridge RNA contains two internal loops encoding nucleotide stretches that base-pair with the target DNA and the donor DNA, which is the IS110 element itself. We demonstrate that the target-binding and donor-binding loops can be independently reprogrammed to direct sequence-specific recombination between two DNA molecules. This modularity enables the insertion of DNA into genomic target sites, as well as programmable DNA excision and inversion. The IS110 bridge recombination system expands the diversity of nucleic-acid-guided systems beyond CRISPR and RNA interference, offering a unified mechanism for the three fundamental DNA rearrangements-insertion, excision and inversion-that are required for genome design.


Assuntos
DNA , RNA não Traduzido , Recombinação Genética , Pareamento de Bases , Sequência de Bases , DNA/genética , DNA/metabolismo , Elementos de DNA Transponíveis/genética , Mutagênese Insercional/genética , Recombinases/metabolismo , Recombinases/genética , Recombinação Genética/genética , RNA não Traduzido/genética , RNA não Traduzido/metabolismo
5.
Nature ; 630(8018): 994-1002, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38926616

RESUMO

Insertion sequence (IS) elements are the simplest autonomous transposable elements found in prokaryotic genomes1. We recently discovered that IS110 family elements encode a recombinase and a non-coding bridge RNA (bRNA) that confers modular specificity for target DNA and donor DNA through two programmable loops2. Here we report the cryo-electron microscopy structures of the IS110 recombinase in complex with its bRNA, target DNA and donor DNA in three different stages of the recombination reaction cycle. The IS110 synaptic complex comprises two recombinase dimers, one of which houses the target-binding loop of the bRNA and binds to target DNA, whereas the other coordinates the bRNA donor-binding loop and donor DNA. We uncovered the formation of a composite RuvC-Tnp active site that spans the two dimers, positioning the catalytic serine residues adjacent to the recombination sites in both target and donor DNA. A comparison of the three structures revealed that (1) the top strands of target and donor DNA are cleaved at the composite active sites to form covalent 5'-phosphoserine intermediates, (2) the cleaved DNA strands are exchanged and religated to create a Holliday junction intermediate, and (3) this intermediate is subsequently resolved by cleavage of the bottom strands. Overall, this study reveals the mechanism by which a bispecific RNA confers target and donor DNA specificity to IS110 recombinases for programmable DNA recombination.


Assuntos
DNA , RNA não Traduzido , Recombinação Genética , Domínio Catalítico , Microscopia Crioeletrônica , DNA/química , DNA/metabolismo , DNA/ultraestrutura , Elementos de DNA Transponíveis/genética , Modelos Moleculares , Conformação de Ácido Nucleico , Multimerização Proteica , Recombinases/química , Recombinases/genética , Recombinases/metabolismo , RNA não Traduzido/química , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , RNA não Traduzido/ultraestrutura , Especificidade por Substrato
6.
Cell ; 157(6): 1262-1278, 2014 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-24906146

RESUMO

Recent advances in genome engineering technologies based on the CRISPR-associated RNA-guided endonuclease Cas9 are enabling the systematic interrogation of mammalian genome function. Analogous to the search function in modern word processors, Cas9 can be guided to specific locations within complex genomes by a short RNA search string. Using this system, DNA sequences within the endogenous genome and their functional outputs are now easily edited or modulated in virtually any organism of choice. Cas9-mediated genetic perturbation is simple and scalable, empowering researchers to elucidate the functional organization of the genome at the systems level and establish causal linkages between genetic variations and biological phenotypes. In this Review, we describe the development and applications of Cas9 for a variety of research or translational applications while highlighting challenges as well as future directions. Derived from a remarkable microbial defense system, Cas9 is driving innovative applications from basic biology to biotechnology and medicine.


Assuntos
Bactérias/genética , Sistemas CRISPR-Cas , Marcação de Genes , Engenharia Genética , Animais , Bactérias/classificação , Bactérias/imunologia , Bactérias/virologia , Células Eucarióticas/metabolismo , Genoma , Humanos , Streptococcus pyogenes/enzimologia , Streptococcus pyogenes/genética
7.
Cell ; 156(5): 935-49, 2014 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-24529477

RESUMO

The CRISPR-associated endonuclease Cas9 can be targeted to specific genomic loci by single guide RNAs (sgRNAs). Here, we report the crystal structure of Streptococcus pyogenes Cas9 in complex with sgRNA and its target DNA at 2.5 Å resolution. The structure revealed a bilobed architecture composed of target recognition and nuclease lobes, accommodating the sgRNA:DNA heteroduplex in a positively charged groove at their interface. Whereas the recognition lobe is essential for binding sgRNA and DNA, the nuclease lobe contains the HNH and RuvC nuclease domains, which are properly positioned for cleavage of the complementary and noncomplementary strands of the target DNA, respectively. The nuclease lobe also contains a carboxyl-terminal domain responsible for the interaction with the protospacer adjacent motif (PAM). This high-resolution structure and accompanying functional analyses have revealed the molecular mechanism of RNA-guided DNA targeting by Cas9, thus paving the way for the rational design of new, versatile genome-editing technologies.


Assuntos
Proteínas Associadas a CRISPR/química , Cristalografia por Raios X , Endonucleases/química , RNA Bacteriano/química , Streptococcus pyogenes/química , Sequência de Aminoácidos , Bactérias/enzimologia , Proteínas Associadas a CRISPR/metabolismo , DNA Bacteriano/química , DNA Bacteriano/metabolismo , Endonucleases/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Estrutura Terciária de Proteína , RNA Bacteriano/metabolismo , Alinhamento de Sequência , Streptococcus pyogenes/enzimologia , Streptococcus pyogenes/metabolismo , Pequeno RNA não Traduzido
8.
Cell ; 154(6): 1380-9, 2013 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-23992846

RESUMO

Targeted genome editing technologies have enabled a broad range of research and medical applications. The Cas9 nuclease from the microbial CRISPR-Cas system is targeted to specific genomic loci by a 20 nt guide sequence, which can tolerate certain mismatches to the DNA target and thereby promote undesired off-target mutagenesis. Here, we describe an approach that combines a Cas9 nickase mutant with paired guide RNAs to introduce targeted double-strand breaks. Because individual nicks in the genome are repaired with high fidelity, simultaneous nicking via appropriately offset guide RNAs is required for double-stranded breaks and extends the number of specifically recognized bases for target cleavage. We demonstrate that using paired nicking can reduce off-target activity by 50- to 1,500-fold in cell lines and to facilitate gene knockout in mouse zygotes without sacrificing on-target cleavage efficiency. This versatile strategy enables a wide variety of genome editing applications that require high specificity.


Assuntos
Quebras de DNA de Cadeia Dupla , Marcação de Genes/métodos , Genoma , Animais , Sequência de Bases , Camundongos , Dados de Sequência Molecular , Streptococcus pyogenes/enzimologia , Streptococcus pyogenes/genética , Zigoto/metabolismo , Pequeno RNA não Traduzido
9.
PLoS Biol ; 21(6): e3002097, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37310920

RESUMO

Identifying host genes essential for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has the potential to reveal novel drug targets and further our understanding of Coronavirus Disease 2019 (COVID-19). We previously performed a genome-wide CRISPR/Cas9 screen to identify proviral host factors for highly pathogenic human coronaviruses. Few host factors were required by diverse coronaviruses across multiple cell types, but DYRK1A was one such exception. Although its role in coronavirus infection was previously undescribed, DYRK1A encodes Dual Specificity Tyrosine Phosphorylation Regulated Kinase 1A and is known to regulate cell proliferation and neuronal development. Here, we demonstrate that DYRK1A regulates ACE2 and DPP4 transcription independent of its catalytic kinase function to support SARS-CoV, SARS-CoV-2, and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) entry. We show that DYRK1A promotes DNA accessibility at the ACE2 promoter and a putative distal enhancer, facilitating transcription and gene expression. Finally, we validate that the proviral activity of DYRK1A is conserved across species using cells of nonhuman primate and human origin. In summary, we report that DYRK1A is a novel regulator of ACE2 and DPP4 expression that may dictate susceptibility to multiple highly pathogenic human coronaviruses.


Assuntos
COVID-19 , Internalização do Vírus , Animais , Humanos , Enzima de Conversão de Angiotensina 2 , COVID-19/genética , COVID-19/metabolismo , Dipeptidil Peptidase 4 , Coronavírus da Síndrome Respiratória do Oriente Médio/genética , SARS-CoV-2/genética , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/genética , Quinases Dyrk
10.
PLoS Pathog ; 19(7): e1011351, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37410700

RESUMO

Identification of host determinants of coronavirus infection informs mechanisms of pathogenesis and may provide novel therapeutic targets. Here, we demonstrate that the histone demethylase KDM6A promotes infection of diverse coronaviruses, including SARS-CoV, SARS-CoV-2, MERS-CoV and mouse hepatitis virus (MHV) in a demethylase activity-independent manner. Mechanistic studies reveal that KDM6A promotes viral entry by regulating expression of multiple coronavirus receptors, including ACE2, DPP4 and Ceacam1. Importantly, the TPR domain of KDM6A is required for recruitment of the histone methyltransferase KMT2D and histone deacetylase p300. Together this KDM6A-KMT2D-p300 complex localizes to the proximal and distal enhancers of ACE2 and regulates receptor expression. Notably, small molecule inhibition of p300 catalytic activity abrogates ACE2 and DPP4 expression and confers resistance to all major SARS-CoV-2 variants and MERS-CoV in primary human airway and intestinal epithelial cells. These data highlight the role for KDM6A-KMT2D-p300 complex activities in conferring diverse coronaviruses susceptibility and reveal a potential pan-coronavirus therapeutic target to combat current and emerging coronaviruses. One Sentence Summary: The KDM6A/KMT2D/EP300 axis promotes expression of multiple viral receptors and represents a potential drug target for diverse coronaviruses.


Assuntos
COVID-19 , Coronavírus da Síndrome Respiratória do Oriente Médio , Animais , Humanos , Camundongos , Enzima de Conversão de Angiotensina 2/metabolismo , Dipeptidil Peptidase 4/metabolismo , Histona Desmetilases/metabolismo , Coronavírus da Síndrome Respiratória do Oriente Médio/metabolismo , Receptores Virais/genética , Receptores Virais/metabolismo , SARS-CoV-2/metabolismo
11.
Mol Cell ; 63(3): 355-70, 2016 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-27494557

RESUMO

Advances in the development of delivery, repair, and specificity strategies for the CRISPR-Cas9 genome engineering toolbox are helping researchers understand gene function with unprecedented precision and sensitivity. CRISPR-Cas9 also holds enormous therapeutic potential for the treatment of genetic disorders by directly correcting disease-causing mutations. Although the Cas9 protein has been shown to bind and cleave DNA at off-target sites, the field of Cas9 specificity is rapidly progressing, with marked improvements in guide RNA selection, protein and guide engineering, novel enzymes, and off-target detection methods. We review important challenges and breakthroughs in the field as a comprehensive practical guide to interested users of genome editing technologies, highlighting key tools and strategies for optimizing specificity. The genome editing community should now strive to standardize such methods for measuring and reporting off-target activity, while keeping in mind that the goal for specificity should be continued improvement and vigilance.


Assuntos
Proteínas Associadas a CRISPR/metabolismo , Sistemas CRISPR-Cas , DNA/metabolismo , Endonucleases/metabolismo , Edição de Genes/métodos , Marcação de Genes/métodos , Genômica/métodos , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Proteínas Associadas a CRISPR/genética , Biologia Computacional , DNA/genética , Endonucleases/genética , Humanos , Cinética , Mutação , Engenharia de Proteínas , RNA Guia de Cinetoplastídeos/genética , RNA Guia de Cinetoplastídeos/metabolismo , Especificidade por Substrato
12.
Nat Chem Biol ; 17(9): 982-988, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34354262

RESUMO

Direct, amplification-free detection of RNA has the potential to transform molecular diagnostics by enabling simple on-site analysis of human or environmental samples. CRISPR-Cas nucleases offer programmable RNA-guided RNA recognition that triggers cleavage and release of a fluorescent reporter molecule, but long reaction times hamper their detection sensitivity and speed. Here, we show that unrelated CRISPR nucleases can be deployed in tandem to provide both direct RNA sensing and rapid signal generation, thus enabling robust detection of ~30 molecules per µl of RNA in 20 min. Combining RNA-guided Cas13 and Csm6 with a chemically stabilized activator creates a one-step assay that can detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA extracted from respiratory swab samples with quantitative reverse transcriptase PCR (qRT-PCR)-derived cycle threshold (Ct) values up to 33, using a compact detector. This Fast Integrated Nuclease Detection In Tandem (FIND-IT) approach enables sensitive, direct RNA detection in a format that is amenable to point-of-care infection diagnosis as well as to a wide range of other diagnostic or research applications.


Assuntos
COVID-19/genética , Sistemas CRISPR-Cas/genética , RNA Viral/genética , SARS-CoV-2/genética , Humanos , Reação em Cadeia da Polimerase Via Transcriptase Reversa
13.
Nature ; 517(7536): 583-8, 2015 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-25494202

RESUMO

Systematic interrogation of gene function requires the ability to perturb gene expression in a robust and generalizable manner. Here we describe structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci. We used these engineered Cas9 activation complexes to investigate single-guide RNA (sgRNA) targeting rules for effective transcriptional activation, to demonstrate multiplexed activation of ten genes simultaneously, and to upregulate long intergenic non-coding RNA (lincRNA) transcripts. We also synthesized a library consisting of 70,290 guides targeting all human RefSeq coding isoforms to screen for genes that, upon activation, confer resistance to a BRAF inhibitor. The top hits included genes previously shown to be able to confer resistance, and novel candidates were validated using individual sgRNA and complementary DNA overexpression. A gene expression signature based on the top screening hits correlated with markers of BRAF inhibitor resistance in cell lines and patient-derived samples. These results collectively demonstrate the potential of Cas9-based activators as a powerful genetic perturbation technology.


Assuntos
Sistemas CRISPR-Cas/genética , Engenharia Genética/métodos , Genoma Humano/genética , Melanoma/genética , Ativação Transcricional/genética , Proteínas Associadas a CRISPR/genética , Proteínas Associadas a CRISPR/metabolismo , Linhagem Celular Tumoral , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , DNA Complementar/biossíntese , DNA Complementar/genética , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Resistencia a Medicamentos Antineoplásicos/genética , Regulação Neoplásica da Expressão Gênica/genética , Biblioteca Gênica , Loci Gênicos/genética , Testes Genéticos , Humanos , Indóis/farmacologia , Melanoma/tratamento farmacológico , Proteínas Proto-Oncogênicas B-raf/antagonistas & inibidores , RNA não Traduzido/biossíntese , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Reprodutibilidade dos Testes , Sulfonamidas/farmacologia , Regulação para Cima/genética
14.
Nature ; 500(7463): 472-476, 2013 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-23877069

RESUMO

The dynamic nature of gene expression enables cellular programming, homeostasis and environmental adaptation in living systems. Dissection of causal gene functions in cellular and organismal processes therefore necessitates approaches that enable spatially and temporally precise modulation of gene expression. Recently, a variety of microbial and plant-derived light-sensitive proteins have been engineered as optogenetic actuators, enabling high-precision spatiotemporal control of many cellular functions. However, versatile and robust technologies that enable optical modulation of transcription in the mammalian endogenous genome remain elusive. Here we describe the development of light-inducible transcriptional effectors (LITEs), an optogenetic two-hybrid system integrating the customizable TALE DNA-binding domain with the light-sensitive cryptochrome 2 protein and its interacting partner CIB1 from Arabidopsis thaliana. LITEs do not require additional exogenous chemical cofactors, are easily customized to target many endogenous genomic loci, and can be activated within minutes with reversibility. LITEs can be packaged into viral vectors and genetically targeted to probe specific cell populations. We have applied this system in primary mouse neurons, as well as in the brain of freely behaving mice in vivo to mediate reversible modulation of mammalian endogenous gene expression as well as targeted epigenetic chromatin modifications. The LITE system establishes a novel mode of optogenetic control of endogenous cellular processes and enables direct testing of the causal roles of genetic and epigenetic regulation in normal biological processes and disease states.


Assuntos
Epigênese Genética/genética , Epigênese Genética/efeitos da radiação , Regulação da Expressão Gênica/efeitos da radiação , Luz , Optogenética/métodos , Transcrição Gênica/efeitos da radiação , Animais , Proteínas de Arabidopsis/metabolismo , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Células Cultivadas , Cromatina/genética , Cromatina/efeitos da radiação , Criptocromos/metabolismo , Regulação da Expressão Gênica/genética , Vetores Genéticos/genética , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Neurônios/metabolismo , Neurônios/efeitos da radiação , Fatores de Tempo , Transcrição Gênica/genética , Técnicas do Sistema de Duplo-Híbrido , Vigília
16.
bioRxiv ; 2024 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-38948874

RESUMO

Gene therapies have the potential to treat disease by delivering therapeutic genetic cargo to disease-associated cells. One limitation to their widespread use is the lack of short regulatory sequences, or promoters, that differentially induce the expression of delivered genetic cargo in target cells, minimizing side effects in other cell types. Such cell-type-specific promoters are difficult to discover using existing methods, requiring either manual curation or access to large datasets of promoter-driven expression from both targeted and untargeted cells. Model-based optimization (MBO) has emerged as an effective method to design biological sequences in an automated manner, and has recently been used in promoter design methods. However, these methods have only been tested using large training datasets that are expensive to collect, and focus on designing promoters for markedly different cell types, overlooking the complexities associated with designing promoters for closely related cell types that share similar regulatory features. Therefore, we introduce a comprehensive framework for utilizing MBO to design promoters in a data-efficient manner, with an emphasis on discovering promoters for similar cell types. We use conservative objective models (COMs) for MBO and highlight practical considerations such as best practices for improving sequence diversity, getting estimates of model uncertainty, and choosing the optimal set of sequences for experimental validation. Using three relatively similar blood cancer cell lines (Jurkat, K562, and THP1), we show that our approach discovers many novel cell-type-specific promoters after experimentally validating the designed sequences. For K562 cells, in particular, we discover a promoter that has 75.85% higher cell-type-specificity than the best promoter from the initial dataset used to train our models.

17.
bioRxiv ; 2024 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-38328150

RESUMO

Genomic rearrangements, encompassing mutational changes in the genome such as insertions, deletions, or inversions, are essential for genetic diversity. These rearrangements are typically orchestrated by enzymes involved in fundamental DNA repair processes such as homologous recombination or in the transposition of foreign genetic material by viruses and mobile genetic elements (MGEs). We report that IS110 insertion sequences, a family of minimal and autonomous MGEs, express a structured non-coding RNA that binds specifically to their encoded recombinase. This bridge RNA contains two internal loops encoding nucleotide stretches that base-pair with the target DNA and donor DNA, which is the IS110 element itself. We demonstrate that the target-binding and donor-binding loops can be independently reprogrammed to direct sequence-specific recombination between two DNA molecules. This modularity enables DNA insertion into genomic target sites as well as programmable DNA excision and inversion. The IS110 bridge system expands the diversity of nucleic acid-guided systems beyond CRISPR and RNA interference, offering a unified mechanism for the three fundamental DNA rearrangements required for genome design.

18.
bioRxiv ; 2023 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-36909524

RESUMO

Advances in gene delivery technologies are enabling rapid progress in molecular medicine, but require precise expression of genetic cargo in desired cell types, which is predominantly achieved via a regulatory DNA sequence called a promoter; however, only a handful of cell type-specific promoters are known. Efficiently designing compact promoter sequences with a high density of regulatory information by leveraging machine learning models would therefore be broadly impactful for fundamental research and direct therapeutic applications. However, models of expression from such compact promoter sequences are lacking, despite the recent success of deep learning in modelling expression from endogenous regulatory sequences. Despite the lack of large datasets measuring promoter-driven expression in many cell types, data from a few well-studied cell types or from endogenous gene expression may provide relevant information for transfer learning, which has not yet been explored in this setting. Here, we evaluate a variety of pretraining tasks and transfer strategies for modelling cell type-specific expression from compact promoters and demonstrate the effectiveness of pretraining on existing promoter-driven expression datasets from other cell types. Our approach is broadly applicable for modelling promoter-driven expression in any data-limited cell type of interest, and will enable the use of model-based optimization techniques for promoter design for gene delivery applications. Our code and data are available at https://github.com/anikethjr/promoter_models.

19.
Nat Biotechnol ; 41(4): 488-499, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36217031

RESUMO

Large serine recombinases (LSRs) are DNA integrases that facilitate the site-specific integration of mobile genetic elements into bacterial genomes. Only a few LSRs, such as Bxb1 and PhiC31, have been characterized to date, with limited efficiency as tools for DNA integration in human cells. In this study, we developed a computational approach to identify thousands of LSRs and their DNA attachment sites, expanding known LSR diversity by >100-fold and enabling the prediction of their insertion site specificities. We tested their recombination activity in human cells, classifying them as landing pad, genome-targeting or multi-targeting LSRs. Overall, we achieved up to seven-fold higher recombination than Bxb1 and genome integration efficiencies of 40-75% with cargo sizes over 7 kb. We also demonstrate virus-free, direct integration of plasmid or amplicon libraries for improved functional genomics applications. This systematic discovery of recombinases directly from microbial sequencing data provides a resource of over 60 LSRs experimentally characterized in human cells for large-payload genome insertion without exposed DNA double-stranded breaks.


Assuntos
Engenharia Genética , Integrases , Humanos , Genoma Humano , Transfecção , Biblioteca Genômica
20.
Cell Syst ; 14(12): 1087-1102.e13, 2023 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-38091991

RESUMO

Effective and precise mammalian transcriptome engineering technologies are needed to accelerate biological discovery and RNA therapeutics. Despite the promise of programmable CRISPR-Cas13 ribonucleases, their utility has been hampered by an incomplete understanding of guide RNA design rules and cellular toxicity resulting from off-target or collateral RNA cleavage. Here, we quantified the performance of over 127,000 RfxCas13d (CasRx) guide RNAs and systematically evaluated seven machine learning models to build a guide efficiency prediction algorithm orthogonally validated across multiple human cell types. Deep learning model interpretation revealed preferred sequence motifs and secondary features for highly efficient guides. We next identified and screened 46 novel Cas13d orthologs, finding that DjCas13d achieves low cellular toxicity and high specificity-even when targeting abundant transcripts in sensitive cell types, including stem cells and neurons. Our Cas13d guide efficiency model was successfully generalized to DjCas13d, illustrating the power of combining machine learning with ortholog discovery to advance RNA targeting in human cells.


Assuntos
Sistemas CRISPR-Cas , Aprendizado Profundo , RNA , Humanos , Sistemas CRISPR-Cas/genética , RNA/genética , RNA Guia de Sistemas CRISPR-Cas , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA