Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Cell ; 182(2): 463-480.e30, 2020 07 23.
Article in English | MEDLINE | ID: mdl-32533916

ABSTRACT

Although base editors are widely used to install targeted point mutations, the factors that determine base editing outcomes are not well understood. We characterized sequence-activity relationships of 11 cytosine and adenine base editors (CBEs and ABEs) on 38,538 genomically integrated targets in mammalian cells and used the resulting outcomes to train BE-Hive, a machine learning model that accurately predicts base editing genotypic outcomes (R ≈ 0.9) and efficiency (R ≈ 0.7). We corrected 3,388 disease-associated SNVs with ≥90% precision, including 675 alleles with bystander nucleotides that BE-Hive correctly predicted would not be edited. We discovered determinants of previously unpredictable C-to-G, or C-to-A editing and used these discoveries to correct coding sequences of 174 pathogenic transversion SNVs with ≥90% precision. Finally, we used insights from BE-Hive to engineer novel CBE variants that modulate editing outcomes. These discoveries illuminate base editing, enable editing at previously intractable targets, and provide new base editors with improved editing capabilities.


Subject(s)
Gene Editing/methods , Machine Learning , Animals , Gene Library , Humans , Mice , Mouse Embryonic Stem Cells/cytology , Mouse Embryonic Stem Cells/metabolism , Point Mutation , RNA, Guide, Kinetoplastida/metabolism
2.
Nature ; 567(7746): E1-E2, 2019 03.
Article in English | MEDLINE | ID: mdl-30765887

ABSTRACT

In this Article, a data processing error affected Fig. 3e and Extended Data Table 2; these errors have been corrected online.

3.
Nature ; 563(7733): 646-651, 2018 11.
Article in English | MEDLINE | ID: mdl-30405244

ABSTRACT

Following Cas9 cleavage, DNA repair without a donor template is generally considered stochastic, heterogeneous and impractical beyond gene disruption. Here, we show that template-free Cas9 editing is predictable and capable of precise repair to a predicted genotype, enabling correction of disease-associated mutations in humans. We constructed a library of 2,000 Cas9 guide RNAs paired with DNA target sites and trained inDelphi, a machine learning model that predicts genotypes and frequencies of 1- to 60-base-pair deletions and 1-base-pair insertions with high accuracy (r = 0.87) in five human and mouse cell lines. inDelphi predicts that 5-11% of Cas9 guide RNAs targeting the human genome are 'precise-50', yielding a single genotype comprising greater than or equal to 50% of all major editing products. We experimentally confirmed precise-50 insertions and deletions in 195 human disease-relevant alleles, including correction in primary patient-derived fibroblasts of pathogenic alleles to wild-type genotype for Hermansky-Pudlak syndrome and Menkes disease. This study establishes an approach for precise, template-free genome editing.


Subject(s)
CRISPR-Cas Systems/genetics , Gene Editing/methods , Gene Editing/standards , Hermanski-Pudlak Syndrome/genetics , Machine Learning , Menkes Kinky Hair Syndrome/genetics , Templates, Genetic , Alleles , Base Sequence , CRISPR-Associated Protein 9/metabolism , DNA Repair/genetics , Fibroblasts/metabolism , Fibroblasts/pathology , HCT116 Cells , HEK293 Cells , Hermanski-Pudlak Syndrome/pathology , Humans , K562 Cells , Menkes Kinky Hair Syndrome/pathology , Reproducibility of Results , Substrate Specificity
4.
Nat Chem Biol ; 17(11): 1188-1198, 2021 11.
Article in English | MEDLINE | ID: mdl-34635842

ABSTRACT

Directed evolution can generate proteins with tailor-made activities. However, full-length genotypes, their frequencies and fitnesses are difficult to measure for evolving gene-length biomolecules using most high-throughput DNA sequencing methods, as short read lengths can lose mutation linkages in haplotypes. Here we present Evoracle, a machine learning method that accurately reconstructs full-length genotypes (R2 = 0.94) and fitness using short-read data from directed evolution experiments, with substantial improvements over related methods. We validate Evoracle on phage-assisted continuous evolution (PACE) and phage-assisted non-continuous evolution (PANCE) of adenine base editors and OrthoRep evolution of drug-resistant enzymes. Evoracle retains strong performance (R2 = 0.86) on data with complete linkage loss between neighboring nucleotides and large measurement noise, such as pooled Sanger sequencing data (~US$10 per timepoint), and broadens the accessibility of training machine learning models on gene variant fitnesses. Evoracle can also identify high-fitness variants, including low-frequency 'rising stars', well before they are identifiable from consensus mutations.


Subject(s)
Adenosine Deaminase/genetics , Escherichia coli Proteins/genetics , High-Throughput Nucleotide Sequencing , Genetic Variation/genetics , Machine Learning
5.
PLoS Comput Biol ; 17(1): e1008605, 2021 01.
Article in English | MEDLINE | ID: mdl-33417623

ABSTRACT

Restoring gene function by the induced skipping of deleterious exons has been shown to be effective for treating genetic disorders. However, many of the clinically successful therapies for exon skipping are transient oligonucleotide-based treatments that require frequent dosing. CRISPR-Cas9 based genome editing that causes exon skipping is a promising therapeutic modality that may offer permanent alleviation of genetic disease. We show that machine learning can select Cas9 guide RNAs that disrupt splice acceptors and cause the skipping of targeted exons. We experimentally measured the exon skipping frequencies of a diverse genome-integrated library of 791 splice sequences targeted by 1,063 guide RNAs in mouse embryonic stem cells. We found that our method, SkipGuide, is able to identify effective guide RNAs with a precision of 0.68 (50% threshold predicted exon skipping frequency) and 0.93 (70% threshold predicted exon skipping frequency). We anticipate that SkipGuide will be useful for selecting guide RNA candidates for evaluation of CRISPR-Cas9-mediated exon skipping therapy.


Subject(s)
CRISPR-Cas Systems/genetics , Gene Editing/methods , Genetic Therapy/methods , Machine Learning , RNA, Guide, Kinetoplastida/genetics , Animals , Cells, Cultured , Embryonic Stem Cells , Exons , Gene Library , Humans , Mice
6.
PLoS Comput Biol ; 17(3): e1008789, 2021 03.
Article in English | MEDLINE | ID: mdl-33711017

ABSTRACT

We introduce poly-adenine CRISPR gRNA-based single-cell RNA-sequencing (pAC-Seq), a method that enables the direct observation of guide RNAs (gRNAs) in scRNA-seq. We use pAC-Seq to assess the phenotypic consequences of CRISPR/Cas9 based alterations of gene cis-regulatory regions. We show that pAC-Seq is able to detect cis-regulatory-induced alteration of target gene expression even when biallelic loss of target gene expression occurs in only ~5% of cells. This low rate of biallelic loss significantly increases the number of cells required to detect the consequences of changes to the regulatory genome, but can be ameliorated by transcript-targeted sequencing. Based on our experimental results we model the power to detect regulatory genome induced transcriptomic effects based on the rate of mono/biallelic loss, baseline gene expression, and the number of cells per target gRNA.


Subject(s)
CRISPR-Cas Systems/genetics , Regulatory Elements, Transcriptional/genetics , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Transcriptome/genetics , Algorithms , Animals , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Computational Biology , Databases, Factual , Humans , Mice , RNA, Guide, Kinetoplastida/genetics
7.
Proc Natl Acad Sci U S A ; 113(52): E8396-E8405, 2016 12 27.
Article in English | MEDLINE | ID: mdl-27956617

ABSTRACT

The recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize the strengths of the alternative de Bruijn graph approach to genome assembly. Moreover, these studies often assume that applications of the de Bruijn graph approach are limited to short and accurate reads and that the OLC approach is the only practical paradigm for assembling long error-prone reads. We show how to generalize de Bruijn graphs for assembling long error-prone reads and describe the ABruijn assembler, which combines the de Bruijn graph and the OLC approaches and results in accurate genome reconstructions.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Algorithms , Benchmarking , Escherichia coli/genetics , Genomics , Reproducibility of Results , Software , Xanthomonas/genetics
8.
Science ; 380(6642): eadg6518, 2023 04 21.
Article in English | MEDLINE | ID: mdl-36996170

ABSTRACT

Spinal muscular atrophy (SMA), the leading genetic cause of infant mortality, arises from survival motor neuron (SMN) protein insufficiency resulting from SMN1 loss. Approved therapies circumvent endogenous SMN regulation and require repeated dosing or may wane. We describe genome editing of SMN2, an insufficient copy of SMN1 harboring a C6>T mutation, to permanently restore SMN protein levels and rescue SMA phenotypes. We used nucleases or base editors to modify five SMN2 regulatory regions. Base editing converted SMN2 T6>C, restoring SMN protein levels to wild type. Adeno-associated virus serotype 9-mediated base editor delivery in Δ7SMA mice yielded 87% average T6>C conversion, improved motor function, and extended average life span, which was enhanced by one-time base editor and nusinersen coadministration (111 versus 17 days untreated). These findings demonstrate the potential of a one-time base editing treatment for SMA.


Subject(s)
Gene Editing , Muscular Atrophy, Spinal , Survival of Motor Neuron 1 Protein , Survival of Motor Neuron 2 Protein , Animals , Mice , Fibroblasts/metabolism , Motor Neurons/metabolism , Muscular Atrophy, Spinal/genetics , Muscular Atrophy, Spinal/therapy , Survival of Motor Neuron 1 Protein/genetics , Survival of Motor Neuron 2 Protein/genetics
9.
Nat Commun ; 13(1): 4541, 2022 08 04.
Article in English | MEDLINE | ID: mdl-35927274

ABSTRACT

In vitro selection queries large combinatorial libraries for sequence-defined polymers with target binding and reaction catalysis activity. While the total sequence space of these libraries can extend beyond 1022 sequences, practical considerations limit starting sequences to ≤~1015 distinct molecules. Selection-induced sequence convergence and limited sequencing depth further constrain experimentally observable sequence space. To address these limitations, we integrate experimental and machine learning approaches to explore regions of sequence space unrelated to experimentally derived variants. We perform in vitro selections to discover highly side-chain-functionalized nucleic acid polymers (HFNAPs) with potent affinities for a target small molecule (daunomycin KD = 5-65 nM). We then use the selection data to train a conditional variational autoencoder (CVAE) machine learning model to generate diverse and unique HFNAP sequences with high daunomycin affinities (KD = 9-26 nM), even though they are unrelated in sequence to experimental polymers. Coupling in vitro selection with a machine learning model thus enables direct generation of active variants, demonstrating a new approach to the discovery of functional biopolymers.


Subject(s)
Nucleic Acids , Biopolymers , Daunorubicin , Machine Learning , Polymers/chemistry
10.
Nat Commun ; 13(1): 3512, 2022 06 18.
Article in English | MEDLINE | ID: mdl-35717416

ABSTRACT

Prime editing enables search-and-replace genome editing but is limited by low editing efficiency. We present a high-throughput approach, the Peptide Self-Editing sequencing assay (PepSEq), to measure how fusion of 12,000 85-amino acid peptides influences prime editing efficiency. We show that peptide fusion can enhance prime editing, prime-enhancing peptides combine productively, and a top dual peptide-prime editor increases prime editing significantly in multiple cell lines across dozens of target sites. Top prime-enhancing peptides function by increasing translation efficiency and serve as broadly useful tools to improve prime editing efficiency.


Subject(s)
CRISPR-Cas Systems , Gene Editing , Cell Line , Gene Fusion , Peptides/genetics
11.
Nat Commun ; 12(1): 5111, 2021 08 25.
Article in English | MEDLINE | ID: mdl-34433825

ABSTRACT

Mutational outcomes following CRISPR-Cas9-nuclease cutting in mammalian cells have recently been shown to be predictable and, in certain cases, skewed toward single genotypes. However, the ability to control these outcomes remains limited, especially for 1-bp insertions, a common and therapeutically relevant class of repair outcomes. Here, through a small molecule screen, we identify the ATM kinase inhibitor KU-60019 as a compound capable of reproducibly increasing the fraction of 1-bp insertions relative to other Cas9 repair outcomes. Small molecule or genetic ATM inhibition increases 1-bp insertion outcome fraction across three human and mouse cell lines, two Cas9 species, and dozens of target sites, although concomitantly reducing the fraction of edited alleles. Notably, KU-60019 increases the relative frequency of 1-bp insertions to over 80% of edited alleles at several native human genomic loci and improves the efficiency of correction for pathogenic 1-bp deletion variants. The ability to increase 1-bp insertion frequency adds another dimension to precise template-free Cas9-nuclease genome editing.


Subject(s)
Ataxia Telangiectasia Mutated Proteins/antagonists & inhibitors , Ataxia Telangiectasia Mutated Proteins/metabolism , CRISPR-Cas Systems/drug effects , Morpholines/pharmacology , Mutagenesis, Insertional/drug effects , Protein Kinase Inhibitors/pharmacology , Thioxanthenes/pharmacology , Animals , Ataxia Telangiectasia Mutated Proteins/genetics , Cell Line , Gene Editing , Humans , Sequence Deletion/drug effects
12.
Nat Commun ; 12(1): 1034, 2021 02 15.
Article in English | MEDLINE | ID: mdl-33589617

ABSTRACT

Prime editing (PE) is a versatile genome editing technology, but design of the required guide RNAs is more complex than for standard CRISPR-based nucleases or base editors. Here we describe PrimeDesign, a user-friendly, end-to-end web application and command-line tool for the design of PE experiments. PrimeDesign can be used for single and combination editing applications, as well as genome-wide and saturation mutagenesis screens. Using PrimeDesign, we construct PrimeVar, a comprehensive and searchable database that includes candidate prime editing guide RNA (pegRNA) and nicking sgRNA (ngRNA) combinations for installing or correcting >68,500 pathogenic human genetic variants from the ClinVar database. Finally, we use PrimeDesign to design pegRNAs/ngRNAs to install a variety of human pathogenic variants in human cells.


Subject(s)
CRISPR-Cas Systems , Gene Editing/methods , Genome, Human , RNA, Guide, Kinetoplastida/genetics , Base Pairing , Base Sequence , CRISPR-Associated Protein 9/genetics , CRISPR-Associated Protein 9/metabolism , Clustered Regularly Interspaced Short Palindromic Repeats , Databases, Genetic , Fabry Disease/genetics , Fabry Disease/metabolism , Fabry Disease/pathology , Green Fluorescent Proteins/genetics , Green Fluorescent Proteins/metabolism , HEK293 Cells , Hemophilia A/genetics , Hemophilia A/metabolism , Hemophilia A/pathology , Humans , Models, Biological , Muscular Dystrophy, Duchenne/genetics , Muscular Dystrophy, Duchenne/metabolism , Muscular Dystrophy, Duchenne/pathology , Mutation , Nucleic Acid Conformation , Plasmids/chemistry , Plasmids/metabolism , RNA, Guide, Kinetoplastida/metabolism , Recombinant Fusion Proteins/genetics , Recombinant Fusion Proteins/metabolism
13.
Nat Biotechnol ; 39(11): 1414-1425, 2021 11.
Article in English | MEDLINE | ID: mdl-34183861

ABSTRACT

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE-single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.


Subject(s)
Clustered Regularly Interspaced Short Palindromic Repeats , Gene Editing , Animals , CRISPR-Cas Systems/genetics , Machine Learning , Mammals/genetics , RNA, Guide, Kinetoplastida/genetics
14.
Cell Rep ; 33(8): 108426, 2020 11 24.
Article in English | MEDLINE | ID: mdl-33238122

ABSTRACT

Gene expression is controlled by the collective binding of transcription factors to cis-regulatory regions. Deciphering gene-centered regulatory networks is vital to understanding and controlling gene misexpression in human disease; however, systematic approaches to uncovering regulatory networks have been lacking. Here we present high-throughput interrogation of gene-centered activation networks (HIGAN), a pipeline that employs a suite of multifaceted genomic approaches to connect upstream signaling inputs, trans-acting TFs, and cis-regulatory elements. We apply HIGAN to understand the aberrant activation of the cytidine deaminase APOBEC3B, an intrinsic source of cancer hypermutation. We reveal that nuclear factor κB (NF-κB) and AP-1 pathways are the most salient trans-acting inputs, with minor roles for other inflammatory pathways. We identify a cis-regulatory architecture dominated by a major intronic enhancer that requires coordinated NF-κB and AP-1 activity with secondary inputs from distal regulatory regions. Our data demonstrate how integration of cis and trans genomic screening platforms provides a paradigm for building gene-centered regulatory networks.


Subject(s)
Gene Expression/genetics , Gene Regulatory Networks/genetics , Oncogenes/immunology , Humans , Signal Transduction
15.
Nat Biotechnol ; 38(4): 471-481, 2020 04.
Article in English | MEDLINE | ID: mdl-32042170

ABSTRACT

The targeting scope of Streptococcus pyogenes Cas9 (SpCas9) and its engineered variants is largely restricted to protospacer-adjacent motif (PAM) sequences containing G bases. Here we report the evolution of three new SpCas9 variants that collectively recognize NRNH PAMs (where R is A or G and H is A, C or T) using phage-assisted non-continuous evolution, three new phage-assisted continuous evolution strategies for DNA binding and a secondary selection for DNA cleavage. The targeting capabilities of these evolved variants and SpCas9-NG were characterized in HEK293T cells using a library of 11,776 genomically integrated protospacer-sgRNA pairs containing all possible NNNN PAMs. The evolved variants mediated indel formation and base editing in human cells and enabled A•T-to-G•C base editing of a sickle cell anemia mutation using a previously inaccessible CACC PAM. These new evolved SpCas9 variants, together with previously reported variants, in principle enable targeting of most NR PAM sequences and substantially reduce the fraction of genomic sites that are inaccessible by Cas9-based methods.


Subject(s)
CRISPR-Associated Protein 9/genetics , CRISPR-Associated Protein 9/metabolism , CRISPR-Cas Systems/genetics , DNA/genetics , DNA/metabolism , DNA Cleavage , Directed Molecular Evolution , Gene Editing , Genetic Variation , Genome, Human/genetics , HEK293 Cells , Humans , Mutation , Nucleotide Motifs , Streptococcus pyogenes/enzymology , Streptococcus pyogenes/genetics , Substrate Specificity
SELECTION OF CITATIONS
SEARCH DETAIL