RESUMO
Combinatorial CRISPR technologies have emerged as a transformative approach to systematically probe genetic interactions and dependencies of redundant gene pairs. However, the performance of different functional genomic tools for multiplexing sgRNAs vary widely. Here, we generate and benchmark ten distinct pooled combinatorial CRISPR libraries targeting paralog pairs to optimize digenic knockout screens. Libraries composed of dual Streptococcus pyogenes Cas9 (spCas9), orthogonal spCas9 and Staphylococcus aureus (saCas9), and enhanced Cas12a from Acidaminococcus were evaluated. We demonstrate a combination of alternative tracrRNA sequences from spCas9 consistently show superior effect size and positional balance between the sgRNAs as a robust combinatorial approach to profile genetic interactions of multiple genes.
Assuntos
Acidaminococcus , Sistemas CRISPR-Cas , Acidaminococcus/genética , Sistemas CRISPR-Cas/genética , RNA Guia de Cinetoplastídeos/genética , Staphylococcus aureus/genética , Streptococcus pyogenes/genéticaRESUMO
Although single-gene perturbation screens have revealed a number of new targets, vulnerabilities specific to frequently altered drivers have not been uncovered. An important question is whether the compensatory relationship between functionally redundant genes masks potential therapeutic targets in single-gene perturbation studies. To identify digenic dependencies, we developed a CRISPR paralog targeting library to investigate the viability effects of disrupting 3,284 genes, 5,065 paralog pairs and 815 paralog families. We identified that dual inactivation of DUSP4 and DUSP6 selectively impairs growth in NRAS and BRAF mutant cells through the hyperactivation of MAPK signaling. Furthermore, cells resistant to MAPK pathway therapeutics become cross-sensitized to DUSP4 and DUSP6 perturbations such that the mechanisms of resistance to the inhibitors reinforce this mechanism of vulnerability. Together, multigene perturbation technologies unveil previously unrecognized digenic vulnerabilities that may be leveraged as new therapeutic targets in cancer.
Assuntos
Fosfatase 6 de Especificidade Dupla/genética , Fosfatases de Especificidade Dupla/genética , Sistema de Sinalização das MAP Quinases , Fosfatases da Proteína Quinase Ativada por Mitógeno/genética , Neoplasias/genética , Linhagem Celular Tumoral , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Ativação Enzimática , GTP Fosfo-Hidrolases/genética , Técnicas de Inativação de Genes , Humanos , Melanoma Experimental/genética , Melanoma Experimental/terapia , Proteínas de Membrana/genética , Neoplasias/enzimologia , Neoplasias/metabolismo , Neoplasias/terapia , Proteínas Proto-Oncogênicas B-raf/genéticaRESUMO
Systems for CRISPR-based combinatorial perturbation of two or more genes are emerging as powerful tools for uncovering genetic interactions. However, systematic identification of these relationships is complicated by sample, reagent, and biological variability. We develop a variational Bayes approach (GEMINI) that jointly analyzes all samples and reagents to identify genetic interactions in pairwise knockout screens. The improved accuracy and scalability of GEMINI enables the systematic analysis of combinatorial CRISPR knockout screens, regardless of design and dimension. GEMINI is available as an open source R package on GitHub at https://github.com/sellerslab/gemini .
Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Técnicas Genéticas , Software , Teorema de Bayes , Epistasia GenéticaRESUMO
Large panels of comprehensively characterized human cancer models, including the Cancer Cell Line Encyclopedia (CCLE), have provided a rigorous framework with which to study genetic variants, candidate targets, and small-molecule and biological therapeutics and to identify new marker-driven cancer dependencies. To improve our understanding of the molecular features that contribute to cancer phenotypes, including drug responses, here we have expanded the characterizations of cancer cell lines to include genetic, RNA splicing, DNA methylation, histone H3 modification, microRNA expression and reverse-phase protein array data for 1,072 cell lines from individuals of various lineages and ethnicities. Integration of these data with functional characterizations such as drug-sensitivity, short hairpin RNA knockdown and CRISPR-Cas9 knockout data reveals potential targets for cancer drugs and associated biomarkers. Together, this dataset and an accompanying public data portal provide a resource for the acceleration of cancer research using model cancer cell lines.
Assuntos
Linhagem Celular Tumoral , Neoplasias/genética , Neoplasias/patologia , Antineoplásicos/farmacologia , Biomarcadores Tumorais , Metilação de DNA , Resistencia a Medicamentos Antineoplásicos , Etnicidade/genética , Edição de Genes , Histonas/metabolismo , Humanos , MicroRNAs/genética , Terapia de Alvo Molecular , Neoplasias/metabolismo , Análise Serial de Proteínas , Splicing de RNARESUMO
When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this "coupled clustering" problem as an optimization problem and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single-cell RNA-sequencing (RNA-seq) and single-cell ATAC-sequencing (ATAC-seq) data.
Assuntos
Bases de Dados Genéticas , Modelos Genéticos , Análise de Sequência de RNA/métodos , Animais , HumanosRESUMO
Characterizing epigenetic heterogeneity at the cellular level is a critical problem in the modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate cellular level epigenetic heterogeneity through patterns of variability in open chromatin. However, these assays exhibit technical variability that complicates clear classification and cell type identification in heterogeneous populations. We present scABC, an R package for the unsupervised clustering of single-cell epigenetic data, to classify scATAC-seq data and discover regions of open chromatin specific to cell identity.
Assuntos
Epigenômica/métodos , Modelos Estatísticos , Análise de Célula Única/métodos , Algoritmos , Animais , Linhagem Celular , Cromatina/genética , Análise por Conglomerados , Camundongos , Células-Tronco Embrionárias Murinas , Análise de Sequência de DNA/métodos , SoftwareRESUMO
The reconstruction of gene regulatory networks from gene expression data has been the subject of intense research activity. A variety of models and methods have been developed to address different aspects of this important problem. However, these techniques are narrowly focused on particular biological and experimental platforms, and require experimental data that are typically unavailable and difficult to ascertain. The more recent availability of higher-throughput sequencing platforms, combined with more precise modes of genetic perturbation, presents an opportunity to formulate more robust and comprehensive approaches to gene network inference. Here, we propose a step-wise framework for identifying gene-gene regulatory interactions that expand from a known point of genetic or chemical perturbation using time series gene expression data. This novel approach sequentially identifies non-steady state genes post-perturbation and incorporates them into a growing series of low-complexity optimization problems. The governing ordinary differential equations of this model are rooted in the biophysics of stochastic molecular events that underlie gene regulation, delineating roles for both protein and RNA-mediated gene regulation. We show the successful application of our core algorithms for network inference using simulated and real datasets.
Assuntos
Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , Biologia de Sistemas/métodos , Bases de Dados Genéticas , Análise de Sequência de RNA , Fatores de TempoRESUMO
Transcription factors (TFs) play crucial roles in regulating gene expression through interactions with specific DNA sequences. Recently, the sequence motif of almost 400 human TFs have been identified using high-throughput SELEX sequencing. However, there remain a large number of TFs (â¼800) with no high-throughput-derived binding motifs. Computational methods capable of associating known motifs to such TFs will avoid tremendous experimental efforts and enable deeper understanding of transcriptional regulatory functions. We present a method to associate known motifs to TFs (MATLAB code is available in Supplementary Materials). Our method is based on a probabilistic framework that not only exploits DNA-binding domains and specificities, but also integrates open chromatin, gene expression and genomic data to accurately infer monomeric and homodimeric binding motifs. Our analysis resulted in the assignment of motifs to 200 TFs with no SELEX-derived motifs, roughly a 50% increase compared to the existing coverage.
Assuntos
Algoritmos , Cromatina/química , DNA/química , Regulação da Expressão Gênica , Modelos Estatísticos , Fatores de Transcrição/genética , Sítios de Ligação , Cromatina/metabolismo , DNA/genética , DNA/metabolismo , Genoma Humano , Humanos , Motivos de Nucleotídeos , Ligação Proteica , Técnica de Seleção de Aptâmeros , Fatores de Transcrição/metabolismoRESUMO
Dimension reduction methods are commonly applied to high-throughput biological datasets. However, the results can be hindered by confounding factors, either biological or technical in origin. In this study, we extend principal component analysis (PCA) to propose AC-PCA for simultaneous dimension reduction and adjustment for confounding (AC) variation. We show that AC-PCA can adjust for (i) variations across individual donors present in a human brain exon array dataset and (ii) variations of different species in a model organism ENCODE RNA sequencing dataset. Our approach is able to recover the anatomical structure of neocortical regions and to capture the shared variation among species during embryonic development. For gene selection purposes, we extend AC-PCA with sparsity constraints and propose and implement an efficient algorithm. The methods developed in this paper can also be applied to more general settings. The R package and MATLAB source code are available at https://github.com/linzx06/AC-PCA.
Assuntos
Encéfalo/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Componente Principal , Análise de Sequência de RNA , Algoritmos , Mapeamento Encefálico , Simulação por Computador , Interpretação Estatística de Dados , Éxons , Humanos , Modelos Estatísticos , Software , TranscriptomaRESUMO
We are developing a novel intradural spinal cord (SC) stimulator designed to improve the treatment of intractable pain and the sequelae of SC injury. In-vivo ovine models of neuropathic pain and moderate SC injury are being implemented for pre-clinical evaluations of this device, to be carried out via gait analysis before and after induction of the relevant condition. We extend previous studies on other quadrupeds to extract the three-dimensional kinematics of the limbs over the gait cycle of sheep walking on a treadmill. Quantitative measures of thoracic and pelvic limb movements were obtained from 17 animals. We calculated the total-error values to define the analytical performance of our motion capture system for these kinematic variables. The post- vs. pre-injury time delay between contralateral thoracic and pelvic-limb steps for normal and SC-injured sheep increased by ~24s over 100 steps. The pelvic limb hoof velocity during swing phase decreased, while range of pelvic hoof elevation and distance between lateral pelvic hoof placements increased after SC injury. The kinematics measures in a single SC-injured sheep can be objectively defined as changed from the corresponding pre-injury values, implying utility of this method to assess new neuromodulation strategies for specific deficits exhibited by an individual.