RESUMO
BACKGROUND: CRISPR-Cas9 dropout screens are formidable tools for investigating biology with unprecedented precision and scale. However, biases in data lead to potential confounding effects on interpretation and compromise overall quality. The activity of Cas9 is influenced by structural features of the target site, including copy number amplifications (CN bias). More worryingly, proximal targeted loci tend to generate similar gene-independent responses to CRISPR-Cas9 targeting (proximity bias), possibly due to Cas9-induced whole chromosome-arm truncations or other genomic structural features and different chromatin accessibility levels. RESULTS: We benchmarked eight computational methods, rigorously evaluating their ability to reduce both CN and proximity bias in the two largest publicly available cell-line-based CRISPR-Cas9 screens to date. We also evaluated the capability of each method to preserve data quality and heterogeneity by assessing the extent to which the processed data allows accurate detection of true positive essential genes, established oncogenetic addictions, and known/novel biomarkers of cancer dependency. Our analysis sheds light on the ability of each method to correct biases under different scenarios. AC-Chronos outperforms other methods in correcting both CN and proximity biases when jointly processing multiple screens of models with available CN information, whereas CRISPRcleanR is the top performing method for individual screens or when CN information is not available. In addition, Chronos and AC-Chronos yield a final dataset better able to recapitulate known sets of essential and non-essential genes. CONCLUSIONS: Overall, our investigation provides guidance for the selection of the most appropriate bias-correction method, based on its strengths, weaknesses and experimental settings.
Assuntos
Benchmarking , Sistemas CRISPR-Cas , Humanos , Biologia Computacional/métodos , ViésRESUMO
Reducing disparities is vital for equitable access to precision treatments in cancer. Socioenvironmental factors are a major driver of disparities, but differences in genetic variation likely also contribute. The impact of genetic ancestry on prioritization of cancer targets in drug discovery pipelines has not been systematically explored due to the absence of pre-clinical data at the appropriate scale. Here, we analyze data from 611 genome-scale CRISPR/Cas9 viability experiments in human cell line models to identify ancestry-associated genetic dependencies essential for cell survival. Surprisingly, we find that most putative associations between ancestry and dependency arise from artifacts related to germline variants. Our analysis suggests that for 1.2-2.5% of guides, germline variants in sgRNA targeting sequences reduce cutting by the CRISPR/Cas9 nuclease, disproportionately affecting cell models derived from individuals of recent African descent. We propose three approaches to mitigate this experimental bias, enabling the scientific community to address these disparities.
Assuntos
Sistemas CRISPR-Cas , Mutação em Linhagem Germinativa , Humanos , Edição de Genes/métodos , RNA Guia de Sistemas CRISPR-Cas/genética , Células Germinativas/metabolismo , Variação Genética , Neoplasias/genética , Reações Falso-Negativas , Genoma Humano , Linhagem Celular Tumoral , Linhagem CelularRESUMO
CRISPR loss of function screens are powerful tools to interrogate biology but exhibit a number of biases and artifacts that can confound the results. Here, we introduce Chronos, an algorithm for inferring gene knockout fitness effects based on an explicit model of cell proliferation dynamics after CRISPR gene knockout. We test Chronos on two pan-cancer CRISPR datasets and one longitudinal CRISPR screen. Chronos generally outperforms competitors in separation of controls and strength of biomarker associations, particularly when longitudinal data is available. Additionally, Chronos exhibits the lowest copy number and screen quality bias of evaluated methods. Chronos is available at https://github.com/broadinstitute/chronos .
Assuntos
Sistemas CRISPR-Cas , Biologia Computacional , Genoma , Dinâmica Populacional , Algoritmos , Biomarcadores Tumorais/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Técnicas de Inativação de Genes , Biblioteca Gênica , Humanos , Neoplasias/genéticaRESUMO
CRISPR-Cas9 viability screens are increasingly performed at a genome-wide scale across large panels of cell lines to identify new therapeutic targets for precision cancer therapy. Integrating the datasets resulting from these studies is necessary to adequately represent the heterogeneity of human cancers and to assemble a comprehensive map of cancer genetic vulnerabilities. Here, we integrated the two largest public independent CRISPR-Cas9 screens performed to date (at the Broad and Sanger institutes) by assessing, comparing, and selecting methods for correcting biases due to heterogeneous single-guide RNA efficiency, gene-independent responses to CRISPR-Cas9 targeting originated from copy number alterations, and experimental batch effects. Our integrated datasets recapitulate findings from the individual datasets, provide greater statistical power to cancer- and subtype-specific analyses, unveil additional biomarkers of gene dependency, and improve the detection of common essential genes. We provide the largest integrated resources of CRISPR-Cas9 screens to date and the basis for harmonizing existing and future functional genetics datasets.