RESUMO
Cell type-specific differential gene expression analyses based on single-cell transcriptome datasets are sensitive to the presence of cell-free mRNA in the droplets containing single cells. This so-called ambient RNA contamination may differ between samples obtained from patients and healthy controls. Current ambient RNA correction methods were not developed specifically for single-cell differential gene expression (sc-DGE) analyses and might therefore not sufficiently correct for ambient RNA-derived signals. Here, we show that ambient RNA levels are highly sample-specific. We found that without ambient RNA correction, sc-DGE analyses erroneously identify transcripts originating from ambient RNA as cell type-specific disease-associated genes. We therefore developed a computationally lean and intuitive correction method, Fast Correction for Ambient RNA (FastCAR), optimized for sc-DGE analysis of scRNA-Seq datasets generated by droplet-based methods including the 10XGenomics Chromium platform. FastCAR uses the profile of transcripts observed in libraries that likely represent empty droplets to determine the level of ambient RNA in each individual sample, and then corrects for these ambient RNA gene expression values. FastCAR can be applied as part of the data pre-processing and QC in sc-DGE workflows comparing scRNA-Seq data in a health versus disease experimental design. We compared FastCAR with two methods previously developed to remove ambient RNA, SoupX and CellBender. All three methods identified additional genes in sc-DGE analyses that were not identified in the absence of ambient RNA correction. However, we show that FastCAR performs better at correcting gene expression values attributed to ambient RNA, resulting in a lower frequency of false-positive observations. Moreover, the use of FastCAR in a sc-DGE workflow increases the cell-type specificity of sc-DGE analyses across disease conditions.
Assuntos
Perfilação da Expressão Gênica , RNA , Humanos , RNA/metabolismo , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Transcriptoma , Projetos de Pesquisa , Análise de Célula Única/métodosRESUMO
The Human Cell Atlas (HCA) consortium aims to establish an atlas of all organs in the healthy human body at single-cell resolution to increase our understanding of basic biological processes that govern development, physiology and anatomy, and to accelerate diagnosis and treatment of disease. The Lung Biological Network of the HCA aims to generate the Human Lung Cell Atlas as a reference for the cellular repertoire, molecular cell states and phenotypes, and cell-cell interactions that characterise normal lung homeostasis in healthy lung tissue. Such a reference atlas of the healthy human lung will facilitate mapping the changes in the cellular landscape in disease. The discovAIR project is one of six pilot actions for the HCA funded by the European Commission in the context of the H2020 framework programme. discovAIR aims to establish the first draft of an integrated Human Lung Cell Atlas, combining single-cell transcriptional and epigenetic profiling with spatially resolving techniques on matched tissue samples, as well as including a number of chronic and infectious diseases of the lung. The integrated Human Lung Cell Atlas will be available as a resource for the wider respiratory community, including basic and translational scientists, clinical medicine, and the private sector, as well as for patients with lung disease and the interested lay public. We anticipate that the Human Lung Cell Atlas will be the founding stone for a more detailed understanding of the pathogenesis of lung diseases, guiding the design of novel diagnostics and preventive or curative interventions.
Assuntos
Pneumopatias , Pulmão , Humanos , Proteômica , TóraxRESUMO
Childhood allergic diseases, including asthma, rhinitis and eczema, are prevalent conditions that share strong genetic and environmental components. Diagnosis relies on clinical history and measurements of allergen-specific IgE. We hypothesize that a multi-omics model could accurately diagnose childhood allergic disease. We show that nasal DNA methylation has the strongest predictive power to diagnose childhood allergy, surpassing blood DNA methylation, genetic risk scores, and environmental factors. DNA methylation at only three nasal CpG sites classifies allergic disease in Dutch children aged 16 years well, with an area under the curve (AUC) of 0.86. This is replicated in Puerto Rican children aged 9-20 years (AUC 0.82). DNA methylation at these CpGs additionally detects allergic multimorbidity and symptomatic IgE sensitization. Using nasal single-cell RNA-sequencing data, these three CpGs associate with influx of T cells and macrophages that contribute to allergic inflammation. Our study suggests the potential of methylation-based allergy diagnosis.