Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Genes Dev ; 37(17-18): 818-828, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-37775182

RESUMO

Activating KRAS mutations (KRAS*) in pancreatic ductal adenocarcinoma (PDAC) drive anabolic metabolism and support tumor maintenance. KRAS* inhibitors show initial antitumor activity followed by recurrence due to cancer cell-intrinsic and immune-mediated paracrine mechanisms. Here, we explored the potential role of cancer-associated fibroblasts (CAFs) in enabling KRAS* bypass and identified CAF-derived NRG1 activation of cancer cell ERBB2 and ERBB3 receptor tyrosine kinases as a mechanism by which KRAS*-independent growth is supported. Genetic extinction or pharmacological inhibition of KRAS* resulted in up-regulation of ERBB2 and ERBB3 expression in human and murine models, which prompted cancer cell utilization of CAF-derived NRG1 as a survival factor. Genetic depletion or pharmacological inhibition of ERBB2/3 or NRG1 abolished KRAS* bypass and synergized with KRASG12D inhibitors in combination treatments in mouse and human PDAC models. Thus, we found that CAFs can contribute to KRAS* inhibitor therapy resistance via paracrine mechanisms, providing an actionable therapeutic strategy to improve the effectiveness of KRAS* inhibitors in PDAC patients.


Assuntos
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Animais , Camundongos , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteínas Proto-Oncogênicas p21(ras)/metabolismo , Proliferação de Células , Neoplasias Pancreáticas/metabolismo , Carcinoma Ductal Pancreático/genética , Carcinoma Ductal Pancreático/patologia , Neuregulina-1/genética , Neuregulina-1/metabolismo
2.
BMC Bioinformatics ; 23(1): 2, 2022 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-34983369

RESUMO

Cellular heterogeneity underlies cancer evolution and metastasis. Advances in single-cell technologies such as single-cell RNA sequencing and mass cytometry have enabled interrogation of cell type-specific expression profiles and abundance across heterogeneous cancer samples obtained from clinical trials and preclinical studies. However, challenges remain in determining sample sizes needed for ascertaining changes in cell type abundances in a controlled study. To address this statistical challenge, we have developed a new approach, named Sensei, to determine the number of samples and the number of cells that are required to ascertain such changes between two groups of samples in single-cell studies. Sensei expands the t-test and models the cell abundances using a beta-binomial distribution. We evaluate the mathematical accuracy of Sensei and provide practical guidelines on over 20 cell types in over 30 cancer types based on knowledge acquired from the cancer cell atlas (TCGA) and prior single-cell studies. We provide a web application to enable user-friendly study design via https://kchen-lab.github.io/sensei/table_beta.html .


Assuntos
Neoplasias , Software , Distribuição Binomial , Humanos , Neoplasias/genética , Projetos de Pesquisa , Tamanho da Amostra
3.
PLoS Comput Biol ; 15(10): e1007445, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31658262

RESUMO

Single-cell RNA-sequencing data generated by a variety of technologies, such as Drop-seq and SMART-seq, can reveal simultaneously the mRNA transcript levels of thousands of genes in thousands of cells. It is often important to identify informative genes or cell-type-discriminative markers to reduce dimensionality and achieve informative cell typing results. We present an ab initio method that performs unsupervised marker selection by identifying genes that have subpopulation-discriminative expression levels and are co- or mutually-exclusively expressed with other genes. Consistent improvements in cell-type classification and biologically meaningful marker selection are achieved by applying SCMarker on various datasets in multiple tissue types, followed by a variety of clustering algorithms. The source code of SCMarker is publicly available at https://github.com/KChen-lab/SCMarker.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Algoritmos , Sequência de Bases/genética , Biomarcadores , Análise por Conglomerados , Humanos , RNA/genética , Análise de Sequência de RNA/métodos , Software , Transcriptoma/genética
4.
Commun Biol ; 7(1): 326, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38486077

RESUMO

Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. The batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Label-Aware Distance (LAD), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate LAD on simulated data as well as apply it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). LAD provides better cell embedding than state-of-the-art batch correction methods on longitudinal datasets. It can be used in distance-based clustering and visualization methods to combine the power of multiple samples to help make biological findings.


Assuntos
Análise por Conglomerados , Animais , Camundongos , Expressão Gênica
5.
Nat Commun ; 15(1): 109, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38168026

RESUMO

Host anti-viral factors are essential for controlling SARS-CoV-2 infection but remain largely unknown due to the biases of previous large-scale studies toward pro-viral host factors. To fill in this knowledge gap, we perform a genome-wide CRISPR dropout screen and integrate analyses of the multi-omics data of the CRISPR screen, genome-wide association studies, single-cell RNA-Seq, and host-virus proteins or protein/RNA interactome. This study uncovers many host factors that are currently underappreciated, including the components of V-ATPases, ESCRT, and N-glycosylation pathways that modulate viral entry and/or replication. The cohesin complex is also identified as an anti-viral pathway, suggesting an important role of three-dimensional chromatin organization in mediating host-viral interaction. Furthermore, we discover another anti-viral regulator KLF5, a transcriptional factor involved in sphingolipid metabolism, which is up-regulated, and harbors genetic variations linked to COVID-19 patients with severe symptoms. Anti-viral effects of three identified candidates (DAZAP2/VTA1/KLF5) are confirmed individually. Molecular characterization of DAZAP2/VTA1/KLF5-knockout cells highlights the involvement of genes related to the coagulation system in determining the severity of COVID-19. Together, our results provide further resources for understanding the host anti-viral network during SARS-CoV-2 infection and may help develop new countermeasure strategies.


Assuntos
COVID-19 , Humanos , SARS-CoV-2 , Estudo de Associação Genômica Ampla , Multiômica , Antivirais/farmacologia
6.
Res Sq ; 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37547002

RESUMO

Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. The batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Batch-Corrected Distance (BCD), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate BCD on simulated data as well as applied it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). BCD achieves more accurate clusters and better visualizations than state-of-the-art batch correction methods on longitudinal datasets. BCD can be directly integrated with most clustering and visualization methods to enable more scientific findings.

7.
bioRxiv ; 2023 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-36993682

RESUMO

Personalized immunotherapy holds the promise of revolutionizing cancer prevention and treatment. However, selecting HLA-bound peptide targets that are specific to patient tumors has been challenging due to a lack of patient-specific antigen presentation models. Here, we present epiNB, a white-box, positive-example-only, semi-supervised method based on Naïve Bayes formulation, with information content-based feature selection, to achieve accurate modeling using Mass Spectrometry data eluted from mono-allelic cell lines and patient-derived cell lines. In addition to achieving state-of-the-art accuracy, epiNB yields novel insights into the structural properties, such as interactions of peptide positions, that appear important for modeling personalized, tumor-specific antigen presentation. epiNB uses substantially less parameters than neural networks, does not require hyperparameter tweaking and can efficiently train and run on our web portal (https://epinbweb.streamlit.app/) or a regular PC/laptop, making it easily applicable in translational settings.

8.
Commun Biol ; 6(1): 765, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37479893

RESUMO

Acute myeloid leukemia (AML) is a heterogeneous disease characterized by high rate of therapy resistance. Since the cell of origin can impact response to therapy, it is crucial to understand the lineage composition of AML cells at time of therapy resistance. Here we leverage single-cell chromatin accessibility profiling of 22 AML bone marrow aspirates from eight patients at time of therapy resistance and following subsequent therapy to characterize their lineage landscape. Our findings reveal a complex lineage architecture of therapy-resistant AML cells that are primed for stem and progenitor lineages and spanning quiescent, activated and late stem cell/progenitor states. Remarkably, therapy-resistant AML cells are also composed of cells primed for differentiated myeloid, erythroid and even lymphoid lineages. The heterogeneous lineage composition persists following subsequent therapy, with early progenitor-driven features marking unfavorable prognosis in The Cancer Genome Atlas AML cohort. Pseudotime analysis further confirms the vast degree of heterogeneity driven by the dynamic changes in chromatin accessibility. Our findings suggest that therapy-resistant AML cells are characterized not only by stem and progenitor states, but also by a continuum of differentiated cellular lineages. The heterogeneity in lineages likely contributes to their therapy resistance by harboring different degrees of lineage-specific susceptibilities to therapy.


Assuntos
Cromatina , Leucemia Mieloide Aguda , Humanos , Cromatina/genética , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/genética , Diferenciação Celular , Divisão Celular , Linhagem da Célula/genética
9.
Sci Adv ; 9(30): eadd6997, 2023 07 28.
Artigo em Inglês | MEDLINE | ID: mdl-37494448

RESUMO

Chimeric antigen receptor (CAR) engineering of natural killer (NK) cells is promising, with early-phase clinical studies showing encouraging responses. However, the transcriptional signatures that control the fate of CAR-NK cells after infusion and factors that influence tumor control remain poorly understood. We performed single-cell RNA sequencing and mass cytometry to study the heterogeneity of CAR-NK cells and their in vivo evolution after adoptive transfer, from the phase of tumor control to relapse. Using a preclinical model of noncurative lymphoma and samples from a responder and a nonresponder patient treated with CAR19/IL-15 NK cells, we observed the emergence of NK cell clusters with distinct patterns of activation, function, and metabolic signature associated with different phases of in vivo evolution and tumor control. Interaction with the highly metabolically active tumor resulted in loss of metabolic fitness in NK cells that could be partly overcome by incorporation of IL-15 in the CAR construct.


Assuntos
Receptores de Antígenos Quiméricos , Humanos , Receptores de Antígenos Quiméricos/genética , Receptores de Antígenos Quiméricos/metabolismo , Interleucina-15/genética , Interleucina-15/metabolismo , Citocinas/metabolismo , Linhagem Celular Tumoral , Células Matadoras Naturais , Terapia Baseada em Transplante de Células e Tecidos
10.
JCI Insight ; 7(7)2022 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-35230977

RESUMO

SARS-CoV-2 vaccines pose as the most effective approach for mitigating the COVID-19 pandemic. High-degree efficacy of SARS-CoV-2 vaccines in clinical trials indicates that vaccination invariably induces an adaptive immune response. However, the emergence of breakthrough infections in vaccinated individuals suggests that the breadth and magnitude of vaccine-induced adaptive immune response may vary. We assessed vaccine-induced SARS-CoV-2 T cell response in 21 vaccinated individuals and found that SARS-CoV-2-specific T cells, which were mainly CD4+ T cells, were invariably detected in all individuals but the response was varied. We then investigated differentiation states and cytokine profiles to identify immune features associated with superior recall function and longevity. We identified SARS-CoV-2-specific CD4+ T cells were polyfunctional and produced high levels of IL-2, which could be associated with superior longevity. Based on the breadth and magnitude of vaccine-induced SARS-CoV-2 response, we identified 2 distinct response groups: individuals with high abundance versus low abundance of SARS-CoV-2-specific T cells. The fractions of TNF-α- and IL-2-producing SARS-CoV-2 T cells were the main determinants distinguishing high versus low responders. Last, we identified that the majority of vaccine-induced SARS-CoV-2 T cells were reactive against non-mutated regions of mutant S-protein, suggesting that vaccine-induced SARS-CoV-2 T cells could provide continued protection against emerging variants of concern.


Assuntos
Vacinas contra COVID-19 , COVID-19 , Linfócitos T , COVID-19/imunologia , COVID-19/prevenção & controle , Vacinas contra COVID-19/imunologia , Humanos , Imunidade Celular , Interleucina-2 , Pandemias , SARS-CoV-2 , Linfócitos T/virologia
11.
Res Sq ; 2022 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-36032971

RESUMO

Host anti-viral factors are essential for controlling SARS-CoV-2 infection but remain largely unknown due to the biases of previous large-scale studies toward pro-viral host factors. To fill in this knowledge gap, we performed a genome-wide CRISPR dropout screen and integrated analyses of the multi-omics data of the CRISPR screen, genome-wide association studies, single-cell RNA-seq, and host-virus proteins or protein/RNA interactome. This study has uncovered many host factors that were missed by previous studies, including the components of V-ATPases, ESCRT, and N-glycosylation pathways that modulated viral entry and/or replication. The cohesin complex was also identified as a novel anti-viral pathway, suggesting an important role of three-dimensional chromatin organization in mediating host-viral interaction. Furthermore, we discovered an anti-viral regulator KLF5, a transcriptional factor involved in sphingolipid metabolism, which was up-regulated and harbored genetic variations linked to the COVID-19 patients with severe symptoms. Our results provide a resource for understanding the host anti-viral network during SARS-CoV-2 infection and may help develop new countermeasure strategies.

12.
Genome Biol ; 23(1): 112, 2022 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-35534898

RESUMO

Integration of single-cell multiomics profiles generated by different single-cell technologies from the same biological sample is still challenging. Previous approaches based on shared features have only provided approximate solutions. Here, we present a novel mathematical solution named bi-order canonical correlation analysis (bi-CCA), which extends the widely used CCA approach to iteratively align the rows and the columns between data matrices. Bi-CCA is generally applicable to combinations of any two single-cell modalities. Validations using co-assayed ground truth data and application to a CAR-NK study and a fetal muscle atlas demonstrate its capability in generating accurate multimodal co-embeddings and discovering cellular identity.

13.
NAR Cancer ; 4(4): zcac038, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36518525

RESUMO

Genetic screens are widely exploited to develop novel therapeutic approaches for cancer treatment. With recent advances in single-cell technology, single-cell CRISPR screen (scCRISPR) platforms provide opportunities for target validation and mechanistic studies in a high-throughput manner. Here, we aim to establish scCRISPR platforms which are suitable for immune-related screens involving multiple cell types. We integrated two scCRISPR platforms, namely Perturb-seq and CROP-seq, with both in vitro and in vivo immune screens. By leveraging previously generated resources, we optimized experimental conditions and data analysis pipelines to achieve better consistency between results from high-throughput and individual validations. Furthermore, we evaluated the performance of scCRISPR immune screens in determining underlying mechanisms of tumor intrinsic immune regulation. Our results showed that scCRISPR platforms can simultaneously characterize gene expression profiles and perturbation effects present in individual cells in different immune screen conditions. Results from scCRISPR immune screens also predict transcriptional phenotype associated with clinical responses to cancer immunotherapy. More importantly, scCRISPR screen platforms reveal the interactive relationship between targeting tumor intrinsic factors and T cell-mediated antitumor immune response which cannot be easily assessed by bulk RNA-seq. Collectively, scCRISPR immune screens provide scalable and reliable platforms to elucidate molecular determinants of tumor immune resistance.

14.
Nat Commun ; 13(1): 3652, 2022 06 25.
Artigo em Inglês | MEDLINE | ID: mdl-35752636

RESUMO

Heterogeneity is a hallmark of cancer. The advent of single-cell technologies has helped uncover heterogeneity in a high-throughput manner in different cancers across varied contexts. Here we apply single-cell sequencing technologies to reveal inherent heterogeneity in assumptively monoclonal pancreatic cancer (PDAC) cell lines and patient-derived organoids (PDOs). Our findings reveal a high degree of both genomic and transcriptomic polyclonality in monolayer PDAC cell lines, custodial variation induced by growing apparently identical cell lines in different laboratories, and transcriptomic shifts in transitioning from 2D to 3D spheroid growth models. Our findings also call into question the validity of widely available immortalized, non-transformed pancreatic lines as contemporaneous "control" lines in experiments. We confirm these findings using a variety of independent assays, including but not limited to whole exome sequencing, single-cell copy number variation sequencing (scCNVseq), single-nuclei assay for transposase-accessible chromatin with sequencing, fluorescence in-situ hybridization, and single-cell RNA sequencing (scRNAseq). We map scRNA expression data to unique genomic clones identified by orthogonally-gathered scCNVseq data of these same PDAC cell lines. Further, while PDOs are known to reflect the cognate in vivo biology of the parental tumor, we identify transcriptomic shifts during ex vivo passage that might hamper their predictive abilities over time. The impact of these findings on rigor and reproducibility of experimental data generated using established preclinical PDAC models between and across laboratories is uncertain, but a matter of concern.


Assuntos
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Carcinoma Ductal Pancreático/patologia , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA/genética , Humanos , Pâncreas/metabolismo , Neoplasias Pancreáticas/patologia , Reprodutibilidade dos Testes , Neoplasias Pancreáticas
15.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2072-2079, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34232885

RESUMO

Analyzing single-cell sequencing data from large cohorts is challenging. Discrepancies across experiments and differences among participants often lead to omissions and false discoveries in differentially expressed genes. We find that the Van Elteren test, a stratified version of the widely used Wilcoxon rank-sum test, elegantly mitigates the problem. We also modified the common language effect size to supplement this test, further improving its utility. On both simulated and real patient data we show the ability of Van Elteren test to control for false positives and false negatives. A comprehensive assessment using receiver operating characteristic (ROC) curve shows that Van Elteren test achieves higher sensitivity and specificity on simulated datasets, compared with nine state-of-the-art differential expression analysis methods. The effect size also estimates the differences between cell types more accurately.


Assuntos
Biologia Computacional/métodos , RNA-Seq/métodos , Análise de Célula Única/métodos , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Curva ROC , Retina/citologia , Retina/metabolismo , Estatísticas não Paramétricas , Transcriptoma/genética
16.
Sci Rep ; 11(1): 12388, 2021 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-34117319

RESUMO

Sample barcoding is essential in mass cytometry analysis, since it can eliminate potential procedural variations, enhance throughput, and allow simultaneous sample processing and acquisition. Sample pooling after prior surface staining termed live-cell barcoding is more desirable than intracellular barcoding, where samples are pooled after fixation and permeabilization, since it does not depend on fixation-sensitive antigenic epitopes. In live-cell barcoding, the general approach uses two tags per sample out of a pool of antibodies paired with five palladium (Pd) isotopes in order to preserve appreciable signal-to-noise ratios and achieve higher yields after sample deconvolution. The number of samples that can be pooled in an experiment using live-cell barcoding is limited, due to weak signal intensities associated with Pd isotopes and the relatively low number of available tags. Here, we describe a novel barcoding technique utilizing 10 different tags, seven cadmium (Cd) tags and three Pd tags, with superior signal intensities that do not impinge on lanthanide detection, which enables enhanced pooling of samples with multiple experimental conditions and markedly enhances sample throughput.


Assuntos
Separação Celular/métodos , Leucócitos Mononucleares/citologia , Espectrometria de Massas/métodos , Células Cultivadas , Humanos , Imunoensaio/métodos , Leucócitos Mononucleares/classificação , Análise de Célula Única/métodos
17.
Nat Comput Sci ; 1(5): 374-384, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-36969355

RESUMO

A key challenge in studying organisms and diseases is to detect rare molecular programs and rare cell populations (RCPs) that drive development, differentiation, and transformation. Molecular features such as genes and proteins defining RCPs are often unknown and difficult to detect from unenriched single-cell data, using conventional dimensionality reduction and clustering-based approaches. Here, we propose an unsupervised approach, SCMER (Single-Cell Manifold presERving feature selection), which selects a compact set of molecular features with definitive meanings that preserve the manifold of the data. We applied SCMER in the context of hematopoiesis, lymphogenesis, tumorigenesis, and drug resistance and response. We found that SCMER can identify non-redundant features that sensitively delineate both common cell lineages and rare cellular states. SCMER can be used for discovering molecular features in a high dimensional dataset, designing targeted, cost-effective assays for clinical applications, and facilitating multi-modality integration.

18.
Genome Biol ; 22(1): 70, 2021 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-33622385

RESUMO

We present a Minimal Event Distance Aneuploidy Lineage Tree (MEDALT) algorithm that infers the evolution history of a cell population based on single-cell copy number (SCCN) profiles, and a statistical routine named lineage speciation analysis (LSA), whichty facilitates discovery of fitness-associated alterations and genes from SCCN lineage trees. MEDALT appears more accurate than phylogenetics approaches in reconstructing copy number lineage. From data from 20 triple-negative breast cancer patients, our approaches effectively prioritize genes that are essential for breast cancer cell fitness and predict patient survival, including those implicating convergent evolution.The source code of our study is available at https://github.com/KChen-lab/MEDALT .


Assuntos
Aneuploidia , Biologia Computacional/métodos , Dosagem de Genes , RNA-Seq , Análise de Célula Única , Software , Algoritmos , Evolução Molecular , Estudos de Associação Genética , Aptidão Genética , Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA-Seq/métodos , Análise de Célula Única/métodos
19.
Front Oncol ; 11: 705627, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34422660

RESUMO

Acute myeloid leukemia (AML) is a heterogeneous disease with variable responses to therapy. Cytogenetic and genomic features are used to classify AML patients into prognostic and treatment groups. However, these molecular characteristics harbor significant patient-to-patient variability and do not fully account for AML heterogeneity. RNA-based classifications have also been applied in AML as an alternative approach, but transcriptomic grouping is strongly associated with AML morphologic lineages. We used a training cohort of newly diagnosed AML patients and conducted unsupervised RNA-based classification after excluding lineage-associated genes. We identified three AML patient groups that have distinct biological pathways associated with outcomes. Enrichment of inflammatory pathways and downregulation of HOX pathways were associated with improved outcomes, and this was validated in 2 independent cohorts. We also identified a group of AML patients who harbored high metabolic and mTOR pathway activity, and this was associated with worse clinical outcomes. Using a comprehensive reverse phase protein array, we identified higher mTOR protein expression in the highly metabolic group. We also identified a positive correlation between degree of resistance to venetoclax and mTOR activation in myeloid and lymphoid cell lines. Our approach of integrating RNA, protein, and genomic data uncovered lineage-independent AML patient groups that share biologic mechanisms and can inform outcomes independent of commonly used clinical and demographic variables; these groups could be used to guide therapeutic strategies.

20.
bioRxiv ; 2020 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-33052339

RESUMO

Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. Batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Batch-Corrected Distance (BCD), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate BCD on a simulated data as well as applied it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). BCD achieves more accurate clusters and better visualizations than state-of-the-art batch correction methods on longitudinal datasets. BCD can be directly integrated with most clustering and visualization methods to enable more scientific findings.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa