Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Brief Bioinform ; 25(6)2024 Sep 23.
Artículo en Inglés | MEDLINE | ID: mdl-39356327

RESUMEN

Single-cell cross-modal joint clustering has been extensively utilized to investigate the tumor microenvironment. Although numerous approaches have been suggested, accurate clustering remains the main challenge. First, the gene expression matrix frequently contains numerous missing values due to measurement limitations. The majority of existing clustering methods treat it as a typical multi-modal dataset without further processing. Few methods conduct recovery before clustering and do not sufficiently engage with the underlying research, leading to suboptimal outcomes. Additionally, the existing cross-modal information fusion strategy does not ensure consistency of representations across different modes, potentially leading to the integration of conflicting information, which could degrade performance. To address these challenges, we propose the 'Recover then Aggregate' strategy and introduce the Unified Cross-Modal Deep Clustering model. Specifically, we have developed a data augmentation technique based on neighborhood similarity, iteratively imposing rank constraints on the Laplacian matrix, thus updating the similarity matrix and recovering dropout events. Concurrently, we integrate cross-modal features and employ contrastive learning to align modality-specific representations with consistent ones, enhancing the effective integration of diverse modal information. Comprehensive experiments on five real-world multi-modal datasets have demonstrated this method's superior effectiveness in single-cell clustering tasks.


Asunto(s)
Análisis de la Célula Individual , Análisis por Conglomerados , Análisis de la Célula Individual/métodos , Humanos , Algoritmos , Microambiente Tumoral , Biología Computacional/métodos
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38493338

RESUMEN

In recent years, there has been a growing trend in the realm of parallel clustering analysis for single-cell RNA-seq (scRNA) and single-cell Assay of Transposase Accessible Chromatin (scATAC) data. However, prevailing methods often treat these two data modalities as equals, neglecting the fact that the scRNA mode holds significantly richer information compared to the scATAC. This disregard hinders the model benefits from the insights derived from multiple modalities, compromising the overall clustering performance. To this end, we propose an effective multi-modal clustering model scEMC for parallel scRNA and Assay of Transposase Accessible Chromatin data. Concretely, we have devised a skip aggregation network to simultaneously learn global structural information among cells and integrate data from diverse modalities. To safeguard the quality of integrated cell representation against the influence stemming from sparse scATAC data, we connect the scRNA data with the aggregated representation via skip connection. Moreover, to effectively fit the real distribution of cells, we introduced a Zero Inflated Negative Binomial-based denoising autoencoder that accommodates corrupted data containing synthetic noise, concurrently integrating a joint optimization module that employs multiple losses. Extensive experiments serve to underscore the effectiveness of our model. This work contributes significantly to the ongoing exploration of cell subpopulations and tumor microenvironments, and the code of our work will be public at https://github.com/DayuHuu/scEMC.


Asunto(s)
Cromatina , ARN Citoplasmático Pequeño , Análisis de Expresión Génica de una Sola Célula , Análisis por Conglomerados , Aprendizaje , ARN Citoplasmático Pequeño/genética , Transposasas , Análisis de Secuencia de ARN , Perfilación de la Expresión Génica
3.
Mol Cell ; 72(6): 1035-1049.e5, 2018 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-30503769

RESUMEN

Membrane-less organelles (MLOs) are liquid-like subcellular compartments that form through phase separation of proteins and RNA. While their biophysical properties are increasingly understood, their regulation and the consequences of perturbed MLO states for cell physiology are less clear. To study the regulatory networks, we targeted 1,354 human genes and screened for morphological changes of nucleoli, Cajal bodies, splicing speckles, PML nuclear bodies (PML-NBs), cytoplasmic processing bodies, and stress granules. By multivariate analysis of MLO features we identified hundreds of genes that control MLO homeostasis. We discovered regulatory crosstalk between MLOs, and mapped hierarchical interactions between aberrant MLO states and cellular properties. We provide evidence that perturbation of pre-mRNA splicing results in stress granule formation and reveal that PML-NB abundance influences DNA replication rates and that PML-NBs are in turn controlled by HIP kinases. Together, our comprehensive dataset is an unprecedented resource for deciphering the regulation and biological functions of MLOs.


Asunto(s)
Orgánulos/genética , Estrés Fisiológico/genética , Biología de Sistemas/métodos , Transcriptoma , Replicación del ADN , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Células HeLa , Humanos , Orgánulos/metabolismo , Transición de Fase , Interferencia de ARN , Precursores del ARN/genética , ARN Mensajero/genética , Transducción de Señal/genética , Análisis de la Célula Individual
4.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37935617

RESUMEN

Single-cell clustering is a critical step in biological downstream analysis. The clustering performance could be effectively improved by extracting cell-type-specific genes. The state-of-the-art feature selection methods usually calculate the importance of a single gene without considering the information contained in the gene expression distribution. Moreover, these methods ignore the intrinsic expression patterns of genes and heterogeneity within groups of different mean expression levels. In this work, we present a Feature sElection method based on gene Expression Decomposition (FEED) of scRNA-seq data, which selects informative genes to enhance clustering performance. First, the expression levels of genes are decomposed into multiple Gaussian components. Then, a novel gene correlation calculation method is proposed to measure the relationship between genes from the perspective of distribution. Finally, a permutation-based approach is proposed to determine the threshold of gene importance to obtain marker gene subsets. Compared with state-of-the-art feature selection methods, applying FEED on various scRNA-seq datasets including large datasets followed by different common clustering algorithms results in significant improvements in the accuracy of cell-type identification. The source codes for FEED are freely available at https://github.com/genemine/FEED.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Análisis por Conglomerados , Expresión Génica
5.
BMC Bioinformatics ; 25(Suppl 2): 292, 2024 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-39237886

RESUMEN

BACKGROUND: With the advance in single-cell RNA sequencing (scRNA-seq) technology, deriving inherent biological system information from expression profiles at a single-cell resolution has become possible. It has been known that network modeling by estimating the associations between genes could better reveal dynamic changes in biological systems. However, accurately constructing a single-cell network (SCN) to capture the network architecture of each cell and further explore cell-to-cell heterogeneity remains challenging. RESULTS: We introduce SINUM, a method for constructing the SIngle-cell Network Using Mutual information, which estimates mutual information between any two genes from scRNA-seq data to determine whether they are dependent or independent in a specific cell. Experiments on various scRNA-seq datasets with different cell numbers based on eight performance indexes (e.g., adjusted rand index and F-measure index) validated the accuracy and robustness of SINUM in cell type identification, superior to the state-of-the-art SCN inference method. Additionally, the SINUM SCNs exhibit high overlap with the human interactome and possess the scale-free property. CONCLUSIONS: SINUM presents a view of biological systems at the network level to detect cell-type marker genes/gene pairs and investigate time-dependent changes in gene associations during embryo development. Codes for SINUM are freely available at https://github.com/SysMednet/SINUM .


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , Análisis de Secuencia de ARN/métodos , Redes Reguladoras de Genes , RNA-Seq/métodos , Algoritmos , Perfilación de la Expresión Génica/métodos , Análisis de Expresión Génica de una Sola Célula
6.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36151725

RESUMEN

Accurately identifying cell-populations is paramount to the quality of downstream analyses and overall interpretations of single-cell RNA-seq (scRNA-seq) datasets but remains a challenge. The quality of single-cell clustering depends on the proximity metric used to generate cell-to-cell distances. Accordingly, proximity metrics have been benchmarked for scRNA-seq clustering, typically with results averaged across datasets to identify a highest performing metric. However, the 'best-performing' metric varies between studies, with the performance differing significantly between datasets. This suggests that the unique structural properties of an scRNA-seq dataset, specific to the biological system under study, have a substantial impact on proximity metric performance. Previous benchmarking studies have omitted to factor the structural properties into their evaluations. To address this gap, we developed a framework for the in-depth evaluation of the performance of 17 proximity metrics with respect to core structural properties of scRNA-seq data, including sparsity, dimensionality, cell-population distribution and rarity. We find that clustering performance can be improved substantially by the selection of an appropriate proximity metric and neighbourhood size for the structural properties of a dataset, in addition to performing suitable pre-processing and dimensionality reduction. Furthermore, popular metrics such as Euclidean and Manhattan distance performed poorly in comparison to several lessor applied metrics, suggesting that the default metric for many scRNA-seq methods should be re-evaluated. Our findings highlight the critical nature of tailoring scRNA-seq analyses pipelines to the dataset under study and provide practical guidance for researchers looking to optimize cell-similarity search for the structural properties of their own data.


Asunto(s)
Benchmarking , Análisis de la Célula Individual , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , RNA-Seq , Análisis por Conglomerados , Algoritmos
7.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34553226

RESUMEN

The development of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) technology has led to great opportunities for the identification of heterogeneous cell types in complex tissues. Clustering algorithms are of great importance to effectively identify different cell types. In addition, the definition of the distance between each two cells is a critical step for most clustering algorithms. In this study, we found that different distance measures have considerably different effects on clustering algorithms. Moreover, there is no specific distance measure that is applicable to all datasets. In this study, we introduce a new single-cell clustering method called SD-h, which generates an applicable distance measure for different kinds of datasets by optimally synthesizing commonly used distance measures. Then, hierarchical clustering is performed based on the new distance measure for more accurate cell-type clustering. SD-h was tested on nine frequently used scRNA-seq datasets and it showed great superiority over almost all the compared leading single-cell clustering algorithms.


Asunto(s)
Algoritmos , ARN , Análisis por Conglomerados , Consenso , Análisis de Secuencia de ARN/métodos
8.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35151228

RESUMEN

Identifying differential genes over conditions provides insights into the mechanisms of biological processes and disease progression. Here we present an approach, the Kullback-Leibler divergence-based differential distribution (klDD), which provides a flexible framework for quantifying changes in higher-order statistical information of genes including mean and variance/covariation. The method can well detect subtle differences in gene expression distributions in contrast to mean or variance shifts of the existing methods. In addition to effectively identifying informational genes in terms of differential distribution, klDD can be directly applied to cancer subtyping, single-cell clustering and disease early-warning detection, which were all validated by various benchmark datasets.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Análisis por Conglomerados , Progresión de la Enfermedad , Perfilación de la Expresión Génica/métodos , Humanos
9.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35524494

RESUMEN

Clustering analysis is widely used in single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) data to discover cell heterogeneity and cell states. While many clustering methods have been developed for scRNA-seq analysis, most of these methods require to provide the number of clusters. However, it is not easy to know the exact number of cell types in advance, and experienced determination is not always reliable. Here, we have developed ADClust, an automatic deep embedding clustering method for scRNA-seq data, which can accurately cluster cells without requiring a predefined number of clusters. Specifically, ADClust first obtains low-dimensional representation through pre-trained autoencoder and uses the representations to cluster cells into initial micro-clusters. The clusters are then compared in between by a statistical test, and similar micro-clusters are merged into larger clusters. According to the clustering, cell representations are updated so that each cell will be pulled toward centers of its assigned cluster and similar clusters, while cells are separated to keep distances between clusters. This is accomplished through jointly optimizing the carefully designed clustering and autoencoder loss functions. This merging process continues until convergence. ADClust was tested on 11 real scRNA-seq datasets and was shown to outperform existing methods in terms of both clustering performance and the accuracy on the number of the determined clusters. More importantly, our model provides high speed and scalability for large datasets.


Asunto(s)
ARN , Análisis de la Célula Individual , Algoritmos , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , ARN/genética , RNA-Seq , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
10.
BMC Genomics ; 24(1): 725, 2023 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-38036964

RESUMEN

In recent single-cell -omics studies, both the differential activity of transcription factors regulating cell fate determination and differential genome activation have been tested for utility as descriptors of cell types. Naturally, genome accessibility and gene expression are interlinked. To understand the variability in genomic feature activation in the GABAergic neurons of different spatial origins, we have mapped accessible chromatin regions and mRNA expression in single cells derived from the developing mouse central nervous system (CNS). We first defined a reference set of open chromatin regions for scATAC-seq read quantitation across samples, allowing comparison of chromatin accessibility between brain regions and cell types directly. Second, we integrated the scATAC-seq and scRNA-seq data to form a unified resource of transcriptome and chromatin accessibility landscape for the cell types in di- and telencephalon, midbrain and anterior hindbrain of E14.5 mouse embryo. Importantly, we implemented resolution optimization at the clustering, and automatized the cell typing step. We show high level of concordance between the cell clustering based on the chromatin accessibility and the transcriptome in analyzed neuronal lineages, indicating that both genome and transcriptome features can be used for cell type definition. Hierarchical clustering by the similarity in accessible chromatin reveals that the genomic feature activation correlates with neurotransmitter phenotype, selector gene expression, cell differentiation stage and neuromere origins.


Asunto(s)
Cromatina , Factores de Transcripción , Animales , Ratones , Cromatina/genética , Diferenciación Celular/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Genoma , Encéfalo/metabolismo , Análisis de la Célula Individual
11.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33940590

RESUMEN

Single-cell clustering is an important part of analyzing single-cell RNA-sequencing data. However, the accuracy and robustness of existing methods are disturbed by noise. One promising approach for addressing this challenge is integrating pathway information, which can alleviate noise and improve performance. In this work, we studied the impact on accuracy and robustness of existing single-cell clustering methods by integrating pathways. We collected 10 state-of-the-art single-cell clustering methods, 26 scRNA-seq datasets and four pathway databases, combined the AUCell method and the similarity network fusion to integrate pathway data and scRNA-seq data, and introduced three accuracy indicators, three noise generation strategies and robustness indicators. Experiments on this framework showed that integrating pathways can significantly improve the accuracy and robustness of most single-cell clustering methods.


Asunto(s)
Algoritmos , Bases de Datos de Ácidos Nucleicos , Secuenciación del Exoma , RNA-Seq , Análisis de la Célula Individual , Análisis por Conglomerados
12.
BMC Bioinformatics ; 22(1): 578, 2021 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-34856921

RESUMEN

BACKGROUND: Existing computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to individual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation. RESULTS: In this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to combine single-cell miRNA-mRNA co-sequencing data and putative miRNA-mRNA binding information to identify miRNA regulatory networks at the resolution of individual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks for understanding miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. The comparison results indicate that CSmiR is effective in predicting cell-specific miRNA targets. Finally, through exploring cell-cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells and helps to understand cell-cell crosstalk. CONCLUSIONS: To the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.


Asunto(s)
MicroARNs , Análisis por Conglomerados , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , MicroARNs/genética , ARN Mensajero/genética
13.
BMC Bioinformatics ; 21(1): 440, 2020 Oct 07.
Artículo en Inglés | MEDLINE | ID: mdl-33028196

RESUMEN

BACKGROUND: Advances in single-cell RNA-seq technology have led to great opportunities for the quantitative characterization of cell types, and many clustering algorithms have been developed based on single-cell gene expression. However, we found that different data preprocessing methods show quite different effects on clustering algorithms. Moreover, there is no specific preprocessing method that is applicable to all clustering algorithms, and even for the same clustering algorithm, the best preprocessing method depends on the input data. RESULTS: We designed a graph-based algorithm, SC3-e, specifically for discriminating the best data preprocessing method for SC3, which is currently the most widely used clustering algorithm for single cell clustering. When tested on eight frequently used single-cell RNA-seq data sets, SC3-e always accurately selects the best data preprocessing method for SC3 and therefore greatly enhances the clustering performance of SC3. CONCLUSION: The SC3-e algorithm is practically powerful for discriminating the best data preprocessing method, and therefore largely enhances the performance of cell-type clustering of SC3. It is expected to play a crucial role in the related studies of single-cell clustering, such as the studies of human complex diseases and discoveries of new cell types.


Asunto(s)
RNA-Seq/métodos , Algoritmos , Análisis por Conglomerados , Expresión Génica , Humanos , Análisis de Secuencia de ARN
14.
Methods Mol Biol ; 2812: 155-168, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39068361

RESUMEN

This chapter shows applying the Asymmetric Within-Sample Transformation to single-cell RNA-Seq data matched with a previous dropout imputation. The asymmetric transformation is a special winsorization that flattens low-expressed intensities and preserves highly expressed gene levels. Before a standard hierarchical clustering algorithm, an intermediate step removes noninformative genes according to a threshold applied to a per-gene entropy estimate. Following the clustering, a time-intensive algorithm is shown to uncover the molecular features associated with each cluster. This step implements a resampling algorithm to generate a random baseline to measure up/downregulated significant genes. To this aim, we adopt a GLM model as implemented in DESeq2 package. We render the results in graphical mode. While the tools are standard heat maps, we introduce some data scaling to clarify the results' reliability.


Asunto(s)
Algoritmos , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Perfilación de la Expresión Génica/métodos , Programas Informáticos , Biología Computacional/métodos , RNA-Seq/métodos
15.
Methods Mol Biol ; 2757: 383-445, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38668977

RESUMEN

The emergence and development of single-cell RNA sequencing (scRNA-seq) techniques enable researchers to perform large-scale analysis of the transcriptomic profiling at cell-specific resolution. Unsupervised clustering of scRNA-seq data is central for most studies, which is essential to identify novel cell types and their gene expression logics. Although an increasing number of algorithms and tools are available for scRNA-seq analysis, a practical guide for users to navigate the landscape remains underrepresented. This chapter presents an overview of the scRNA-seq data analysis pipeline, quality control, batch effect correction, data standardization, cell clustering and visualization, cluster correlation analysis, and marker gene identification. Taking the two broadly used analysis packages, i.e., Scanpy and MetaCell, as examples, we provide a hands-on guideline and comparison regarding the best practices for the above essential analysis steps and data visualization. Additionally, we compare both packages and algorithms using a scRNA-seq dataset of the ctenophore Mnemiopsis leidyi, which is representative of one of the earliest animal lineages, critical to understanding the origin and evolution of animal novelties. This pipeline can also be helpful for analyses of other taxa, especially prebilaterian animals, where these tools are under development (e.g., placozoan and Porifera).


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Análisis de la Célula Individual , Programas Informáticos , Análisis de la Célula Individual/métodos , Animales , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Biología Computacional/métodos , Análisis por Conglomerados , Transcriptoma/genética
16.
Brief Funct Genomics ; 22(4): 329-340, 2023 07 17.
Artículo en Inglés | MEDLINE | ID: mdl-36848584

RESUMEN

Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular markers to identify differences between cells, which contributes to feature extraction of single cells. In this work, we propose a high-precision single-cell clustering algorithm-SCMcluster (single-cell cluster using marker genes). This algorithm integrates two cell marker databases(CellMarker database and PanglaoDB database) with scRNA-seq data for feature extraction and constructs an ensemble clustering model based on the consensus matrix. We test the efficiency of this algorithm and compare it with other eight popular clustering algorithms on two scRNA-seq datasets derived from human and mouse tissues, respectively. The experimental results show that SCMcluster outperforms the existing methods in both feature extraction and clustering performance. The source code of SCMcluster is available for free at https://github.com/HaoWuLab-Bioinformatics/SCMcluster.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Animales , Humanos , Ratones , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Análisis por Conglomerados
17.
Front Genet ; 14: 1183099, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37091787

RESUMEN

Identifying different types of cells in scRNA-seq data is a critical task in single-cell data analysis. In this paper, we propose a method called ProgClust for the decomposition of cell populations and detection of rare cells. ProgClust represents the single-cell data with clustering trees where a progressive searching method is designed to select cell population-specific genes and cluster cells. The obtained trees reveal the structure of both abundant cell populations and rare cell populations. Additionally, it can automatically determine the number of clusters. Experimental results show that ProgClust outperforms the baseline method and is capable of accurately identifying both common and rare cells. Moreover, when applied to real unlabeled data, it reveals potential cell subpopulations which provides clues for further exploration. In summary, ProgClust shows potential in identifying subpopulations of complex single-cell data.

18.
Interdiscip Sci ; 14(2): 394-408, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35028910

RESUMEN

Cell type determination based on transcriptome profiles is a key application of single-cell RNA sequencing (scRNA-seq). It is usually achieved through unsupervised clustering. Good feature selection is capable of improving the clustering accuracy and is a crucial component of single-cell clustering pipelines. However, most current single-cell feature selection methods are univariable filter methods ignoring gene dependency. Even the multivariable filter methods developed in recent years only consider "one-to-many" relationship between genes. In this paper, a novel single-cell feature selection method based on convex analysis of mixtures (FSCAM) is proposed, which takes into account "many-to-many" relationship. Compared to the previous "one-to-many" methods, FSCAM selects genes with a combination of relevancy, redundancy and completeness. Pertinent benchmarking is conducted on the real datasets to validate the superiority of FSCAM. Through plugging into the framework of partition around medoids (PAM) clustering, a single-cell clustering algorithm based on FSCAM method (SCC_FSCAM) is further developed. Comparing SCC_FSCAM with existing advanced clustering algorithms, the results show that our algorithm has advantages in both internal criteria (clustering number) and external criteria (adjusted Rand index) and has a good stability.


Asunto(s)
Algoritmos , Análisis de la Célula Individual , Análisis por Conglomerados , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Transcriptoma
19.
Cell Metab ; 34(8): 1214-1225.e6, 2022 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-35858629

RESUMEN

Cells often adopt different phenotypes, dictated by tissue-specific or local signals such as cell-cell and cell-matrix contacts or molecular micro-environment. This holds in extremis for macrophages with their high phenotypic plasticity. Their broad range of functions, some even opposing, reflects their heterogeneity, and a multitude of subsets has been described in different tissues and diseases. Such micro-environmental imprint cannot be adequately studied by single-cell applications, as cells are detached from their context, while histology-based assessment lacks the phenotypic depth due to limitations in marker combination. Here, we present a novel, integrative approach in which 15-color multispectral imaging allows comprehensive cell classification based on multi-marker expression patterns, followed by downstream analysis pipelines to link their phenotypes to contextual, micro-environmental cues, such as their cellular ("community") and metabolic ("local lipidome") niches in complex tissue. The power of this approach is illustrated for myeloid subsets and associated lipid signatures in murine atherosclerotic plaque.


Asunto(s)
Aterosclerosis , Placa Aterosclerótica , Animales , Aterosclerosis/metabolismo , Biomarcadores/metabolismo , Macrófagos/metabolismo , Espectrometría de Masas , Ratones , Fenotipo
20.
Front Genet ; 12: 811043, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35082838

RESUMEN

Identifying the phenotypes and interactions of various cells is the primary objective in cellular heterogeneity dissection. A key step of this methodology is to perform unsupervised clustering, which, however, often suffers challenges of the high level of noise, as well as redundant information. To overcome the limitations, we proposed self-diffusion on local scaling affinity (LSSD) to enhance cell similarities' metric learning for dissecting cellular heterogeneity. Local scaling infers the self-tuning of cell-to-cell distances that are used to construct cell affinity. Our approach implements the self-diffusion process by propagating the affinity matrices to further improve the cell similarities for the downstream clustering analysis. To demonstrate the effectiveness and usefulness, we applied LSSD on two simulated and four real scRNA-seq datasets. Comparing with other single-cell clustering methods, our approach demonstrates much better clustering performance, and cell types identified on colorectal tumors reveal strongly biological interpretability.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA