Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38493339

ABSTRACT

Clustering cells based on single-cell multi-modal sequencing technologies provides an unprecedented opportunity to create high-resolution cell atlas, reveal cellular critical states and study health and diseases. However, effectively integrating different sequencing data for cell clustering remains a challenging task. Motivated by the successful application of Louvain in scRNA-seq data, we propose a single-cell multi-modal Louvain clustering framework, called scMLC, to tackle this problem. scMLC builds multiplex single- and cross-modal cell-to-cell networks to capture modal-specific and consistent information between modalities and then adopts a robust multiplex community detection method to obtain the reliable cell clusters. In comparison with 15 state-of-the-art clustering methods on seven real datasets simultaneously measuring gene expression and chromatin accessibility, scMLC achieves better accuracy and stability in most datasets. Synthetic results also indicate that the cell-network-based integration strategy of multi-omics data is superior to other strategies in terms of generalization. Moreover, scMLC is flexible and can be extended to single-cell sequencing data with more than two modalities.


Subject(s)
Chromatin , Multiomics , Cluster Analysis , Algorithms , Sequence Analysis, RNA
2.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36567258

ABSTRACT

Single-cell RNA-sequencing technology (scRNA-seq) brings research to single-cell resolution. However, a major drawback of scRNA-seq is large sparsity, i.e. expressed genes with no reads due to technical noise or limited sequence depth during the scRNA-seq protocol. This phenomenon is also called 'dropout' events, which likely affect downstream analyses such as differential expression analysis, the clustering and visualization of cell subpopulations, cellular trajectory inference, etc. Therefore, there is a need to develop a method to identify and impute these dropout events. We propose Bubble, which first identifies dropout events from all zeros based on expression rate and coefficient of variation of genes within cell subpopulation, and then leverages an autoencoder constrained by bulk RNA-seq data to only impute those values. Unlike other deep learning-based imputation methods, Bubble fuses the matched bulk RNA-seq data as a constraint to reduce the introduction of false positive signals. Using simulated and several real scRNA-seq datasets, we demonstrate that Bubble enhances the recovery of missing values, gene-to-gene and cell-to-cell correlations, and reduces the introduction of false positive signals. Regarding some crucial downstream analyses of scRNA-seq data, Bubble facilitates the identification of differentially expressed genes, improves the performance of clustering and visualization, and aids the construction of cellular trajectory. More importantly, Bubble provides fast and scalable imputation with minimal memory usage.


Subject(s)
Gene Expression Profiling , Single-Cell Gene Expression Analysis , RNA-Seq , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Software
3.
Brief Bioinform ; 24(5)2023 09 20.
Article in English | MEDLINE | ID: mdl-37715282

ABSTRACT

Gene regulatory network plays a crucial role in controlling the biological processes of living creatures. Deciphering the complex gene regulatory networks from experimental data remains a major challenge in system biology. Recent advances in single-cell RNA sequencing technology bring massive high-resolution data, enabling computational inference of cell-specific gene regulatory networks (GRNs). Many relevant algorithms have been developed to achieve this goal in the past years. However, GRN inference is still less ideal due to the extra noises involved in pseudo-time information and large amounts of dropouts in datasets. Here, we present a novel GRN inference method named Normi, which is based on non-redundant mutual information. Normi manipulates these problems by employing a sliding size-fixed window approach on the entire trajectory and conducts average smoothing strategy on the gene expression of the cells in each window to obtain representative cells. To further alleviate the impact of dropouts, we utilize the mixed KSG estimator to quantify the high-order time-delayed mutual information among genes, then filter out the redundant edges by adopting Max-Relevance and Min Redundancy algorithm. Moreover, we determined the optimal time delay for each gene pair by distance correlation. Normi outperforms other state-of-the-art GRN inference methods on both simulated data and single-cell RNA sequencing (scRNA-seq) datasets, demonstrating its superiority in robustness. The performance of Normi in real scRNA-seq data further reveals its ability to identify the key regulators and crucial biological processes.


Subject(s)
Algorithms , Gene Regulatory Networks
4.
Brief Bioinform ; 25(1)2023 11 22.
Article in English | MEDLINE | ID: mdl-38145950

ABSTRACT

Single cell sequencing technology has provided unprecedented opportunities for comprehensively deciphering cell heterogeneity. Nevertheless, the high dimensionality and intricate nature of cell heterogeneity have presented substantial challenges to computational methods. Numerous novel clustering methods have been proposed to address this issue. However, none of these methods achieve the consistently better performance under different biological scenarios. In this study, we developed CAKE, a novel and scalable self-supervised clustering method, which consists of a contrastive learning model with a mixture neighborhood augmentation for cell representation learning, and a self-Knowledge Distiller model for the refinement of clustering results. These designs provide more condensed and cluster-friendly cell representations and improve the clustering performance in term of accuracy and robustness. Furthermore, in addition to accurately identifying the major type cells, CAKE could also find more biologically meaningful cell subgroups and rare cell types. The comprehensive experiments on real single-cell RNA sequencing datasets demonstrated the superiority of CAKE in visualization and clustering over other comparison methods, and indicated its extensive application in the field of cell heterogeneity analysis. Contact: Ruiqing Zheng. (rqzheng@csu.edu.cn).


Subject(s)
Algorithms , Learning , Cluster Analysis , Sequence Analysis, RNA
5.
Brief Bioinform ; 25(1)2023 11 22.
Article in English | MEDLINE | ID: mdl-38189544

ABSTRACT

With the development of spatially resolved transcriptomics technologies, it is now possible to explore the gene expression profiles of single cells while preserving their spatial context. Spatial clustering plays a key role in spatial transcriptome data analysis. In the past 2 years, several graph neural network-based methods have emerged, which significantly improved the accuracy of spatial clustering. However, accurately identifying the boundaries of spatial domains remains a challenging task. In this article, we propose stAA, an adversarial variational graph autoencoder, to identify spatial domain. stAA generates cell embedding by leveraging gene expression and spatial information using graph neural networks and enforces the distribution of cell embeddings to a prior distribution through Wasserstein distance. The adversarial training process can make cell embeddings better capture spatial domain information and more robust. Moreover, stAA incorporates global graph information into cell embeddings using labels generated by pre-clustering. Our experimental results show that stAA outperforms the state-of-the-art methods and achieves better clustering results across different profiling platforms and various resolutions. We also conducted numerous biological analyses and found that stAA can identify fine-grained structures in tissues, recognize different functional subtypes within tumors and accurately identify developmental trajectories.


Subject(s)
Gene Expression Profiling , Transcriptome , Cluster Analysis , Neural Networks, Computer
6.
Bioinformatics ; 40(1)2024 01 02.
Article in English | MEDLINE | ID: mdl-38230824

ABSTRACT

MOTIVATION: Single-cell RNA sequencing has emerged as a powerful technology for studying gene expression at the individual cell level. Clustering individual cells into distinct subpopulations is fundamental in scRNA-seq data analysis, facilitating the identification of cell types and exploration of cellular heterogeneity. Despite the recent development of many deep learning-based single-cell clustering methods, few have effectively exploited the correlations among genes, resulting in suboptimal clustering outcomes. RESULTS: Here, we propose a novel masked autoencoder-based method, scMAE, for cell clustering. scMAE perturbs gene expression and employs a masked autoencoder to reconstruct the original data, learning robust and informative cell representations. The masked autoencoder introduces a masking predictor, which captures relationships among genes by predicting whether gene expression values are masked. By integrating this masking mechanism, scMAE effectively captures latent structures and dependencies in the data, enhancing clustering performance. We conducted extensive comparative experiments using various clustering evaluation metrics on 15 scRNA-seq datasets from different sequencing platforms. Experimental results indicate that scMAE outperforms other state-of-the-art methods on these datasets. In addition, scMAE accurately identifies rare cell types, which are challenging to detect due to their low abundance. Furthermore, biological analyses confirm the biological significance of the identified cell subpopulations. AVAILABILITY AND IMPLEMENTATION: The source code of scMAE is available at: https://zenodo.org/records/10465991.


Subject(s)
Gene Expression Profiling , Single-Cell Gene Expression Analysis , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Cluster Analysis , Algorithms
7.
Methods ; 222: 1-9, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38128706

ABSTRACT

The development of single cell RNA sequencing (scRNA-seq) has provided new perspectives to study biological problems at the single cell level. One of the key issues in scRNA-seq data analysis is to divide cells into several clusters for discovering the heterogeneity and diversity of cells. However, the existing scRNA-seq data are high-dimensional, sparse, and noisy, which challenges the existing single-cell clustering methods. In this study, we propose a joint learning framework (JLONMFSC) for clustering scRNA-seq data. In our method, the dimension of the original data is reduced to minimize the effect of noise. In addition, the graph regularized matrix factorization is used to learn the local features. Further, the Low-Rank Representation (LRR) subspace clustering is utilized to learn the global features. Finally, the joint learning of local features and global features is performed to obtain the results of clustering. We compare the proposed algorithm with eight state-of-the-art algorithms for clustering performance on six datasets, and the experimental results demonstrate that the JLONMFSC achieves better performance in all datasets. The code is avalable at https://github.com/lanbiolab/JLONMFSC.


Subject(s)
Gene Expression Profiling , Single-Cell Gene Expression Analysis , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Algorithms , Cluster Analysis
8.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-35901449

ABSTRACT

Integration of single-cell transcriptome datasets from multiple sources plays an important role in investigating complex biological systems. The key to integration of transcriptome datasets is batch effect removal. Recent methods attempt to apply a contrastive learning strategy to correct batch effects. Despite their encouraging performance, the optimal contrastive learning framework for batch effect removal is still under exploration. We develop an improved contrastive learning-based batch correction framework, GLOBE. GLOBE defines adaptive translation transformations for each cell to guarantee the stability of approximating batch effects. To enhance the consistency of representations alignment, GLOBE utilizes a loss function that is both hardness-aware and consistency-aware to learn batch effect-invariant representations. Moreover, GLOBE computes batch-corrected gene matrix in a transparent approach to support diverse downstream analysis. Benchmarking results on a wide spectrum of datasets show that GLOBE outperforms other state-of-the-art methods in terms of robust batch mixing and superior conservation of biological signals. We further apply GLOBE to integrate two developing mouse neocortex datasets and show GLOBE succeeds in removing batch effects while preserving the contiguous structure of cells in raw data. Finally, a comprehensive study is conducted to validate the effectiveness of GLOBE.


Subject(s)
Benchmarking , Transcriptome , Animals , Mice
9.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34864877

ABSTRACT

Increasing evidences have proved that circRNA plays a significant role in the development of many diseases. In addition, many researches have shown that circRNA can be considered as the potential biomarker for clinical diagnosis and treatment of disease. Some computational methods have been proposed to predict circRNA-disease associations. However, the performance of these methods is limited as the sparsity of low-order interaction information. In this paper, we propose a new computational method (KGANCDA) to predict circRNA-disease associations based on knowledge graph attention network. The circRNA-disease knowledge graphs are constructed by collecting multiple relationship data among circRNA, disease, miRNA and lncRNA. Then, the knowledge graph attention network is designed to obtain embeddings of each entity by distinguishing the importance of information from neighbors. Besides the low-order neighbor information, it can also capture high-order neighbor information from multisource associations, which alleviates the problem of data sparsity. Finally, the multilayer perceptron is applied to predict the affinity score of circRNA-disease associations based on the embeddings of circRNA and disease. The experiment results show that KGANCDA outperforms than other state-of-the-art methods in 5-fold cross validation. Furthermore, the case study demonstrates that KGANCDA is an effective tool to predict potential circRNA-disease associations.


Subject(s)
MicroRNAs , RNA, Circular , Computational Biology/methods , MicroRNAs/genetics , Neural Networks, Computer , Pattern Recognition, Automated
10.
Bioinformatics ; 39(3)2023 03 01.
Article in English | MEDLINE | ID: mdl-36821425

ABSTRACT

MOTIVATION: Integration of growing single-cell RNA sequencing datasets helps better understand cellular identity and function. The major challenge for integration is removing batch effects while preserving biological heterogeneities. Advances in contrastive learning have inspired several contrastive learning-based batch correction methods. However, existing contrastive-learning-based methods exhibit noticeable ad hoc trade-off between batch mixing and preservation of cellular heterogeneities (mix-heterogeneity trade-off). Therefore, a deliberate mix-heterogeneity trade-off is expected to yield considerable improvements in scRNA-seq dataset integration. RESULTS: We develop a novel contrastive learning-based batch correction framework, CIAIRE, which achieves superior mix-heterogeneity trade-off. The key contributions of CLAIRE are proposal of two complementary strategies: construction strategy and refinement strategy, to improve the appropriateness of positive pairs. Construction strategy dynamically generates positive pairs by augmenting inter-batch mutual nearest neighbors (MNN) with intra-batch k-nearest neighbors (KNN), which improves the coverage of positive pairs for the whole distribution of shared cell types between batches. Refinement strategy aims to automatically reduce the potential false positive pairs from the construction strategy, which resorts to the memory effect of deep neural networks. We demonstrate that CLAIRE possesses superior mix-heterogeneity trade-off over existing contrastive learning-based methods. Benchmark results on six real datasets also show that CLAIRE achieves the best integration performance against eight state-of-the-art methods. Finally, comprehensive experiments are conducted to validate the effectiveness of CLAIRE. AVAILABILITY AND IMPLEMENTATION: The source code and data used in this study can be found in https://github.com/CSUBioGroup/CLAIRE-release. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Single-Cell Analysis , Software , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Neural Networks, Computer , Cluster Analysis
11.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37584660

ABSTRACT

MOTIVATION: scATAC-seq has enabled chromatin accessibility landscape profiling at the single-cell level, providing opportunities for determining cell-type-specific regulation codes. However, high dimension, extreme sparsity, and large scale of scATAC-seq data have posed great challenges to cell-type identification. Thus, there has been a growing interest in leveraging the well-annotated scRNA-seq data to help annotate scATAC-seq data. However, substantial computational obstacles remain to transfer information from scRNA-seq to scATAC-seq, especially for their heterogeneous features. RESULTS: We propose a new transfer learning method, scNCL, which utilizes prior knowledge and contrastive learning to tackle the problem of heterogeneous features. Briefly, scNCL transforms scATAC-seq features into gene activity matrix based on prior knowledge. Since feature transformation can cause information loss, scNCL introduces neighborhood contrastive learning to preserve the neighborhood structure of scATAC-seq cells in raw feature space. To learn transferable latent features, scNCL uses a feature projection loss and an alignment loss to harmonize embeddings between scRNA-seq and scATAC-seq. Experiments on various datasets demonstrated that scNCL not only realizes accurate and robust label transfer for common types, but also achieves reliable detection of novel types. scNCL is also computationally efficient and scalable to million-scale datasets. Moreover, we prove scNCL can help refine cell-type annotations in existing scATAC-seq atlases. AVAILABILITY AND IMPLEMENTATION: The source code and data used in this paper can be found in https://github.com/CSUBioGroup/scNCL-release.


Subject(s)
Gene Expression Profiling , Single-Cell Gene Expression Analysis , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Software , Chromatin , Sequence Analysis, RNA/methods
12.
Bioinformatics ; 39(39 Suppl 1): i368-i376, 2023 06 30.
Article in English | MEDLINE | ID: mdl-37387178

ABSTRACT

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect the complexity of biological tissues through cell sub-population identification in combination with clustering approaches. Feature selection is a critical step for improving the accuracy and interpretability of single-cell clustering. Existing feature selection methods underutilize the discriminatory potential of genes across distinct cell types. We hypothesize that incorporating such information could further boost the performance of single cell clustering. RESULTS: We develop CellBRF, a feature selection method that considers genes' relevance to cell types for single-cell clustering. The key idea is to identify genes that are most important for discriminating cell types through random forests guided by predicted cell labels. Moreover, it proposes a class balancing strategy to mitigate the impact of unbalanced cell type distributions on feature importance evaluation. We benchmark CellBRF on 33 scRNA-seq datasets representing diverse biological scenarios and demonstrate that it substantially outperforms state-of-the-art feature selection methods in terms of clustering accuracy and cell neighborhood consistency. Furthermore, we demonstrate the outstanding performance of our selected features through three case studies on cell differentiation stage identification, non-malignant cell subtype identification, and rare cell identification. CellBRF provides a new and effective tool to boost single-cell clustering accuracy. AVAILABILITY AND IMPLEMENTATION: All source codes of CellBRF are freely available at https://github.com/xuyp-csu/CellBRF.


Subject(s)
Benchmarking , Random Forest , Cell Differentiation , Cluster Analysis
13.
Methods ; 216: 21-38, 2023 08.
Article in English | MEDLINE | ID: mdl-37315825

ABSTRACT

Single-cell RNA-sequencing (scRNA-seq) data suffer from a lot of zeros. Such dropout events impede the downstream data analyses. We propose BayesImpute to infer and impute dropouts from the scRNA-seq data. Using the expression rate and coefficient of variation of the genes within the cell subpopulation, BayesImpute first determines likely dropouts, and then constructs the posterior distribution for each gene and uses the posterior mean to impute dropout values. Some simulated and real experiments show that BayesImpute can effectively identify dropout events and reduce the introduction of false positive signals. Additionally, BayesImpute successfully recovers the true expression levels of missing values, restores the gene-to-gene and cell-to-cell correlation coefficient, and maintains the biological information in bulk RNA-seq data. Furthermore, BayesImpute boosts the clustering and visualization of cell subpopulations and improves the identification of differentially expressed genes. We further demonstrate that, in comparison to other statistical-based imputation methods, BayesImpute is scalable and fast with minimal memory usage.


Subject(s)
Single-Cell Gene Expression Analysis , Software , Sequence Analysis, RNA/methods , Bayes Theorem , Single-Cell Analysis/methods , Probability , Gene Expression Profiling
14.
Methods ; 220: 90-97, 2023 12.
Article in English | MEDLINE | ID: mdl-37952704

ABSTRACT

For a given single cell RNA-seq data, it is critical to pinpoint key cellular stages and quantify cells' differentiation potency along a differentiation pathway in a time course manner. Currently, several methods based on the entropy of gene functions or PPI network have been proposed to solve the problem. Nevertheless, these methods still suffer from the inaccurate interactions and noises originating from scRNA-seq profile. In this study, we proposed a cell potency inference method based on cell-specific network entropy, called SPIDE. SPIDE introduces the local weighted cell-specific network for each cell to maintain cell heterogeneity and calculates the entropy by incorporating gene expression with network structure. In this study, we compared three cell entropy estimation models on eight scRNA-Seq datasets. The results show that SPIDE obtains consistent conclusions with real cell differentiation potency on most datasets. Moreover, SPIDE accurately recovers the continuous changes of potency during cell differentiation and significantly correlates with the stemness of tumor cells in Colorectal cancer. To conclude, our study provides a universal and accurate framework for cell entropy estimation, which deepens our understanding of cell differentiation, the development of diseases and other related biological research.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Entropy , Cell Differentiation/genetics , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods
15.
Brief Bioinform ; 22(2): 963-975, 2021 03 22.
Article in English | MEDLINE | ID: mdl-33285566

ABSTRACT

The Novel Coronavirus Disease 2019 (COVID-19) has become an international public health emergency, which poses the most serious threat to the human health around the world. Accumulating evidences have shown that the new coronavirus could not only infect human beings, but also can infect other species which might result in the cross-species infections. In this research, 1056 ACE2 protein sequences are collected from the NCBI database, and 173 species with >60% sequence identity compared with that of human beings are selected for further analysis. We find 14 polar residues forming the binding interface of ACE2/2019-nCoV-Spike complex play an important role in maintaining protein-protein stability. Among them, 8 polar residues at the same positions with that of human ACE2 are highly conserved, which ensure its basic binding affinity with the novel coronavirus. 5 of other 6 unconserved polar residues (positions at human ACE2: Q24, D30, K31, H34 and E35) are proved to have an effect on the binding patterns among species. We select 21 species keeping close contacts with human beings, construct their ACE2 three-dimensional structures by Homology Modeling method and calculate the binding free energies of their ACE2/2019-nCoV-Spike complexes. We find the ACE2 from all the 21 species possess the capabilities to bind with the novel coronavirus. Compared with the human beings, 8 species (cow, deer, cynomys, chimpanzee, monkey, sheep, dolphin and whale) present almost the same binding abilities, and 3 species (bat, pig and dog) show significant improvements in binding affinities. We hope this research could provide significant help for the future epidemic detection, drug and vaccine development and even the global eco-system protections.


Subject(s)
Angiotensin-Converting Enzyme 2/metabolism , SARS-CoV-2/metabolism , Animals , Humans , Protein Binding , Species Specificity , Spike Glycoprotein, Coronavirus/metabolism
16.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33866352

ABSTRACT

The prediction of genes related to diseases is important to the study of the diseases due to high cost and time consumption of biological experiments. Network propagation is a popular strategy for disease-gene prediction. However, existing methods focus on the stable solution of dynamics while ignoring the useful information hidden in the dynamical process, and it is still a challenge to make use of multiple types of physical/functional relationships between proteins/genes to effectively predict disease-related genes. Therefore, we proposed a framework of network impulsive dynamics on multiplex biological network (NIDM) to predict disease-related genes, along with four variants of NIDM models and four kinds of impulsive dynamical signatures (IDSs). NIDM is to identify disease-related genes by mining the dynamical responses of nodes to impulsive signals being exerted at specific nodes. By a series of experimental evaluations in various types of biological networks, we confirmed the advantage of multiplex network and the important roles of functional associations in disease-gene prediction, demonstrated superior performance of NIDM compared with four types of network-based algorithms and then gave the effective recommendations of NIDM models and IDS signatures. To facilitate the prioritization and analysis of (candidate) genes associated to specific diseases, we developed a user-friendly web server, which provides three kinds of filtering patterns for genes, network visualization, enrichment analysis and a wealth of external links (http://bioinformatics.csu.edu.cn/DGP/NID.jsp). NIDM is a protocol for disease-gene prediction integrating different types of biological networks, which may become a very useful computational tool for the study of disease-related genes.


Subject(s)
Algorithms , Computational Biology/methods , Gene Regulatory Networks , Genetic Association Studies/methods , Genetic Predisposition to Disease/genetics , Proteins/genetics , Humans , Protein Interaction Maps/genetics , Proteins/metabolism , Reproducibility of Results
17.
Brief Bioinform ; 22(4)2021 07 20.
Article in English | MEDLINE | ID: mdl-33300547

ABSTRACT

The rapid development of single-cell RNA sequencing (scRNA-Seq) technology provides strong technical support for accurate and efficient analyzing single-cell gene expression data. However, the analysis of scRNA-Seq is accompanied by many obstacles, including dropout events and the curse of dimensionality. Here, we propose the scGMAI, which is a new single-cell Gaussian mixture clustering method based on autoencoder networks and the fast independent component analysis (FastICA). Specifically, scGMAI utilizes autoencoder networks to reconstruct gene expression values from scRNA-Seq data and FastICA is used to reduce the dimensions of reconstructed data. The integration of these computational techniques in scGMAI leads to outperforming results compared to existing tools, including Seurat, in clustering cells from 17 public scRNA-Seq datasets. In summary, scGMAI is an effective tool for accurately clustering and identifying cell types from scRNA-Seq data and shows the great potential of its applicative power in scRNA-Seq data analysis. The source code is available at https://github.com/QUST-AIBBDRC/scGMAI/.


Subject(s)
Algorithms , RNA-Seq , Single-Cell Analysis , Software
18.
Methods ; 205: 114-122, 2022 09.
Article in English | MEDLINE | ID: mdl-35777719

ABSTRACT

The rapid development of single-cell sequencing technologies makes it possible to analyze cellular heterogeneity at the single-cell level. Cell clustering is one of the most fundamental and common steps in the heterogeneity analysis. However, due to the high noise level, high dimensionality and high sparsity, accurate cell clustering is still challengeable. Here, we present DeepCI, a new clustering approach for scRNA-seq data. Using two autoencoders to obtain cell embedding and gene embedding, DeepCI can simultaneously learn cell low-dimensional representation and clustering. In addition, the recovered gene expression matrix can be obtained by the matrix multiplication of cell and gene embedding. To evaluate the performance of DeepCI, we performed it on several real scRNA-seq datasets for clustering and visualization analysis. The experimental results show that DeepCI obtains the overall better performance than several popular single cell analysis methods. We also evaluated the imputation performance of DeepCI by a dedicated experiment. The corresponding results show that the imputed gene expression of known specific marker genes can greatly improve the accuracy of cell type classification.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Cluster Analysis , RNA-Seq , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods
19.
Surg Radiol Anat ; 45(3): 241-246, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36715709

ABSTRACT

OBJECTIVE: The purpose of this study was to research the morphological classification and clinical significance of vertebral artery sulcus on atlas based on CT three-dimensional reconstruction. METHODS: Three-dimensional reconstruction images of 300 adult atlases were collected. A total of 600 atlas vertebral artery sulci were selected in this study. The parameters required for placement of C1 pedicle screw, including depth of grinding drilling (ao), width (cd), length ab), height (H), lateral wall thickness (L1), inner wall thickness (L2), medial angle (∠α), and the cephalad angle to the transverse plane of atlas pedicle (∠ß), were measured. RESULTS: CT three-dimensional reconstruction images showed that there were five types of atlas vertebral artery sulci: no process type (n = 494 cases, 82.33%), upper process type (n = 29, 4.83%), lower process type (n = 25, 4.17%), double process type (n = 19, 3.17%), and posterior ring type (33, 5.50%). One-way ANOVA tests showed that the five groups differed significantly in the parameter of ao, L2, H, ∠α and ∠ß. One-way ANOVA with the LSD post hoc tests showed that the parameter ao of the group of no process type was less than that of the group of upper or lower process type (P < 0.05), and ao of the group of lower process or posterior ring type was less than that of the group of the upper type (P < 0.05). The parameter of ao of the male group was larger than that of the female group. CONCLUSION: No process type of the atlas vertebral artery sulcus was the most common, and the medial angle and cephalad angle of the atlas pedicle in this type were the smallest. When pedicle screws are inserted, the above two angles should not be too large. Male's ao was larger than that of female's. All these findings should be considered to avoid the deviation of the nail track.


Subject(s)
Pedicle Screws , Vertebral Artery , Adult , Humans , Male , Female , Vertebral Artery/diagnostic imaging , Imaging, Three-Dimensional , Clinical Relevance , Tomography, X-Ray Computed
20.
Brief Bioinform ; 21(2): 566-583, 2020 03 23.
Article in English | MEDLINE | ID: mdl-30776072

ABSTRACT

Genes that are thought to be critical for the survival of organisms or cells are called essential genes. The prediction of essential genes and their products (essential proteins) is of great value in exploring the mechanism of complex diseases, the study of the minimal required genome for living cells and the development of new drug targets. As laboratory methods are often complicated, costly and time-consuming, a great many of computational methods have been proposed to identify essential genes/proteins from the perspective of the network level with the in-depth understanding of network biology and the rapid development of biotechnologies. Through analyzing the topological characteristics of essential genes/proteins in protein-protein interaction networks (PINs), integrating biological information and considering the dynamic features of PINs, network-based methods have been proved to be effective in the identification of essential genes/proteins. In this paper, we survey the advanced methods for network-based prediction of essential genes/proteins and present the challenges and directions for future research.


Subject(s)
Genes, Essential , Proteins/chemistry , Computational Biology/methods , Genome , Surveys and Questionnaires
SELECTION OF CITATIONS
SEARCH DETAIL