Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701412

RESUMO

Trajectory inference is a crucial task in single-cell RNA-sequencing downstream analysis, which can reveal the dynamic processes of biological development, including cell differentiation. Dimensionality reduction is an important step in the trajectory inference process. However, most existing trajectory methods rely on cell features derived from traditional dimensionality reduction methods, such as principal component analysis and uniform manifold approximation and projection. These methods are not specifically designed for trajectory inference and fail to fully leverage prior information from upstream analysis, limiting their performance. Here, we introduce scCRT, a novel dimensionality reduction model for trajectory inference. In order to utilize prior information to learn accurate cells representation, scCRT integrates two feature learning components: a cell-level pairwise module and a cluster-level contrastive module. The cell-level module focuses on learning accurate cell representations in a reduced-dimensionality space while maintaining the cell-cell positional relationships in the original space. The cluster-level contrastive module uses prior cell state information to aggregate similar cells, preventing excessive dispersion in the low-dimensional space. Experimental findings from 54 real and 81 synthetic datasets, totaling 135 datasets, highlighted the superior performance of scCRT compared with commonly used trajectory inference methods. Additionally, an ablation study revealed that both cell-level and cluster-level modules enhance the model's ability to learn accurate cell features, facilitating cell lineage inference. The source code of scCRT is available at https://github.com/yuchen21-web/scCRT-for-scRNA-seq.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , RNA-Seq/métodos , Biologia Computacional/métodos , Software , Análise de Sequência de RNA/métodos , Animais , Análise da Expressão Gênica de Célula Única
2.
Comput Biol Med ; 164: 107263, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37531858

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technology has revolutionized the study of cell heterogeneity and biological interpretation at the single-cell level. However, the dropout events commonly present in scRNA-seq data can markedly reduce the reliability of downstream analysis. Existing imputation methods often overlook the discrepancy between the established cell relationship from dropout noisy data and reality, which limits their performances due to the learned untrustworthy cell representations. METHOD: Here, we propose a novel approach called the CL-Impute (Contrastive Learning-based Impute) model for estimating missing genes without relying on preconstructed cell relationships. CL-Impute utilizes contrastive learning and a self-attention network to address this challenge. Specifically, the proposed CL-Impute model leverages contrastive learning to learn cell representations from the self-perspective of dropout events, whereas the self-attention network captures cell relationships from the global-perspective. RESULTS: Experimental results on four benchmark datasets, including quantitative assessment, cell clustering, gene identification, and trajectory inference, demonstrate the superior performance of CL-Impute compared with that of existing state-of-the-art imputation methods. Furthermore, our experiment reveals that combining contrastive learning and masking cell augmentation enables the model to learn actual latent features from noisy data with a high rate of dropout events, enhancing the reliability of imputed values. CONCLUSIONS: CL-Impute is a novel contrastive learning-based method to impute scRNA-seq data in the context of high dropout rate. The source code of CL-Impute is available at https://github.com/yuchen21-web/Imputation-for-scRNA-seq.


Assuntos
Análise de Célula Única , Análise da Expressão Gênica de Célula Única , Análise de Sequência de RNA/métodos , Reprodutibilidade dos Testes , Análise de Célula Única/métodos , Software , Perfilação da Expressão Gênica , Análise por Conglomerados
3.
IEEE/ACM Trans Comput Biol Bioinform ; 20(4): 2577-2586, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37018664

RESUMO

Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.


Assuntos
Aprendizado Profundo , Proteínas
4.
Front Genet ; 13: 884589, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35571057

RESUMO

Parasites can cause enormous damage to their hosts. Studies have shown that antiparasitic peptides can inhibit the growth and development of parasites and even kill them. Because traditional biological methods to determine the activity of antiparasitic peptides are time-consuming and costly, a method for large-scale prediction of antiparasitic peptides is urgently needed. We propose a computational approach called i2APP that can efficiently identify APPs using a two-step machine learning (ML) framework. First, in order to solve the imbalance of positive and negative samples in the training set, a random under sampling method is used to generate a balanced training data set. Then, the physical and chemical features and terminus-based features are extracted, and the first classification is performed by Light Gradient Boosting Machine (LGBM) and Support Vector Machine (SVM) to obtain 264-dimensional higher level features. These features are selected by Maximal Information Coefficient (MIC) and the features with the big MIC values are retained. Finally, the SVM algorithm is used for the second classification in the optimized feature space. Thus the prediction model i2APP is fully constructed. On independent datasets, the accuracy and AUC of i2APP are 0.913 and 0.935, respectively, which are better than the state-of-arts methods. The key idea of the proposed method is that multi-level features are extracted from peptide sequences and the higher-level features can distinguish well the APPs and non-APPs.

5.
Artigo em Inglês | MEDLINE | ID: mdl-35286269

RESUMO

Fully supervised semantic segmentation has performed well in many computer vision tasks. However, it is time-consuming because training a model requires a large number of pixel-level annotated samples. Few-shot segmentation has recently become a popular approach to addressing this problem, as it requires only a handful of annotated samples to generalize to new categories. However, the full utilization of limited samples remains an open problem. Thus, in this article, a mutually supervised few-shot segmentation network is proposed. First, the feature maps from intermediate convolution layers are fused to enrich the capacity of feature representation. Second, the support image and query image are combined into a bipartite graph, and the graph attention network is adopted to avoid losing spatial information and increase the number of pixels in the support image to guide the query image segmentation. Third, the attention map of the query image is used as prior information to enhance the support image segmentation, which forms a mutually supervised regime. Finally, the attention maps of the intermediate layers are fused and sent into the graph reasoning layer to infer the pixel categories. Experiments are conducted on the PASCAL VOC- 5i dataset and FSS-1000 dataset, and the results demonstrate the effectiveness and superior performance of our method compared with other baseline methods.

6.
Sensors (Basel) ; 17(9)2017 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-28885602

RESUMO

Cyber-physical systems (CPS) have received much attention from both academia and industry. An increasing number of functions in CPS are provided in the way of services, which gives rise to an urgent task, that is, how to recommend the suitable services in a huge number of available services in CPS. In traditional service recommendation, collaborative filtering (CF) has been studied in academia, and used in industry. However, there exist several defects that limit the application of CF-based methods in CPS. One is that under the case of high data sparsity, CF-based methods are likely to generate inaccurate prediction results. In this paper, we discover that mining the potential similarity relations among users or services in CPS is really helpful to improve the prediction accuracy. Besides, most of traditional CF-based methods are only capable of using the service invocation records, but ignore the context information, such as network location, which is a typical context in CPS. In this paper, we propose a novel service recommendation method for CPS, which utilizes network location as context information and contains three prediction models using random walking. We conduct sufficient experiments on two real-world datasets, and the results demonstrate the effectiveness of our proposed methods and verify that the network location is indeed useful in QoS prediction.


Assuntos
Sistemas Computacionais , Modelos Estatísticos , Redes Neurais de Computação , Algoritmos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA