Pesquisa | Biblioteca Virtual em Saúde

scPLAN: a hierarchical computational framework for single transcriptomics data annotation, integration and cell-type label refinement.

Guo, Qirui; Yuan, Musu; Zhang, Lei; Deng, Minghua.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38935069

RESUMO

MOTIVATION: In the past decade, single-cell RNA sequencing (scRNA-seq) has emerged as a pivotal method for transcriptomic profiling in biomedical research. Precise cell-type identification is crucial for subsequent analysis of single-cell data. And the integration and refinement of annotated data are essential for building comprehensive databases. However, prevailing annotation techniques often overlook the hierarchical organization of cell types, resulting in inconsistent annotations. Meanwhile, most existing integration approaches fail to integrate datasets with different annotation depths and none of them can enhance the labels of outdated data with lower annotation resolutions using more intricately annotated datasets or novel biological findings. RESULTS: Here, we introduce scPLAN, a hierarchical computational framework designed for scRNA-seq data analysis. scPLAN excels in annotating unlabeled scRNA-seq data using a reference dataset structured along a hierarchical cell-type tree. It identifies potential novel cell types in a systematic, layer-by-layer manner. Additionally, scPLAN effectively integrates annotated scRNA-seq datasets with varying levels of annotation depth, ensuring consistent refinement of cell-type labels across datasets with lower resolutions. Through extensive annotation and novel cell detection experiments, scPLAN has demonstrated its efficacy. Two case studies have been conducted to showcase how scPLAN integrates datasets with diverse cell-type label resolutions and refine their cell-type labels. AVAILABILITY: https://github.com/michaelGuo1204/scPLAN.

Assuntos

Biologia Computacional , Perfilação da Expressão Gênica , Análise de Célula Única , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Humanos , Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq/métodos , Anotação de Sequência Molecular/métodos

A new algorithm to train hidden Markov models for biological sequences with partial labels.

Li, Jiefu; Lee, Jung-Youn; Liao, Li.

BMC Bioinformatics ; 22(1): 162, 2021 Mar 26.

Artigo em Inglês | MEDLINE | ID: mdl-33771095

RESUMO

BACKGROUND: Hidden Markov models (HMM) are a powerful tool for analyzing biological sequences in a wide variety of applications, from profiling functional protein families to identifying functional domains. The standard method used for HMM training is either by maximum likelihood using counting when sequences are labelled or by expectation maximization, such as the Baum-Welch algorithm, when sequences are unlabelled. However, increasingly there are situations where sequences are just partially labelled. In this paper, we designed a new training method based on the Baum-Welch algorithm to train HMMs for situations in which only partial labeling is available for certain biological problems. RESULTS: Compared with a similar method previously reported that is designed for the purpose of active learning in text mining, our method achieves significant improvements in model training, as demonstrated by higher accuracy when the trained models are tested for decoding with both synthetic data and real data. CONCLUSIONS: A novel training method is developed to improve the training of hidden Markov models by utilizing partial labelled data. The method will impact on detecting de novo motifs and signals in biological sequence data. In particular, the method will be deployed in active learning mode to the ongoing research in detecting plasmodesmata targeting signals and assess the performance with validations from wet-lab experiments.

Assuntos

Algoritmos , Proteínas , Biologia Computacional , Cadeias de Markov , Proteínas/genética

UFPS: A unified framework for partially annotated federated segmentation in heterogeneous data distribution.

Jiang, Le; Ma, Li Yan; Zeng, Tie Yong; Ying, Shi Hui.

Patterns (N Y) ; 5(2): 100917, 2024 Feb 09.

Artigo em Inglês | MEDLINE | ID: mdl-38370123

RESUMO

Partially supervised segmentation is a label-saving method based on datasets with fractional classes labeled and intersectant. Its practical application in real-world medical scenarios is, however, hindered by privacy concerns and data heterogeneity. To address these issues without compromising privacy, federated partially supervised segmentation (FPSS) is formulated in this work. The primary challenges for FPSS are class heterogeneity and client drift. We propose a unified federated partially labeled segmentation (UFPS) framework to segment pixels within all classes for partially annotated datasets by training a comprehensive global model that avoids class collision. Our framework includes unified label learning (ULL) and sparse unified sharpness aware minimization (sUSAM) for class and feature space unification, respectively. Through empirical studies, we find that traditional methods in partially supervised segmentation and federated learning often struggle with class collision when combined. Our extensive experiments on real medical datasets demonstrate better deconflicting and generalization capabilities of UFPS.

Partial label learning: Taxonomy, analysis and outlook.

Tian, Yingjie; Yu, Xiaotong; Fu, Saiji.

Neural Netw ; 161: 708-734, 2023 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-36848826

RESUMO

Partial label learning (PLL) is an emerging framework in weakly supervised machine learning with broad application prospects. It handles the case in which each training example corresponds to a candidate label set and only one label concealed in the set is the ground-truth label. In this paper, we propose a novel taxonomy framework for PLL including four categories: disambiguation strategy, transformation strategy, theory-oriented strategy and extensions. We analyze and evaluate methods in each category and sort out synthetic and real-world PLL datasets which are all hyperlinked to the source data. Future work of PLL is profoundly discussed in this article based on the proposed taxonomy framework.

Assuntos

Aprendizado de Máquina Supervisionado

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA