Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
Neural Netw ; 172: 106151, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38301339

RESUMO

Representation learning on temporal interaction graphs (TIG) aims to model complex networks with the dynamic evolution of interactions on a wide range of web and social graph applications. However, most existing works on TIG either (a) rely on discretely updated node embeddings merely when an interaction occurs that fail to capture the continuous evolution of embedding trajectories of nodes, or (b) overlook the rich temporal patterns hidden in the ever-changing graph data that presumably lead to sub-optimal models. In this paper, we propose a two-module framework named ConTIG, a novel representation learning method on TIG that captures the continuous dynamic evolution of node embedding trajectories. With two essential modules, our model exploits three-fold factors in dynamic networks including latest interaction, neighbor features, and inherent characteristics. In the first update module, we employ a continuous inference block to learn the nodes' state trajectories from time-adjacent interaction patterns using ordinary differential equations. In the second transform module, we introduce a self-attention mechanism to predict future node embeddings by aggregating historical temporal interaction information. Experiment results demonstrate the superiority of ConTIG on temporal link prediction, temporal node recommendation, and dynamic node classification tasks of four datasets compared with a range of state-of-the-art baselines, especially for long-interval interaction prediction.


Assuntos
Aprendizado de Máquina
2.
BMC Bioinformatics ; 24(1): 387, 2023 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-37821827

RESUMO

BACKGROUND: Metagenomic sequencing is an unbiased approach that can potentially detect all the known and unidentified strains in pathogen detection. Recently, nanopore sequencing has been emerging as a highly potential tool for rapid pathogen detection due to its fast turnaround time. However, identifying pathogen within species is nontrivial for nanopore sequencing data due to the high sequencing error rate. RESULTS: We developed the core gene alleles metagenome strain identification (cgMSI) tool, which uses a two-stage maximum a posteriori probability estimation method to detect pathogens at strain level from nanopore metagenomic sequencing data at low computational cost. The cgMSI tool can accurately identify strains and estimate relative abundance at 1× coverage. CONCLUSIONS: We developed cgMSI for nanopore metagenomic pathogen detection within species. cgMSI is available at https://github.com/ZHU-XU-xmu/cgMSI .


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Metagenoma , Alelos , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
3.
Research (Wash D C) ; 6: 0179, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37377457

RESUMO

Data-independent acquisition (DIA) technology for protein identification from mass spectrometry and related algorithms is developing rapidly. The spectrum-centric analysis of DIA data without the use of spectra library from data-dependent acquisition data represents a promising direction. In this paper, we proposed an untargeted analysis method, Dear-DIAXMBD, for direct analysis of DIA data. Dear-DIAXMBD first integrates the deep variational autoencoder and triplet loss to learn the representations of the extracted fragment ion chromatograms, then uses the k-means clustering algorithm to aggregate fragments with similar representations into the same classes, and finally establishes the inverted index tables to determine the precursors of fragment clusters between precursors and peptides and between fragments and peptides. We show that Dear-DIAXMBD performs superiorly with the highly complicated DIA data of different species obtained by different instrument platforms. Dear-DIAXMBD is publicly available at https://github.com/jianweishuai/Dear-DIA-XMBD.

4.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37204192

RESUMO

Accurately predicting the antigen-binding specificity of adaptive immune receptors (AIRs), such as T-cell receptors (TCRs) and B-cell receptors (BCRs), is essential for discovering new immune therapies. However, the diversity of AIR chain sequences limits the accuracy of current prediction methods. This study introduces SC-AIR-BERT, a pre-trained model that learns comprehensive sequence representations of paired AIR chains to improve binding specificity prediction. SC-AIR-BERT first learns the 'language' of AIR sequences through self-supervised pre-training on a large cohort of paired AIR chains from multiple single-cell resources. The model is then fine-tuned with a multilayer perceptron head for binding specificity prediction, employing the K-mer strategy to enhance sequence representation learning. Extensive experiments demonstrate the superior AUC performance of SC-AIR-BERT compared with current methods for TCR- and BCR-binding specificity prediction.


Assuntos
Receptores de Antígenos de Linfócitos B , Receptores de Antígenos de Linfócitos T , Humanos , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos B/genética , Redes Neurais de Computação , Especificidade de Anticorpos
5.
IEEE Trans Med Imaging ; 42(8): 2462-2473, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37028064

RESUMO

Cancer survival prediction requires exploiting related multimodal information (e.g., pathological, clinical and genomic features, etc.) and it is even more challenging in clinical practices due to the incompleteness of patient's multimodal data. Furthermore, existing methods lack sufficient intra- and inter-modal interactions, and suffer from significant performance degradation caused by missing modalities. This manuscript proposes a novel hybrid graph convolutional network, entitled HGCN, which is equipped with an online masked autoencoder paradigm for robust multimodal cancer survival prediction. Particularly, we pioneer modeling the patient's multimodal data into flexible and interpretable multimodal graphs with modality-specific preprocessing. HGCN integrates the advantages of graph convolutional networks (GCNs) and a hypergraph convolutional network (HCN) through node message passing and a hyperedge mixing mechanism to facilitate intra-modal and inter-modal interactions between multimodal graphs. With HGCN, the potential for multimodal data to create more reliable predictions of patient's survival risk is dramatically increased compared to prior methods. Most importantly, to compensate for missing patient modalities in clinical scenarios, we incorporated an online masked autoencoder paradigm into HGCN, which can effectively capture intrinsic dependence between modalities and seamlessly generate missing hyperedges for model inference. Extensive experiments and analysis on six cancer cohorts from TCGA show that our method significantly outperforms the state-of-the-arts in both complete and missing modal settings. Our codes are made available at https://github.com/lin-lcx/HGCN.


Assuntos
Genômica , Neoplasias , Humanos , Neoplasias/diagnóstico por imagem
6.
IEEE Trans Med Imaging ; 42(5): 1337-1348, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37015475

RESUMO

Multi-instance learning (MIL) is widely adop- ted for automatic whole slide image (WSI) analysis and it usually consists of two stages, i.e., instance feature extraction and feature aggregation. However, due to the "weak supervision" of slide-level labels, the feature aggregation stage would suffer from severe over-fitting in training an effective MIL model. In this case, mining more information from limited slide-level data is pivotal to WSI analysis. Different from previous works on improving instance feature extraction, this paper investigates how to exploit the latent relationship of different instances (patches) to combat overfitting in MIL for more generalizable WSI classification. In particular, we propose a novel Multi-instance Rein- forcement Contrastive Learning framework (MuRCL) to deeply mine the inherent semantic relationships of different patches to advance WSI classification. Specifically, the proposed framework is first trained in a self-supervised manner and then finetuned with WSI slide-level labels. We formulate the first stage as a contrastive learning (CL) process, where positive/negative discriminative feature sets are constructed from the same patch-level feature bags of WSIs. To facilitate the CL training, we design a novel reinforcement learning-based agent to progressively update the selection of discriminative feature sets according to an online reward for slide-level feature aggregation. Then, we further update the model with labeled WSI data to regularize the learned features for the final WSI classification. Experimental results on three public WSI classification datasets (Camelyon16, TCGA-Lung and TCGA-Kidney) demonstrate that the proposed MuRCL outperforms state-of-the-art MIL models. In addition, MuRCL can achieve comparable performance to other state-of-the-art MIL models on TCGA-Esca dataset.


Assuntos
Processamento de Imagem Assistida por Computador , Aprendizado de Máquina Supervisionado , Humanos , Conjuntos de Dados como Assunto , Pulmão/diagnóstico por imagem , Rim/diagnóstico por imagem
7.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36946415

RESUMO

Colorectal cancer (CRC) is one of the most common gastrointestinal malignancies. There are few recurrence risk signatures for CRC patients. Single-cell RNA-sequencing (scRNA-seq) provides a high-resolution platform for prognostic signature detection. However, scRNA-seq is not practical in large cohorts due to its high cost and most single-cell experiments lack clinical phenotype information. Few studies have been reported to use external bulk transcriptome with survival time to guide the detection of key cell subtypes in scRNA-seq data. We proposed scRankXMBD, a computational framework to prioritize prognostic-associated cell subpopulations based on within-cell relative expression orderings of gene pairs from single-cell transcriptomes. scRankXMBD achieves higher precision and concordance compared with five existing methods. Moreover, we developed single-cell gene pair signatures to predict recurrence risk for patients individually. Our work facilitates the application of the rank-based method in scRNA-seq data for prognostic biomarker discovery and precision oncology. scRankXMBD is available at https://github.com/xmuyulab/scRank-XMBD. (XMBD:Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.).


Assuntos
Neoplasias Colorretais , Transcriptoma , Humanos , Perfilação da Expressão Gênica/métodos , Prognóstico , Medicina de Precisão , Software , Neoplasias Colorretais/genética , Análise de Sequência de RNA
8.
Cell ; 186(3): 591-606.e23, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36669483

RESUMO

Dysregulation of the immune system is a cardinal feature of opioid addiction. Here, we characterize the landscape of peripheral immune cells from patients with opioid use disorder and from healthy controls. Opioid-associated blood exhibited an abnormal distribution of immune cells characterized by a significant expansion of fragile-like regulatory T cells (Tregs), which was positively correlated with the withdrawal score. Analogously, opioid-treated mice also showed enhanced Treg-derived interferon-γ (IFN-γ) expression. IFN-γ signaling reshaped synaptic morphology in nucleus accumbens (NAc) neurons, modulating subsequent withdrawal symptoms. We demonstrate that opioids increase the expression of neuron-derived C-C motif chemokine ligand 2 (Ccl2) and disrupted blood-brain barrier (BBB) integrity through the downregulation of astrocyte-derived fatty-acid-binding protein 7 (Fabp7), which both triggered peripheral Treg infiltration into NAc. Our study demonstrates that opioids drive the expansion of fragile-like Tregs and favor peripheral Treg diapedesis across the BBB, which leads to IFN-γ-mediated synaptic instability and subsequent withdrawal symptoms.


Assuntos
Interferon gama , Transtornos Relacionados ao Uso de Opioides , Síndrome de Abstinência a Substâncias , Linfócitos T Reguladores , Animais , Camundongos , Analgésicos Opioides/administração & dosagem , Interferon gama/metabolismo , Transtornos Relacionados ao Uso de Opioides/metabolismo , Transtornos Relacionados ao Uso de Opioides/patologia
9.
Commun Med (Lond) ; 2: 131, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36281356

RESUMO

Background: Single-cell technologies have enabled extensive analysis of complex immune composition, phenotype and interactions within tumor, which is crucial in understanding the mechanisms behind cancer progression and treatment resistance. Unfortunately, knowledge on cell phenotypes and their spatial interactions has only had limited impact on the pathological stratification of patients in the clinic so far. We explore the relationship between different tumor environments (TMEs) and response to immunotherapy by deciphering the composition and spatial relationships of different cell types. Methods: Here we used imaging mass cytometry to simultaneously quantify 35 proteins in a spatially resolved manner on tumor tissues from 26 melanoma patients receiving anti-programmed cell death-1 (anti-PD-1) therapy. Using unsupervised clustering, we profiled 662,266 single cells to identify lymphocytes, myeloid derived monocytes, stromal and tumor cells, and characterized TME of different melanomas. Results: Combined single-cell and spatial analysis reveals highly dynamic TMEs that are characterized with variable tumor and immune cell phenotypes and their spatial organizations in melanomas, and many of these multicellular features are associated with response to anti-PD-1 therapy. We further identify six distinct TME archetypes based on their multicellular compositions, and find that patients with different TME archetypes responded differently to anti-PD-1 therapy. Finally, we find that classifying patients based on the gene expression signature derived from TME archetypes predicts anti-PD-1 therapy response across multiple validation cohorts. Conclusions: Our results demonstrate the utility of multiplex proteomic imaging technologies in studying complex molecular events in a spatially resolved manner for the development of new strategies for patient stratification and treatment outcome prediction.

10.
STAR Protoc ; 3(3): 101587, 2022 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-35942344

RESUMO

Computational protocols for cell type deconvolution from bulk RNA-seq data have been used to understand cellular heterogeneity in disease-related samples, but their performance can be impacted by batch effect among datasets. Here, we present a DAISM-DNN protocol to achieve robust cell type proportion estimation on the target dataset. We describe the preparation of calibrated samples from human blood samples. We then detail steps to train a dataset-specific deep neural network (DNN) model and cell type proportion estimation using the trained model. For complete details on the use and execution of this protocol, please refer to Lin et al. (2022).


Assuntos
Redes Neurais de Computação , Humanos , RNA-Seq
11.
Patterns (N Y) ; 3(3): 100440, 2022 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-35510186

RESUMO

Understanding the immune cell abundance of cancer and other disease-related tissues has an important role in guiding disease treatments. Computational cell type proportion estimation methods have been previously developed to derive such information from bulk RNA sequencing data. Unfortunately, our results show that the performance of these methods can be seriously plagued by the mismatch between training data and real-world data. To tackle this issue, we propose the DAISM-DNNXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (denoted as DAISM-DNN) pipeline that trains a deep neural network (DNN) with dataset-specific training data populated from a certain amount of calibrated samples using DAISM, a novel data augmentation method with an in silico mixing strategy. The evaluation results demonstrate that the DAISM-DNN pipeline outperforms other existing methods consistently and substantially for all the cell types under evaluation in real-world datasets.

12.
J Biomed Inform ; 130: 104093, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35537690

RESUMO

The random noises, sampling biases, and batch effects often confound true biological variations in single-cell RNA-sequencing (scRNA-seq) data. Adjusting such biases is key to the robust discoveries in downstream analyses, such as cell clustering, gene selection and data integration. Here we propose a model-based downsampling algorithm based on minimal unbiased representative points (MURPXMBD). MURPXMBD is designed to retrieve a set of representative points by reducing gene-wise random independent errors, while retaining the covariance structure of biological origin hence provide an unbiased representation of the cell population. Subsequent validation using benchmark datasets shows that MURPXMBD can improve the quality and accuracy of clustering algorithms, and thus facilitate the discovery of new cell types. Besides, MURPXMBD also improves the performance of dataset integration algorithms. In summary, MURPXMBD serves as a useful noise-reduction method for single-cell sequencing analysis in biomedical studies.


Assuntos
Análise de Célula Única , Transcriptoma , Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
13.
Patterns (N Y) ; 3(5): 100509, 2022 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-35607625

RESUMO

There is an increasing risk of people using advanced artificial intelligence, particularly the generative adversarial network (GAN), for scientific image manipulation for the purpose of publications. We demonstrated this possibility by using GAN to fabricate several different types of biomedical images and discuss possible ways for the detection and prevention of such scientific misconducts in research communities.

14.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35368072

RESUMO

Liquid chromatography-mass spectrometry-based quantitative proteomics can measure the expression of thousands of proteins from biological samples and has been increasingly applied in cancer research. Identifying differentially expressed proteins (DEPs) between tumors and normal controls is commonly used to investigate carcinogenesis mechanisms. While differential expression analysis (DEA) at an individual level is desired to identify patient-specific molecular defects for better patient stratification, most statistical DEP analysis methods only identify deregulated proteins at the population level. To date, robust individualized DEA algorithms have been proposed for ribonucleic acid data, but their performance on proteomics data is underexplored. Herein, we performed a systematic evaluation on five individualized DEA algorithms for proteins on cancer proteomic datasets from seven cancer types. Results show that the within-sample relative expression orderings (REOs) of protein pairs in normal tissues were highly stable, providing the basis for individualized DEA for proteins using REOs. Moreover, individualized DEA algorithms achieve higher precision in detecting sample-specific deregulated proteins than population-level methods. To facilitate the utilization of individualized DEA algorithms in proteomics for prognostic biomarker discovery and personalized medicine, we provide Individualized DEP Analysis IDEPAXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (https://github.com/xmuyulab/IDEPA-XMBD), which is a user-friendly and open-source Python toolkit that integrates individualized DEA algorithms for DEP-associated deregulation pattern recognition.


Assuntos
Neoplasias , Proteoma , Humanos , Espectrometria de Massas/métodos , Neoplasias/genética , Proteoma/análise , Proteômica/métodos , Software
16.
Front Genet ; 12: 721229, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34603385

RESUMO

Highly multiplexed imaging technology is a powerful tool to facilitate understanding the composition and interactions of cells in tumor microenvironments at subcellular resolution, which is crucial for both basic research and clinical applications. Imaging mass cytometry (IMC), a multiplex imaging method recently introduced, can measure up to 100 markers simultaneously in one tissue section by using a high-resolution laser with a mass cytometer. However, due to its high resolution and large number of channels, how to process and interpret the image data from IMC remains a key challenge to its further applications. Accurate and reliable single cell segmentation is the first and a critical step to process IMC image data. Unfortunately, existing segmentation pipelines either produce inaccurate cell segmentation results or require manual annotation, which is very time consuming. Here, we developed Dice-XMBD, a Deep learnIng-based Cell sEgmentation algorithm for tissue multiplexed imaging data. In comparison with other state-of-the-art cell segmentation methods currently used for IMC images, Dice-XMBD generates more accurate single cell masks efficiently on IMC images produced with different nuclear, membrane, and cytoplasm markers. All codes and datasets are available at https://github.com/xmuyulab/Dice-XMBD.

17.
Commun Biol ; 4(1): 1190, 2021 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-34650228

RESUMO

We developed DreamDIAXMBD (denoted as DreamDIA), a software suite based on a deep representation model for data-independent acquisition (DIA) data analysis. DreamDIA adopts a data-driven strategy to capture comprehensive information from elution patterns of peptides in DIA data and achieves considerable improvements on both identification and quantification performance compared with other state-of-the-art methods such as OpenSWATH, Skyline and DIA-NN. Specifically, in contrast to existing methods which use only 6 to 10 selected fragment ions from spectral libraries, DreamDIA extracts additional features from hundreds of theoretical elution profiles originated from different ions of each precursor using a deep representation network. To achieve higher coverage of target peptides without sacrificing specificity, the extracted features are further processed by nonlinear discriminative models under the framework of positive-unlabeled learning with decoy peptides as affirmative negative controls. DreamDIA is publicly available at https://github.com/xmuyulab/DreamDIA-XMBD for high coverage and accuracy DIA data analysis.


Assuntos
Peptídeos/análise , Proteômica/métodos , Software
19.
Cancer Res ; 81(16): 4205-4217, 2021 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-34215622

RESUMO

The somatic landscape of the cancer genome results from different mutational processes represented by distinct "mutational signatures." Although several mutagenic mechanisms are known to cause specific mutational signatures in cell lines, the variation of somatic mutational activities in patients, which is mostly attributed to somatic selection, is still poorly explained. Here, we introduce a quantitative trait, mutational propensity (MP), and describe an integrated method to infer genetic determinants of variations in the mutational processes in 3,566 cancers with specific underlying mechanisms. As a result, we report 2,314 candidate determinants with both significant germline and somatic effects on somatic selection of mutational processes, of which, 485 act via cancer gene expression and 1,427 act through the tumor-immune microenvironment. These data demonstrate that the genetic determinants of MPs provide complementary information to known cancer driver genes, clonal evolution, and clinical biomarkers. SIGNIFICANCE: The genetic determinants of the somatic mutational processes in cancer elucidate the biology underlying somatic selection and evolution of cancers and demonstrate complementary predictive power across cancer types.


Assuntos
Análise Mutacional de DNA , Predisposição Genética para Doença , Mutação , Neoplasias/genética , Evolução Clonal , Biologia Computacional , Genes Neoplásicos , Variação Genética , Genoma Humano , Genômica , Humanos , Modelos Genéticos , Distribuição Normal , Oncogenes , Fenótipo , Proteômica , Análise de Regressão , Microambiente Tumoral , Interface Usuário-Computador
20.
Med Image Anal ; 72: 102092, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34030101

RESUMO

Automatic surveillance of early neoplasia in Barrett's esophagus (BE) is of great significance for improving the survival rate of esophageal cancer. It remains, however, a challenging task due to (1) the large variation of early neoplasia, (2) the existence of hard mimics, (3) the complicated anatomical and lighting environment in endoscopic images, and (4) the intrinsic real-time requirement of this application. We propose a novel end-to-end network equipped with an attentive hierarchical aggregation module and a self-distillation mechanism to comprehensively address these challenges. The hierarchical aggregation module is proposed to capture the complementariness of adjacent layers and hence strengthen the representation capability of each aggregated feature. Meanwhile, an attention mask is developed to selectively integrate the logits of each feature, which not only improves the prediction accuracy but also enhances the prediction interpretability. Furthermore, an efficient self-distillation mechanism is implemented based on a teacher-student architecture, where the student aims at capturing abstract high-level features while the teacher is applied to bring more low-level semantic details to calibrate the classification results. The proposed techniques are effective yet lightweight, improving the classification performance without sacrificing time performance, and thus achieving real-time inference. We extensively evaluate the proposed method on the MICCAI EndoVis Challenge Dataset. Experimental results demonstrate the proposed method can achieve competitive accuracy with a much faster speed than state-of-the-arts.


Assuntos
Esôfago de Barrett , Neoplasias Esofágicas , Atenção , Esôfago de Barrett/diagnóstico por imagem , Neoplasias Esofágicas/diagnóstico por imagem , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA