Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 10 de 10
1.
Cancer Res ; 84(11): 1915-1928, 2024 Jun 04.
Article En | MEDLINE | ID: mdl-38536129

T cells recognize tumor antigens and initiate an anticancer immune response in the very early stages of tumor development, and the antigen specificity of T cells is determined by the T-cell receptor (TCR). Therefore, monitoring changes in the TCR repertoire in peripheral blood may offer a strategy to detect various cancers at a relatively early stage. Here, we developed the deep learning framework iCanTCR to identify patients with cancer based on the TCR repertoire. The iCanTCR framework uses TCRß sequences from an individual as an input and outputs the predicted cancer probability. The model was trained on over 2,000 publicly available TCR repertoires from 11 types of cancer and healthy controls. Analysis of several additional publicly available datasets validated the ability of iCanTCR to distinguish patients with cancer from noncancer individuals and demonstrated the capability of iCanTCR for the accurate classification of multiple cancers. Importantly, iCanTCR precisely identified individuals with early-stage cancer with an AUC of 86%. Altogether, this work provides a liquid biopsy approach to capture immune signals from peripheral blood for noninvasive cancer diagnosis. SIGNIFICANCE: Development of a deep learning-based method for multicancer detection using the TCR repertoire in the peripheral blood establishes the potential of evaluating circulating immune signals for noninvasive early cancer detection.


Deep Learning , Early Detection of Cancer , Neoplasms , Receptors, Antigen, T-Cell , Humans , Neoplasms/immunology , Neoplasms/blood , Neoplasms/diagnosis , Receptors, Antigen, T-Cell/immunology , Early Detection of Cancer/methods , Biomarkers, Tumor/blood , Biomarkers, Tumor/immunology , T-Lymphocytes/immunology , T-Lymphocytes/metabolism
2.
Mol Ther Nucleic Acids ; 35(1): 102129, 2024 Mar 12.
Article En | MEDLINE | ID: mdl-38370981

Circulating tumor cells (CTCs) that undergo epithelial-to-mesenchymal transition (EMT) can provide valuable information regarding metastasis and potential therapies. However, current studies on the EMT overlook alternative splicing. Here, we used single-cell full-length transcriptome data and mRNA sequencing of CTCs to identify stage-specific alternative splicing of partial EMT and mesenchymal states during pancreatic cancer metastasis. We classified definitive tumor and normal epithelial cells via genetic aberrations and demonstrated dynamic changes in the epithelial-mesenchymal continuum in both epithelial cancer cells and CTCs. We provide the landscape of alternative splicing in CTCs at different stages of EMT, uncovering cell-type-specific splicing patterns and splicing events in cell surface proteins suitable for therapies. We show that MBNL1 governs cell fate through alternative splicing independently of changes in gene expression and affects the splicing pattern during EMT. We found a high frequency of events that contained multiple premature termination codons and were enriched with C and G nucleotides in close proximity, which influence the likelihood of stop codon readthrough and expand the range of potential therapeutic targets. Our study provides insights into the EMT transcriptome's dynamic changes and identifies potential diagnostic and therapeutic targets in pancreatic cancer.

3.
Bioinformatics ; 39(10)2023 10 03.
Article En | MEDLINE | ID: mdl-37740953

MOTIVATION: Cell-cell interactions (CCIs) play critical roles in many biological processes such as cellular differentiation, tissue homeostasis, and immune response. With the rapid development of high throughput single-cell RNA sequencing (scRNA-seq) technologies, it is of high importance to identify CCIs from the ever-increasing scRNA-seq data. However, limited by the algorithmic constraints, current computational methods based on statistical strategies ignore some key latent information contained in scRNA-seq data with high sparsity and heterogeneity. RESULTS: Here, we developed a deep learning framework named DeepCCI to identify meaningful CCIs from scRNA-seq data. Applications of DeepCCI to a wide range of publicly available datasets from diverse technologies and platforms demonstrate its ability to predict significant CCIs accurately and effectively. Powered by the flexible and easy-to-use software, DeepCCI can provide the one-stop solution to discover meaningful intercellular interactions and build CCI networks from scRNA-seq data. AVAILABILITY AND IMPLEMENTATION: The source code of DeepCCI is available online at https://github.com/JiangBioLab/DeepCCI.


Deep Learning , Gene Expression Profiling , Sequence Analysis, RNA , Single-Cell Analysis , Software , Cluster Analysis
4.
Mol Ther Nucleic Acids ; 32: 189-202, 2023 Jun 13.
Article En | MEDLINE | ID: mdl-37096165

Tumor-infiltrating T cells are essential players in tumor immunotherapy. Great progress has been achieved in the investigation of T cell heterogeneity. However, little is well known about the shared characteristics of tumor-infiltrating T cells across cancers. In this study, we conduct a pan-cancer analysis of 349,799 T cells across 15 cancers. The results show that the same T cell types had similar expression patterns regulated by specific transcription factor (TF) regulons across cancers. Multiple T cell type transition paths were consistent in cancers. We found that TF regulons associated with CD8+ T cells transitioned to terminally differentiated effector memory (Temra) or exhausted (Tex) states were associated with patient clinical classification. We also observed universal activated cell-cell interaction pathways of tumor-infiltrating T cells in all cancers, some of which specifically mediated crosstalk in certain cell types. Moreover, consistent characteristics of TCRs in the aspect of variable and joining region genes were found across cancers. Overall, our study reveals common features of tumor-infiltrating T cells in different cancers and suggests future avenues for rational, targeted immunotherapies.

5.
Nucleic Acids Res ; 50(22): e131, 2022 12 09.
Article En | MEDLINE | ID: mdl-36250636

Recent advances in spatial transcriptomics (ST) have brought unprecedented opportunities to understand tissue organization and function in spatial context. However, it is still challenging to precisely dissect spatial domains with similar gene expression and histology in situ. Here, we present DeepST, an accurate and universal deep learning framework to identify spatial domains, which performs better than the existing state-of-the-art methods on benchmarking datasets of the human dorsolateral prefrontal cortex. Further testing on a breast cancer ST dataset, we showed that DeepST can dissect spatial domains in cancer tissue at a finer scale. Moreover, DeepST can achieve not only effective batch integration of ST data generated from multiple batches or different technologies, but also expandable capabilities for processing other spatial omics data. Together, our results demonstrate that DeepST has the exceptional capacity for identifying spatial domains, making it a desirable tool to gain novel insights from ST studies.


Deep Learning , Gene Expression Profiling , Humans , Benchmarking , Gene Expression Profiling/methods , Transcriptome
6.
Brief Bioinform ; 23(5)2022 09 20.
Article En | MEDLINE | ID: mdl-35870203

The rapid development of single-cel+l RNA sequencing (scRNA-seq) technology provides unprecedented opportunities for exploring biological phenomena at the single-cell level. The discovery of cell types is one of the major applications for researchers to explore the heterogeneity of cells. Some computational methods have been proposed to solve the problem of scRNA-seq data clustering. However, the unavoidable technical noise and notorious dropouts also reduce the accuracy of clustering methods. Here, we propose the cauchy-based bounded constraint low-rank representation (CBLRR), which is a low-rank representation-based method by introducing cauchy loss function (CLF) and bounded nuclear norm regulation, aiming to alleviate the above issue. Specifically, as an effective loss function, the CLF is proven to enhance the robustness of the identification of cell types. Then, we adopt the bounded constraint to ensure the entry values of single-cell data within the restricted interval. Finally, the performance of CBLRR is evaluated on 15 scRNA-seq datasets, and compared with other state-of-the-art methods. The experimental results demonstrate that CBLRR performs accurately and robustly on clustering scRNA-seq data. Furthermore, CBLRR is an effective tool to cluster cells, and provides great potential for downstream analysis of single-cell data. The source code of CBLRR is available online at https://github.com/Ginnay/CBLRR.


Single-Cell Analysis , Software , Algorithms , Cluster Analysis , Gene Expression Profiling/methods , RNA-Seq , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods
7.
Brief Bioinform ; 22(6)2021 11 05.
Article En | MEDLINE | ID: mdl-34415016

Accurate prediction of immunogenic peptide recognized by T cell receptor (TCR) can greatly benefit vaccine development and cancer immunotherapy. However, identifying immunogenic peptides accurately is still a huge challenge. Most of the antigen peptides predicted in silico fail to elicit immune responses in vivo without considering TCR as a key factor. This inevitably causes costly and time-consuming experimental validation test for predicted antigens. Therefore, it is necessary to develop novel computational methods for precisely and effectively predicting immunogenic peptide recognized by TCR. Here, we described DLpTCR, a multimodal ensemble deep learning framework for predicting the likelihood of interaction between single/paired chain(s) of TCR and peptide presented by major histocompatibility complex molecules. To investigate the generality and robustness of the proposed model, COVID-19 data and IEDB data were constructed for independent evaluation. The DLpTCR model exhibited high predictive power with area under the curve up to 0.91 on COVID-19 data while predicting the interaction between peptide and single TCR chain. Additionally, the DLpTCR model achieved the overall accuracy of 81.03% on IEDB data while predicting the interaction between peptide and paired TCR chains. The results demonstrate that DLpTCR has the ability to learn general interaction rules and generalize to antigen peptide recognition by TCR. A user-friendly webserver is available at http://jianglab.org.cn/DLpTCR/. Additionally, a stand-alone software package that can be downloaded from https://github.com/jiangBiolab/DLpTCR.


COVID-19 Drug Treatment , Epitopes/immunology , Peptides/immunology , Receptors, Antigen, T-Cell/immunology , SARS-CoV-2/immunology , Amino Acid Sequence/genetics , COVID-19/genetics , COVID-19/immunology , COVID-19/virology , Computer Simulation , Deep Learning , Epitopes/genetics , Humans , Peptides/genetics , Peptides/therapeutic use , Protein Binding/genetics , Receptors, Antigen, T-Cell/genetics , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Software
8.
J Chem Inf Model ; 60(10): 4497-4505, 2020 10 26.
Article En | MEDLINE | ID: mdl-32804489

To efficiently save cost and reduce risk in drug research and development, there is a pressing demand to develop in silico methods to predict drug sensitivity to cancer cells. With the exponentially increasing number of multi-omics data derived from high-throughput techniques, machine learning-based methods have been applied to the prediction of drug sensitivities. However, these methods have drawbacks either in the interpretability of the mechanism of drug action or limited performance in modeling drug sensitivity. In this paper, we presented a pathway-guided deep neural network (DNN) model to predict the drug sensitivity in cancer cells. Biological pathways describe a group of molecules in a cell that collaborates to control various biological functions like cell proliferation and death, thereby abnormal function of pathways can result in disease. To take advantage of the excellent predictive ability of DNN and the biological knowledge of pathways, we reshaped the canonical DNN structure by incorporating a layer of pathway nodes and their connections to input gene nodes, which makes the DNN model more interpretable and predictive compared to canonical DNN. We have conducted extensive performance evaluations on multiple independent drug sensitivity data sets and demonstrated that our model significantly outperformed the canonical DNN model and eight other classical regression models. Most importantly, we observed a remarkable activity decrease in disease-related pathway nodes during forward propagation upon inputs of drug targets, which implicitly corresponds to the inhibition effect of disease-related pathways induced by drug treatment on cancer cells. Our empirical experiments showed that our method achieves pharmacological interpretability and predictive ability in modeling drug sensitivity in cancer cells. The web server, the processed data sets, and source codes for reproducing our work are available at http://pathdnn.denglab.org.


Neural Networks, Computer , Pharmaceutical Preparations , Machine Learning
9.
Article En | MEDLINE | ID: mdl-32411695

The assignment of function to proteins at a large scale is essential for understanding the molecular mechanism of life. However, only a very small percentage of the more than 179 million proteins in UniProtKB have Gene Ontology (GO) annotations supported by experimental evidence. In this paper, we proposed an integrated deep-learning-based classification model, named SDN2GO, to predict protein functions. SDN2GO applies convolutional neural networks to learn and extract features from sequences, protein domains, and known PPI networks, and then utilizes a weight classifier to integrate these features and achieve accurate predictions of GO terms. We constructed the training set and the independent test set according to the time-delayed principle of the Critical Assessment of Function Annotation (CAFA) and compared it with two highly competitive methods and the classic BLAST method on the independent test set. The results show that our method outperforms others on each sub-ontology of GO. We also investigated the performance of using protein domain information. We learned from the Natural Language Processing (NLP) to process domain information and pre-trained a deep learning sub-model to extract the comprehensive features of domains. The experimental results demonstrate that the domain features we obtained are much improved the performance of our model. Our deep learning models together with the data pre-processing scripts are publicly available as an open source software at https://github.com/Charrick/SDN2GO.

10.
Int J Mol Sci ; 20(23)2019 Nov 30.
Article En | MEDLINE | ID: mdl-31801264

MicroRNAs (miRNAs) are a highly abundant collection of functional non-coding RNAs involved in cellular regulation and various complex human diseases. Although a large number of miRNAs have been identified, most of their physiological functions remain unknown. Computational methods play a vital role in exploring the potential functions of miRNAs. Here, we present DeepMiR2GO, a tool for integrating miRNAs, proteins and diseases, to predict the gene ontology (GO) functions based on multiple deep neuro-symbolic models. DeepMiR2GO starts by integrating the miRNA co-expression network, protein-protein interaction (PPI) network, disease phenotype similarity network, and interactions or associations among them into a global heterogeneous network. Then, it employs an efficient graph embedding strategy to learn potential network representations of the global heterogeneous network as the topological features. Finally, a deep multi-label classification network based on multiple neuro-symbolic models is built and used to annotate the GO terms of miRNAs. The predicted results demonstrate that DeepMiR2GO performs significantly better than other state-of-the-art approaches in terms of precision, recall, and maximum F-measure.


Cardiovascular Diseases/genetics , MicroRNAs/genetics , Neoplasms/genetics , Neural Networks, Computer , Schizophrenia/genetics , Cardiovascular Diseases/metabolism , Cardiovascular Diseases/pathology , Datasets as Topic , Gene Expression Regulation , Gene Ontology , Gene Regulatory Networks , Humans , MicroRNAs/classification , MicroRNAs/metabolism , Neoplasms/metabolism , Neoplasms/pathology , Schizophrenia/metabolism , Schizophrenia/pathology
...