Pesquisa | Portal de Pesquisa da BVS

DeepAntigen: a novel method for neoantigen prioritization via 3D genome and deep sparse learning.

Shi, Yi; Guo, Zehua; Su, Xianbin; Meng, Luming; Zhang, Mingxuan; Sun, Jing; Wu, Chao; Zheng, Minhua; Shang, Xueyin; Zou, Xin; Cheng, Wangqiu; Yu, Yaoliang; Cai, Yujia; Zhang, Chaoyi; Cai, Weidong; Da, Lin-Tai; He, Guang; Han, Ze-Guang.

Bioinformatics ; 36(19): 4894-4901, 2020 12 08.

Artigo em Inglês | MEDLINE | ID: mdl-32592462

RESUMO

MOTIVATION: The mutations of cancers can encode the seeds of their own destruction, in the form of T-cell recognizable immunogenic peptides, also known as neoantigens. It is computationally challenging, however, to accurately prioritize the potential neoantigen candidates according to their ability of activating the T-cell immunoresponse, especially when the somatic mutations are abundant. Although a few neoantigen prioritization methods have been proposed to address this issue, advanced machine learning model that is specifically designed to tackle this problem is still lacking. Moreover, none of the existing methods considers the original DNA loci of the neoantigens in the perspective of 3D genome which may provide key information for inferring neoantigens' immunogenicity. RESULTS: In this study, we discovered that DNA loci of the immunopositive and immunonegative MHC-I neoantigens have distinct spatial distribution patterns across the genome. We therefore used the 3D genome information along with an ensemble pMHC-I coding strategy, and developed a group feature selection-based deep sparse neural network model (DNN-GFS) that is optimized for neoantigen prioritization. DNN-GFS demonstrated increased neoantigen prioritization power comparing to existing sequence-based approaches. We also developed a webserver named deepAntigen (http://yishi.sjtu.edu.cn/deepAntigen) that implements the DNN-GFS as well as other machine learning methods. We believe that this work provides a new perspective toward more accurate neoantigen prediction which eventually contribute to personalized cancer immunotherapy. AVAILABILITY AND IMPLEMENTATION: Data and implementation are available on webserver: http://yishi.sjtu.edu.cn/deepAntigen. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Antígenos de Neoplasias , Neoplasias , Antígenos de Neoplasias/genética , Genoma , Humanos , Imunoterapia , Neoplasias/genética , Linfócitos T

MT-MAG: Accurate and interpretable machine learning for complete or partial taxonomic assignments of metagenomeassembled genomes.

Li, Wanxin; Kari, Lila; Yu, Yaoliang; Hug, Laura A.

PLoS One ; 18(8): e0283536, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37594964

RESUMO

We propose MT-MAG, a novel machine learning-based software tool for the complete or partial hierarchically-structured taxonomic classification of metagenome-assembled genomes (MAGs). MT-MAG is alignment-free, with k-mer frequencies being the only feature used to distinguish a DNA sequence from another (herein k = 7). MT-MAG is capable of classifying large and diverse metagenomic datasets: a total of 245.68 Gbp in the training sets, and 9.6 Gbp in the test sets analyzed in this study. In addition to complete classifications, MT-MAG offers a "partial classification" option, whereby a classification at a higher taxonomic level is provided for MAGs that cannot be classified to the Species level. MT-MAG outputs complete or partial classification paths, and interpretable numerical classification confidences of its classifications, at all taxonomic ranks. To assess the performance of MT-MAG, we define a "weighted classification accuracy," with a weighting scheme reflecting the fact that partial classifications at different ranks are not equally informative. For the two benchmarking datasets analyzed (genomes from human gut microbiome species, and bacterial and archaeal genomes assembled from cow rumen metagenomic sequences), MT-MAG achieves an average of 87.32% in weighted classification accuracy. At the Species level, MT-MAG outperforms DeepMicrobes, the only other comparable software tool, by an average of 34.79% in weighted classification accuracy. In addition, MT-MAG is able to completely classify an average of 67.70% of the sequences at the Species level, compared with DeepMicrobes which only classifies 47.45%. Moreover, MT-MAG provides additional information for sequences that it could not classify at the Species level, resulting in the partial or complete classification of 95.13%, of the genomes in the datasets analyzed. Lastly, unlike other taxonomic assignment tools (e.g., GDTB-Tk), MT-MAG is an alignment-free and genetic marker-free tool, able to provide additional bioinformatics analysis to confirm existing or tentative taxonomic assignments.

Assuntos

Microbioma Gastrointestinal , Metagenoma , Animais , Bovinos , Feminino , Humanos , Metagenoma/genética , Benchmarking , Biologia Computacional , Aprendizado de Máquina

Network Comparison with Interpretable Contrastive Network Representation Learning.

Fujiwara, Takanori; Zhao, Jian; Chen, Francine; Yu, Yaoliang; Ma, Kwan-Liu.

J Data Sci Stat Vis ; 2(5)2022 Sep 07.

Artigo em Inglês | MEDLINE | ID: mdl-38318468

RESUMO

Identifying unique characteristics in a network through comparison with another network is an essential network analysis task. For example, with networks of protein interactions obtained from normal and cancer tissues, we can discover unique types of interactions in cancer tissues. This analysis task could be greatly assisted by contrastive learning, which is an emerging analysis approach to discover salient patterns in one dataset relative to another. However, existing contrastive learning methods cannot be directly applied to networks as they are designed only for high-dimensional data analysis. To address this problem, we introduce a new analysis approach called contrastive network representation learning (cNRL). By integrating two machine learning schemes, network representation learning and contrastive learning, cNRL enables embedding of network nodes into a low-dimensional representation that reveals the uniqueness of one network compared to another. Within this approach, we also design a method, named i-cNRL, which offers interpretability in the learned results, allowing for understanding which specific patterns are only found in one network. We demonstrate the effectiveness of i-cNRL for network comparison with multiple network models and real-world datasets. Furthermore, we compare i-cNRL and other potential cNRL algorithm designs through quantitative and qualitative evaluations.

A novel neoantigen discovery approach based on chromatin high order conformation.

Shi, Yi; Zhang, Mingxuan; Meng, Luming; Su, Xianbin; Shang, Xueying; Guo, Zehua; Li, Qingjiao; Lin, Mengna; Zou, Xin; Luo, Qing; Yu, Yaoliang; Wu, Yanting; Da, Lintai; Cai, Tom Weidong; He, Guang; Han, Ze-Guang.

BMC Med Genomics ; 13(Suppl 6): 62, 2020 08 27.

Artigo em Inglês | MEDLINE | ID: mdl-32854726

RESUMO

BACKGROUND: High-throughput sequencing technology has yielded reliable and ultra-fast sequencing for DNA and RNA. For tumor cells of cancer patients, when combining the results of DNA and RNA sequencing, one can identify potential neoantigens that stimulate the immune response of the T cell. However, when the somatic mutations are abundant, it is computationally challenging to efficiently prioritize the identified neoantigen candidates according to their ability of activating the T cell immuno-response. METHODS: Numerous prioritization or prediction approaches have been proposed to address this issue but none of them considers the original DNA loci of the neoantigens from the perspective of 3D genome. Based on our previous discoveries, we propose to investigate the distribution of neoantigens with different immunogenicity abilities in 3D genome and propose to adopt this important information into neoantigen prediction. RESULTS: We retrospect the DNA origins of the immuno-positive and immuno-negative neoantigens in the context of 3D genome and discovered that DNA loci of the immuno-positive neoantigens and immuno-negative neoantigens have very different distribution pattern. Specifically, comparing to the background 3D genome, DNA loci of the immuno-positive neoantigens tend to locate at specific regions in the 3D genome. We thus used this information into neoantigen prediction and demonstrated the effectiveness of this approach. CONCLUSION: We believe that the 3D genome information will help to increase the precision of neoantigen prioritization and discovery and eventually benefit precision and personalized medicine in cancer immunotherapy.

Assuntos

Antígenos de Neoplasias/química , Cromatina/química , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Medicina de Precisão , Conformação Proteica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA