Pesquisa | Portal Regional da BVS

1.

KGE-UNIT: toward the unification of molecular interactions prediction based on knowledge graph and multi-task learning on drug discovery.

Zhang, Chengcheng; Zang, Tianyi; Zhao, Tianyi.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38348746

RESUMO

The prediction of molecular interactions is vital for drug discovery. Existing methods often focus on individual prediction tasks and overlook the relationships between them. Additionally, certain tasks encounter limitations due to insufficient data availability, resulting in limited performance. To overcome these limitations, we propose KGE-UNIT, a unified framework that combines knowledge graph embedding (KGE) and multi-task learning, for simultaneous prediction of drug-target interactions (DTIs) and drug-drug interactions (DDIs) and enhancing the performance of each task, even when data availability is limited. Via KGE, we extract heterogeneous features from the drug knowledge graph to enhance the structural features of drug and protein nodes, thereby improving the quality of features. Additionally, employing multi-task learning, we introduce an innovative predictor that comprises the task-aware Convolutional Neural Network-based (CNN-based) encoder and the task-aware attention decoder which can fuse better multimodal features, capture the contextual interactions of molecular tasks and enhance task awareness, leading to improved performance. Experiments on two imbalanced datasets for DTIs and DDIs demonstrate the superiority of KGE-UNIT, achieving high area under the receiver operating characteristics curves (AUROCs) (0.942, 0.987) and area under the precision-recall curve ( AUPRs) (0.930, 0.980) for DTIs and high AUROCs (0.975, 0.989) and AUPRs (0.966, 0.988) for DDIs. Notably, on the LUO dataset where the data were more limited, KGE-UNIT exhibited a more pronounced improvement, with increases of 4.32$\%$ in AUROC and 3.56$\%$ in AUPR for DTIs and 6.56$\%$ in AUROC and 8.17$\%$ in AUPR for DDIs. The scalability of KGE-UNIT is demonstrated through its extension to protein-protein interactions prediction, ablation studies and case studies further validate its effectiveness.

Assuntos

Aprendizagem , Reconhecimento Automatizado de Padrão , Descoberta de Drogas , Área Sob a Curva , Redes Neurais de Computação , Interações Medicamentosas

2.

Maternal GLP-1 receptor activation inhibits fetal growth.

Qiao, Liping; Lu, Cindy; Zang, Tianyi; Dzyuba, Brianna; Shao, Jianhua.

Am J Physiol Endocrinol Metab ; 326(3): E268-E276, 2024 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-38197791

RESUMO

Glucagon-like peptide 1 (GLP-1) regulates food intake, insulin production, and metabolism. Our recent study demonstrated that pancreatic α-cells-secreted (intraislet) GLP-1 effectively promotes maternal insulin secretion and metabolic adaptation during pregnancy. However, the role of circulating GLP-1 in maternal energy metabolism remains largely unknown. Our study aims to investigate systemic GLP-1 response to pregnancy and its regulatory effect on fetal growth. Using C57BL/6 mice, we observed a gradual decline in maternal blood GLP-1 concentrations. Subsequent administration of the GLP-1 receptor agonist semaglutide (Sem) to dams in late pregnancy revealed a modest decrease in maternal food intake during initial treatment. At the same time, no significant alterations were observed in maternal body weight or fat mass. Notably, Sem-treated dams exhibited a significant decrease in fetal body weight, which persisted even following the restoration of maternal blood glucose levels. Despite no observable change in placental weight, a marked reduction in the placenta labyrinth area from Sem-treated dams was evident. Our investigation further demonstrated a substantial decrease in the expression levels of various pivotal nutrient transporters within the placenta, including glucose transporter one and sodium-neutral amino acid transporter one, after Sem treatment. In addition, Sem injection led to a notable reduction in the capillary area, number, and surface densities within the labyrinth. These findings underscore the crucial role of modulating circulating GLP-1 levels in maternal adaptation, emphasizing the inhibitory effects of excessive GLP-1 receptor activation on both placental development and fetal growth.NEW & NOTEWORTHY Our study reveals a progressive decline in maternal blood glucagon-like peptide 1 (GLP-1) concentration. GLP-1 receptor agonist injection in late pregnancy significantly reduced fetal body weight, even after restoration of maternal blood glucose concentration. GLP-1 receptor activation significantly reduced the placental labyrinth area, expression of some nutrient transporters, and capillary development. Our study indicates that reducing maternal blood GLP-1 levels is a physiological adaptation process that benefits placental development and fetal growth.

Assuntos

Glicemia , Placenta , Animais , Feminino , Camundongos , Gravidez , Glicemia/metabolismo , Desenvolvimento Fetal , Peso Fetal , Peptídeo 1 Semelhante ao Glucagon/metabolismo , Receptor do Peptídeo Semelhante ao Glucagon 1/metabolismo , Agonistas do Receptor do Peptídeo 1 Semelhante ao Glucagon , Camundongos Endogâmicos C57BL , Placenta/metabolismo

3.

SPDB: a comprehensive resource and knowledgebase for proteomic data at the single-cell resolution.

Wang, Fang; Liu, Chunpu; Li, Jiawei; Yang, Fan; Song, Jiangning; Zang, Tianyi; Yao, Jianhua; Wang, Guohua.

Nucleic Acids Res ; 52(D1): D562-D571, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-37953313

RESUMO

The single-cell proteomics enables the direct quantification of protein abundance at the single-cell resolution, providing valuable insights into cellular phenotypes beyond what can be inferred from transcriptome analysis alone. However, insufficient large-scale integrated databases hinder researchers from accessing and exploring single-cell proteomics, impeding the advancement of this field. To fill this deficiency, we present a comprehensive database, namely Single-cell Proteomic DataBase (SPDB, https://scproteomicsdb.com/), for general single-cell proteomic data, including antibody-based or mass spectrometry-based single-cell proteomics. Equipped with standardized data process and a user-friendly web interface, SPDB provides unified data formats for convenient interaction with downstream analysis, and offers not only dataset-level but also protein-level data search and exploration capabilities. To enable detailed exhibition of single-cell proteomic data, SPDB also provides a module for visualizing data from the perspectives of cell metadata or protein features. The current version of SPDB encompasses 133 antibody-based single-cell proteomic datasets involving more than 300 million cells and over 800 marker/surface proteins, and 10 mass spectrometry-based single-cell proteomic datasets involving more than 4000 cells and over 7000 proteins. Overall, SPDB is envisioned to be explored as a useful resource that will facilitate the wider research communities by providing detailed insights into proteomics from the single-cell perspective.

Assuntos

Proteínas , Proteômica , Anticorpos , Bases de Conhecimento , Espectrometria de Massas , Humanos , Animais , Análise de Célula Única

4.

Integration of multiple-omics data to reveal the shared genetic architecture of educational attainment, intelligence, cognitive performance, and Alzheimer's disease.

Wang, Fuxu; Wang, Haoyan; Yuan, Ye; Han, Bing; Qiu, Shizheng; Hu, Yang; Zang, Tianyi.

Front Genet ; 14: 1243879, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37900179

RESUMO

Growing evidence suggests the effect of educational attainment (EA) on Alzheimer's disease (AD), but less is known about the shared genetic architecture between them. Here, leveraging genome-wide association studies (GWAS) for AD (N = 21,982/41,944), EA (N = 1,131,881), cognitive performance (N = 257,828), and intelligence (N = 78,308), we investigated their causal association with the linkage disequilibrium score (LDSC) and Mendelian randomization and their shared loci with the conjunctional false discovery rate (conjFDR), transcriptome-wide association studies (TWAS), and colocalization. We observed significant genetic correlations of EA (rg = -0.22, p = 5.07E-05), cognitive performance (rg = -0.27, p = 2.44E-05), and intelligence (rg = -0.30, p = 3.00E-04) with AD, and a causal relationship between EA and AD (OR = 0.74, 95% CI: 0.58-0.94, p = 0.013). We identified 13 shared loci at conjFDR <0.01, of which five were novel, and prioritized three causal genes. These findings inform early prevention strategies for AD.

5.

MHCRoBERTa: pan-specific peptide-MHC class I binding prediction through transfer learning with label-agnostic protein sequences.

Wang, Fuxu; Wang, Haoyan; Wang, Lizhuang; Lu, Haoyu; Qiu, Shizheng; Zang, Tianyi; Zhang, Xinjun; Hu, Yang.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35443027

RESUMO

Predicting the binding of peptide and major histocompatibility complex (MHC) plays a vital role in immunotherapy for cancer. The success of Alphafold of applying natural language processing (NLP) algorithms in protein secondary struction prediction has inspired us to explore the possibility of NLP methods in predicting peptide-MHC class I binding. Based on the above motivations, we propose the MHCRoBERTa method, RoBERTa pre-training approach, for predicting the binding affinity between type I MHC and peptides. Analysis of the results on benchmark dataset demonstrates that MHCRoBERTa can outperform other state-of-art prediction methods with an increase of the Spearman rank correlation coefficient (SRCC) value. Notably, our model gave a significant improvement on IC50 value. Our method has achieved SRCC value and AUC value as 0.785 and 0.817, respectively. Our SRCC value is 14.3% higher than NetMHCpan3.0 (the second highest SRCC value on pan-specific) and is 3% higher than MHCflurry (the second highest SRCC value on all methods). The AUC value is also better than any other pan-specific methods. Moreover, we visualize the multi-head self-attention for the token representation across the layers and heads by this method. Through the analysis of the representation of each layer and head, we can show whether the model has learned the syntax and semantics necessary to perform the prediction task well. All these results demonstrate that our model can accurately predict the peptide-MHC class I binding affinity and that MHCRoBERTa is a powerful tool for screening potential neoantigens for cancer immunotherapy. MHCRoBERTa is available as an open source software at github (https://github.com/FuxuWang/MHCRoBERTa).

Assuntos

Antígenos de Histocompatibilidade Classe I , Peptídeos , Algoritmos , Sequência de Aminoácidos , Antígenos de Histocompatibilidade Classe I/metabolismo , Aprendizado de Máquina , Peptídeos/metabolismo , Ligação Proteica

6.

Explore potential disease related metabolites based on latent factor model.

Wang, Yongtian; Juan, Liran; Peng, Jiajie; Wang, Tao; Zang, Tianyi; Wang, Yadong.

BMC Genomics ; 23(Suppl 1): 269, 2022 Apr 06.

Artigo em Inglês | MEDLINE | ID: mdl-35387615

RESUMO

BACKGROUND: In biological systems, metabolomics can not only contribute to the discovery of metabolic signatures for disease diagnosis, but is very helpful to illustrate the underlying molecular disease-causing mechanism. Therefore, identification of disease-related metabolites is of great significance for comprehensively understanding the pathogenesis of diseases and improving clinical medicine. RESULTS: In the paper, we propose a disease and literature driven metabolism prediction model (DLMPM) to identify the potential associations between metabolites and diseases based on latent factor model. We build the disease glossary with disease terms from different databases and an association matrix based on the mapping between diseases and metabolites. The similarity of diseases and metabolites is used to complete the association matrix. Finally, we predict potential associations between metabolites and diseases based on the matrix decomposition method. In total, 1,406 direct associations between diseases and metabolites are found. There are 119,206 unknown associations between diseases and metabolites predicted with a coverage rate of 80.88%. Subsequently, we extract training sets and testing sets based on data increment from the database of disease-related metabolites and assess the performance of DLMPM on 19 diseases. As a result, DLMPM is proven to be successful in predicting potential metabolic signatures for human diseases with an average AUC value of 82.33%. CONCLUSION: In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases.

Assuntos

Metabolômica , Publicações , Biologia Computacional/métodos , Bases de Dados Factuais , Humanos , Metabolômica/métodos

7.

CNN-DDI: a learning-based method for predicting drug-drug interactions using convolution neural networks.

Zhang, Chengcheng; Lu, Yao; Zang, Tianyi.

BMC Bioinformatics ; 23(Suppl 1): 88, 2022 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-35255808

RESUMO

BACKGROUND: Drug-drug interactions (DDIs) are the reactions between drugs. They are compartmentalized into three types: synergistic, antagonistic and no reaction. As a rapidly developing technology, predicting DDIs-associated events is getting more and more attention and application in drug development and disease diagnosis fields. In this work, we study not only whether the two drugs interact, but also specific interaction types. And we propose a learning-based method using convolution neural networks to learn feature representations and predict DDIs. RESULTS: In this paper, we proposed a novel algorithm using a CNN architecture, named CNN-DDI, to predict drug-drug interactions. First, we extract feature interactions from drug categories, targets, pathways and enzymes as feature vectors and employ the Jaccard similarity as the measurement of drugs similarity. Then, based on the representation of features, we build a new convolution neural network as the DDIs' predictor. CONCLUSION: The experimental results indicate that drug categories is effective as a new feature type applied to CNN-DDI method. And using multiple features is more informative and more effective than single feature. It can be concluded that CNN-DDI has more superiority than other existing algorithms on task of predicting DDIs.

Assuntos

Algoritmos , Redes Neurais de Computação , Desenvolvimento de Medicamentos , Interações Medicamentosas , Projetos de Pesquisa

8.

A multi-network integration approach for measuring disease similarity based on ncRNA regulation and heterogeneous information.

Zhang, Ningyi; Zang, Tianyi.

BMC Bioinformatics ; 23(Suppl 1): 89, 2022 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-35255810

RESUMO

BACKGROUND: Measuring similarity between complex diseases has significant implications for revealing the pathogenesis of diseases and development in the domain of biomedicine. It has been consentaneous that functional associations between disease-related genes and semantic associations can be applied to calculate disease similarity. Currently, more and more studies have demonstrated the profound involvement of non-coding RNA in the regulation of genome organization and gene expression. Thus, taking ncRNA into account can be useful in measuring disease similarities. However, existing methods ignore the regulation functions of ncRNA in biological process. In this study, we proposed a novel deep-learning method to deduce disease similarity. RESULTS: In this article, we proposed a novel method, ImpAESim, a framework integrating multiple networks embedding to learn compact feature representations and disease similarity calculation. We first utilize three different disease-related information networks to build up a heterogeneous network, after a network diffusion process, RWR, a compact feature learning model composed of classic Auto Encoder (AE) and improved AE model is proposed to extract constraints and low-dimensional feature representations. We finally obtain an accurate and low-dimensional feature representation of diseases, then we employed the cosine distance as the measurement of disease similarity. CONCLUSION: ImpAESim focuses on extracting a low-dimensional vector representation of features based on ncRNA regulation, and gene-gene interaction network. Our method can significantly reduce the calculation bias resulted from the sparse disease associations which are derived from semantic associations.

Assuntos

Redes Reguladoras de Genes , RNA Longo não Codificante , RNA Longo não Codificante/genética , RNA não Traduzido/genética

9.

The Essential Role of Pancreatic α-Cells in Maternal Metabolic Adaptation to Pregnancy.

Qiao, Liping; Saget, Sarah; Lu, Cindy; Zang, Tianyi; Dzyuba, Brianna; Hay, William W; Shao, Jianhua.

Diabetes ; 71(5): 978-988, 2022 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-35147704

RESUMO

Pancreatic α-cells are important in maintaining metabolic homeostasis, but their role in regulating maternal metabolic adaptations to pregnancy has not been studied. The objective of this study was to determine whether pancreatic α-cells respond to pregnancy and their contribution to maternal metabolic adaptation. With use of C57BL/6 mice, the findings of our study showed that pregnancy induced a significant increase of α-cell mass by promoting α-cell proliferation that was associated with a transitory increase of maternal serum glucagon concentration in early pregnancy. Maternal pancreatic GLP-1 content also was significantly increased during pregnancy. Using the inducible Cre/loxp technique, we ablated the α-cells (α-null) before and during pregnancy while maintaining enteroendocrine L-cells and serum GLP-1 in the normal range. In contrast to an improved glucose tolerance test (GTT) before pregnancy, significantly impaired GTT and remarkably higher serum glucose concentrations in the fed state were observed in α-null dams. Glucagon receptor antagonism treatment, however, did not affect measures of maternal glucose metabolism, indicating a dispensable role of glucagon receptor signaling in maternal glucose homeostasis. However, the GLP-1 receptor agonist improved insulin production and glucose metabolism of α-null dams. Furthermore, GLP-1 receptor antagonist Exendin (9-39) attenuated pregnancy-enhanced insulin secretion and GLP-1 restored glucose-induced insulin secretion of cultured islets from α-null dams. Together, these results demonstrate that α-cells play an essential role in controlling maternal metabolic adaptation to pregnancy by enhancing insulin secretion.

Assuntos

Células Secretoras de Glucagon , Ilhotas Pancreáticas , Animais , Feminino , Glucagon/metabolismo , Peptídeo 1 Semelhante ao Glucagon/metabolismo , Receptor do Peptídeo Semelhante ao Glucagon 1/genética , Receptor do Peptídeo Semelhante ao Glucagon 1/metabolismo , Células Secretoras de Glucagon/metabolismo , Glucose/metabolismo , Insulina/metabolismo , Ilhotas Pancreáticas/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Gravidez , Receptores de Glucagon/metabolismo

10.

Psi-Caller: A Lightweight Short Read-Based Variant Caller With High Speed and Accuracy.

Liu, Yadong; Jiang, Tao; Gao, Yan; Liu, Bo; Zang, Tianyi; Wang, Yadong.

Front Cell Dev Biol ; 9: 731424, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34485311

RESUMO

With the rapid development of short-read sequencing technologies, many population-scale resequencing studies have been carried out to study the associations between human genome variants and various phenotypes in recent years. Variant calling is one of the core bioinformatics tasks in such studies to comprehensively discover genomic variants in sequenced samples. Many efforts have been made to develop short read-based variant calling approaches; however, state-of-the-art tools are still computationally expensive. Meanwhile, cutting-edge genomics studies also have higher requirements on the yields of variant calling. Herein, we propose Partial-Order Alignment-based single nucleotide polymorphism (SNV) and Indel caller (Psi-caller), a lightweight variant calling algorithm that simultaneously achieves high performance and yield. Mainly, Psi-caller recognizes and divides the candidate variant site into three categories according to the complexity and location of the signatures and employs various methods including binomial model, partial-order alignment, and de Bruijn graph-based local assembly to handle various categories of candidate variant sites to call and genotype SNVs/Indels, respectively. Benchmarks on simulated and real short-read sequencing data sets demonstrate that Psi-caller is times faster than state-of-the-art tools with higher or equal sensitivity and accuracy. It has the potential to well handle large-scale data sets in cutting-edge genomics studies.

11.

DeepGP: An Integrated Deep Learning Method for Endocrine Disease Gene Prediction Using Omics Data.

Zhang, Ningyi; Wang, Haoyan; Xu, Chen; Zhang, Liyuan; Zang, Tianyi.

Front Cell Dev Biol ; 9: 700061, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34295899

RESUMO

Endocrinology is the study focusing on hormones and their actions. Hormones are known as chemical messengers, released into the blood, that exert functions through receptors to make an influence in the target cell. The capacity of the mammalian organism to perform as a whole unit is made possible based on two principal control mechanisms, the nervous system and the endocrine system. The endocrine system is essential in regulating growth and development, tissue function, metabolism, and reproductive processes. Endocrine diseases such as diabetes mellitus, Grave's disease, polycystic ovary syndrome, and insulin-like growth factor I deficiency (IGFI deficiency) are classical endocrine diseases. Endocrine dysfunction is also an increasing factor of morbidity in cancer and other dangerous diseases in humans. Thus, it is essential to understand the diseases from their genetic level in order to recognize more pathogenic genes and make a great effort in understanding the pathologies of endocrine diseases. In this study, we proposed a deep learning method named DeepGP based on graph convolutional network and convolutional neural network for prioritizing susceptible genes of five endocrine diseases. To test the performance of our method, we performed 10-cross-validations on an integrated reported dataset; DeepGP obtained a performance of the area under the curve of â¼83% and area under the precision-recall curve of â¼65%. We found that type 1 diabetes mellitus (T1DM) and type 2 diabetes mellitus (T2DM) share most of their associated genes; therefore, we should pay more attention to the rest of the genes related to T1DM and T2DM, respectively, which could help in understanding the pathogenesis and pathologies of these diseases.

12.

SKSV: ultrafast structural variation detection from circular consensus sequencing reads.

Liu, Yadong; Jiang, Tao; Su, Junhao; Liu, Bo; Zang, Tianyi; Wang, Yadong.

Bioinformatics ; 37(20): 3647-3649, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-33963826

RESUMO

SUMMARY: Circular consensus sequencing reads are promising for the comprehensive detection of structural variants (SVs). However, alignment-based SV calling pipelines are computationally intensive due to the generation of complete read-alignments and its post-processing. Herein, we propose a SKeleton-based analysis toolkit for Structural Variation detection (SKSV). Benchmarks on real and simulated datasets demonstrate that SKSV has an order of magnitude of faster speed than state-of-the-art SV calling approaches; moreover, it achieves higher F1 scores for various types of SVs. AVAILABILITY AND IMPLEMENTATION: SKSV is available from https://github.com/ydLiu-HIT/SKSV. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

13.

Enhancement and Imputation of Peak Signal Enables Accurate Cell-Type Classification in scATAC-seq.

Cui, Zhe; Cui, Ya; Gao, Yan; Jiang, Tao; Zang, Tianyi; Wang, Yadong.

Front Genet ; 12: 658352, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33889181

RESUMO

Single-cell Assay Transposase Accessible Chromatin sequencing (scATAC-seq) has been widely used in profiling genome-wide chromatin accessibility in thousands of individual cells. However, compared with single-cell RNA-seq, the peaks of scATAC-seq are much sparser due to the lower copy numbers (diploid in humans) and the inherent missing signals, which makes it more challenging to classify cell type based on specific expressed gene or other canonical markers. Here, we present svmATAC, a support vector machine (SVM)-based method for accurately identifying cell types in scATAC-seq datasets by enhancing peak signal strength and imputing signals through patterns of co-accessibility. We applied svmATAC to several scATAC-seq data from human immune cells, human hematopoietic system cells, and peripheral blood mononuclear cells. The benchmark results showed that svmATAC is free of literature-based markers and robust across datasets in different libraries and platforms. The source code of svmATAC is available at https://github.com/mrcuizhe/svmATAC under the MIT license.

14.

interacCircos: an R package based on JavaScript libraries for the generation of interactive circos plots.

Cui, Zhe; Cui, Ya; Zang, Tianyi; Wang, Yadong.

Bioinformatics ; 37(20): 3642-3644, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-33830205

RESUMO

SUMMARY: JavaScript-based Circos libraries have been widely implemented to generate interactive Circos plots in web applications. However, these libraries require either local installation, which requires the compilation of extra libraries, or extra data processing procedures to prepare input and configuration for each track of plot, which limits the utility and capability of integration with powerful R packages. In this report, we present interacCircos, an R package for creating interactive Circos plots through the integration of JavaScript-based libraries. interacCircos can simply and flexibly implement 14 track-plot functions and 7 auxiliary functions for presenting large-scale genomic data in interactive Circos plots. AVAILABILITY AND IMPLEMENTATION: InteracCircos and its manual are freely available at https://github.com/mrcuizhe/interacCircos under the GPL license. The online documentation is available at https://mrcuizhe.github.io/interacCircos_documentation/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

15.

Prediction and collection of protein-metabolite interactions.

Zhao, Tianyi; Liu, Jinxin; Zeng, Xi; Wang, Wei; Li, Sheng; Zang, Tianyi; Peng, Jiajie; Yang, Yang.

Brief Bioinform ; 22(5)2021 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-33554247

RESUMO

Interactions between proteins and small molecule metabolites play vital roles in regulating protein functions and controlling various cellular processes. The activities of metabolic enzymes, transcription factors, transporters and membrane receptors can all be mediated through protein-metabolite interactions (PMIs). Compared with the rich knowledge of protein-protein interactions, little is known about PMIs. To the best of our knowledge, no existing database has been developed for collecting PMIs. The recent rapid development of large-scale mass spectrometry analysis of biomolecules has led to the discovery of large amounts of PMIs. Therefore, we developed the PMI-DB to provide a comprehensive and accurate resource of PMIs. A total of 49 785 entries were manually collected in the PMI-DB, corresponding to 23 small molecule metabolites, 9631 proteins and 4 species. Unlike other databases that only provide positive samples, the PMI-DB provides non-interaction between proteins and metabolites, which not only reduces the experimental cost for biological experimenters but also facilitates the construction of more accurate algorithms for researchers using machine learning. To show the convenience of the PMI-DB, we developed a deep learning-based method to predict PMIs in the PMI-DB and compared it with several methods. The experimental results show that the area under the curve and area under the precision-recall curve of our method are 0.88 and 0.95, respectively. Overall, the PMI-DB provides a user-friendly interface for browsing the biological functions of metabolites/proteins of interest, and experimental techniques for identifying PMIs in different species, which provides important support for furthering the understanding of cellular processes. The PMI-DB is freely accessible at http://easybioai.com/PMIDB.

Assuntos

Aprendizado Profundo , Escherichia coli/metabolismo , Metaboloma , Mapas de Interação de Proteínas , Proteínas/metabolismo , Leveduras/metabolismo , Animais , Cromatografia Líquida , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas , Metabolômica , Camundongos , Interface Usuário-Computador

16.

Identifying drug-target interactions based on graph convolutional network and deep neural network.

Zhao, Tianyi; Hu, Yang; Valsdottir, Linda R; Zang, Tianyi; Peng, Jiajie.

Brief Bioinform ; 22(2): 2141-2150, 2021 03 22.

Artigo em Inglês | MEDLINE | ID: mdl-32367110

RESUMO

Identification of new drug-target interactions (DTIs) is an important but a time-consuming and costly step in drug discovery. In recent years, to mitigate these drawbacks, researchers have sought to identify DTIs using computational approaches. However, most existing methods construct drug networks and target networks separately, and then predict novel DTIs based on known associations between the drugs and targets without accounting for associations between drug-protein pairs (DPPs). To incorporate the associations between DPPs into DTI modeling, we built a DPP network based on multiple drugs and proteins in which DPPs are the nodes and the associations between DPPs are the edges of the network. We then propose a novel learning-based framework, 'graph convolutional network (GCN)-DTI', for DTI identification. The model first uses a graph convolutional network to learn the features for each DPP. Second, using the feature representation as an input, it uses a deep neural network to predict the final label. The results of our analysis show that the proposed framework outperforms some state-of-the-art approaches by a large margin.

Assuntos

Aprendizado Profundo , Sistemas de Liberação de Medicamentos , Redes Neurais de Computação , Algoritmos , Humanos

17.

DRACP: a novel method for identification of anticancer peptides.

Zhao, Tianyi; Hu, Yang; Zang, Tianyi.

BMC Bioinformatics ; 21(Suppl 16): 559, 2020 Dec 16.

Artigo em Inglês | MEDLINE | ID: mdl-33323099

RESUMO

BACKGROUND: Millions of people are suffering from cancers, but accurate early diagnosis and effective treatment are still tough for all doctors. Common ways against cancer include surgical operation, radiotherapy and chemotherapy. However, they are all very harmful for patients. Recently, the anticancer peptides (ACPs) have been discovered to be a potential way to treat cancer. Since ACPs are natural biologics, they are safer than other methods. However, the experimental technology is an expensive way to find ACPs so we purpose a new machine learning method to identify the ACPs. RESULTS: Firstly, we extracted the feature of ACPs in two aspects: sequence and chemical characteristics of amino acids. For sequence, average 20 amino acids composition was extracted. For chemical characteristics, we classified amino acids into six groups based on the patterns of hydrophobic and hydrophilic residues. Then, deep belief network has been used to encode the features of ACPs. Finally, we purposed Random Relevance Vector Machines to identify the true ACPs. We call this method 'DRACP' and tested the performance of it on two independent datasets. Its AUC and AUPR are higher than 0.9 in both datasets. CONCLUSION: We developed a novel method named 'DRACP' and compared it with some traditional methods. The cross-validation results showed its effectiveness in identifying ACPs.

Assuntos

Antineoplásicos/uso terapêutico , Biologia Computacional/métodos , Peptídeos/uso terapêutico , Humanos , Aprendizado de Máquina , Neoplasias/tratamento farmacológico , Peptídeos/química , Curva ROC , Máquina de Vetores de Suporte

18.

Identifying Protein Biomarkers in Blood for Alzheimer's Disease.

Zhao, Tianyi; Hu, Yang; Zang, Tianyi; Wang, Yadong.

Front Cell Dev Biol ; 8: 472, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32626709

RESUMO

Background: At present, the main diagnostic methods for Alzheimer's disease (AD) are positron emission tomography (PET) scanning of the brain and analysis of cerebrospinal fluid (CSF) sample, but these methods are expensive and harmful to patients. Recently, more researchers focus on diagnosing AD by detecting biomarkers in blood, which is a cheaper and harmless way. Therefore, identifying AD-related proteins in blood can help treatment and diagnosis. Methods: We proposed a hypothesis that similar diseases share similar proteins. Diseases with similar symptoms are caused by abnormalities of similar proteins. Assuming that the similarities between AD and other diseases obey the normal distribution, we developed an iterative method based on disease similarity (IBDS). We combined Elastic Network (EN) with Minimum angle regression (MAR) to find the optimal solution. Finally, we used case studies and Summary data Mendelian Random (SMR) to verify our method. Results: We selected 39 diseases which are highly related to AD. They correspond 1,481 kinds of proteins. One hundred and eighty-four proteins are reported to be related to AD in Uniprot and the number would be 284 with our method. The AUC of our method by cross-validation is 0.9251 which is much higher than previous methods. Conclusion: In this paper, we presented a novel method for prioritizing AD-related proteins. Seven proteins have tissue specificity in blood among these 284 proteins, which could be used to diagnose AD in future. Case studies and SMR have been used to prove the relationship between these 7 proteins and AD. Availability and Implementation: https://github.com/zty2009/Identifying-Protein-Biomarkers-in-Blood-for-Alzheimer-s-Disease.

19.

A Review of Drug Side Effect Identification Methods.

Deng, Shuai; Sun, Yige; Zhao, Tianyi; Hu, Yang; Zang, Tianyi.

Curr Pharm Des ; 26(26): 3096-3104, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32532187

RESUMO

Drug side effects have become an important indicator for evaluating the safety of drugs. There are two main factors in the frequent occurrence of drug safety problems; on the one hand, the clinical understanding of drug side effects is insufficient, leading to frequent adverse drug reactions, while on the other hand, due to the long-term period and complexity of clinical trials, side effects of approved drugs on the market cannot be reported in a timely manner. Therefore, many researchers have focused on developing methods to identify drug side effects. In this review, we summarize the methods of identifying drug side effects and common databases in this field. We classified methods of identifying side effects into four categories: biological experimental, machine learning, text mining and network methods. We point out the key points of each kind of method. In addition, we also explain the advantages and disadvantages of each method. Finally, we propose future research directions.

Assuntos

Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Mineração de Dados , Bases de Dados Factuais , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Humanos , Aprendizado de Máquina

20.

MRTFB regulates the expression of NOMO1 in colon.

Zhao, Tianyi; Hu, Yang; Zang, Tianyi; Cheng, Liang.

Proc Natl Acad Sci U S A ; 117(14): 7568-7569, 2020 04 07.

Artigo em Inglês | MEDLINE | ID: mdl-32184333

Assuntos

Neoplasias do Colo , Neoplasias Colorretais , Antígeno CD146 , Colo , Humanos , Fatores de Transcrição

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA