Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
IEEE J Biomed Health Inform ; 28(1): 569-579, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37991904

RESUMO

Adverse drug-drug interactions (DDIs) pose potential risks in polypharmacy due to unknown physicochemical incompatibilities between co-administered drugs. Recent studies have utilized multi-layer graph neural network architectures to model hierarchical molecular substructures of drugs, achieving excellent DDI prediction performance. While extant substructural frameworks effectively encode interactions from atom-level features, they overlook valuable chemical bond representations within molecular graphs. More critically, given the multifaceted nature of DDI prediction tasks involving both known and novel drug combinations, previous methods lack tailored strategies to address these distinct scenarios. The resulting lack of adaptability impedes further improvements to model performance. To tackle these challenges, we propose PEB-DDI, a DDI prediction learning framework with enhanced substructure extraction. First, the information of chemical bonds is integrated and synchronously updated with the atomic nodes. Then, different dual-view strategies are selected based on whether novel drugs are present in the prediction task. Particularly, we constructed Molecular fingerprint-Molecular graph view for transductive task, and Bipartite graph-Molecular graph view for inductive task. Rigorous evaluations on benchmark datasets underscore PEB-DDI's superior performance. Notably, on DrugBank, it achieves an outstanding accuracy rate of 98.18% when predicting previously unknown interactions among approved drugs. Even when faced with novel drugs, PEB-DDI consistently exhibits outstanding generalization capabilities with an accuracy rate of 88.06%, attributing to the proper migrating of molecular basic structure learning.


Assuntos
Redes Neurais de Computação , Humanos , Interações Medicamentosas
2.
J Chem Inf Model ; 64(1): 238-249, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38103039

RESUMO

Drug repositioning plays a key role in disease treatment. With the large-scale chemical data increasing, many computational methods are utilized for drug-disease association prediction. However, most of the existing models neglect the positive influence of non-Euclidean data and multisource information, and there is still a critical issue for graph neural networks regarding how to set the feature diffuse distance. To solve the problems, we proposed SiSGC, which makes full use of the biological knowledge information as initial features and learns the structure information from the constructed heterogeneous graph with the adaptive selection of the information diffuse distance. Then, the structural features are fused with the denoised similarity information and fed to the advanced classifier of CatBoost to make predictions. Three different data sets are used to confirm the robustness and generalization of SiSGC under two splitting strategies. Experiment results demonstrate that the proposed model achieves superior performance compared with the six leading methods and four variants. Our case study on breast neoplasms further indicates that SiSGC is trustworthy and robust yet simple. We also present four drugs for breast cancer treatment with high confidence and further give an explanation for demonstrating the rationality. There is no doubt that SiSGC can be used as a beneficial supplement for drug repositioning.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação
3.
J Theor Biol ; 571: 111538, 2023 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-37257720

RESUMO

The gut microbial community has been shown to play a significant role in various diseases, including colorectal cancer (CRC), which is a major public health concern worldwide. The accurate diagnosis and etiological analysis of CRC are crucial issues. Numerous methods have utilized gut microbiota to address these challenges; however, few have considered the complex interactions and individual heterogeneity of the gut microbiota, which are important issues in genetics and intestinal microbiology, particularly in high-dimensional cases. This paper presents a novel method called Binary matrix based on Logistic Regression (LRBmat) to address these concerns. The binary matrix in LRBmat can directly mitigate or eliminate the influence of heterogeneity, while also capturing information on gut microbial interactions with any order. LRBmat is highly adaptable and can be combined with any machine learning method to enhance its capabilities. The proposed method was evaluated using real CRC data and demonstrated superior classification performance compared to state-of-the-art methods. Furthermore, the association rules extracted from the binary matrix of the real data align well with biological properties and existing literature, thereby aiding in the etiological analysis of CRC.


Assuntos
Neoplasias Colorretais , Microbioma Gastrointestinal , Microbiota , Humanos , Interações Microbianas
4.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35039838

RESUMO

Drug repositioning is an efficient and promising strategy for traditional drug discovery and development. Many research efforts are focused on utilizing deep-learning approaches based on a heterogeneous network for modeling complex drug-disease associations. Similar to traditional latent factor models, which directly factorize drug-disease associations, they assume the neighbors are independent of each other in the network and thus tend to be ineffective to capture localized information. In this study, we propose a novel neighborhood and neighborhood interaction-based neural collaborative filtering approach (called DRWBNCF) to infer novel potential drugs for diseases. Specifically, we first construct three networks, including the known drug-disease association network, the drug-drug similarity and disease-disease similarity networks (using the nearest neighbors). To take the advantage of localized information in the three networks, we then design an integration component by proposing a new weighted bilinear graph convolution operation to integrate the information of the known drug-disease association, the drug's and disease's neighborhood and neighborhood interactions into a unified representation. Lastly, we introduce a prediction component, which utilizes the multi-layer perceptron optimized by the α-balanced focal loss function and graph regularization to model the complex drug-disease associations. Benchmarking comparisons on three datasets verified the effectiveness of DRWBNCF for drug repositioning. Importantly, the unknown drug-disease associations predicted by DRWBNCF were validated against clinical trials and three authoritative databases and we listed several new DRWBNCF-predicted potential drugs for breast cancer (e.g. valrubicin and teniposide) and small cell lung cancer (e.g. valrubicin and cytarabine).


Assuntos
Algoritmos , Reposicionamento de Medicamentos , Biologia Computacional , Bases de Dados Factuais , Descoberta de Drogas , Redes Neurais de Computação
5.
ACS Omega ; 6(37): 23998-24008, 2021 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-34568678

RESUMO

Cancer is one of the most dangerous threats to human health. Accurate identification of anticancer peptides (ACPs) is valuable for the development and design of new anticancer agents. However, most machine-learning algorithms have limited ability to identify ACPs, and their accuracy is sensitive to the amount of label data. In this paper, we construct a new technology that combines active learning (AL) and label propagation (LP) algorithm to solve this problem, called (ACP-ALPM). First, we develop an efficient feature representation method based on various descriptor information and coding information of the peptide sequence. Then, an AL strategy is used to filter out the most informative data for model training, and a more powerful LP classifier is cast through continuous iterations. Finally, we evaluate the performance of ACP-ALPM and compare it with that of some of the state-of-the-art and classic methods; experimental results show that our method is significantly superior to them. In addition, through the experimental comparison of random selection and AL on three public data sets, it is proved that the AL strategy is more effective. Notably, a visualization experiment further verified that AL can utilize unlabeled data to improve the performance of the model. We hope that our method can be extended to other types of peptides and provide more inspiration for other similar work.

6.
BMC Bioinformatics ; 22(1): 216, 2021 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-33902446

RESUMO

BACKGROUND: Carbonylation is a non-enzymatic irreversible protein post-translational modification, and refers to the side chain of amino acid residues being attacked by reactive oxygen species and finally converted into carbonyl products. Studies have shown that protein carbonylation caused by reactive oxygen species is involved in the etiology and pathophysiological processes of aging, neurodegenerative diseases, inflammation, diabetes, amyotrophic lateral sclerosis, Huntington's disease, and tumor. Current experimental approaches used to predict carbonylation sites are expensive, time-consuming, and limited in protein processing abilities. Computational prediction of the carbonylation residue location in protein post-translational modifications enhances the functional characterization of proteins. RESULTS: In this study, an integrated classifier algorithm, CarSite-II, was developed to identify K, P, R, and T carbonylated sites. The resampling method K-means similarity-based undersampling and the synthetic minority oversampling technique (SMOTE-KSU) were incorporated to balance the proportions of K, P, R, and T carbonylated training samples. Next, the integrated classifier system Rotation Forest uses "support vector machine" subclassifications to divide three types of feature spaces into several subsets. CarSite-II gained Matthew's correlation coefficient (MCC) values of 0.2287/0.3125/0.2787/0.2814, False Positive rate values of 0.2628/0.1084/0.1383/0.1313, False Negative rate values of 0.2252/0.0205/0.0976/0.0608 for K/P/R/T carbonylation sites by tenfold cross-validation, respectively. On our independent test dataset, CarSite-II yield MCC values of 0.6358/0.2910/0.4629/0.3685, False Positive rate values of 0.0165/0.0203/0.0188/0.0094, False Negative rate values of 0.1026/0.1875/0.2037/0.3333 for K/P/R/T carbonylation sites. The results show that CarSite-II achieves remarkably better performance than all currently available prediction tools. CONCLUSION: The related results revealed that CarSite-II achieved better performance than the currently available five programs, and revealed the usefulness of the SMOTE-KSU resampling approach and integration algorithm. For the convenience of experimental scientists, the web tool of CarSite-II is available in http://47.100.136.41:8081/.


Assuntos
Algoritmos , Proteínas , Carbonilação Proteica , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Máquina de Vetores de Suporte
7.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33313672

RESUMO

The peptide therapeutics market is providing new opportunities for the biotechnology and pharmaceutical industries. Therefore, identifying therapeutic peptides and exploring their properties are important. Although several studies have proposed different machine learning methods to predict peptides as being therapeutic peptides, most do not explain the decision factors of model in detail. In this work, an Interpretable Therapeutic Peptide Prediction (ITP-Pred) model based on efficient feature fusion was developed. First, we proposed three kinds of feature descriptors based on sequence and physicochemical property encoded, namely amino acid composition (AAC), group AAC and coding autocorrelation, and concatenated them to obtain the feature representation of therapeutic peptide. Then, we input it into the CNN-Bi-directional Long Short-Term Memory (BiLSTM) model to automatically learn recognition of therapeutic peptides. The cross-validation and independent verification experiments results indicated that ITP-Pred has a higher prediction performance on the benchmark dataset than other comparison methods. Finally, we analyzed the output of the model from two aspects: sequence order and physical and chemical properties, mining important features as guidance for the design of better models that can complement existing methods.


Assuntos
Aprendizado de Máquina , Modelos Genéticos , Peptídeos/genética , Análise de Sequência de Proteína , Peptídeos/química , Peptídeos/uso terapêutico
8.
Mol Ther Nucleic Acids ; 16: 566-575, 2019 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-31077936

RESUMO

Identifying disease-related microRNAs (miRNAs) is an essential but challenging task in bioinformatics research. Much effort has been devoted to discovering the underlying associations between miRNAs and diseases. However, most studies mainly focus on designing advanced methods to improve prediction accuracy while neglecting to investigate the link predictability of the relationships between miRNAs and diseases. In this work, we construct a heterogeneous network by integrating neighborhood information in the neural network to predict potential associations between miRNAs and diseases, which also consider the imbalance of datasets. We also employ a new computational method called a neural network model for miRNA-disease association prediction (NNMDA). This model predicts miRNA-disease associations by integrating multiple biological data resources. Comparison of our work with other algorithms reveals the reliable performance of NNMDA. Its average AUC score was 0.937 over 15 diseases in a 5-fold cross-validation and AUC of 0.8439 based on leave-one-out cross-validation. The results indicate that NNMDA could be used in evaluating the accuracy of miRNA-disease associations. Moreover, NNMDA was applied to two common human diseases in two types of case studies. In the first type, 26 out of the top 30 predicted miRNAs of lung neoplasms were confirmed by the experiments. In the second type of case study for new diseases without any known miRNAs related to it, we selected breast neoplasms as the test example by hiding the association information between the miRNAs and this disease. The results verified 50 out of the top 50 predicted breast-neoplasm-related miRNAs.

9.
PLoS Comput Biol ; 15(2): e1006772, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30779739

RESUMO

Recent advances in next-generation sequencing and computational technologies have enabled routine analysis of large-scale single-cell ribonucleic acid sequencing (scRNA-seq) data. However, scRNA-seq technologies have suffered from several technical challenges, including low mean expression levels in most genes and higher frequencies of missing data than bulk population sequencing technologies. Identifying functional gene sets and their regulatory networks that link specific cell types to human diseases and therapeutics from scRNA-seq profiles are daunting tasks. In this study, we developed a Component Overlapping Attribute Clustering (COAC) algorithm to perform the localized (cell subpopulation) gene co-expression network analysis from large-scale scRNA-seq profiles. Gene subnetworks that represent specific gene co-expression patterns are inferred from the components of a decomposed matrix of scRNA-seq profiles. We showed that single-cell gene subnetworks identified by COAC from multiple time points within cell phases can be used for cell type identification with high accuracy (83%). In addition, COAC-inferred subnetworks from melanoma patients' scRNA-seq profiles are highly correlated with survival rate from The Cancer Genome Atlas (TCGA). Moreover, the localized gene subnetworks identified by COAC from individual patients' scRNA-seq data can be used as pharmacogenomics biomarkers to predict drug responses (The area under the receiver operating characteristic curves ranges from 0.728 to 0.783) in cancer cell lines from the Genomics of Drug Sensitivity in Cancer (GDSC) database. In summary, COAC offers a powerful tool to identify potential network-based diagnostic and pharmacogenomics biomarkers from large-scale scRNA-seq profiles. COAC is freely available at https://github.com/ChengF-Lab/COAC.


Assuntos
Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Sequência de Bases/genética , Análise por Conglomerados , Análise de Dados , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , RNA Citoplasmático Pequeno/genética , Curva ROC , Software
10.
Artigo em Inglês | MEDLINE | ID: mdl-29990255

RESUMO

MicroRNAs (miRNAs) play critical roles in regulating gene expression at post-transcriptional levels. Numerous experimental studies indicate that alterations and dysregulations in miRNAs are associated with important complex diseases, especially cancers. Predicting potential miRNA-disease association is beneficial not only to explore the pathogenesis of diseases, but also to understand biological processes. In this work, we propose two methods that can effectively predict potential miRNA-disease associations using our reconstructed miRNA and disease similarity networks, which are based on the latest experimental data. We reconstruct a miRNA functional similarity network using the following biological information: the miRNA family information, miRNA cluster information, experimentally valid miRNA-target association and disease-miRNA information. We also reconstruct a disease similarity network using disease functional information and disease semantic information. We present Katz with specific weights and Katz with machine learning, on the comprehensive heterogeneous network. These methods, which achieve corresponding AUC values of 0.897 and 0.919, exhibit performance superior to the existing methods. Comprehensive data networks and reasonable considerations guarantee the high performance of our methods. Contrary to several methods, which cannot work in such situations, the proposed methods also predict associations for diseases without any known related miRNAs. A web service for the download and prediction of relationships between diseases and miRNAs is available at http://lab.malab.cn/soft/MDPredict/.


Assuntos
MicroRNAs , Neoplasias , Biologia de Sistemas/métodos , Bases de Dados Genéticas , Progressão da Doença , Humanos , MicroRNAs/classificação , MicroRNAs/genética , MicroRNAs/metabolismo , Modelos Estatísticos , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/metabolismo , Curva ROC
11.
Bioinformatics ; 34(14): 2425-2432, 2018 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-29490018

RESUMO

Motivation: The identification of disease-related microRNAs (miRNAs) is an essential but challenging task in bioinformatics research. Similarity-based link prediction methods are often used to predict potential associations between miRNAs and diseases. In these methods, all unobserved associations are ranked by their similarity scores. Higher score indicates higher probability of existence. However, most previous studies mainly focus on designing advanced methods to improve the prediction accuracy while neglect to investigate the link predictability of the networks that present the miRNAs and diseases associations. In this work, we construct a bilayer network by integrating the miRNA-disease network, the miRNA similarity network and the disease similarity network. We use structural consistency as an indicator to estimate the link predictability of the related networks. On the basis of the indicator, a derivative algorithm, called structural perturbation method (SPM), is applied to predict potential associations between miRNAs and diseases. Results: The link predictability of bilayer network is higher than that of miRNA-disease network, indicating that the prediction of potential miRNAs-diseases associations on bilayer network can achieve higher accuracy than based merely on the miRNA-disease network. A comparison between the SPM and other algorithms reveals the reliable performance of SPM which performed well in a 5-fold cross-validation. We test fifteen networks. The AUC values of SPM are higher than some well-known methods, indicating that SPM could serve as a useful computational method for improving the identification accuracy of miRNA‒disease associations. Moreover, in a case study on breast neoplasm, 80% of the top-20 predicted miRNAs have been manually confirmed by previous experimental studies. Availability and implementation: https://github.com/lecea/SPM-code.git. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Suscetibilidade a Doenças , Estudos de Associação Genética/métodos , MicroRNAs/metabolismo , Software , Algoritmos , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Neoplasias da Mama/fisiopatologia , Feminino , Humanos , MicroRNAs/genética , MicroRNAs/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA