Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 400
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38517693

RESUMO

Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.


Assuntos
MicroRNAs , Neoplasias da Próstata , Humanos , Masculino , Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , MicroRNAs/genética , Estudos Prospectivos , Neoplasias da Próstata/genética , Feminino
2.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36526282

RESUMO

Identifying unknown protein functional modules, such as protein complexes and biological pathways, from protein-protein interaction (PPI) networks, provides biologists with an opportunity to efficiently understand cellular function and organization. Finding complex nonlinear relationships in underlying functional modules may involve a long-chain of PPI and pose great challenges in a PPI network with an unevenly sparse and dense node distribution. To overcome these challenges, we propose AdaPPI, an adaptive convolution graph network in PPI networks to predict protein functional modules. We first suggest an attributed graph node presentation algorithm. It can effectively integrate protein gene ontology attributes and network topology, and adaptively aggregates low- or high-order graph structural information according to the node distribution by considering graph node smoothness. Based on the obtained node representations, core cliques and expansion algorithms are applied to find functional modules in PPI networks. Comprehensive performance evaluations and case studies indicate that the framework significantly outperforms state-of-the-art methods. We also presented potential functional modules based on their confidence.


Assuntos
Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Algoritmos , Proteínas/genética , Proteínas/metabolismo
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36631407

RESUMO

Recently, peptide-based drugs have gained unprecedented interest in discovering and developing antifungal drugs due to their high efficacy, broad-spectrum activity, low toxicity and few side effects. However, it is time-consuming and expensive to identify antifungal peptides (AFPs) experimentally. Therefore, computational methods for accurately predicting AFPs are highly required. In this work, we develop AFP-MFL, a novel deep learning model that predicts AFPs only relying on peptide sequences without using any structural information. AFP-MFL first constructs comprehensive feature profiles of AFPs, including contextual semantic information derived from a pre-trained protein language model, evolutionary information, and physicochemical properties. Subsequently, the co-attention mechanism is utilized to integrate contextual semantic information with evolutionary information and physicochemical properties separately. Extensive experiments show that AFP-MFL outperforms state-of-the-art models on four independent test datasets. Furthermore, the SHAP method is employed to explore each feature contribution to the AFPs prediction. Finally, a user-friendly web server of the proposed AFP-MFL is developed and freely accessible at http://inner.wei-group.net/AFPMFL/, which can be considered as a powerful tool for the rapid screening and identification of novel AFPs.


Assuntos
Antifúngicos , alfa-Fetoproteínas , Antifúngicos/farmacologia , Algoritmos , Peptídeos/química , Biologia Computacional/métodos
4.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36702755

RESUMO

Due to the high heterogeneity and complexity of cancers, patients with different cancer subtypes often have distinct groups of genomic and clinical characteristics. Therefore, the discovery and identification of cancer subtypes are crucial to cancer diagnosis, prognosis and treatment. Recent technological advances have accelerated the increasing availability of multi-omics data for cancer subtyping. To take advantage of the complementary information from multi-omics data, it is necessary to develop computational models that can represent and integrate different layers of data into a single framework. Here, we propose a decoupled contrastive clustering method (Subtype-DCC) based on multi-omics data integration for clustering to identify cancer subtypes. The idea of contrastive learning is introduced into deep clustering based on deep neural networks to learn clustering-friendly representations. Experimental results demonstrate the superior performance of the proposed Subtype-DCC model in identifying cancer subtypes over the currently available state-of-the-art clustering methods. The strength of Subtype-DCC is also supported by the survival and clinical analysis.


Assuntos
Multiômica , Neoplasias , Humanos , Algoritmos , Genômica/métodos , Neoplasias/genética , Análise por Conglomerados , Receptor DCC
5.
PLoS Comput Biol ; 20(2): e1011935, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38416785

RESUMO

Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC (spatial transcriptomic clustering with graph and image convolution) is designed for techniques with regular lattices on chips. It utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by Kullback-Leibler (KL) divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of 10x Visium human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , Transcriptoma/genética , Benchmarking , Análise por Conglomerados , Entropia
6.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35180781

RESUMO

Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.


Assuntos
Aprendizado de Máquina , Polimorfismo de Nucleotídeo Único , Éxons , Humanos , Anotação de Sequência Molecular , Virulência
7.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34671814

RESUMO

One of the main problems with the joint use of multiple drugs is that it may cause adverse drug interactions and side effects that damage the body. Therefore, it is important to predict potential drug interactions. However, most of the available prediction methods can only predict whether two drugs interact or not, whereas few methods can predict interaction events between two drugs. Accurately predicting interaction events of two drugs is more useful for researchers to study the mechanism of the interaction of two drugs. In the present study, we propose a novel method, MDF-SA-DDI, which predicts drug-drug interaction (DDI) events based on multi-source drug fusion, multi-source feature fusion and transformer self-attention mechanism. MDF-SA-DDI is mainly composed of two parts: multi-source drug fusion and multi-source feature fusion. First, we combine two drugs in four different ways and input the combined drug feature representation into four different drug fusion networks (Siamese network, convolutional neural network and two auto-encoders) to obtain the latent feature vectors of the drug pairs, in which the two auto-encoders have the same structure, and their main difference is the number of neurons in the input layer of the two auto-encoders. Then, we use transformer blocks that include self-attention mechanism to perform latent feature fusion. We conducted experiments on three different tasks with two datasets. On the small dataset, the area under the precision-recall-curve (AUPR) and F1 scores of our method on task 1 reached 0.9737 and 0.8878, respectively, which were better than the state-of-the-art method. On the large dataset, the AUPR and F1 scores of our method on task 1 reached 0.9773 and 0.9117, respectively. In task 2 and task 3 of two datasets, our method also achieved the same or better performance as the state-of-the-art method. More importantly, the case studies on five DDI events are conducted and achieved satisfactory performance. The source codes and data are available at https://github.com/ShenggengLin/MDF-SA-DDI.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Redes Neurais de Computação , Interações Medicamentosas , Humanos , Oligossacarídeos , Software
8.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36027578

RESUMO

Anatomical Therapeutic Chemical (ATC) classification for compounds/drugs plays an important role in drug development and basic research. However, previous methods depend on interactions extracted from STITCH dataset which may make it depend on lab experiments. We present a pilot study to explore the possibility of conducting the ATC prediction solely based on the molecular structures. The motivation is to eliminate the reliance on the costly lab experiments so that the characteristics of a drug can be pre-assessed for better decision-making and effort-saving before the actual development. To this end, we construct a new benchmark consisting of 4545 compounds which is with larger scale than the one used in previous study. A light-weight prediction model is proposed. The model is with better explainability in the sense that it is consists of a straightforward tokenization that extracts and embeds statistically and physicochemically meaningful tokens, and a deep network backed by a set of pyramid kernels to capture multi-resolution chemical structural characteristics. Its efficacy has been validated in the experiments where it outperforms the state-of-the-art methods by 15.53% in accuracy and by 69.66% in terms of efficiency. We make the benchmark dataset, source code and web server open to ease the reproduction of this study.


Assuntos
Benchmarking , Software , Projetos Piloto
9.
Bioinformatics ; 39(12)2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-38015872

RESUMO

MOTIVATION: Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information. RESULTS: In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein-protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite. AVAILABILITY AND IMPLEMENTATION: The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.


Assuntos
Peptídeos , Proteínas , Ligação Proteica , Proteínas/química , Sítios de Ligação , Software
10.
Microb Pathog ; 189: 106572, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38354987

RESUMO

The JCV (John Cunningham Virus) is known to cause progressive multifocal leukoencephalopathy, a condition that results in the formation of tumors. Symptoms of this condition such as sensory defects, cognitive dysfunction, muscle weakness, homonosapobia, difficulties with coordination, and aphasia. To date, there is no specific and effective treatment to completely cure or prevent John Cunningham polyomavirus infections. Since the best way to control the disease is vaccination. In this study, the immunoinformatic tools were used to predict the high immunogenic and non-allergenic B cells, helper T cells (HTL), and cytotoxic T cells (CTL) epitopes from capsid, major capsid, and T antigen proteins of JC virus to design the highly efficient subunit vaccines. The specific immunogenic linkers were used to link together the predicted epitopes and subjected to 3D modeling by using the Robetta server. MD simulation was used to confirm that the newly constructed vaccines are stable and properly fold. Additionally, the molecular docking approach revealed that the vaccines have a strong binding affinity with human TLR-7. The codon adaptation index (CAI) and GC content values verified that the constructed vaccines would be highly expressed in E. coli pET28a (+) plasmid. The immune simulation analysis indicated that the human immune system would have a strong response to the vaccines, with a high titer of IgM and IgG antibodies being produced. In conclusion, this study will provide a pre-clinical concept to construct an effective, highly antigenic, non-allergenic, and thermostable vaccine to combat the infection of the John Cunningham virus.


Assuntos
Vírus JC , Vacinas , Humanos , Epitopos/genética , Simulação de Acoplamento Molecular , Escherichia coli , Vacinologia , Vacinas de Subunidades Antigênicas/genética , Epitopos de Linfócito T/genética , Biologia Computacional , Epitopos de Linfócito B , Simulação de Dinâmica Molecular
11.
Methods ; 220: 1-10, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37858611

RESUMO

The joint use of multiple drugs can result in adverse drug-drug interactions (DDIs) and side effects that harm the body. Accurate identification of DDIs is crucial for avoiding accidental drug side effects and understanding potential mechanisms underlying DDIs. Several computational methods have been proposed for multi-type DDI prediction, but most rely on the similarity profiles of drugs as the drug feature vectors, which may result in information leakage and overoptimistic performance when predicting interactions between new drugs. To address this issue, we propose a novel method, MATT-DDI, for predicting multi-type DDIs based on the original feature vectors of drugs and multiple attention mechanisms. MATT-DDI consists of three main modules: the top k most similar drug pair selection module, heterogeneous attention mechanism module and multi­type DDI prediction module. Firstly, based on the feature vector of the input drug pair (IDP), k drug pairs that are most similar to the input drug pair from the training dataset are selected according to cosine similarity between drug pairs. Then, the vectors of k selected drug pairs are averaged to obtain a new drug pair (NDP). Next, IDP and NDP are fed into heterogeneous attention modules, including scaled dot product attention and bilinear attention, to extract latent feature vectors. Finally, these latent feature vectors are taken as input of the classification module to predict DDI types. We evaluated MATT-DDI on three different tasks. The experimental results show that MATT-DDI provides better or comparable performance compared to several state-of-the-art methods, and its feasibility is supported by case studies. MATT-DDI is a robust model for predicting multi-type DDIs with excellent performance and no information leakage.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Interações Medicamentosas
12.
Biotechnol Appl Biochem ; 71(2): 402-413, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38287712

RESUMO

Malonyl-CoA serves as the main building block for the biosynthesis of many important polyketides, as well as fatty acid-derived compounds, such as biofuel. Escherichia coli, Corynebacterium gultamicum, and Saccharomyces cerevisiae have recently been engineered for the biosynthesis of such compounds. However, the developed processes and strains often have insufficient productivity. In the current study, we used enzyme-engineering approach to improve the binding of acetyl-CoA with ACC. We generated different mutations, and the impact was calculated, which reported that three mutations, that is, S343A, T347W, and S350W, significantly improve the substrate binding. Molecular docking investigation revealed an altered binding network compared to the wild type. In mutants, additional interactions stabilize the binding of the inner tail of acetyl-CoA. Using molecular simulation, the stability, compactness, hydrogen bonding, and protein motions were estimated, revealing different dynamic properties owned by the mutants only but not by the wild type. The findings were further validated by using the binding-free energy (BFE) method, which revealed these mutations as favorable substitutions. The total BFE was reported to be -52.66 ± 0.11 kcal/mol for the wild type, -55.87 ± 0.16 kcal/mol for the S343A mutant, -60.52 ± 0.25 kcal/mol for T347W mutant, and -59.64 ± 0.25 kcal/mol for the S350W mutant. This shows that the binding of the substrate is increased due to the induced mutations and strongly corroborates with the docking results. In sum, this study provides information regarding the essential hotspot residues for the substrate binding and can be used for application in industrial processes.


Assuntos
Acetil-CoA Carboxilase , Streptomyces antibioticus , Acetil-CoA Carboxilase/genética , Acetil-CoA Carboxilase/metabolismo , Streptomyces antibioticus/metabolismo , Acetilcoenzima A/genética , Simulação de Acoplamento Molecular , Mutação , Saccharomyces cerevisiae/metabolismo , Escherichia coli/metabolismo
13.
BMC Genomics ; 24(1): 661, 2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37919660

RESUMO

Microproteins, prevalent across all kingdoms of life, play a crucial role in cell physiology and human health. Although global gene transcription is widely explored and abundantly available, our understanding of microprotein functions using transcriptome data is still limited. To mitigate this problem, we present a database, Mip-mining ( https://weilab.sjtu.edu.cn/mipmining/ ), underpinned by high-quality RNA-sequencing data exclusively aimed at analyzing microprotein functions. The Mip-mining hosts 336 sets of high-quality transcriptome data from 8626 samples and nine representative living organisms, including microorganisms, plants, animals, and humans, in our Mip-mining database. Our database specifically provides a focus on a range of diseases and environmental stress conditions, taking into account chemical, physical, biological, and diseases-related stresses. Comparatively, our platform enables customized analysis by inputting desired data sets with self-determined cutoff values. The practicality of Mip-mining is demonstrated by identifying essential microproteins in different species and revealing the importance of ATP15 in the acetic acid stress tolerance of budding yeast. We believe that Mip-mining will facilitate a greater understanding and application of microproteins in biotechnology. Moreover, it will be beneficial for designing therapeutic strategies under various biological conditions.


Assuntos
Biotecnologia , Transcriptoma , Animais , Humanos , Análise de Sequência de RNA , Micropeptídeos
14.
Funct Integr Genomics ; 23(2): 94, 2023 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-36943579

RESUMO

Breast cancer is one of the leading causes of death in women worldwide. Initially, it develops in the epithelium of the ducts or lobules of the breast glandular tissues with limited growth and the potential to metastasize. It is a highly heterogeneous malignancy; however, the common molecular mechanisms could help identify new targeted drugs for treating its subtypes. This study uses computational drug repositioning approaches to explore fresh drug candidates for breast cancer treatment. We also implemented reversal gene expression and gene expression-based signatures to explore novel drug candidates computationally. The drug activity profiles and related gene expression changes were acquired from the DrugBank, PubChem, and LINCS databases, and then in silico drug screening, molecular dynamics (MD) simulation, replica exchange MD simulations, and simulated annealing molecular dynamics (SAMD) simulations were conducted to discover and verify the valid drug candidates. We have found that compounds like furosemide, gold, and dopamine showed significant outcomes. Furthermore, the expression of genes related to breast cancer was observed to be reversed by these shortlisted drugs. Therefore, we postulate that combining furosemide, gold, and dopamine would be a potential combination therapy measurement for breast cancer patients.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Dopamina/uso terapêutico , Furosemida/farmacologia , Furosemida/uso terapêutico , Ouro/uso terapêutico , Transcriptoma
15.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32520339

RESUMO

The long non-coding RNAs (lncRNAs) are subject of intensive recent studies due to its association with various human diseases. It is desirable to build the artificial intelligence-based models for prediction of diseases or tissues based on the lncRNAs data, which will be useful in disease diagnosis and therapy. The accuracy and robustness of existing models based on the machine learning techniques are subject to further improvement. In this study, we propose a deep learning model, called Multi-Label Classifications with Deep Forest, termed MLCDForest, to address multi-label classification on tissue prediction for a given lncRNA, which can be regarded as an implementation of the deep forest model in multi-label classification. The MLCDForest is a sequential multi-label-grained scanning method, which distinguishes from the standard deep forest model. It is proposed to train in sequential of multi-labels with label correlation considered. A systematic comparison using the lncRNA-disease association datasets demonstrates that our method consistently shows superior performance over the state-of-the-art methods in disease prediction. Considering label correlation in the sequential multi-label-grained scanning, our model provides a powerful tool to make multi-label classification and tissue prediction based on given lncRNAs.


Assuntos
Biologia Computacional , Aprendizado Profundo , Doença/genética , Modelos Genéticos , RNA Longo não Codificante/genética , Humanos
16.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32743640

RESUMO

BACKGROUND: The most frequently mutated gene pairs in pancreatic adenocarcinoma (PAAD) are KRAS and TP53, and our goal is to illustrate the multiomics and molecular dynamics landscapes of KRAS/TP53 mutation and also to obtain prospective novel drugs for KRAS- and TP53-mutated PAAD patients. Moreover, we also made an attempt to discover the probable link amid KRAS and TP53 on the basis of the abovementioned multiomics data. METHOD: We utilized TCGA & Cancer Cell Line Encyclopedia data for the analysis of KRAS/TP53 mutation in a multiomics manner. In addition to that, we performed molecular dynamics analysis of KRAS and TP53 to produce mechanistic descriptions of particular mutations and carcinogenesis. RESULT: We discover that there is a significant difference in the genomics, transcriptomics, methylomics, and molecular dynamics pattern of KRAS and TP53 mutation from the matching wild type in PAAD, and the prognosis of pancreatic cancer is directly linked with a particular mutation of KRAS and protein stability. Screened drugs are potentially effective in PAAD patients. CONCLUSIONS: KRAS and TP53 prognosis of PAAD is directly associated with a specific mutation of KRAS. Irinotecan and vandetanib are prospective drugs for PAAD patients with KRASG12Dmutation and TP53 mutation.


Assuntos
Adenocarcinoma , Protocolos de Quimioterapia Combinada Antineoplásica/administração & dosagem , Mutação , Neoplasias Pancreáticas , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteína Supressora de Tumor p53/genética , Adenocarcinoma/tratamento farmacológico , Adenocarcinoma/genética , Adenocarcinoma/mortalidade , Intervalo Livre de Doença , Sinergismo Farmacológico , Feminino , Humanos , Irinotecano/administração & dosagem , Irinotecano/agonistas , Masculino , Neoplasias Pancreáticas/tratamento farmacológico , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/mortalidade , Piperidinas/administração & dosagem , Piperidinas/agonistas , Quinazolinas/administração & dosagem , Quinazolinas/agonistas , Taxa de Sobrevida
17.
Brief Bioinform ; 22(1): 451-462, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31885041

RESUMO

Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.


Assuntos
Desenvolvimento de Medicamentos/métodos , Proteômica/métodos , Software , Humanos , Simulação de Acoplamento Molecular/métodos , Análise de Sequência de Proteína/métodos
18.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32964234

RESUMO

Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado de Máquina , Preparações Farmacêuticas/metabolismo , Proteínas/metabolismo , Simulação por Computador , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Internet , Terapia de Alvo Molecular/métodos , Preparações Farmacêuticas/administração & dosagem , Preparações Farmacêuticas/química , Ligação Proteica , Proteínas/antagonistas & inibidores , Proteínas/química , Reprodutibilidade dos Testes
19.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34009265

RESUMO

Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.


Assuntos
Biomarcadores , Biologia Computacional/métodos , Suscetibilidade a Doenças , Regulação da Expressão Gênica , MicroRNAs/genética , Software , Algoritmos , Bases de Dados Genéticas , Humanos , Reprodutibilidade dos Testes , Fluxo de Trabalho
20.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34169968

RESUMO

BACKGROUND: There are ever increasing researches implying that noncoded RNAs (ncRNAs) specifically circular RNAs (circRNAs) and microRNAs (miRNAs) in exosomes play vital roles in respiratory disease. However, the detailed mechanisms persist to be unclear in mycobacterial infection. METHODS: In order to detect circRNAs and miRNAs expression pattern and potential biological function in tuberculosis, we performed immense parallel sequencing for exosomal ncRNAs from THP-1-derived macrophages infected by Mycobacterium tuberculosis H37Ra, Mycobacterium bovis BCG and control Streptococcus pneumonia, respectively and uninfected normal cells. Besides, THP-1-derived macrophages were used to verify the validation of differential miRNAs, and monocytes from PBMCs and clinical plasma samples were used to further validate differentially expressed miR-185-5p. RESULTS: Many exosomal circRNAs and miRNAs associated with tuberculosis infection were recognized. Extensive enrichment analyses were performed to illustrate the major effects of altered ncRNAs expression. Moreover, the miRNA-mRNA and circRNA-miRNA networks were created and expected to reveal their interrelationship. Further, significant differentially expressed miRNAs based on Exo-BCG, Exo-Ra and Exo-Control, were evaluated, and the potential target mRNAs and function were analyzed. Eventually, miR-185-5p was collected as a promising potential biomarker for tuberculosis. CONCLUSION: Our findings provide a new vision for exploring biological functions of ncRNAs in mycobacterial infection and screening novel potential biomarkers. To sum up, exosomal ncRNAs might represent useful functional biomarkers in tuberculosis pathogenesis and diagnosis.


Assuntos
Biomarcadores , Exossomos , Perfilação da Expressão Gênica , MicroRNAs/genética , Mycobacterium tuberculosis , RNA não Traduzido , Tuberculose/genética , Transporte Biológico , Linhagem Celular , Exossomos/metabolismo , Exossomos/ultraestrutura , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Macrófagos/imunologia , Macrófagos/metabolismo , Macrófagos/microbiologia , Transporte de RNA , RNA Circular , RNA Mensageiro/genética , Curva ROC , Tuberculose/metabolismo , Tuberculose/microbiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA