Búsqueda | Portal Regional de la BVS

Hierarchical multimodal self-attention-based graph neural network for DTI prediction.

Bian, Jilong; Lu, Hao; Dong, Guanghui; Wang, Guohua.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38920341

RESUMEN

Drug-target interactions (DTIs) are a key part of drug development process and their accurate and efficient prediction can significantly boost development efficiency and reduce development time. Recent years have witnessed the rapid advancement of deep learning, resulting in an abundance of deep learning-based models for DTI prediction. However, most of these models used a single representation of drugs and proteins, making it difficult to comprehensively represent their characteristics. Multimodal data fusion can effectively compensate for the limitations of single-modal data. However, existing multimodal models for DTI prediction do not take into account both intra- and inter-modal interactions simultaneously, resulting in limited presentation capabilities of fused features and a reduction in DTI prediction accuracy. A hierarchical multimodal self-attention-based graph neural network for DTI prediction, called HMSA-DTI, is proposed to address multimodal feature fusion. Our proposed HMSA-DTI takes drug SMILES, drug molecular graphs, protein sequences and protein 2-mer sequences as inputs, and utilizes a hierarchical multimodal self-attention mechanism to achieve deep fusion of multimodal features of drugs and proteins, enabling the capture of intra- and inter-modal interactions between drugs and proteins. It is demonstrated that our proposed HMSA-DTI has significant advantages over other baseline methods on multiple evaluation metrics across five benchmark datasets.

Asunto(s)

Aprendizaje Profundo , Redes Neurales de la Computación , Proteínas/química , Proteínas/metabolismo , Humanos , Algoritmos , Biología Computacional/métodos

HMMF: a hybrid multi-modal fusion framework for predicting drug side effect frequencies.

Liu, Wuyong; Zhang, Jingyu; Qiao, Guanyu; Bian, Jilong; Dong, Benzhi; Li, Yang.

BMC Bioinformatics ; 25(1): 196, 2024 May 20.

Artículo en Inglés | MEDLINE | ID: mdl-38769492

RESUMEN

BACKGROUND: The identification of drug side effects plays a critical role in drug repositioning and drug screening. While clinical experiments yield accurate and reliable information about drug-related side effects, they are costly and time-consuming. Computational models have emerged as a promising alternative to predict the frequency of drug-side effects. However, earlier research has primarily centered on extracting and utilizing representations of drugs, like molecular structure or interaction graphs, often neglecting the inherent biomedical semantics of drugs and side effects. RESULTS: To address the previously mentioned issue, we introduce a hybrid multi-modal fusion framework (HMMF) for predicting drug side effect frequencies. Considering the wealth of biological and chemical semantic information related to drugs and side effects, incorporating multi-modal information offers additional, complementary semantics. HMMF utilizes various encoders to understand molecular structures, biomedical textual representations, and attribute similarities of both drugs and side effects. It then models drug-side effect interactions using both coarse and fine-grained fusion strategies, effectively integrating these multi-modal features. CONCLUSIONS: HMMF exhibits the ability to successfully detect previously unrecognized potential side effects, demonstrating superior performance over existing state-of-the-art methods across various evaluation metrics, including root mean squared error and area under receiver operating characteristic curve, and shows remarkable performance in cold-start scenarios.

Asunto(s)

Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Biología Computacional/métodos , Humanos , Algoritmos

ACP-ML: A sequence-based method for anticancer peptide prediction.

Bian, Jilong; Liu, Xuan; Dong, Guanghui; Hou, Chang; Huang, Shan; Zhang, Dandan.

Comput Biol Med ; 170: 108063, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38301519

RESUMEN

Cancer is a serious malignant tumor and is difficult to cure. Chemotherapy, as a primary treatment for cancer, causes significant harm to normal cells in the body and is often accompanied by serious side effects. Recently, anti-cancer peptides (ACPs) as a type of protein for treating cancers dominated research into the development of new anti-tumor drugs because of their ability to specifically target and destroy cancer cells. The screening of proteins with cancer-inhibiting properties from a large pool of proteins is key to the development of anti-tumor drugs. However, it is expensive and inefficient to accurately identify protein functions only through biological experiments due to their complex structure. Therefore, we propose a new prediction model ACP-ML to effectively predict ACPs. In terms of feature extraction, DPC, PseAAC, CTDC, CTDT and CS-Pse-PSSM features were used and the most optimal feature set was selected by comparing combinations of these features. Then, a two-step feature selection process using MRMD and RFE algorithms was performed to determine the most crucial features from the most optimal feature set for identifying ACPs. Furthermore, we assessed the classification accuracy of single learning models and different strategies-based ensemble models through ten-fold cross-validation. Ultimately, a voting-based ensemble learning method is developed to predict ACPs. To validate its effectiveness, two independent test sets were used to perform tests, achieving accuracy of 90.891 % and 92.578 % respectively. Compared with existing anticancer peptide prediction algorithms, the proposed feature processing method is more effective, and the proposed ensemble model ACP-ML exhibits stronger generalization capability and higher accuracy.

Asunto(s)

Antineoplásicos , Neoplasias , Humanos , Biología Computacional/métodos , Péptidos/química , Proteínas , Algoritmos , Neoplasias/tratamiento farmacológico , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico

DETIRE: a hybrid deep learning model for identifying viral sequences from metagenomes.

Miao, Yan; Bian, Jilong; Dong, Guanghui; Dai, Tianhong.

Front Microbiol ; 14: 1169791, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37396369

RESUMEN

A metagenome contains all DNA sequences from an environmental sample, including viruses, bacteria, archaea, and eukaryotes. Since viruses are of huge abundance and have caused vast mortality and morbidity to human society in history as a type of major pathogens, detecting viruses from metagenomes plays a crucial role in analyzing the viral component of samples and is the very first step for clinical diagnosis. However, detecting viral fragments directly from the metagenomes is still a tough issue because of the existence of a huge number of short sequences. In this study a hybrid Deep lEarning model for idenTifying vIral sequences fRom mEtagenomes (DETIRE) is proposed to solve the problem. First, the graph-based nucleotide sequence embedding strategy is utilized to enrich the expression of DNA sequences by training an embedding matrix. Then, the spatial and sequential features are extracted by trained CNN and BiLSTM networks, respectively, to enrich the features of short sequences. Finally, the two sets of features are weighted combined for the final decision. Trained by 220,000 sequences of 500 bp subsampled from the Virus and Host RefSeq genomes, DETIRE identifies more short viral sequences (<1,000 bp) than the three latest methods, such as DeepVirFinder, PPR-Meta, and CHEER. DETIRE is freely available at Github (https://github.com/crazyinter/DETIRE).

MCANet: shared-weight-based MultiheadCrossAttention network for drug-target interaction prediction.

Bian, Jilong; Zhang, Xi; Zhang, Xiying; Xu, Dali; Wang, Guohua.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36892153

RESUMEN

Accurate and effective drug-target interaction (DTI) prediction can greatly shorten the drug development lifecycle and reduce the cost of drug development. In the deep-learning-based paradigm for predicting DTI, robust drug and protein feature representations and their interaction features play a key role in improving the accuracy of DTI prediction. Additionally, the class imbalance problem and the overfitting problem in the drug-target dataset can also affect the prediction accuracy, and reducing the consumption of computational resources and speeding up the training process are also critical considerations. In this paper, we propose shared-weight-based MultiheadCrossAttention, a precise and concise attention mechanism that can establish the association between target and drug, making our models more accurate and faster. Then, we use the cross-attention mechanism to construct two models: MCANet and MCANet-B. In MCANet, the cross-attention mechanism is used to extract the interaction features between drugs and proteins for improving the feature representation ability of drugs and proteins, and the PolyLoss loss function is applied to alleviate the overfitting problem and the class imbalance problem in the drug-target dataset. In MCANet-B, the robustness of the model is improved by combining multiple MCANet models and prediction accuracy further increases. We train and evaluate our proposed methods on six public drug-target datasets and achieve state-of-the-art results. In comparison with other baselines, MCANet saves considerable computational resources while maintaining accuracy in the leading position; however, MCANet-B greatly improves prediction accuracy by combining multiple models while maintaining a balance between computational resource consumption and prediction accuracy.

Asunto(s)

Desarrollo de Medicamentos , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Proteínas/metabolismo , Sistemas de Liberación de Medicamentos , Dominios Proteicos

Unsupervised construction of gene regulatory network based on single-cell multi-omics data of colorectal cancer.

Cui, Lingyu; Li, Hongfei; Bian, Jilong; Wang, Guohua; Liang, Yingjian.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36723605

RESUMEN

Identifying gene regulatory networks (GRNs) at the resolution of single cells has long been a great challenge, and the advent of single-cell multi-omics data provides unprecedented opportunities to construct GRNs. Here, we propose a novel strategy to integrate omics datasets of single-cell ribonucleic acid sequencing and single-cell Assay for Transposase-Accessible Chromatin using sequencing, and using an unsupervised learning neural network to divide the samples with high copy number variation scores, which are used to infer the GRN in each gene block. Accuracy validation of proposed strategy shows that approximately 80% of transcription factors are directly associated with cancer, colorectal cancer, malignancy and disease by TRRUST; and most transcription factors are prone to produce multiple transcript variants and lead to tumorigenesis by RegNetwork database, respectively. The source code access are available at: https://github.com/Cuily-v/Colorectal_cancer.

Asunto(s)

Neoplasias Colorrectales , Redes Reguladoras de Genes , Humanos , Multiómica , Variaciones en el Número de Copia de ADN , Algoritmos , Factores de Transcripción/genética , Neoplasias Colorrectales/genética

Feature selection combined with top-down and bottom-up strategies for survival analysis: A case of prognostic prediction in glioblastoma.

Liu, Yanan; Zhao, Xudong; Bian, Jilong; Wang, Guohua.

Comput Biol Med ; 153: 106486, 2023 02.

Artículo en Inglés | MEDLINE | ID: mdl-36603438

RESUMEN

Over the last decades, molecular signatures have attracted extensive attention in cancer research. However, most of the reported biomarkers show a weak distinguishing ability in predicting the survival risks of patients. Actually, univariate analysis is generally considered in regression analysis, which makes the existing statistical methods ineffective. Furthermore, there is too much human involvement in the ways of classifying patients with high and low risk. Last but not least, the participation of therapy after conservative surgery also makes the survival analysis more complex. In order to solve these problems, we propose a solid method of feature selection which combines top-down and bottom-up strategies. The top-down strategy is to randomly extract some genes each time and select candidate genes through cumulative voting. The bottom-up strategy is to fully enumerate the selected genes and to use a clustering algorithm to classify samples. We analyzed glioblastoma data from the Cancer Genome Atlas (TCGA) and got candidate signatures. The results of simulation data, as well as an independent test set the Chinese Glioma Genome Atlas (CGGA), verified the reliability of the method and validity of the selected features.

Asunto(s)

Glioblastoma , Humanos , Glioblastoma/genética , Perfilación de la Expresión Génica/métodos , Pronóstico , Reproducibilidad de los Resultados , Análisis de Supervivencia

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA