Pesquisa | BVS IEC

1.

MMSMAPlus: a multi-view multi-scale multi-attention embedding model for protein function prediction.

Wang, Zhongyu; Deng, Zhaohong; Zhang, Wei; Lou, Qiongdan; Choi, Kup-Sze; Wei, Zhisheng; Wang, Lei; Wu, Jing.

Brief Bioinform ; 24(4)2023 07 20.

Artigo em Inglês | MEDLINE | ID: mdl-37258453

RESUMO

Protein is the most important component in organisms and plays an indispensable role in life activities. In recent years, a large number of intelligent methods have been proposed to predict protein function. These methods obtain different types of protein information, including sequence, structure and interaction network. Among them, protein sequences have gained significant attention where methods are investigated to extract the information from different views of features. However, how to fully exploit the views for effective protein sequence analysis remains a challenge. In this regard, we propose a multi-view, multi-scale and multi-attention deep neural model (MMSMA) for protein function prediction. First, MMSMA extracts multi-view features from protein sequences, including one-hot encoding features, evolutionary information features, deep semantic features and overlapping property features based on physiochemistry. Second, a specific multi-scale multi-attention deep network model (MSMA) is built for each view to realize the deep feature learning and preliminary classification. In MSMA, both multi-scale local patterns and long-range dependence from protein sequences can be captured. Third, a multi-view adaptive decision mechanism is developed to make a comprehensive decision based on the classification results of all the views. To further improve the prediction performance, an extended version of MMSMA, MMSMAPlus, is proposed to integrate homology-based protein prediction under the framework of multi-view deep neural model. Experimental results show that the MMSMAPlus has promising performance and is significantly superior to the state-of-the-art methods. The source code can be found at https://github.com/wzy-2020/MMSMAPlus.

Assuntos

Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Software , Análise de Sequência de Proteína

2.

MINDG: a drug-target interaction prediction method based on an integrated learning algorithm.

Yang, Hailong; Chen, Yue; Zuo, Yun; Deng, Zhaohong; Pan, Xiaoyong; Shen, Hong-Bin; Choi, Kup-Sze; Yu, Dong-Jun.

Bioinformatics ; 40(4)2024 03 29.

Artigo em Inglês | MEDLINE | ID: mdl-38483285

RESUMO

MOTIVATION: Drug-target interaction (DTI) prediction refers to the prediction of whether a given drug molecule will bind to a specific target and thus exert a targeted therapeutic effect. Although intelligent computational approaches for drug target prediction have received much attention and made many advances, they are still a challenging task that requires further research. The main challenges are manifested as follows: (i) most graph neural network-based methods only consider the information of the first-order neighboring nodes (drug and target) in the graph, without learning deeper and richer structural features from the higher-order neighboring nodes. (ii) Existing methods do not consider both the sequence and structural features of drugs and targets, and each method is independent of each other, and cannot combine the advantages of sequence and structural features to improve the interactive learning effect. RESULTS: To address the above challenges, a Multi-view Integrated learning Network that integrates Deep learning and Graph Learning (MINDG) is proposed in this study, which consists of the following parts: (i) a mixed deep network is used to extract sequence features of drugs and targets, (ii) a higher-order graph attention convolutional network is proposed to better extract and capture structural features, and (iii) a multi-view adaptive integrated decision module is used to improve and complement the initial prediction results of the above two networks to enhance the prediction performance. We evaluate MINDG on two dataset and show it improved DTI prediction performance compared to state-of-the-art baselines. AVAILABILITY AND IMPLEMENTATION: https://github.com/jnuaipr/MINDG.

Assuntos

Algoritmos , Redes Neurais de Computação

3.

Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer.

Xiao, Zhiyong; Zhang, Yuhong; Deng, Zhaohong; Liu, Fei.

Neuroimage ; 292: 120608, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38626817

RESUMO

The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippocampus segmentation of Magnetic Resonance Imaging (MRI) due to their effectiveness, and the architecture based on Transformer has also received some attention. However, some existing methods focus too much on the shape and volume of the hippocampus rather than its spatial information, and the extracted information is independent of each other, ignoring the correlation between local and global features. In addition, many methods cannot be effectively applied to practical medical image segmentation due to many parameters and high computational complexity. To this end, we combined the advantages of CNNs and ViTs (Vision Transformer) and proposed a simple and lightweight model: Light3DHS for the segmentation of the 3D hippocampus. In order to obtain richer local contextual features, the encoder first utilizes a multi-scale convolutional attention module (MCA) to learn the spatial information of the hippocampus. Considering the importance of local features and global semantics for 3D segmentation, we used a lightweight ViT to learn high-level features of scale invariance and further fuse local-to-global representation. To evaluate the effectiveness of encoder feature representation, we designed three decoders of different complexity to generate segmentation maps. Experiments on three common hippocampal datasets demonstrate that the network achieves more accurate hippocampus segmentation with fewer parameters. Light3DHS performs better than other state-of-the-art algorithms.

Assuntos

Hipocampo , Imageamento Tridimensional , Imageamento por Ressonância Magnética , Hipocampo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética/métodos , Imageamento Tridimensional/métodos , Redes Neurais de Computação , Aprendizado Profundo , Algoritmos

4.

circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier.

Li, Hui; Deng, Zhaohong; Yang, Haitao; Pan, Xiaoyong; Wei, Zhisheng; Shen, Hong-Bin; Choi, Kup-Sze; Wang, Lei; Wang, Shitong; Wu, Jing.

Brief Bioinform ; 23(1)2022 01 17.

Artigo em Inglês | MEDLINE | ID: mdl-34571539

RESUMO

Circular RNAs (circRNAs) generally bind to RNA-binding proteins (RBPs) to play an important role in the regulation of autoimmune diseases. Thus, it is crucial to study the binding sites of RBPs on circRNAs. Although many methods, including traditional machine learning and deep learning, have been developed to predict the interactions between RNAs and RBPs, and most of them are focused on linear RNAs. At present, few studies have been done on the binding relationships between circRNAs and RBPs. Thus, in-depth research is urgently needed. In the existing circRNA-RBP binding site prediction methods, circRNA sequences are the main research subjects, but the relevant characteristics of circRNAs have not been fully exploited, such as the structure and composition information of circRNA sequences. Some methods have extracted different views to construct recognition models, but how to efficiently use the multi-view data to construct recognition models is still not well studied. Considering the above problems, this paper proposes a multi-view classification method called DMSK based on multi-view deep learning, subspace learning and multi-view classifier for the identification of circRNA-RBP interaction sites. In the DMSK method, first, we converted circRNA sequences into pseudo-amino acid sequences and pseudo-dipeptide components for extracting high-dimensional sequence features and component features of circRNAs, respectively. Then, the structure prediction method RNAfold was used to predict the secondary structure of the RNA sequences, and the sequence embedding model was used to extract the context-dependent features. Next, we fed the above four views' raw features to a hybrid network, which is composed of a convolutional neural network and a long short-term memory network, to obtain the deep features of circRNAs. Furthermore, we used view-weighted generalized canonical correlation analysis to extract four views' common features by subspace learning. Finally, the learned subspace common features and multi-view deep features were fed to train the downstream multi-view TSK fuzzy system to construct a fuzzy rule and fuzzy inference-based multi-view classifier. The trained classifier was used to predict the specific positions of the RBP binding sites on the circRNAs. The experiments show that the prediction performance of the proposed method DMSK has been improved compared with the existing methods. The code and dataset of this study are available at https://github.com/Rebecca3150/DMSK.

Assuntos

Aprendizado Profundo , RNA Circular , Sítios de Ligação , Proteínas de Transporte/metabolismo , Biologia Computacional/métodos , Humanos

5.

MDGF-MCEC: a multi-view dual attention embedding model with cooperative ensemble learning for CircRNA-disease association prediction.

Wu, Qunzhuo; Deng, Zhaohong; Pan, Xiaoyong; Shen, Hong-Bin; Choi, Kup-Sze; Wang, Shitong; Wu, Jing; Yu, Dong-Jun.

Brief Bioinform ; 23(5)2022 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-35907779

RESUMO

Circular RNA (circRNA) is closely involved in physiological and pathological processes of many diseases. Discovering the associations between circRNAs and diseases is of great significance. Due to the high-cost to verify the circRNA-disease associations by wet-lab experiments, computational approaches for predicting the associations become a promising research direction. In this paper, we propose a method, MDGF-MCEC, based on multi-view dual attention graph convolution network (GCN) with cooperative ensemble learning to predict circRNA-disease associations. First, MDGF-MCEC constructs two disease relation graphs and two circRNA relation graphs based on different similarities. Then, the relation graphs are fed into a multi-view GCN for representation learning. In order to learn high discriminative features, a dual-attention mechanism is introduced to adjust the contribution weights, at both channel level and spatial level, of different features. Based on the learned embedding features of diseases and circRNAs, nine different feature combinations between diseases and circRNAs are treated as new multi-view data. Finally, we construct a multi-view cooperative ensemble classifier to predict the associations between circRNAs and diseases. Experiments conducted on the CircR2Disease database demonstrate that the proposed MDGF-MCEC model achieves a high area under curve of 0.9744 and outperforms the state-of-the-art methods. Promising results are also obtained from experiments on the circ2Disease and circRNADisease databases. Furthermore, the predicted associated circRNAs for hepatocellular carcinoma and gastric cancer are supported by the literature. The code and dataset of this study are available at https://github.com/ABard0/MDGF-MCEC.

Assuntos

RNA Circular , Neoplasias Gástricas , Humanos , Peptídeos e Proteínas de Sinalização Intercelular , Aprendizado de Máquina , Neoplasias Gástricas/genética

6.

MLNGCF: circRNA-disease associations prediction with multilayer attention neural graph-based collaborative filtering.

Wu, Qunzhuo; Deng, Zhaohong; Zhang, Wei; Pan, Xiaoyong; Choi, Kup-Sze; Zuo, Yun; Shen, Hong-Bin; Yu, Dong-Jun.

Bioinformatics ; 39(8)2023 08 01.

Artigo em Inglês | MEDLINE | ID: mdl-37561093

RESUMO

MOTIVATION: CircRNAs play a critical regulatory role in physiological processes, and the abnormal expression of circRNAs can mediate the processes of diseases. Therefore, exploring circRNAs-disease associations is gradually becoming an important area of research. Due to the high cost of validating circRNA-disease associations using traditional wet-lab experiments, novel computational methods based on machine learning are gaining more and more attention in this field. However, current computational methods suffer to insufficient consideration of latent features in circRNA-disease interactions. RESULTS: In this study, a multilayer attention neural graph-based collaborative filtering (MLNGCF) is proposed. MLNGCF first enhances multiple biological information with autoencoder as the initial features of circRNAs and diseases. Then, by constructing a central network of different diseases and circRNAs, a multilayer cooperative attention-based message propagation is performed on the central network to obtain the high-order features of circRNAs and diseases. A neural network-based collaborative filtering is constructed to predict the unknown circRNA-disease associations and update the model parameters. Experiments on the benchmark datasets demonstrate that MLNGCF outperforms state-of-the-art methods, and the prediction results are supported by the literature in the case studies. AVAILABILITY AND IMPLEMENTATION: The source codes and benchmark datasets of MLNGCF are available at https://github.com/ABard0/MLNGCF.

Assuntos

Redes Neurais de Computação , RNA Circular , Aprendizado de Máquina , Software , Biologia Computacional/métodos

7.

Research progress on prediction of RNA-protein binding sites in the past five years.

Zuo, Yun; Chen, Huixian; Yang, Lele; Chen, Ruoyan; Zhang, Xiaoyao; Deng, Zhaohong.

Anal Biochem ; 691: 115535, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38643894

RESUMO

Accurately predicting RNA-protein binding sites is essential to gain a deeper comprehension of the protein-RNA interactions and their regulatory mechanisms, which are fundamental in gene expression and regulation. However, conventional biological approaches to detect these sites are often costly and time-consuming. In contrast, computational methods for predicting RNA protein binding sites are both cost-effective and expeditious. This review synthesizes already existing computational methods, summarizing commonly used databases for predicting RNA protein binding sites. In addition, applications and innovations of computational methods using traditional machine learning and deep learning for RNA protein binding site prediction during 2018-2023 are presented. These methods cover a wide range of aspects such as effective database utilization, feature selection and encoding, innovative classification algorithms, and evaluation strategies. Exploring the limitations of existing computational methods, this paper delves into the potential directions for future development. DeepRKE, RDense, and DeepDW all employ convolutional neural networks and long and short-term memory networks to construct prediction models, yet their algorithm design and feature encoding differ, resulting in diverse prediction performances.

Assuntos

Proteínas de Ligação a RNA , RNA , Proteínas de Ligação a RNA/metabolismo , Sítios de Ligação , RNA/metabolismo , Biologia Computacional/métodos , Algoritmos , Aprendizado de Máquina , Aprendizado Profundo , Humanos , Ligação Proteica , Redes Neurais de Computação

8.

Glypred: Lysine Glycation Site Prediction via CCU-LightGBM-BiLSTM Framework with Multi-Head Attention Mechanism.

Zuo, Yun; Zhang, Bangyi; Dong, Yinkang; He, Wenying; Bi, Yue; Liu, Xiangrong; Zeng, Xiangxiang; Deng, Zhaohong.

J Chem Inf Model ; 64(16): 6699-6711, 2024 Aug 26.

Artigo em Inglês | MEDLINE | ID: mdl-39121059

RESUMO

Glycation, a type of posttranslational modification, preferentially occurs on lysine and arginine residues, impairing protein functionality and altering characteristics. This process is linked to diseases such as Alzheimer's, diabetes, and atherosclerosis. Traditional wet lab experiments are time-consuming, whereas machine learning has significantly streamlined the prediction of protein glycation sites. Despite promising results, challenges remain, including data imbalance, feature redundancy, and suboptimal classifier performance. This research introduces Glypred, a lysine glycation site prediction model combining ClusterCentroids Undersampling (CCU), LightGBM, and bidirectional long short-term memory network (BiLSTM) methodologies, with an additional multihead attention mechanism integrated into the BiLSTM. To achieve this, the study undertakes several key steps: selecting diverse feature types to capture comprehensive protein information, employing a cluster-based undersampling strategy to balance the data set, using LightGBM for feature selection to enhance model performance, and implementing a bidirectional LSTM network for accurate classification. Together, these approaches ensure that Glypred effectively identifies glycation sites with high accuracy and robustness. For feature encoding, five distinct feature typesâAAC, KMER, DR, PWAA, and EBGWâwere selected to capture a broad spectrum of protein sequence and biological information. These encoded features were integrated and validated to ensure comprehensive protein information acquisition. To address the issue of highly imbalanced positive and negative samples, various undersampling algorithms, including random undersampling, NearMiss, edited nearest neighbor rule, and CCU, were evaluated. CCU was ultimately chosen to remove redundant nonglycated training data, establishing a balanced data set that enhances the model's accuracy and robustness. For feature selection, the LightGBM ensemble learning algorithm was employed to reduce feature dimensionality by identifying the most significant features. This approach accelerates model training, enhances generalization capabilities, and ensures good transferability of the model. Finally, a bidirectional long short-term memory network was used as the classifier, with a network structure designed to capture glycation modification site features from both forward and backward directions. To prevent overfitting, appropriate regularization parameters and dropout rates were introduced, achieving efficient classification. Experimental results show that Glypred achieved optimal performance. This model provides new insights for bioinformatics and encourages the application of similar strategies in other fields. A lysine glycation site prediction software tool was also developed using the PyQt5 library, offering researchers an auxiliary screening tool to reduce workload and improve efficiency. The software and data sets are available on GitHub: https://github.com/ZBYnb/Glypred.

Assuntos

Lisina , Glicosilação , Lisina/química , Lisina/metabolismo , Proteínas/química , Proteínas/metabolismo , Aprendizado de Máquina , Biologia Computacional/métodos , Humanos , Redes Neurais de Computação , Bases de Dados de Proteínas

9.

RNA-binding protein recognition based on multi-view deep feature and multi-label learning.

Yang, Haitao; Deng, Zhaohong; Pan, Xiaoyong; Shen, Hong-Bin; Choi, Kup-Sze; Wang, Lei; Wang, Shitong; Wu, Jing.

Brief Bioinform ; 22(3)2021 05 20.

Artigo em Inglês | MEDLINE | ID: mdl-32808039

RESUMO

RNA-binding protein (RBP) is a class of proteins that bind to and accompany RNAs in regulating biological processes. An RBP may have multiple target RNAs, and its aberrant expression can cause multiple diseases. Methods have been designed to predict whether a specific RBP can bind to an RNA and the position of the binding site using binary classification model. However, most of the existing methods do not take into account the binding similarity and correlation between different RBPs. While methods employing multiple labels and Long Short Term Memory Network (LSTM) are proposed to consider binding similarity between different RBPs, the accuracy remains low due to insufficient feature learning and multi-label learning on RNA sequences. In response to this challenge, the concept of RNA-RBP Binding Network (RRBN) is proposed in this paper to provide theoretical support for multi-label learning to identify RBPs that can bind to RNAs. It is experimentally shown that the RRBN information can significantly improve the prediction of unknown RNA-RBP interactions. To further improve the prediction accuracy, we present the novel computational method iDeepMV which integrates multi-view deep learning technology under the multi-label learning framework. iDeepMV first extracts data from the views of amino acid sequence and dipeptide component based on the RNA sequences as the original view. Deep neural network models are then designed for the respective views to perform deep feature learning. The extracted deep features are fed into multi-label classifiers which are trained with the RNA-RBP interaction information for the three views. Finally, a voting mechanism is designed to make comprehensive decision on the results of the multi-label classifiers. Our experimental results show that the prediction performance of iDeepMV, which combines multi-view deep feature learning models with RNA-RBP interaction information, is significantly better than that of the state-of-the-art methods. iDeepMV is freely available at http://www.csbio.sjtu.edu.cn/bioinf/iDeepMV for academic use. The code is freely available at http://github.com/uchihayht/iDeepMV.

Assuntos

Aprendizado de Máquina , Proteínas de Ligação a RNA/metabolismo , Biologia Computacional/métodos , Redes Neurais de Computação

10.

A pilot study of radiomics signature based on biparametric MRI for preoperative prediction of extrathyroidal extension in papillary thyroid carcinoma.

He, Junlin; Zhang, Heng; Wang, Xian; Sun, Zongqiong; Ge, Yuxi; Wang, Kang; Yu, Chunjing; Deng, Zhaohong; Feng, Jianxin; Xu, Xin; Hu, Shudong.

J Xray Sci Technol ; 29(1): 171-183, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33325448

RESUMO

OBJECTIVE: To investigate efficiency of radiomics signature to preoperatively predict histological features of aggressive extrathyroidal extension (ETE) in papillary thyroid carcinoma (PTC) with biparametric magnetic resonance imaging findings. MATERIALS AND METHODS: Sixty PTC patients with preoperative MR including T2WI and T2WI-fat-suppression (T2WI-FS) were retrospectively analyzed. Among them, 35 had ETE and 25 did not. Pre-contrast T2WI and T2WI-FS images depicting the largest section of tumor were selected. Tumor regions were manually segmented using ITK-SNAP software and 107 radiomics features were computed from the segmented regions using the open Pyradiomics package. Then, a random forest model was built to do classification in which the datasets were partitioned randomly 10 times to do training and testing with ratio of 1:1. Furthermore, forward greedy feature selection based on feature importance was adopted to reduce model overfitting. Classification accuracy was estimated on the test set using area under ROC curve (AUC). RESULTS: The model using T2WI-FS image features yields much higher performance than the model using T2WI features (AUCâ=â0.906 vs. 0.760 using 107 features). Among the top 10 important features of T2WI and T2WI-FS, there are 5 common features. After feature selection, the models trained using top 2 features of T2WI and the top 6 features of T2WI-FS achieve AUC 0.845 and 0.928, respectively. Combining features computed from T2WI and T2WI-FS, model performance decreases slightly (AUCâ=â0.882 based on all features and AUCâ=â0.913 based on top features after feature selection). Adjusting hyper parameters of the random forest model have negligible influence on the model performance with mean AUCâ=â0.907 for T2WI-FS images. CONCLUSIONS: Radiomics features based on pre-contrast T2WI and T2WI-FS is helpful to predict aggressive ETE in PTC. Particularly, the model trained using the optimally selected T2WI-FS image features yields the best classification performance. The most important features relate to lesion size and the texture heterogeneity of the tumor region.

Assuntos

Imageamento por Ressonância Magnética , Neoplasias da Glândula Tireoide , Humanos , Projetos Piloto , Curva ROC , Estudos Retrospectivos , Câncer Papilífero da Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/diagnóstico por imagem

11.

Preoperative assessment of extrathyroidal extension of papillary thyroid carcinomas by ultrasound and magnetic resonance imaging: a comparative study.

Hu, Shudong; Zhang, Heng; Sun, Zongqiong; Ge, Yuxi; Li, Jie; Yu, Chunjing; Deng, Zhaohong; Dou, Weiqiang; Wang, Xian.

Radiol Med ; 125(9): 870-876, 2020 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-32249390

RESUMO

PURPOSE: The purpose of this study was to assess and compare the diagnostic performances of preoperative ultrasonography (US) and magnetic resonance imaging (MRI) in predicting extrathyroidal extension (ETE) in patients with papillary thyroid carcinoma (PTC). MATERIALS AND METHODS: This retrospective study was approved by our institutional review board. Preoperative US and MRI were performed on 225 patients who underwent surgery for PTC between May 2014 and December 2018. The US and MRI features of ETE of each case were retrospectively and independently investigated by two radiologists. The diagnostic performances of US and MRI, including their sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV) for ETE, and their accuracy in predicting ETE were analyzed. RESULTS: Higher sensitivity and NPV in predicting minimal ETE were observed in US (87.5% and 76.2%, respectively) compared with MRI (71.3% and 61.7%, respectively) (p = 0.006 and p = 0.046, respectively). Meanwhile, MRI (85.4%) showed higher sensitivity than US (66.7%) in assessing extensive ETE (p = 0.005). MRI also showed significantly higher specificity and PPV than US in assessing overall ETE (p = 0.025 and p = 0.025, respectively). CONCLUSION: Preoperative US should be used as the first line in predicting minimal ETE, and MRI should be added in extensive ETE assessment. Compared with US, MRI had higher specificity and PPV in detecting the overall ETE of PTC.

Assuntos

Imageamento por Ressonância Magnética/métodos , Cuidados Pré-Operatórios/métodos , Câncer Papilífero da Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Ultrassonografia/métodos , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Invasividade Neoplásica , Estudos Retrospectivos , Câncer Papilífero da Tireoide/patologia , Neoplasias da Glândula Tireoide/patologia , Adulto Jovem

12.

Knowledge-leveraged transfer fuzzy C-Means for texture image segmentation with self-adaptive cluster prototype matching.

Qian, Pengjiang; Zhao, Kaifa; Jiang, Yizhang; Su, Kuan-Hao; Deng, Zhaohong; Wang, Shitong; Muzic, Raymond F.

Knowl Based Syst ; 130: 33-50, 2017 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-30050232

RESUMO

We study a novel fuzzy clustering method to improve the segmentation performance on the target texture image by leveraging the knowledge from a prior texture image. Two knowledge transfer mechanisms, i.e. knowledge-leveraged prototype transfer (KL-PT) and knowledge-leveraged prototype matching (KL-PM) are first introduced as the bases. Applying them, the knowledge-leveraged transfer fuzzy C-means (KL-TFCM) method and its three-stage-interlinked framework, including knowledge extraction, knowledge matching, and knowledge utilization, are developed. There are two specific versions: KL-TFCM-c and KL-TFCM-f, i.e. the so-called crisp and flexible forms, which use the strategies of maximum matching degree and weighted sum, respectively. The significance of our work is fourfold: 1) Owing to the adjustability of referable degree between the source and target domains, KL-PT is capable of appropriately learning the insightful knowledge, i.e. the cluster prototypes, from the source domain; 2) KL-PM is able to self-adaptively determine the reasonable pairwise relationships of cluster prototypes between the source and target domains, even if the numbers of clusters differ in the two domains; 3) The joint action of KL-PM and KL-PT can effectively resolve the data inconsistency and heterogeneity between the source and target domains, e.g. the data distribution diversity and cluster number difference. Thus, using the three-stage-based knowledge transfer, the beneficial knowledge from the source domain can be extensively, self-adaptively leveraged in the target domain. As evidence of this, both KL-TFCM-c and KL-TFCM-f surpass many existing clustering methods in texture image segmentation; and 4) In the case of different cluster numbers between the source and target domains, KL-TFCM-f proves higher clustering effectiveness and segmentation performance than does KL-TFCM-c.

13.

Classifier Model Based on Machine Learning Algorithms: Application to Differential Diagnosis of Suspicious Thyroid Nodules via Sonography.

Wu, Hongxun; Deng, Zhaohong; Zhang, Bingjie; Liu, Qianyun; Chen, Junyong.

AJR Am J Roentgenol ; 207(4): 859-864, 2016 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-27340876

RESUMO

OBJECTIVE: The purpose of this article is to construct classifier models using machine learning algorithms and to evaluate their diagnostic performances for differentiating malignant from benign thyroid nodules. MATERIALS AND METHODS: This study included 970 histopathologically proven thyroid nodules in 970 patients. Two radiologists retrospectively reviewed ultrasound images, and nodules were graded according to a five-tier sonographic scoring system. Statistically significant variables based on an experienced radiologist's observations were obtained with attribute optimization using fivefold cross-validation and applied as the input nodes to build models for predicting malignancy of nodules. The performances of the machine learning algorithms and radiologists were compared using ROC curve analysis. RESULTS: Diagnosis by the experienced radiologist achieved the highest predictive accuracy of 88.66% with a specificity of 85.33%, whereas the radial basis function (RBF)-neural network (NN) achieved the highest sensitivity of 92.31%. The AUC value for diagnosis by the experienced radiologist (AUC = 0.9135) was greater than those for diagnosis by the less experienced radiologist, the naïve Bayes classifier, the support vector machine, and the RBF-NN (AUC = 0.8492, 0.8811, 0.9033, and 0.9103, respectively; p < 0.05). CONCLUSION: The machine learning algorithms underperformed with respect to the experienced radiologist's readings used to construct them, and the RBF-NN outperformed the other machine learning algorithm models.

14.

Urinary liver-type fatty acid-binding protein predicts recovery from acute kidney injury.

Wang, Liang; Xue, Jing; Chen, Caimei; Zhang, Zhijian; Deng, Zhaohong; Sun, Zhuxing; Xing, Changying.

Clin Nephrol ; 84(5): 255-61, 2015 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-26396099

RESUMO

OBJECTIVES: Despite significant advances in the epidemiology of acute kidney injury (AKI), there is no reliable method to predict renal recovery. Using acute kidney injury network (AKIN) criteria, we tested whether higher urinary L-FABP (uL-FABP) concentrations in the patients with AKIN stage 3 (AKIN3) after nephrology consultation would predict failure to recover. METHODS: This is a prospective cohort study of 114 patients with AKIN3 at WuXi People's Hospital from August 2011 to July 2014. The levels of serum creatinine, urine creatinine, and uL-FABP were obtained at the time of nephrology consultation. RESULTS: Patients who recovered had lower uL-FABP than those who failed to recover at time of nephrology consultation (71.42 (11.1 - 118.3) vs. 335.18 (103.9 - 422.3) ng/mg × creatinine, p < 0.001). Urinary L-FABP predicted failure to recover with an area under the receiver operating characteristic curve of 0.906 (95% CI 0.837 - 0.953). A clinical model using age, APACHE II score and acute tubular necrosis severity scoring index (ATN-ISS) predicted failure to recover with an area under the curve of 0.825 (95% CI 0.743 - 0.890). When uL-FABP was compared to the clinical model, the reclassification of risk of renal recovery had significantly improved by 35.1%. CONCLUSION: Urinary L-FABP appears to be a useful biomarker to predict failure to recover during hospitalization in the cohort of patients with AKIN3.

Assuntos

Injúria Renal Aguda/urina , Biomarcadores/urina , Proteínas de Ligação a Ácido Graxo/urina , Estudos de Coortes , Creatinina/sangue , Creatinina/urina , Feminino , Humanos , Testes de Função Renal , Necrose Tubular Aguda/sangue , Necrose Tubular Aguda/urina , Masculino , Estudos Prospectivos , Curva ROC

15.

CarSitePred: an integrated algorithm for identifying carbonylated sites based on KNDUA-LNDOT resampling technique.

Zuo, Yun; Zhang, Jingrun; He, Wenying; Liu, Xiangrong; Deng, Zhaohong.

J Biomol Struct Dyn ; : 1-13, 2024 Feb 09.

Artigo em Inglês | MEDLINE | ID: mdl-38334134

RESUMO

Carbonylated sites are the determining factors for functional changes or deletions in carbonylated proteins, so identifying carbonylated sites is essential for understanding the process of protein carbonylated and exploring the pathogenesis of related diseases. The current wet experimental methods for predicting carbonylated modification sites ae not only expensive and time-consuming, but also have limited protein processing capabilities and cannot meet the needs of researchers. The identification of carbonylated sites using computational methods not only improves the functional characterization of proteins, but also provides researchers with free tools for predicting carbonylated sites. Therefore, it is essential to establish a model using computational methods that can accurately predict protein carbonylated sites. In this study, a prediction model, CarSitePred, is proposed to identify carbonylation sites. In CarSitePred, specific location amino acid hydrophobic hydrophilic, one-to-one numerical conversion of amino acids, and AlexNet convolutional neural networks convert preprocessed carbonylated sequences into valid numerical features. The K-means Normal Distribution-based Undersampling Algorithm (KNDUA) and Localized Normal Distribution Oversampling Technology (LNDOT) were firstly proposed and employed to balance the K, P, R and T carbonylation training dataset. And for the first time, carbonylation modification sites were transformed into the form of images and directly inputted into AlexNet convolutional neural network to extract features for fitting SVM classifiers. The 10-fold cross-validation and independent testing results show that CarSitePred achieves better prediction performance than the best currently available prediction models. Availability: https://github.com/zuoyun123/CarSitePred.Communicated by Ramaswamy H. Sarma.

16.

MVDINET: A Novel Multi-Level Enzyme Function Predictor With Multi-View Deep Interactive Learning.

Tang, Wenliang; Deng, Zhaohong; Zhou, Hanwen; Zhang, Wei; Hu, Fuping; Choi, Kup-Sze; Wang, Shitong.

IEEE/ACM Trans Comput Biol Bioinform ; 21(1): 84-94, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38015669

RESUMO

As a class of extremely significant of biocatalysts, enzymes play an important role in the process of biological reproduction and metabolism. Therefore, the prediction of enzyme function is of great significance in biomedicine fields. Recently, computational methods for predicting enzyme function have been proposed, and they effectively reduce the cost of enzyme function prediction. However, there are still deficiencies for effectively mining the discriminant information for enzyme function recognition in existing methods. In this study, we present MVDINET, a novel method for multi-level enzyme function prediction. First, the initial multi-view feature data is extracted by the enzyme sequence. Then, the above initial views are fed into various deep specific network modules to learn the depth-specificity information. Further, a deep view interaction network is designed to extract the interaction information. Finally, the specificity information and interaction information are fed into a multi-view adaptively weighted classification. We compressively evaluate MVDINET on benchmark datasets and demonstrate that MVDINET is superior to existing methods.

Assuntos

Benchmarking , Treinamento por Simulação , Reprodução

17.

HGLA: Biomolecular Interaction Prediction based on Mixed High-Order Graph Convolution with Filter Network via LSTM and Channel Attention.

Zhang, Zhen; Deng, Zhaohong; Li, Ruibo; Zhang, Wei; Lou, Qiongdan; Choi, Kup-Sze; Wang, Shitong.

IEEE/ACM Trans Comput Biol Bioinform ; PP2024 Jul 26.

Artigo em Inglês | MEDLINE | ID: mdl-39058607

RESUMO

Predicting biomolecular interactions is significant for understanding biological systems. Most existing methods for link prediction are based on graph convolution. Although graph convolution methods are advantageous in extracting structure information of biomolecular interactions, two key challenges still remain. One is how to consider both the immediate and highorder neighbors. Another is how to reduce noise when aggregating high-order neighbors. To address these challenges, we propose a novel method, called mixed high-order graph convolution with filter network via LSTM and channel attention (HGLA), to predict biomolecular interactions. Firstly, the basic and high-order features are extracted respectively through the traditional graph convolutional network (GCN) and the two-layer Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing (MixHop). Secondly, these features are mixed and input into the filter network composed of LayerNorm, SENet and LSTM to generate filtered features, which are concatenated and used for link prediction. The advantages of HGLA are: 1) HGLA processes high-order features separately, rather than simply concatenating them; 2) HGLA better balances the basic features and high-order features; 3) HGLA effectively filters the noise from high-order neighbors. It outperforms state-ofthe-art networks on four benchmark datasets. The codes are available at https://github.com/zznb123/HGLA.

18.

ILYCROsite: Identification of lysine crotonylation sites based on FCM-GRNN undersampling technique.

Zuo, Yun; Wan, Minquan; Shen, Yang; Wang, Xinheng; He, Wenying; Bi, Yue; Liu, Xiangrong; Deng, Zhaohong.

Comput Biol Chem ; 113: 108212, 2024 Sep 13.

Artigo em Inglês | MEDLINE | ID: mdl-39277959

RESUMO

Protein lysine crotonylation is an important post-translational modification that regulates various cellular activities. For example, histone crotonylation affects chromatin structure and promotes histone replacement. Identification and understanding of lysine crotonylation sites is crucial in the field of protein research. However, due to the increasing amount of non-histone crotonylation sites, existing classifiers based on traditional machine learning may encounter performance limitations. In order to address this problem, a novel deep learning-based model for identifying crotonylation sites is presented in this study, given the unique advantages of deep learning techniques for sequence data analysis. In this study, an MLP-Attention-based model was developed for the identification of crotonylation sites. Firstly, three feature extraction strategies, namely Amino Acid Composition, K-mer, and Distance-based residue features extraction strategy, were used to encode crotonylated and non-crotonylated sequences. Then, in order to balance the training dataset, the FCM-GRNN undersampling algorithm combining fuzzy clustering and generalized neural network approaches was introduced. Finally, to improve the effectiveness of crotonylation site identification, we explored various classification algorithms, and based on the relevant experimental performance comparisons, the multilayer perceptron (MLP) combined with the superimposed self-attention mechanism was finally selected to construct the prediction model ILYCROsite. The results obtained from independent testing and five-fold cross-validation demonstrated that the model proposed in this study, ILYCROsite, had excellent performance. Notably, on the independent test set, ILYCROsite achieves an AUC value of 87.93â¯%, which is significantly better than the existing state-of-the-art models. In addition, SHAP (Shapley Additive exPlanations) values were used to analyze the importance of features and their impact on model predictions. Meanwhile, in order to facilitate researchers to use the prediction model constructed in this study, we developed a prediction program to identify the crotonylation sites in a given protein sequence. The data and code for this program are available at: https://github.com/wmqskr/ILYCROsite.

19.

EnsDeepDP: An Ensemble Deep Learning Approach for Disease Prediction Through Metagenomics.

Shen, Yang; Zhu, Jinlin; Deng, Zhaohong; Lu, Wenwei; Wang, Hongchao.

IEEE/ACM Trans Comput Biol Bioinform ; 20(2): 986-998, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36001521

RESUMO

A growing number of studies show that the human microbiome plays a vital role in human health and can be a crucial factor in predicting certain human diseases. However, microbiome data are often characterized by the limited samples and high-dimensional features, which pose a great challenge for machine learning methods. Therefore, this paper proposes a novel ensemble deep learning disease prediction method that combines unsupervised and supervised learning paradigms. First, unsupervised deep learning methods are used to learn the potential representation of the sample. Afterwards, the disease scoring strategy is developed based on the deep representations as the informative features for ensemble analysis. To ensure the optimal ensemble, a score selection mechanism is constructed, and performance boosting features are engaged with the original sample. Finally, the composite features are trained with gradient boosting classifier for health status decision. For case study, the ensemble deep learning flowchart has been demonstrated on six public datasets extracted from the human microbiome profiling. The results show that compared with the existing algorithms, our framework achieves better performance on disease prediction.

Assuntos

Aprendizado Profundo , Microbiota , Humanos , Metagenômica , Algoritmos , Aprendizado de Máquina , Microbiota/genética

20.

One-Step Multiview Fuzzy Clustering With Collaborative Learning Between Common and Specific Hidden Space Information.

Zhang, Wei; Deng, Zhaohong; Zhang, Te; Choi, Kup-Sze; Wang, Shitong.

IEEE Trans Neural Netw Learn Syst ; PP2023 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-37216234

RESUMO

Multiview data are widespread in real-world applications, and multiview clustering is a commonly used technique to effectively mine the data. Most of the existing algorithms perform multiview clustering by mining the commonly hidden space between views. Although this strategy is effective, there are two challenges that still need to be addressed to further improve the performance. First, how to design an efficient hidden space learning method so that the learned hidden spaces contain both shared and specific information of multiview data. Second, how to design an efficient mechanism to make the learned hidden space more suitable for the clustering task. In this study, a novel one-step multiview fuzzy clustering (OMFC-CS) method is proposed to address the two challenges by collaborative learning between the common and specific space information. To tackle the first challenge, we propose a mechanism to extract the common and specific information simultaneously based on matrix factorization. For the second challenge, we design a one-step learning framework to integrate the learning of common and specific spaces and the learning of fuzzy partitions. The integration is achieved in the framework by performing the two learning processes alternately and thereby yielding mutual benefit. Furthermore, the Shannon entropy strategy is introduced to obtain the optimal views weight assignment during clustering. The experimental results based on benchmark multiview datasets demonstrate that the proposed OMFC-CS outperforms many existing methods.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA