Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37258453

RESUMO

Protein is the most important component in organisms and plays an indispensable role in life activities. In recent years, a large number of intelligent methods have been proposed to predict protein function. These methods obtain different types of protein information, including sequence, structure and interaction network. Among them, protein sequences have gained significant attention where methods are investigated to extract the information from different views of features. However, how to fully exploit the views for effective protein sequence analysis remains a challenge. In this regard, we propose a multi-view, multi-scale and multi-attention deep neural model (MMSMA) for protein function prediction. First, MMSMA extracts multi-view features from protein sequences, including one-hot encoding features, evolutionary information features, deep semantic features and overlapping property features based on physiochemistry. Second, a specific multi-scale multi-attention deep network model (MSMA) is built for each view to realize the deep feature learning and preliminary classification. In MSMA, both multi-scale local patterns and long-range dependence from protein sequences can be captured. Third, a multi-view adaptive decision mechanism is developed to make a comprehensive decision based on the classification results of all the views. To further improve the prediction performance, an extended version of MMSMA, MMSMAPlus, is proposed to integrate homology-based protein prediction under the framework of multi-view deep neural model. Experimental results show that the MMSMAPlus has promising performance and is significantly superior to the state-of-the-art methods. The source code can be found at https://github.com/wzy-2020/MMSMAPlus.


Assuntos
Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Software , Análise de Sequência de Proteína
2.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38483285

RESUMO

MOTIVATION: Drug-target interaction (DTI) prediction refers to the prediction of whether a given drug molecule will bind to a specific target and thus exert a targeted therapeutic effect. Although intelligent computational approaches for drug target prediction have received much attention and made many advances, they are still a challenging task that requires further research. The main challenges are manifested as follows: (i) most graph neural network-based methods only consider the information of the first-order neighboring nodes (drug and target) in the graph, without learning deeper and richer structural features from the higher-order neighboring nodes. (ii) Existing methods do not consider both the sequence and structural features of drugs and targets, and each method is independent of each other, and cannot combine the advantages of sequence and structural features to improve the interactive learning effect. RESULTS: To address the above challenges, a Multi-view Integrated learning Network that integrates Deep learning and Graph Learning (MINDG) is proposed in this study, which consists of the following parts: (i) a mixed deep network is used to extract sequence features of drugs and targets, (ii) a higher-order graph attention convolutional network is proposed to better extract and capture structural features, and (iii) a multi-view adaptive integrated decision module is used to improve and complement the initial prediction results of the above two networks to enhance the prediction performance. We evaluate MINDG on two dataset and show it improved DTI prediction performance compared to state-of-the-art baselines. AVAILABILITY AND IMPLEMENTATION: https://github.com/jnuaipr/MINDG.


Assuntos
Algoritmos , Redes Neurais de Computação
3.
Neuroimage ; 292: 120608, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38626817

RESUMO

The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippocampus segmentation of Magnetic Resonance Imaging (MRI) due to their effectiveness, and the architecture based on Transformer has also received some attention. However, some existing methods focus too much on the shape and volume of the hippocampus rather than its spatial information, and the extracted information is independent of each other, ignoring the correlation between local and global features. In addition, many methods cannot be effectively applied to practical medical image segmentation due to many parameters and high computational complexity. To this end, we combined the advantages of CNNs and ViTs (Vision Transformer) and proposed a simple and lightweight model: Light3DHS for the segmentation of the 3D hippocampus. In order to obtain richer local contextual features, the encoder first utilizes a multi-scale convolutional attention module (MCA) to learn the spatial information of the hippocampus. Considering the importance of local features and global semantics for 3D segmentation, we used a lightweight ViT to learn high-level features of scale invariance and further fuse local-to-global representation. To evaluate the effectiveness of encoder feature representation, we designed three decoders of different complexity to generate segmentation maps. Experiments on three common hippocampal datasets demonstrate that the network achieves more accurate hippocampus segmentation with fewer parameters. Light3DHS performs better than other state-of-the-art algorithms.


Assuntos
Hipocampo , Imageamento Tridimensional , Imageamento por Ressonância Magnética , Hipocampo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética/métodos , Imageamento Tridimensional/métodos , Redes Neurais de Computação , Aprendizado Profundo , Algoritmos
4.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34571539

RESUMO

Circular RNAs (circRNAs) generally bind to RNA-binding proteins (RBPs) to play an important role in the regulation of autoimmune diseases. Thus, it is crucial to study the binding sites of RBPs on circRNAs. Although many methods, including traditional machine learning and deep learning, have been developed to predict the interactions between RNAs and RBPs, and most of them are focused on linear RNAs. At present, few studies have been done on the binding relationships between circRNAs and RBPs. Thus, in-depth research is urgently needed. In the existing circRNA-RBP binding site prediction methods, circRNA sequences are the main research subjects, but the relevant characteristics of circRNAs have not been fully exploited, such as the structure and composition information of circRNA sequences. Some methods have extracted different views to construct recognition models, but how to efficiently use the multi-view data to construct recognition models is still not well studied. Considering the above problems, this paper proposes a multi-view classification method called DMSK based on multi-view deep learning, subspace learning and multi-view classifier for the identification of circRNA-RBP interaction sites. In the DMSK method, first, we converted circRNA sequences into pseudo-amino acid sequences and pseudo-dipeptide components for extracting high-dimensional sequence features and component features of circRNAs, respectively. Then, the structure prediction method RNAfold was used to predict the secondary structure of the RNA sequences, and the sequence embedding model was used to extract the context-dependent features. Next, we fed the above four views' raw features to a hybrid network, which is composed of a convolutional neural network and a long short-term memory network, to obtain the deep features of circRNAs. Furthermore, we used view-weighted generalized canonical correlation analysis to extract four views' common features by subspace learning. Finally, the learned subspace common features and multi-view deep features were fed to train the downstream multi-view TSK fuzzy system to construct a fuzzy rule and fuzzy inference-based multi-view classifier. The trained classifier was used to predict the specific positions of the RBP binding sites on the circRNAs. The experiments show that the prediction performance of the proposed method DMSK has been improved compared with the existing methods. The code and dataset of this study are available at https://github.com/Rebecca3150/DMSK.


Assuntos
Aprendizado Profundo , RNA Circular , Sítios de Ligação , Proteínas de Transporte/metabolismo , Biologia Computacional/métodos , Humanos
5.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35907779

RESUMO

Circular RNA (circRNA) is closely involved in physiological and pathological processes of many diseases. Discovering the associations between circRNAs and diseases is of great significance. Due to the high-cost to verify the circRNA-disease associations by wet-lab experiments, computational approaches for predicting the associations become a promising research direction. In this paper, we propose a method, MDGF-MCEC, based on multi-view dual attention graph convolution network (GCN) with cooperative ensemble learning to predict circRNA-disease associations. First, MDGF-MCEC constructs two disease relation graphs and two circRNA relation graphs based on different similarities. Then, the relation graphs are fed into a multi-view GCN for representation learning. In order to learn high discriminative features, a dual-attention mechanism is introduced to adjust the contribution weights, at both channel level and spatial level, of different features. Based on the learned embedding features of diseases and circRNAs, nine different feature combinations between diseases and circRNAs are treated as new multi-view data. Finally, we construct a multi-view cooperative ensemble classifier to predict the associations between circRNAs and diseases. Experiments conducted on the CircR2Disease database demonstrate that the proposed MDGF-MCEC model achieves a high area under curve of 0.9744 and outperforms the state-of-the-art methods. Promising results are also obtained from experiments on the circ2Disease and circRNADisease databases. Furthermore, the predicted associated circRNAs for hepatocellular carcinoma and gastric cancer are supported by the literature. The code and dataset of this study are available at https://github.com/ABard0/MDGF-MCEC.


Assuntos
RNA Circular , Neoplasias Gástricas , Humanos , Peptídeos e Proteínas de Sinalização Intercelular , Aprendizado de Máquina , Neoplasias Gástricas/genética
6.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37561093

RESUMO

MOTIVATION: CircRNAs play a critical regulatory role in physiological processes, and the abnormal expression of circRNAs can mediate the processes of diseases. Therefore, exploring circRNAs-disease associations is gradually becoming an important area of research. Due to the high cost of validating circRNA-disease associations using traditional wet-lab experiments, novel computational methods based on machine learning are gaining more and more attention in this field. However, current computational methods suffer to insufficient consideration of latent features in circRNA-disease interactions. RESULTS: In this study, a multilayer attention neural graph-based collaborative filtering (MLNGCF) is proposed. MLNGCF first enhances multiple biological information with autoencoder as the initial features of circRNAs and diseases. Then, by constructing a central network of different diseases and circRNAs, a multilayer cooperative attention-based message propagation is performed on the central network to obtain the high-order features of circRNAs and diseases. A neural network-based collaborative filtering is constructed to predict the unknown circRNA-disease associations and update the model parameters. Experiments on the benchmark datasets demonstrate that MLNGCF outperforms state-of-the-art methods, and the prediction results are supported by the literature in the case studies. AVAILABILITY AND IMPLEMENTATION: The source codes and benchmark datasets of MLNGCF are available at https://github.com/ABard0/MLNGCF.


Assuntos
Redes Neurais de Computação , RNA Circular , Aprendizado de Máquina , Software , Biologia Computacional/métodos
7.
Anal Biochem ; 691: 115535, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38643894

RESUMO

Accurately predicting RNA-protein binding sites is essential to gain a deeper comprehension of the protein-RNA interactions and their regulatory mechanisms, which are fundamental in gene expression and regulation. However, conventional biological approaches to detect these sites are often costly and time-consuming. In contrast, computational methods for predicting RNA protein binding sites are both cost-effective and expeditious. This review synthesizes already existing computational methods, summarizing commonly used databases for predicting RNA protein binding sites. In addition, applications and innovations of computational methods using traditional machine learning and deep learning for RNA protein binding site prediction during 2018-2023 are presented. These methods cover a wide range of aspects such as effective database utilization, feature selection and encoding, innovative classification algorithms, and evaluation strategies. Exploring the limitations of existing computational methods, this paper delves into the potential directions for future development. DeepRKE, RDense, and DeepDW all employ convolutional neural networks and long and short-term memory networks to construct prediction models, yet their algorithm design and feature encoding differ, resulting in diverse prediction performances.


Assuntos
Proteínas de Ligação a RNA , RNA , Proteínas de Ligação a RNA/metabolismo , Sítios de Ligação , RNA/metabolismo , Biologia Computacional/métodos , Algoritmos , Aprendizado de Máquina , Aprendizado Profundo , Humanos , Ligação Proteica , Redes Neurais de Computação
8.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32808039

RESUMO

RNA-binding protein (RBP) is a class of proteins that bind to and accompany RNAs in regulating biological processes. An RBP may have multiple target RNAs, and its aberrant expression can cause multiple diseases. Methods have been designed to predict whether a specific RBP can bind to an RNA and the position of the binding site using binary classification model. However, most of the existing methods do not take into account the binding similarity and correlation between different RBPs. While methods employing multiple labels and Long Short Term Memory Network (LSTM) are proposed to consider binding similarity between different RBPs, the accuracy remains low due to insufficient feature learning and multi-label learning on RNA sequences. In response to this challenge, the concept of RNA-RBP Binding Network (RRBN) is proposed in this paper to provide theoretical support for multi-label learning to identify RBPs that can bind to RNAs. It is experimentally shown that the RRBN information can significantly improve the prediction of unknown RNA-RBP interactions. To further improve the prediction accuracy, we present the novel computational method iDeepMV which integrates multi-view deep learning technology under the multi-label learning framework. iDeepMV first extracts data from the views of amino acid sequence and dipeptide component based on the RNA sequences as the original view. Deep neural network models are then designed for the respective views to perform deep feature learning. The extracted deep features are fed into multi-label classifiers which are trained with the RNA-RBP interaction information for the three views. Finally, a voting mechanism is designed to make comprehensive decision on the results of the multi-label classifiers. Our experimental results show that the prediction performance of iDeepMV, which combines multi-view deep feature learning models with RNA-RBP interaction information, is significantly better than that of the state-of-the-art methods. iDeepMV is freely available at http://www.csbio.sjtu.edu.cn/bioinf/iDeepMV for academic use. The code is freely available at http://github.com/uchihayht/iDeepMV.


Assuntos
Aprendizado de Máquina , Proteínas de Ligação a RNA/metabolismo , Biologia Computacional/métodos , Redes Neurais de Computação
9.
J Xray Sci Technol ; 29(1): 171-183, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33325448

RESUMO

OBJECTIVE: To investigate efficiency of radiomics signature to preoperatively predict histological features of aggressive extrathyroidal extension (ETE) in papillary thyroid carcinoma (PTC) with biparametric magnetic resonance imaging findings. MATERIALS AND METHODS: Sixty PTC patients with preoperative MR including T2WI and T2WI-fat-suppression (T2WI-FS) were retrospectively analyzed. Among them, 35 had ETE and 25 did not. Pre-contrast T2WI and T2WI-FS images depicting the largest section of tumor were selected. Tumor regions were manually segmented using ITK-SNAP software and 107 radiomics features were computed from the segmented regions using the open Pyradiomics package. Then, a random forest model was built to do classification in which the datasets were partitioned randomly 10 times to do training and testing with ratio of 1:1. Furthermore, forward greedy feature selection based on feature importance was adopted to reduce model overfitting. Classification accuracy was estimated on the test set using area under ROC curve (AUC). RESULTS: The model using T2WI-FS image features yields much higher performance than the model using T2WI features (AUC = 0.906 vs. 0.760 using 107 features). Among the top 10 important features of T2WI and T2WI-FS, there are 5 common features. After feature selection, the models trained using top 2 features of T2WI and the top 6 features of T2WI-FS achieve AUC 0.845 and 0.928, respectively. Combining features computed from T2WI and T2WI-FS, model performance decreases slightly (AUC = 0.882 based on all features and AUC = 0.913 based on top features after feature selection). Adjusting hyper parameters of the random forest model have negligible influence on the model performance with mean AUC = 0.907 for T2WI-FS images. CONCLUSIONS: Radiomics features based on pre-contrast T2WI and T2WI-FS is helpful to predict aggressive ETE in PTC. Particularly, the model trained using the optimally selected T2WI-FS image features yields the best classification performance. The most important features relate to lesion size and the texture heterogeneity of the tumor region.


Assuntos
Imageamento por Ressonância Magnética , Neoplasias da Glândula Tireoide , Humanos , Projetos Piloto , Curva ROC , Estudos Retrospectivos , Câncer Papilífero da Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/diagnóstico por imagem
10.
Radiol Med ; 125(9): 870-876, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32249390

RESUMO

PURPOSE: The purpose of this study was to assess and compare the diagnostic performances of preoperative ultrasonography (US) and magnetic resonance imaging (MRI) in predicting extrathyroidal extension (ETE) in patients with papillary thyroid carcinoma (PTC). MATERIALS AND METHODS: This retrospective study was approved by our institutional review board. Preoperative US and MRI were performed on 225 patients who underwent surgery for PTC between May 2014 and December 2018. The US and MRI features of ETE of each case were retrospectively and independently investigated by two radiologists. The diagnostic performances of US and MRI, including their sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV) for ETE, and their accuracy in predicting ETE were analyzed. RESULTS: Higher sensitivity and NPV in predicting minimal ETE were observed in US (87.5% and 76.2%, respectively) compared with MRI (71.3% and 61.7%, respectively) (p = 0.006 and p = 0.046, respectively). Meanwhile, MRI (85.4%) showed higher sensitivity than US (66.7%) in assessing extensive ETE (p = 0.005). MRI also showed significantly higher specificity and PPV than US in assessing overall ETE (p = 0.025 and p = 0.025, respectively). CONCLUSION: Preoperative US should be used as the first line in predicting minimal ETE, and MRI should be added in extensive ETE assessment. Compared with US, MRI had higher specificity and PPV in detecting the overall ETE of PTC.


Assuntos
Imageamento por Ressonância Magnética/métodos , Cuidados Pré-Operatórios/métodos , Câncer Papilífero da Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Ultrassonografia/métodos , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Invasividade Neoplásica , Estudos Retrospectivos , Câncer Papilífero da Tireoide/patologia , Neoplasias da Glândula Tireoide/patologia , Adulto Jovem
11.
Knowl Based Syst ; 130: 33-50, 2017 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-30050232

RESUMO

We study a novel fuzzy clustering method to improve the segmentation performance on the target texture image by leveraging the knowledge from a prior texture image. Two knowledge transfer mechanisms, i.e. knowledge-leveraged prototype transfer (KL-PT) and knowledge-leveraged prototype matching (KL-PM) are first introduced as the bases. Applying them, the knowledge-leveraged transfer fuzzy C-means (KL-TFCM) method and its three-stage-interlinked framework, including knowledge extraction, knowledge matching, and knowledge utilization, are developed. There are two specific versions: KL-TFCM-c and KL-TFCM-f, i.e. the so-called crisp and flexible forms, which use the strategies of maximum matching degree and weighted sum, respectively. The significance of our work is fourfold: 1) Owing to the adjustability of referable degree between the source and target domains, KL-PT is capable of appropriately learning the insightful knowledge, i.e. the cluster prototypes, from the source domain; 2) KL-PM is able to self-adaptively determine the reasonable pairwise relationships of cluster prototypes between the source and target domains, even if the numbers of clusters differ in the two domains; 3) The joint action of KL-PM and KL-PT can effectively resolve the data inconsistency and heterogeneity between the source and target domains, e.g. the data distribution diversity and cluster number difference. Thus, using the three-stage-based knowledge transfer, the beneficial knowledge from the source domain can be extensively, self-adaptively leveraged in the target domain. As evidence of this, both KL-TFCM-c and KL-TFCM-f surpass many existing clustering methods in texture image segmentation; and 4) In the case of different cluster numbers between the source and target domains, KL-TFCM-f proves higher clustering effectiveness and segmentation performance than does KL-TFCM-c.

12.
AJR Am J Roentgenol ; 207(4): 859-864, 2016 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-27340876

RESUMO

OBJECTIVE: The purpose of this article is to construct classifier models using machine learning algorithms and to evaluate their diagnostic performances for differentiating malignant from benign thyroid nodules. MATERIALS AND METHODS: This study included 970 histopathologically proven thyroid nodules in 970 patients. Two radiologists retrospectively reviewed ultrasound images, and nodules were graded according to a five-tier sonographic scoring system. Statistically significant variables based on an experienced radiologist's observations were obtained with attribute optimization using fivefold cross-validation and applied as the input nodes to build models for predicting malignancy of nodules. The performances of the machine learning algorithms and radiologists were compared using ROC curve analysis. RESULTS: Diagnosis by the experienced radiologist achieved the highest predictive accuracy of 88.66% with a specificity of 85.33%, whereas the radial basis function (RBF)-neural network (NN) achieved the highest sensitivity of 92.31%. The AUC value for diagnosis by the experienced radiologist (AUC = 0.9135) was greater than those for diagnosis by the less experienced radiologist, the naïve Bayes classifier, the support vector machine, and the RBF-NN (AUC = 0.8492, 0.8811, 0.9033, and 0.9103, respectively; p < 0.05). CONCLUSION: The machine learning algorithms underperformed with respect to the experienced radiologist's readings used to construct them, and the RBF-NN outperformed the other machine learning algorithm models.

13.
Clin Nephrol ; 84(5): 255-61, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26396099

RESUMO

OBJECTIVES: Despite significant advances in the epidemiology of acute kidney injury (AKI), there is no reliable method to predict renal recovery. Using acute kidney injury network (AKIN) criteria, we tested whether higher urinary L-FABP (uL-FABP) concentrations in the patients with AKIN stage 3 (AKIN3) after nephrology consultation would predict failure to recover. METHODS: This is a prospective cohort study of 114 patients with AKIN3 at WuXi People's Hospital from August 2011 to July 2014. The levels of serum creatinine, urine creatinine, and uL-FABP were obtained at the time of nephrology consultation. RESULTS: Patients who recovered had lower uL-FABP than those who failed to recover at time of nephrology consultation (71.42 (11.1 - 118.3) vs. 335.18 (103.9 - 422.3) ng/mg × creatinine, p < 0.001). Urinary L-FABP predicted failure to recover with an area under the receiver operating characteristic curve of 0.906 (95% CI 0.837 - 0.953). A clinical model using age, APACHE II score and acute tubular necrosis severity scoring index (ATN-ISS) predicted failure to recover with an area under the curve of 0.825 (95% CI 0.743 - 0.890). When uL-FABP was compared to the clinical model, the reclassification of risk of renal recovery had significantly improved by 35.1%. CONCLUSION: Urinary L-FABP appears to be a useful biomarker to predict failure to recover during hospitalization in the cohort of patients with AKIN3.


Assuntos
Injúria Renal Aguda/urina , Biomarcadores/urina , Proteínas de Ligação a Ácido Graxo/urina , Estudos de Coortes , Creatinina/sangue , Creatinina/urina , Feminino , Humanos , Testes de Função Renal , Necrose Tubular Aguda/sangue , Necrose Tubular Aguda/urina , Masculino , Estudos Prospectivos , Curva ROC
14.
J Biomol Struct Dyn ; : 1-13, 2024 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-38334134

RESUMO

Carbonylated sites are the determining factors for functional changes or deletions in carbonylated proteins, so identifying carbonylated sites is essential for understanding the process of protein carbonylated and exploring the pathogenesis of related diseases. The current wet experimental methods for predicting carbonylated modification sites ae not only expensive and time-consuming, but also have limited protein processing capabilities and cannot meet the needs of researchers. The identification of carbonylated sites using computational methods not only improves the functional characterization of proteins, but also provides researchers with free tools for predicting carbonylated sites. Therefore, it is essential to establish a model using computational methods that can accurately predict protein carbonylated sites. In this study, a prediction model, CarSitePred, is proposed to identify carbonylation sites. In CarSitePred, specific location amino acid hydrophobic hydrophilic, one-to-one numerical conversion of amino acids, and AlexNet convolutional neural networks convert preprocessed carbonylated sequences into valid numerical features. The K-means Normal Distribution-based Undersampling Algorithm (KNDUA) and Localized Normal Distribution Oversampling Technology (LNDOT) were firstly proposed and employed to balance the K, P, R and T carbonylation training dataset. And for the first time, carbonylation modification sites were transformed into the form of images and directly inputted into AlexNet convolutional neural network to extract features for fitting SVM classifiers. The 10-fold cross-validation and independent testing results show that CarSitePred achieves better prediction performance than the best currently available prediction models. Availability: https://github.com/zuoyun123/CarSitePred.Communicated by Ramaswamy H. Sarma.

15.
Artigo em Inglês | MEDLINE | ID: mdl-38015669

RESUMO

As a class of extremely significant of biocatalysts, enzymes play an important role in the process of biological reproduction and metabolism. Therefore, the prediction of enzyme function is of great significance in biomedicine fields. Recently, computational methods for predicting enzyme function have been proposed, and they effectively reduce the cost of enzyme function prediction. However, there are still deficiencies for effectively mining the discriminant information for enzyme function recognition in existing methods. In this study, we present MVDINET, a novel method for multi-level enzyme function prediction. First, the initial multi-view feature data is extracted by the enzyme sequence. Then, the above initial views are fed into various deep specific network modules to learn the depth-specificity information. Further, a deep view interaction network is designed to extract the interaction information. Finally, the specificity information and interaction information are fed into a multi-view adaptively weighted classification. We compressively evaluate MVDINET on benchmark datasets and demonstrate that MVDINET is superior to existing methods.


Assuntos
Benchmarking , Treinamento por Simulação , Reprodução
16.
Artigo em Inglês | MEDLINE | ID: mdl-36001521

RESUMO

A growing number of studies show that the human microbiome plays a vital role in human health and can be a crucial factor in predicting certain human diseases. However, microbiome data are often characterized by the limited samples and high-dimensional features, which pose a great challenge for machine learning methods. Therefore, this paper proposes a novel ensemble deep learning disease prediction method that combines unsupervised and supervised learning paradigms. First, unsupervised deep learning methods are used to learn the potential representation of the sample. Afterwards, the disease scoring strategy is developed based on the deep representations as the informative features for ensemble analysis. To ensure the optimal ensemble, a score selection mechanism is constructed, and performance boosting features are engaged with the original sample. Finally, the composite features are trained with gradient boosting classifier for health status decision. For case study, the ensemble deep learning flowchart has been demonstrated on six public datasets extracted from the human microbiome profiling. The results show that compared with the existing algorithms, our framework achieves better performance on disease prediction.


Assuntos
Aprendizado Profundo , Microbiota , Humanos , Metagenômica , Algoritmos , Aprendizado de Máquina , Microbiota/genética
17.
Artigo em Inglês | MEDLINE | ID: mdl-37028382

RESUMO

Electroencephalogram (EEG) signals are an essential tool for the detection of epilepsy. Because of the complex time series and frequency features of EEG signals, traditional feature extraction methods have difficulty meeting the requirements of recognition performance. The tunable Q-factor wavelet transform (TQWT), which is a constant-Q transform that is easily invertible and modestly oversampled, has been successfully used for feature extraction of EEG signals. Because the constant-Q is set in advance and cannot be optimized, further applications of the TQWT are restricted. To solve this problem, the revised tunable Q-factor wavelet transform (RTQWT) is proposed in this paper. RTQWT is based on the weighted normalized entropy and overcomes the problems of a nontunable Q-factor and the lack of an optimized tunable criterion. In contrast to the continuous wavelet transform and the raw tunable Q-factor wavelet transform, the wavelet transform corresponding to the revised Q-factor, i.e., RTQWT, is sufficiently better adapted to the nonstationary nature of EEG signals. Therefore, the precise and specific characteristic subspaces obtained can improve the classification accuracy of EEG signals. The classification of the extracted features was performed using the decision tree, linear discriminant, naive Bayes, SVM and KNN classifiers. The performance of the new approach was tested by evaluating the accuracies of five time-frequency distributions: FT, EMD, DWT, CWT and TQWT. The experiments showed that the RTQWT proposed in this paper can be used to extract detailed features more effectively and improve the classification accuracy of EEG signals.

18.
Artigo em Inglês | MEDLINE | ID: mdl-37216234

RESUMO

Multiview data are widespread in real-world applications, and multiview clustering is a commonly used technique to effectively mine the data. Most of the existing algorithms perform multiview clustering by mining the commonly hidden space between views. Although this strategy is effective, there are two challenges that still need to be addressed to further improve the performance. First, how to design an efficient hidden space learning method so that the learned hidden spaces contain both shared and specific information of multiview data. Second, how to design an efficient mechanism to make the learned hidden space more suitable for the clustering task. In this study, a novel one-step multiview fuzzy clustering (OMFC-CS) method is proposed to address the two challenges by collaborative learning between the common and specific space information. To tackle the first challenge, we propose a mechanism to extract the common and specific information simultaneously based on matrix factorization. For the second challenge, we design a one-step learning framework to integrate the learning of common and specific spaces and the learning of fuzzy partitions. The integration is achieved in the framework by performing the two learning processes alternately and thereby yielding mutual benefit. Furthermore, the Shannon entropy strategy is introduced to obtain the optimal views weight assignment during clustering. The experimental results based on benchmark multiview datasets demonstrate that the proposed OMFC-CS outperforms many existing methods.

19.
Comput Methods Programs Biomed ; 226: 107099, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36116398

RESUMO

BACKGROUND AND OBJECTIVE: Deep learning-based methods for fast target segmentation of magnetic resonance imaging (MRI) have become increasingly popular in recent years. Generally, the success of deep learning methods in medical image segmentation tasks relies on a large amount of labeled data. The time-consuming and labor-intensive problem of data annotation is a major challenge in medical image segmentation tasks. The aim of this work is to enhance the segmentation of MR images using a semi-supervised learning-based method using a small amount of labeled data and a large amount of unlabeled data. METHODS: To utilize the effective information of the unlabeled data, we designed the method of guiding the Student segmentation model simultaneously by the Dual-Teacher structure of CNN and transformer forming the subject network. Both Teacher A and Student models are CNNs, and the TA-S module they form is a mean teacher structure with added data noise. In the TB-S module formed by the combination of Student and Teacher B models, their backbone networks CNN and transformer capture the local and global information of the image at the same time, respectively, to create pseudo labels for each other and perform cross-supervision. The Dual-Teacher guides the Student through synchronous training and performs knowledge rectification and communication with each other through consistent regular constraints, which better utilizes the valid information in the unlabeled data. In addition, the segmentation predictions of Teacher A and Student and Teacher A and Teacher B are screened for uncertainty assessment during the training process to enhance the prediction accuracy and generalization of the model. This method uses the mechanism of simultaneous training of the synthetic structure composed of TA-S and TB-S modules to jointly guide the optimization of the Student model to obtain better segmentation ability. RESULTS: We evaluated the proposed method on a publicly available MRI dataset from a cardiac segmentation competition organized by MICCAI in 2017. Compared with several existing state-of-the-art semi-supervised segmentation methods, the method achieves better segmentation results in terms of Dice coefficient and HD distance evaluation metrics of 0.878 and 4.9 mm and 0.886 and 5.0 mm, respectively, using a training set containing only 10% and 20% of labeled data. CONCLUSION: This method fuses CNN and transformer to design a new Teacher-Student semi-supervised learning optimization strategy, which greatly improves the utilization of a large number of unlabeled medical images and the effectiveness of model segmentation results.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador/métodos , Incerteza , Aprendizado de Máquina Supervisionado , Imageamento por Ressonância Magnética/métodos
20.
ACS Synth Biol ; 11(8): 2726-2740, 2022 08 19.
Artigo em Inglês | MEDLINE | ID: mdl-35877551

RESUMO

The ribosome binding site (RBS) is a crucial element regulating translation. However, the activity of RBS is poorly predictable, because it is strongly affected by the local possible secondary structure, that is, context dependence. By the Flowseq technique, over 20 000 RBS variants were sorted and sequenced, and the translation of multiple genes under the same RBS was quantitatively characterized to evaluate the context dependence of each RBS variant in E. coli. Two regions, (-7 to -2) and (-17 to -12), of RBS were predicted with a higher possibility to pair with each other to slow down the translation initiation. Associations between phenotypes and the intrinsic factors suspected to affect translation efficiency and context dependence of the RBS, including nucleotide bias at each position, free energy, and conservation, were disentangled. The results showed that translation efficiency was influenced more significantly by conservation of the SD region (-16 to -8), while an AC-rich spacer region (-7 to -1) was associated with low context dependence. We confirmed these characteristics using a series of synthesized RBSs. The average correlation between multiple reporters was significantly higher for RBSs with an AC-rich spacer (0.714) compared with a GU-rich spacer (0.286). Overall, we proposed general design criteria to improve programmability and minimize context dependence of RBS. The characteristics unraveled here can be adapted to other bacteria for fine-tuning target-gene expression.


Assuntos
Escherichia coli , Ribossomos , Bactérias/genética , Sequência de Bases , Sítios de Ligação/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Biossíntese de Proteínas/genética , Ribossomos/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa