Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Langmuir ; 40(20): 10726-10736, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38717961

RESUMEN

In the application of renewable energy, the oxidation-reduction reaction (ORR) and oxygen evolution reaction (OER) are two crucial reactions. Single-atom catalysts (SACs) based on metal-doped graphene have been widely employed due to their high activity and high atom utilization efficiency. However, the catalytic activity is significantly influenced by different metals and local coordination, making it challenging to efficiently screen through either experimental or density functional theory (DFT) calculations. To address this issue, this study employed a combination of DFT calculations and machine learning (DFT-ML) to investigate rare earth-modified carbon-based (RENxC6-x) electrocatalysts. Based on computational data from 75 catalysts, we trained two ML models to capture the underlying patterns of physical properties and overpotential. Subsequently, the candidate catalysts were screened, leading to the discovery of four ORR catalysts, nine OER catalysts, and five bifunctional electrocatalysts, all of which were thoroughly validated for their stability. Lastly, by integrating the ML models with the SHAP analysis framework, we revealed the influence of atomic radius, Pauling electronegativity, and other features on the catalytic activity. Additionally, we analyzed the physicochemical properties of potential catalysts through DFT calculations. The revolutionary DFT-ML approach provides a crucial driving force for the design and synthesis of potential catalysts in subsequent studies.

2.
Comput Biol Med ; 173: 108390, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38569234

RESUMEN

Radiotherapy is one of the primary treatment methods for tumors, but the organ movement caused by respiration limits its accuracy. Recently, 3D imaging from a single X-ray projection has received extensive attention as a promising approach to address this issue. However, current methods can only reconstruct 3D images without directly locating the tumor and are only validated for fixed-angle imaging, which fails to fully meet the requirements of motion control in radiotherapy. In this study, a novel imaging method RT-SRTS is proposed which integrates 3D imaging and tumor segmentation into one network based on multi-task learning (MTL) and achieves real-time simultaneous 3D reconstruction and tumor segmentation from a single X-ray projection at any angle. Furthermore, the attention enhanced calibrator (AEC) and uncertain-region elaboration (URE) modules have been proposed to aid feature extraction and improve segmentation accuracy. The proposed method was evaluated on fifteen patient cases and compared with three state-of-the-art methods. It not only delivers superior 3D reconstruction but also demonstrates commendable tumor segmentation results. Simultaneous reconstruction and segmentation can be completed in approximately 70 ms, significantly faster than the required time threshold for real-time tumor tracking. The efficacies of both AEC and URE have also been validated in ablation studies. The code of work is available at https://github.com/ZywooSimple/RT-SRTS.


Asunto(s)
Imagenología Tridimensional , Neoplasias , Humanos , Imagenología Tridimensional/métodos , Rayos X , Radiografía , Neoplasias/diagnóstico por imagen , Respiración , Procesamiento de Imagen Asistido por Computador/métodos
3.
IEEE J Biomed Health Inform ; 28(4): 2314-2325, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38265897

RESUMEN

In the biomedical literature, entities are often distributed within multiple sentences and exhibit complex interactions. As the volume of literature has increased dramatically, it has become impractical to manually extract and maintain biomedical knowledge, which would entail enormous costs. Fortunately, document-level relation extraction can capture associations between entities from complex text, helping researchers efficiently mine structured knowledge from the vast medical literature. However, how to effectively synthesize rich global information from context and accurately capture local dependencies between entities is still a great challenge. In this paper, we propose a Local to Global Graphical Reasoning framework (LoGo-GR) based on a novel Biased Graph Attention mechanism (B-GAT). It learns global context feature and information of local relation path dependencies from mention-level interaction graph and entity-level path graph respectively, and collaborates with global and local reasoning to capture complex interactions between entities from document-level text. In particular, B-GAT integrates structural dependencies into the standard graph attention mechanism (GAT) as attention biases to adaptively guide information aggregation in graphical reasoning. We evaluate our method on three publicly biomedical document-level datasets: Drug-Mutation Interaction (DV), Chemical-induced Disease (CDR), and Gene-Disease Association (GDA). LoGo-GR has advanced and stable performance compared to other state-of-the-art methods (it achieves state-of-the-art performance with 96.14%-97.39% F1 on DV dataset, advanced performance with 68.89% F1 and 84.22% F1 on CDR and GDA datasets, respectively). In addition, LoGo-GR also shows advanced performance on general-domain document-level relation extraction dataset, DocRED, which proves that it is an effective and robust document-level relation extraction framework.


Asunto(s)
Minería de Datos , Humanos , Minería de Datos/métodos
4.
Sci Total Environ ; 914: 169801, 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38184264

RESUMEN

With the potential to cause millions of deaths, PM2.5 pollution has become a global concern. In Southeast Asia, the Mekong River Basin (MRB) is experiencing heavy PM2.5 pollution and the existing PM2.5 studies in the MRB are limited in terms of accuracy and spatiotemporal coverage. To achieve high-accuracy and long-term PM2.5 monitoring of the MRB, fused aerosol optical depth (AOD) data and multi-source auxiliary data are fed into a stacking model to estimate PM2.5 concentrations. The proposed stacking model takes advantage of convolutional neural network (CNN) and Light Gradient Boosting Machine (LightGBM) models and can well represent the spatiotemporal heterogeneity of the PM2.5-AOD relationship. In the cross-validation (CV), comparison with CNN and LightGBM models shows that the stacking model can better suppress overfitting, with a higher coefficient of determination (R2) of 0.92, a lower root mean square error (RMSE) of 5.58 µg/m3, and a lower mean absolute error (MAE) of 3.44 µg/m3. For the first time, the high-accuracy PM2.5 dataset reveals spatially and temporally continuous PM2.5 pollution and variations in the MRB from 2015 to 2022. Moreover, the spatiotemporal variations of annual and monthly PM2.5 pollution are also investigated at the regional and national scales. The dataset will contribute to the analysis of the causes of PM2.5 pollution and the development of mitigation policies in the MRB.

5.
Phys Chem Chem Phys ; 26(3): 2284-2290, 2024 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-38165715

RESUMEN

The oxygen reduction reaction (ORR) on the oxygen electrode plays a critical role in rechargeable metal-air batteries, and the development of electrochemical energy storage and conversion technologies for the ORR is of great significance. In this study, the catalytic performance of rare earth-doped graphene (EuNxC6-x-Gra) as an electrocatalyst for the ORR was investigated. The results showed that a majority of the catalysts exhibited good ORR catalytic activity under acidic conditions, with some approaching or even surpassing commercial Pt-based catalysts (ηORR = 0.45 V). Particularly, EuN2C4-2-Gra demonstrated an ηORR of 0.38 V. It has been observed that the f-band center of Eu atoms increases with an increasing number of N atoms, and the charge distribution exhibits a "U" shape. There is a decreasing trend from N0 to N3 and an increasing trend from N4 to N6. By incorporating the proportional relationship of the adsorption free energies of reaction intermediates (ΔG*ads), a volcano diagram was constructed to rapidly assess catalytic activity. Finally, an intrinsic characteristic descriptor φ was formulated to quantitatively describe the relationship between φ and ηORR, providing a new tool for predicting and designing catalysts. This will provide guidance for the development and design of high-performance rare earth single atom catalysts.

6.
Math Biosci Eng ; 20(11): 20188-20212, 2023 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-38052642

RESUMEN

A membrane protein's functions are significantly associated with its type, so it is crucial to identify the types of membrane proteins. Conventional computational methods for identifying the species of membrane proteins tend to ignore two issues: High-order correlation among membrane proteins and the scenarios of multi-modal representations of membrane proteins, which leads to information loss. To tackle those two issues, we proposed a deep residual hypergraph neural network (DRHGNN), which enhances the hypergraph neural network (HGNN) with initial residual and identity mapping in this paper. We carried out extensive experiments on four benchmark datasets of membrane proteins. In the meantime, we compared the DRHGNN with recently developed advanced methods. Experimental results showed the better performance of DRHGNN on the membrane protein classification task on four datasets. Experiments also showed that DRHGNN can handle the over-smoothing issue with the increase of the number of model layers compared with HGNN. The code is available at https://github.com/yunfighting/Identification-of-Membrane-Protein-Types-via-deep-residual-hypergraph-neural-network.


Asunto(s)
Proteínas de la Membrana , Redes Neurales de la Computación
7.
Artículo en Inglés | MEDLINE | ID: mdl-37610904

RESUMEN

Predicting G protein-coupled receptor (GPCR)-ligand binding affinity plays a crucial role in drug development. However, determining GPCR-ligand binding affinities is time-consuming and resource-intensive. Although many studies used data-driven methods to predict binding affinity, most of these methods required protein 3D structure, which was often unknown. Moreover, part of these studies only considered the sequence characteristics of the protein, ignoring the secondary structure of the protein. The number of known GPCR for affinity prediction is only a few thousand, which is insufficient for deep learning training. Therefore, this study aimed to propose a deep transfer learning method called TrGPCR, which used dynamic transfer learning to solve the problem of insufficient GPCR data. We used the Binding Database(BindingDB) as the source domain and the GLASS(GPCR-Ligand Association) database as the target domain. We also introduced protein secondary structures, called pockets, as features to predict binding affinities. Compared with DeepDTA, our model improved by 5.2% on RMSE(root mean square error) and 4.5% on MAE(mean squared error).

8.
Comput Biol Med ; 164: 107094, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37459792

RESUMEN

In recent years, research in the field of bioinformatics has focused on predicting the raw sequences of proteins, and some scholars consider DNA-binding protein prediction as a classification task. Many statistical and machine learning-based methods have been widely used in DNA-binding proteins research. The aforementioned methods are indeed more efficient than those based on manual classification, but there is still room for improvement in terms of prediction accuracy and speed. In this study, researchers used Average Blocks, Discrete Cosine Transform, Discrete Wavelet Transform, Global encoding, Normalized Moreau-Broto Autocorrelation and Pseudo position-specific scoring matrix to extract evolutionary features. A dynamic deep network based on lifelong learning architecture was then proposed in order to fuse six features and thus allow for more efficient classification of DNA-binding proteins. The multi-feature fusion allows for a more accurate description of the desired protein information than single features. This model offers a fresh perspective on the dichotomous classification problem in bioinformatics and broadens the application field of lifelong learning. The researchers ran trials on three datasets and contrasted them with other classification techniques to show the model's effectiveness in this study. The findings demonstrated that the model used in this research was superior to other approaches in terms of single-sample specificity (81.0%, 83.0%) and single-sample sensitivity (82.4%, 90.7%), and achieves high accuracy on the benchmark dataset (88.4%, 80.0%, and 76.6%).


Asunto(s)
Proteínas de Unión al ADN , Aprendizaje Automático , Unión Proteica , Proteínas de Unión al ADN/metabolismo , Biología Computacional/métodos , ADN
9.
J Biomed Inform ; 144: 104445, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37467835

RESUMEN

In biomedical literature, cross-sentence texts can usually express rich knowledge, and extracting the interaction relation between entities from cross-sentence texts is of great significance to biomedical research. However, compared with single sentence, cross-sentence text has a longer sequence length, so the research on cross-sentence text information extraction should focus more on learning the context dependency structural information. Nowadays, it is still a challenge to handle global dependencies and structural information of long sequences effectively, and graph-oriented modeling methods have received more and more attention recently. In this paper, we propose a new graph attention network guided by syntactic dependency relationship (SR-GAT) for extracting biomedical relation from the cross-sentence text. It allows each node to pay attention to other nodes in its neighborhood, regardless of the sequence length. The attention weight between nodes is given by a syntactic relation graph probability network (SR-GPR), which encodes the syntactic dependency between nodes and guides the graph attention mechanism to learn information about the dependency structure. The learned feature representation retains information about the node-to-node syntactic dependency, and can further discover global dependencies effectively. The experimental results demonstrate on a publicly available biomedical dataset that, our method achieves state-of-the-art performance while requiring significantly less computational resources. Specifically, in the "drug-mutation" relation extraction task, our method achieves an advanced accuracy of 93.78% for binary classification and 92.14% for multi-classification. In the "drug-gene-mutation" relation extraction task, our method achieves an advanced accuracy of 93.22% for binary classification and 92.28% for multi-classification. Across all relation extraction tasks, our method improves accuracy by an average of 0.49% compared to the existing best model. Furthermore, our method achieved an accuracy of 69.5% in text classification, surpassing most existing models, demonstrating its robustness in generalization across different domains without additional fine-tuning.


Asunto(s)
Investigación Biomédica , Lenguaje , Almacenamiento y Recuperación de la Información
10.
PLoS One ; 18(6): e0286770, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37289704

RESUMEN

A critical issue in intelligent building control is detecting energy consumption anomalies based on intelligent device status data. The building field is plagued by energy consumption anomalies caused by a number of factors, many of which are associated with one another in apparent temporal relationships. For the detection of abnormalities, most traditional detection methods rely solely on a single variable of energy consumption data and its time series changes. Therefore, they are unable to examine the correlation between the multiple characteristic factors that affect energy consumption anomalies and their relationship in time. The outcomes of anomaly detection are one-sided. To address the above problems, this paper proposes an anomaly detection method based on multivariate time series. Firstly, in order to extract the correlation between different feature variables affecting energy consumption, this paper introduces a graph convolutional network to build an anomaly detection framework. Secondly, as different feature variables have different influences on each other, the framework is enhanced by a graph attention mechanism so that time series features with higher influence on energy consumption are given more attention weights, resulting in better anomaly detection of building energy consumption. Finally, the effectiveness of this paper's method and existing methods for detecting energy consumption anomalies in smart buildings are compared using standard data sets. The experimental results show that the model has better detection accuracy.


Asunto(s)
Inteligencia , Fenómenos Fisiológicos , Factores de Tiempo , Fenómenos Físicos , Registros
11.
Ecotoxicol Environ Saf ; 253: 114658, 2023 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-36796207

RESUMEN

Pesticide residues have serious environmental impacts on rice-based ecosystems. In rice fields, Chironomus kiiensis and Chironomus javanus provide alternative food sources to predatory natural enemies of rice insect pests, especially when pests are low. Chlorantraniliprole is a substitute for older classes of insecticides and has been used extensively to control rice pests. To determine the ecological risks of chlorantraniliprole in rice fields, we evaluated its toxic effects on certain growth, biochemical and molecular parameters in these two chironomids. The toxicity tests were performed by exposing third-instar larvae to a range of concentrations of chlorantraniliprole. LC50 values at 24 h, 48 h, and 10 days showed that chlorantraniliprole was more toxic to C. javanus than to C. kiiensis. Chlorantraniliprole significantly prolonged the larval growth duration, inhibited pupation and emergence, and decreased egg numbers of C. kiiensis and C. javanus at sublethal dosages (LC10 = 1.50 mg/L and LC25 = 3.00 mg/L for C. kiiensis; LC10 = 0.25 mg/L and LC25 = 0.50 mg/L for C. javanus). Sublethal exposure to chlorantraniliprole significantly decreased the activity of the detoxification enzymes carboxylesterase (CarE) and glutathione S-transferases (GSTs) in both C. kiiensis and C. javanus. Sublethal exposure to chlorantraniliprole also markedly inhibited the activity of the antioxidant enzyme peroxidase (POD) in C. kiiensis and POD and catalase (CAT) in C. javanus. Expression levels of 12 genes revealed that detoxification and antioxidant abilities were affected by sublethal exposures to chlorantraniliprole. There were significant changes in the expression levels of seven genes (CarE6, CYP9AU1, CYP6FV2, GSTo1, GSTs1, GSTd2, and POD) in C. kiiensis and ten genes (CarE6, CYP9AU1, CYP6FV2, GSTo1, GSTs1, GSTd2, GSTu1, GSTu2, CAT, and POD) in C. javanus. These results provide a comprehensive overview of the differences in chlorantraniliprole toxicity to chironomids, indicating that C. javanus is more susceptible and suitable as an indicator for ecological risk assessment in rice ecosystems.


Asunto(s)
Chironomidae , Insecticidas , Animales , Antioxidantes/farmacología , Ecosistema , Larva , ortoaminobenzoatos/toxicidad , Insecticidas/toxicidad
12.
PeerJ Comput Sci ; 9: e1729, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38192477

RESUMEN

The rapid development of the internet has brought about a comprehensive transformation in human life. However, the challenges of cybersecurity are becoming increasingly severe, necessitating the implementation of effective security mechanisms. Cybersecurity situational awareness can effectively assess the network status, facilitating the formulation of better cybersecurity defense strategies. However, due to the low accuracy of existing situational assessment methods, situational assessment remains a challenge. In this study, a new situational assessment method, MSWOA-BiGRU, combining optimization algorithms and temporal neural networks, was proposed. Firstly, a scientific indicator system proposed in this research is used to calculate the values of each indicator. Then, the Analytic Hierarchy Process is used to derive the actual situation values, which serve as labels. Taking into account the temporal nature of network traffic, the BiGRU model is utilized for cybersecurity situational assessment. After integrating time-related features and network traffic characteristics, the situational assessment value is obtained. During the evaluation process, a whale optimization algorithm (MSWOA) improved with a mix of strategies proposed in this study was employed to optimize the model. The performance of the proposed MSWOA-BiGRU model was evaluated on publicly available real network security datasets. Experimental results indicate that compared to traditional optimization algorithms, the optimization performance of MSWOA has seen significant enhancement. Furthermore, MSWOA-BiGRU demonstrates superior performance in cybersecurity situational assessment compared to existing evaluation methods.

13.
Front Cell Dev Biol ; 10: 794413, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35356288

RESUMEN

Calculating and predicting drug-target interactions (DTIs) is a crucial step in the field of novel drug discovery. Nowadays, many models have improved the prediction performance of DTIs by fusing heterogeneous information, such as drug chemical structure and target protein sequence and so on. However, in the process of fusion, how to allocate the weight of heterogeneous information reasonably is a huge challenge. In this paper, we propose a model based on Q-learning algorithm and Neighborhood Regularized Logistic Matrix Factorization (QLNRLMF) to predict DTIs. First, we obtain three different drug-drug similarity matrices and three different target-target similarity matrices by using different similarity calculation methods based on heterogeneous data, including drug chemical structure, target protein sequence and drug-target interactions. Then, we initialize a set of weights for the drug-drug similarity matrices and target-target similarity matrices respectively, and optimize them through Q-learning algorithm. When the optimal weights are obtained, a new drug-drug similarity matrix and a new drug-drug similarity matrix are obtained by linear combination. Finally, the drug target interaction matrix, the new drug-drug similarity matrices and the target-target similarity matrices are used as inputs to the Neighborhood Regularized Logistic Matrix Factorization (NRLMF) model for DTIs. Compared with the existing six methods of NetLapRLS, BLM-NII, WNN-GIP, KBMF2K, CMF, and NRLMF, our proposed method has achieved better effect in the four benchmark datasets, including enzymes(E), nuclear receptors (NR), ion channels (IC) and G protein coupled receptors (GPCR).

14.
Sustain Cities Soc ; 80: 103719, 2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-35127340

RESUMEN

Gymnasiums, fitness rooms and alike places offer exercise services to citizens, which play positive roles in promoting health and enhancing human immunity. Due to the high metabolic rates during exercises, supplying sufficient ventilation in these places is essential and extremely important especially given the risk of infectious respiratory diseases like COVID-19. Traditional ventilation control methods rely on a single CO2 sensor (often placed at return air duct), which is often difficult to reflect the human metabolic rates accurately, and thus can hardly control the infection risk instantly. Thus, to ensure a safe and healthy environment in places with high metabolism, a real-time metabolism-based ventilation control method is proposed. A computer vision algorithm is developed to monitor human activities (regarding human motion amplitude and speed) and an artificial neural network is established for metabolic prediction. Case studies show that the proposed metabolism-based ventilation control method can reduce the infection probability down to 4.3-6.3% while saving 13% of energy in comparison with the strategy of fixed-fresh-air ventilation. In the development of healthy and sustainable society, gymnasiums and alike exercise places are essential and the proposed ventilation control method is a promising solution to decrease the risk of COVID-19 while preserving features of energy saving and carbon emission reduction.

15.
Biomed Res Int ; 2022: 9044793, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35083336

RESUMEN

DNA contains the genetic information for the synthesis of proteins and RNA, and it is an indispensable substance in living organisms. DNA-binding proteins are an enzyme, which can bind with DNA to produce complex proteins, and play an important role in the functions of a variety of biological molecules. With the continuous development of deep learning, the introduction of deep learning into DNA-binding proteins for prediction is conducive to improving the speed and accuracy of DNA-binding protein recognition. In this study, the features and structures of proteins were used to obtain their representations through graph convolutional networks. A protein prediction model based on graph convolutional network and contact map was proposed. The method had some advantages by testing various indexes of PDB14189 and PDB2272 on the benchmark dataset.


Asunto(s)
Proteínas de Unión al ADN , Redes Neurales de la Computación
16.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3126-3134, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34780331

RESUMEN

G protein-coupled receptors (GPCRs) account for about 40% to 50% of drug targets. Many human diseases are related to G protein coupled receptors. Accurate prediction of GPCR interaction is not only essential to understand its structural role, but also helps design more effective drugs. At present, the prediction of GPCR interaction mainly uses machine learning methods. Machine learning methods generally require a large number of independent and identically distributed samples to achieve good results. However, the number of available GPCR samples that have been marked is scarce. Transfer learning has a strong advantage in dealing with such small sample problems. Therefore, this paper proposes a transfer learning method based on sample similarity, using XGBoost as a weak classifier and using the TrAdaBoost algorithm based on JS divergence for data weight initialization to transfer samples to construct a data set. After that, the deep neural network based on the attention mechanism is used for model training. The existing GPCR is used for prediction. In short-distance contact prediction, the accuracy of our method is 0.26 higher than similar methods.


Asunto(s)
Algoritmos , Receptores Acoplados a Proteínas G , Humanos , Receptores Acoplados a Proteínas G/química , Redes Neurales de la Computación , Aprendizaje Automático
17.
Biomolecules ; 11(12)2021 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-34944479

RESUMEN

Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is an urgent necessity for computational simulation to predict unknown miRNA-disease associations. In this work, we combine Q-learning algorithm of reinforcement learning to propose a RFLMDA model, three submodels CMF, NRLMF, and LapRLS are fused via Q-learning algorithm to obtain the optimal weight S. The performance of RFLMDA was evaluated through five-fold cross-validation and local validation. As a result, the optimal weight is obtained as S (0.1735, 0.2913, 0.5352), and the AUC is 0.9416. By comparing the experiments with other methods, it is proved that RFLMDA model has better performance. For better validate the predictive performance of RFLMDA, we use eight diseases for local verification and carry out case study on three common human diseases. Consequently, all the top 50 miRNAs related to Colorectal Neoplasms and Breast Neoplasms have been confirmed. Among the top 50 miRNAs related to Colon Neoplasms, Gastric Neoplasms, Pancreatic Neoplasms, Kidney Neoplasms, Esophageal Neoplasms, and Lymphoma, we confirm 47, 41, 49, 46, 46 and 48 miRNAs respectively.


Asunto(s)
Biología Computacional/métodos , MicroARNs/genética , Neoplasias/genética , Algoritmos , Simulación por Computador , Predisposición Genética a la Enfermedad , Humanos
18.
BMC Bioinformatics ; 22(Suppl 3): 431, 2021 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-34496763

RESUMEN

BACKGROUND: RNA secondary structure prediction is an important research content in the field of biological information. Predicting RNA secondary structure with pseudoknots has been proved to be an NP-hard problem. Traditional machine learning methods can not effectively apply protein sequence information with different sequence lengths to the prediction process due to the constraint of the self model when predicting the RNA secondary structure. In addition, there is a large difference between the number of paired bases and the number of unpaired bases in the RNA sequences, which means the problem of positive and negative sample imbalance is easy to make the model fall into a local optimum. To solve the above problems, this paper proposes a variable-length dynamic bidirectional Gated Recurrent Unit(VLDB GRU) model. The model can accept sequences with different lengths through the introduction of flag vector. The model can also make full use of the base information before and after the predicted base and can avoid losing part of the information due to truncation. Introducing a weight vector to predict the RNA training set by dynamically adjusting each base loss function solves the problem of balanced sample imbalance. RESULTS: The algorithm proposed in this paper is compared with the existing algorithms on five representative subsets of the data set RNA STRAND. The experimental results show that the accuracy and Matthews correlation coefficient of the method are improved by 4.7% and 11.4%, respectively. CONCLUSIONS: The flag vector introduced allows the model to effectively use the information before and after the protein sequence; the introduced weight vector solves the problem of unbalanced sample balance. Compared with other algorithms, the LVDB GRU algorithm proposed in this paper has the best detection results.


Asunto(s)
Redes Neurales de la Computación , ARN , Algoritmos , Conformación de Ácido Nucleico , Estructura Secundaria de Proteína , ARN/genética
19.
Front Genet ; 12: 834488, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35371189

RESUMEN

Membrane proteins are an essential part of the body's ability to maintain normal life activities. Further research into membrane proteins, which are present in all aspects of life science research, will help to advance the development of cells and drugs. The current methods for predicting proteins are usually based on machine learning, but further improvements in prediction effectiveness and accuracy are needed. In this paper, we propose a dynamic deep network architecture based on lifelong learning in order to use computers to classify membrane proteins more effectively. The model extends the application area of lifelong learning and provides new ideas for multiple classification problems in bioinformatics. To demonstrate the performance of our model, we conducted experiments on top of two datasets and compared them with other classification methods. The results show that our model achieves high accuracy (95.3 and 93.5%) on benchmark datasets and is more effective compared to other methods.

20.
IEEE/ACM Trans Comput Biol Bioinform ; 18(5): 1752-1762, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32750885

RESUMEN

Approximately 40-50 percent of all drugs targets are G protein-coupled receptors (GPCRs). Three-dimensional structure of GPCRs is important to probe their biophysical and biochemical functions and their pharmaceutical applications. Lacking reliable and high quality free function is one of the ugent problems of computational predicting the three-dimensional structure in this community. We proposed a GPCR-specified energy function composed of four novel empirical potential energy terms: a two-dimensional contact energy force field, knowledge-based helix pair connection distance energy term, knowledge-based helix pair angle restraint energy term and a disulfide bond energy term. To validate the energy function, we employed an ab initio GPCR three-dimensional structure predictor to test if the energy function improved the accuracy of prediction. We evaluated 28 solved GPCRs and found that 21(75 percent) targets were correctly folded (TM-score>0.5). Also, the average TM-score using the energy function was 0.54, which was improved 134 percent than the TM-score 0.23 for MODELLER energy function and 170 percent than the TM-score 0.20 for Rosetta membrane energy function. The results confirmed that our empirical potential energy function toward ab initio folding is competitive to state-of-the-art solutions for structural prediction of GPCRs.


Asunto(s)
Pliegue de Proteína , Receptores Acoplados a Proteínas G , Algoritmos , Biología Computacional , Modelos Moleculares , Conformación Proteica , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/metabolismo , Termodinámica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA