Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Bioinformatics ; 39(4)2023 04 03.
Article in English | MEDLINE | ID: mdl-37039829

ABSTRACT

MOTIVATION: Identifying the B-cell epitopes is an essential step for guiding rational vaccine development and immunotherapies. Since experimental approaches are expensive and time-consuming, many computational methods have been designed to assist B-cell epitope prediction. However, existing sequence-based methods have limited performance since they only use contextual features of the sequential neighbors while neglecting structural information. RESULTS: Based on the recent breakthrough of AlphaFold2 in protein structure prediction, we propose GraphBepi, a novel graph-based model for accurate B-cell epitope prediction. For one protein, the predicted structure from AlphaFold2 is used to construct the protein graph, where the nodes/residues are encoded by ESM-2 learning representations. The graph is input into the edge-enhanced deep graph neural network (EGNN) to capture the spatial information in the predicted 3D structures. In parallel, a bidirectional long short-term memory neural networks (BiLSTM) are employed to capture long-range dependencies in the sequence. The learned low-dimensional representations by EGNN and BiLSTM are then combined into a multilayer perceptron for predicting B-cell epitopes. Through comprehensive tests on the curated epitope dataset, GraphBepi was shown to outperform the state-of-the-art methods by more than 5.5% and 44.0% in terms of AUC and AUPR, respectively. A web server is freely available at http://bio-web1.nscc-gz.cn/app/graphbepi. AVAILABILITY AND IMPLEMENTATION: The datasets, pre-computed features, source codes, and the trained model are available at https://github.com/biomed-AI/GraphBepi.


Subject(s)
Epitopes, B-Lymphocyte , Neural Networks, Computer , Epitopes, B-Lymphocyte/chemistry , Proteins/chemistry , Software , Language
2.
Molecules ; 29(4)2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38398585

ABSTRACT

The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.


Subject(s)
Artificial Intelligence , Proteins , Protein Conformation , Models, Molecular , Proteins/chemistry , Algorithms , Computational Biology/methods , Databases, Protein , Software , Protein Folding
3.
Bioinformatics ; 38(1): 94-98, 2021 12 22.
Article in English | MEDLINE | ID: mdl-34450651

ABSTRACT

MOTIVATION: The solvent accessible surface is an essential structural property measure related to the protein structure and protein function. Relative solvent accessible area (RSA) is a standard measure to describe the degree of residue exposure in the protein surface or inside of protein. However, this computation will fail when the residues information is missing. RESULTS: In this article, we proposed a novel method for estimation RSA using the Cα atom distance matrix with the deep learning method (EAGERER). The new method, EAGERER, achieves Pearson correlation coefficients of 0.921-0.928 on two independent test datasets. We empirically demonstrate that EAGERER can yield better Pearson correlation coefficients than existing RSA estimators, such as coordination number, half sphere exposure and SphereCon. To the best of our knowledge, EAGERER represents the first method to estimate the solvent accessible area using limited information with a deep learning model. It could be useful to the protein structure and protein function prediction. AVAILABILITYAND IMPLEMENTATION: The method is free available at https://github.com/cliffgao/EAGERER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Deep Learning , Membrane Proteins , Solvents/chemistry
4.
Bioinformatics ; 37(21): 3752-3759, 2021 11 05.
Article in English | MEDLINE | ID: mdl-34473228

ABSTRACT

MOTIVATION: Protein model quality assessment (QA) is an essential component in protein structure prediction, which aims to estimate the quality of a structure model and/or select the most accurate model out from a pool of structure models, without knowing the native structure. QA remains a challenging task in protein structure prediction. RESULTS: Based on the inter-residue distance predicted by the recent deep learning-based structure prediction algorithm trRosetta, we developed QDistance, a new approach to the estimation of both global and local qualities. QDistance works for both single- and multi-models inputs. We designed several distance-based features to assess the agreement between the predicted and model-derived inter-residue distances. Together with a few widely used features, they are fed into a simple yet powerful linear regression model to infer the global QA scores. The local QA scores for each structure model are predicted based on a comparative analysis with a set of selected reference models. For multi-models input, the reference models are selected from the input based on the predicted global QA scores. For single-model input, the reference models are predicted by trRosetta. With the informative distance-based features, QDistance can predict the global quality with satisfactory accuracy. Benchmark tests on the CASP13 and the CAMEO structure models suggested that QDistance was competitive with other methods. Blind tests in the CASP14 experiments showed that QDistance was robust and ranked among the top predictors. Especially, QDistance was the top 3 local QA method and made the most accurate local QA prediction for unreliable local region. Analysis showed that this superior performance can be attributed to the inclusion of the predicted inter-residue distance. AVAILABILITY AND IMPLEMENTATION: http://yanglab.nankai.edu.cn/QDistance. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology , Proteins , Computational Biology/methods , Proteins/chemistry , Algorithms
5.
Brief Bioinform ; 19(3): 482-494, 2018 05 01.
Article in English | MEDLINE | ID: mdl-28040746

ABSTRACT

Protein secondary structure prediction began in 1951 when Pauling and Corey predicted helical and sheet conformations for protein polypeptide backbone even before the first protein structure was determined. Sixty-five years later, powerful new methods breathe new life into this field. The highest three-state accuracy without relying on structure templates is now at 82-84%, a number unthinkable just a few years ago. These improvements came from increasingly larger databases of protein sequences and structures for training, the use of template secondary structure information and more powerful deep learning techniques. As we are approaching to the theoretical limit of three-state prediction (88-90%), alternative to secondary structure prediction (prediction of backbone torsion angles and Cα-atom-based angles and torsion angles) not only has more room for further improvement but also allows direct prediction of three-dimensional fragment structures with constantly improved accuracy. About 20% of all 40-residue fragments in a database of 1199 non-redundant proteins have <6 Å root-mean-squared distance from the native conformations by SPIDER2. More powerful deep learning methods with improved capability of capturing long-range interactions begin to emerge as the next generation of techniques for secondary structure prediction. The time has come to finish off the final stretch of the long march towards protein secondary structure prediction.


Subject(s)
Algorithms , Computational Biology/methods , Models, Theoretical , Neural Networks, Computer , Protein Structure, Secondary , Proteins/chemistry , Databases, Protein , Humans
6.
J Theor Biol ; 480: 274-283, 2019 11 07.
Article in English | MEDLINE | ID: mdl-31251944

ABSTRACT

Many computational methods have been proposed to predict essential proteins from protein-protein interaction (PPI) networks. However, it is still challenging to improve the prediction accuracy. In this study, we propose a new method, esPOS (essential proteins Predictor using Order Statistics) to predict essential proteins from PPI networks. Firstly, we refine the networks by using gene expression information and subcellular localization information. Secondly, we design some new features, which combine the protein predicted secondary structure with PPI network. We show that these new features are useful to predict essential proteins. Thirdly, we optimize these features by using a greedy method, and combine the optimized features by order statistic method. Our method achieves the prediction accuracy of 0.76-0.79 on two network datasets. The proposed method is available at https://sourceforge.net/projects/espos/.


Subject(s)
Algorithms , Computational Biology/methods , Protein Interaction Maps , Statistics as Topic , Databases, Protein , Predictive Value of Tests
7.
BMC Bioinformatics ; 19(1): 29, 2018 02 01.
Article in English | MEDLINE | ID: mdl-29390958

ABSTRACT

BACKGROUND: Protein structure can be described by backbone torsion angles: rotational angles about the N-Cα bond (φ) and the Cα-C bond (ψ) or the angle between Cαi-1-Cαi-Cαi + 1 (θ) and the rotational angle about the Cαi-Cαi + 1 bond (τ). Thus, their accurate prediction is useful for structure prediction and model refinement. Early methods predicted torsion angles in a few discrete bins whereas most recent methods have focused on prediction of angles in real, continuous values. Real value prediction, however, is unable to provide the information on probabilities of predicted angles. RESULTS: Here, we propose to predict angles in fine grids of 5° by using deep learning neural networks. We found that this grid-based technique can yield 2-6% higher accuracy in predicting angles in the same 5° bin than existing prediction techniques compared. We further demonstrate the usefulness of predicted probabilities at given angle bins in discrimination of intrinsically disorder regions and in selection of protein models. CONCLUSIONS: The proposed method may be useful for characterizing protein structure and disorder. The method is available at http://sparks-lab.org/server/SPIDER2/ as a part of SPIDER2 package.


Subject(s)
Proteins/chemistry , User-Computer Interface , Area Under Curve , Neural Networks, Computer , Probability , Protein Domains , Protein Structure, Tertiary , Proteins/metabolism , ROC Curve
8.
Bioinformatics ; 32(24): 3768-3773, 2016 12 15.
Article in English | MEDLINE | ID: mdl-27551104

ABSTRACT

MOTIVATION: Backbone structures and solvent accessible surface area of proteins are benefited from continuous real value prediction because it removes the arbitrariness of defining boundary between different secondary-structure and solvent-accessibility states. However, lacking the confidence score for predicted values has limited their applications. Here we investigated whether or not we can make a reasonable prediction of absolute errors for predicted backbone torsion angles, Cα-atom-based angles and torsion angles, solvent accessibility, contact numbers and half-sphere exposures by employing deep neural networks. RESULTS: We found that angle-based errors can be predicted most accurately with Spearman correlation coefficient (SPC) between predicted and actual errors at about 0.6. This is followed by solvent accessibility (SPC∼0.5). The errors on contact-based structural properties are most difficult to predict (SPC between 0.2 and 0.3). We showed that predicted errors are significantly better error indicators than the average errors based on secondary-structure and amino-acid residue types. We further demonstrated the usefulness of predicted errors in model quality assessment. These error or confidence indictors are expected to be useful for prediction, assessment, and refinement of protein structures. AVAILABILITY AND IMPLEMENTATION: The method is available at http://sparks-lab.org as a part of SPIDER2 package. CONTACT: yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.auSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Neural Networks, Computer , Protein Structure, Secondary , Proteins/chemistry , Amino Acids , Computational Biology/methods , Solvents
9.
J Adv Res ; 2024 Jun 09.
Article in English | MEDLINE | ID: mdl-38862035

ABSTRACT

INTRODUCTION: Frailty Index (FI) is a common measure of frailty, which has been advocated as a routine clinical test by many guidelines. The genetic and phenotypic relationships of FI with cardiovascular indicators (CIs) and behavioral characteristics (BCs) are unclear, which has hampered ability to monitor FI using easily collected data. OBJECTIVES: This study is designed to investigate the genetic and phenotypic associations of frailty with CIs and BCs, and further to construct a model to predict FI. METHOD: Genetic relationships of FI with 288 CIs and 90 BCs were assessed by the cross-trait LD score regression (LDSC) and Mendelian randomization (MR). The phenotypic data of these CIs and BCs were integrated with a machine-learning model to predict FI of individuals in UK-biobank. The relationships of the predicted FI with risks of type 2 diabetes (T2D) and neurodegenerative diseases were tested by the Kaplan-Meier estimator and Cox proportional hazards model. RESULTS: MR revealed putative causal effects of seven CIs and eight BCs on FI. These CIs and BCs were integrated to establish a model for predicting FI. The predicted FI is significantly correlated with the observed FI (Pearson correlation coefficient = 0.660, P-value = 4.96 × 10-62). The prediction model indicated "usual walking pace" contributes the most to prediction. Patients who were predicted with high FI are in significantly higher risk of T2D (HR = 2.635, P < 2 × 10-16) and neurodegenerative diseases (HR = 2.307, P = 1.62 × 10-3) than other patients. CONCLUSION: This study supports associations of FI with CIs and BCs from genetic and phenotypic perspectives. The model that is developed by integrating easily collected CIs and BCs data in predicting FI has the potential to monitor disease risk.

10.
ACS Appl Mater Interfaces ; 15(6): 8783-8793, 2023 Feb 15.
Article in English | MEDLINE | ID: mdl-36723501

ABSTRACT

Wearable, noninvasive, and simultaneous sensing of subtle strains and eccrine molecules on human body is essential for future health monitoring and personalized medicine. However, there is a huge chasm between biomechanics and bio/chemical molecule detections. Here, a wearable plasmonic bridge sensor with multiple abilities to monitor subtle strains and molecules is developed. Hollow Au-Ag nano-rambutans and carbon nanotubes (CNTs) are adsorbed in the nonwoven fabrics (NWFs) conjointly, where the gap between the conducting network of CNTs is bridged by the Au-Ag nano-rambutans during the subtle strain sensing, and the detection sensitivity for stress is improved at least 1 order of magnitude compared to that with the only CNTs. In order to acquire the accurate human action recognition, a machine learning algorithm (support vector machines) based on output biomechanics data is designed. The average accuracy of our plasmonic bridge sensor reaches 89.0% for human action recognition. Moreover, due to the hollow structure and high nanoroughness, the single Au-Ag nano-rambutan particle has strong localized surface plasmon resonance effect and high surface-enhanced Raman scattering (SERS) activity. Based on their unique SERS spectra introduced by the hollow Au-Ag nano-rambutan adsorbed in the NWFs, noninvasive extraction and "fingerprint" recognition of bio/chemical molecules could be realized during the wearable sensing. In sum, the NWFs/CNTs/Au-Ag sensor bridges the barrier between the bodily strain detection and molecule recognition during the wearable sensing. Such integrated and multifunctional sensing strategy for universal biomechanics and bio/chemical molecules means to assess human health to be of importance.


Subject(s)
Metal Nanoparticles , Nanotubes, Carbon , Humans , Biomechanical Phenomena , Gold/chemistry , Metal Nanoparticles/chemistry , Silver/chemistry , Spectrum Analysis, Raman
11.
Amino Acids ; 42(1): 271-83, 2012 Jan.
Article in English | MEDLINE | ID: mdl-21082205

ABSTRACT

Proteins fold through a two-state (TS), with no visible intermediates, or a multi-state (MS), via at least one intermediate, process. We analyze sequence-derived factors that determine folding types by introducing a novel sequence-based folding type predictor called FOKIT. This method implements a logistic regression model with six input features which hybridize information concerning amino acid composition and predicted secondary structure and solvent accessibility. FOKIT provides predictions with average Matthews correlation coefficient (MCC) between 0.58 and 0.91 measured using out-of-sample tests on four benchmark datasets. These results are shown to be competitive or better than results of four modern predictors. We also show that FOKIT outperforms these methods when predicting chains that share low similarity with the chains used to build the model, which is an important advantage given the limited number of annotated chains. We demonstrate that inclusion of solvent accessibility helps in discrimination of the folding kinetic types and that three of the features constitute statistically significant markers that differentiate TS and MS folders. We found that the increased content of exposed Trp and buried Leu are indicative of the MS folding, which implies that the exposure/burial of certain hydrophobic residues may play important role in the formation of the folding intermediates. Our conclusions are supported by two case studies.


Subject(s)
Proteins/analysis , Sequence Analysis, Protein , Databases, Protein , Kinetics , Logistic Models , Protein Folding , Protein Structure, Secondary , Solvents/chemistry
12.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3255-3262, 2022.
Article in English | MEDLINE | ID: mdl-34529570

ABSTRACT

One important task in single-cell analysis is to quantify the differentiation potential of single cells. Though various single-cell potency measures have been proposed, they are based on individual biological sources, thus not robust and reliable. It is still a challenge to combine multiple sources to generate a relatively reliable and robust measure to estimate differentiation. In this paper, we propose a New Centrality measure with Gene ontology information (NCG) to estimate single-cell potency. NCG is designed by combining network topology property with edge clustering coefficient, and gene function information using gene ontology function similarity scores. NCG distinguishes pluripotent cells from non-pluripotent cells with high accuracy, correctly ranks different cell types by their differentiation potency, tracks changes during the differentiation process, and constructs the lineage trajectory from human myoblasts into skeletal muscle cells. These indicate that NCG is a reliable and robust measure to estimate single-cell potency. NCG is anticipated to be a useful tool for identifying novel stem or progenitor cell phenotypes from single-cell RNA-Seq data. The source codes and datasets are available at https://github.com/Xinzhe-Ni/NCG.


Subject(s)
Algorithms , Software , Humans , Gene Ontology , Cell Differentiation/genetics , Single-Cell Analysis , Gene Expression Profiling , Sequence Analysis, RNA , Cluster Analysis
13.
Leuk Res ; 117: 106843, 2022 06.
Article in English | MEDLINE | ID: mdl-35512442

ABSTRACT

Little is known regarding whether the cell of origin differs among different leukemia types. To address this fundamental issue, we determined the cell of origin in five distinct types of acute leukemia induced by N-Myc overexpression in mice. CD150+CD48-CD41-CD34-c-Kit+Sca-1+Lin- (KSL) (HSC1) cells, CD150-CD48-CD41-CD34-KSL (HSC2) cells, CD150+CD41+CD34-KSL (HPC1) cells, CD150+CD41+CD34+KSL (HPC2) cells, and CD150-CD41-CD34+KSL (HPC3) cells were purified from the bone marrow of adult C57BL/6 mice, transduced with the N-Myc retrovirus vector, and transplanted into lethally irradiated mice. B-cell acute lymphoblastic leukemia (B-ALL), T-cell acute lymphoblastic leukemia (T-ALL), acute myeloid leukemia (AML), acute undifferentiated leukemia (AUL), and mixed phenotype acute leukemia (MPAL) developed from five populations. RNA sequencing data supported the phenotypical diagnoses of leukemia, except that AUL appeared transcriptionally close to T-ALL. Whole-genome sequencing revealed that retroviral integration sites were irrelevant to the leukemia types and that T-ALL and AML of MPAL shared the same integration site and many gene mutations, suggesting their common origin. Additionally, leukemic stem cells were identified in the KSL cell population, suggesting that the phenotypes of leukemic stem cells are irrelevant to leukemia types. This study provides experimental evidence for the similar and multiple cells of origin in acute leukemia.


Subject(s)
Leukemia, Myeloid, Acute , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma , Animals , Antigens, CD34 , Humans , Leukemia, Myeloid, Acute/genetics , Mice , Mice, Inbred C57BL , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics
14.
IEEE/ACM Trans Comput Biol Bioinform ; 18(5): 2017-2022, 2021.
Article in English | MEDLINE | ID: mdl-31794403

ABSTRACT

Structural flexibility plays an essential role in many biological processes. B-factor is an important indicator to measure the flexibility of protein or RNA structures. Many methods were developed to predict protein B-factors, but few studies have been done for RNA B-factor prediction. In this paper, we proposed a new method RNAbval to predict RNA B-factors using random forest. The method was developed using a comprehensive set of features, including the sequence profile and predicted solvent accessibility. RNAbval achieved an improvement of 9.2-20.5 percent over the state-of-the-art method on two benchmark test datasets. The proposed method is available at http://yanglab.nankai.edu.cn/RNAbval/.


Subject(s)
Computational Biology/methods , RNA , Sequence Analysis, RNA/methods , Crystallography, X-Ray , Machine Learning , Pliability , Proteins/chemistry , Proteins/metabolism , RNA/chemistry , RNA/genetics , RNA/metabolism , Solvents/chemistry , Solvents/metabolism
15.
Nat Commun ; 12(1): 4438, 2021 07 21.
Article in English | MEDLINE | ID: mdl-34290238

ABSTRACT

Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn's webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/.


Subject(s)
Computational Biology/methods , Intrinsically Disordered Proteins/chemistry , Intrinsically Disordered Proteins/metabolism , Machine Learning , Protein Binding , Sequence Analysis, Protein
16.
Mol Ther Nucleic Acids ; 24: 310-324, 2021 Jun 04.
Article in English | MEDLINE | ID: mdl-33850635

ABSTRACT

Hypoxia induces a series of cellular adaptive responses that enable promotion of inflammation and cancer development. Hypoxia-inducible factor-1α (HIF-1α) is involved in the hypoxia response and cancer promotion, and it accumulates in hypoxia and is degraded under normoxic conditions. Here we identify prostate cancer associated transcript-1 (PCAT-1) as a hypoxia-inducible long non-coding RNA (lncRNA) that regulates HIF-1α stability, crucial for cancer progression. Extensive analyses of clinical data indicate that PCAT-1 is elevated in breast cancer patients and is associated with pathological grade, tumor size, and poor clinical outcomes. Through gain- and loss-of-function experiments, we find that PCAT-1 promotes hypoxia-associated breast cancer progression including growth, migration, invasion, colony formation, and metabolic regulation. Mechanistically, PCAT-1 directly interacts with the receptor of activated protein C kinase-1 (RACK1) protein and prevents RACK1 from binding to HIF-1α, thus protecting HIF-1α from RACK1-induced oxygen-independent degradation. These findings provide new insight into lncRNA-mediated mechanisms for HIF-1α stability and suggest a novel role of PCAT-1 as a potential therapeutic target for breast cancer.

17.
Proteins ; 78(9): 2114-30, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20455267

ABSTRACT

Protein folding rates vary by several orders of magnitude and they depend on the topology of the fold and the size and composition of the sequence. Although recent works show that the rates can be predicted from the sequence, allowing for high-throughput annotations, they consider only the sequence and its predicted secondary structure. We propose a novel sequence-based predictor, PFR-AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence, to improve predictions and provide insights into the folding process. The predictor includes three linear regressions for proteins with two-state, multistate, and unknown (mixed-state) folding kinetics. PFR-AF on average outperforms current methods when tested on three datasets. The proposed approach provides high-quality predictions in the absence of similarity between the predicted and the training sequences. The PFR-AF's predictions are characterized by high (between 0.71 and 0.95, depending on the dataset) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates, as measured using out-of-sample tests. Our models reveal that for the two-state chains inclusion of solvent-exposed Ala may accelerate the folding, while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facilitates faster folding and that proteins with larger content of solvent-exposed strands may fold at a slower pace. The increased flexibility of the solvent-exposed residues is shown to elongate folding, which also holds, with a lower correlation, for buried residues. Two case studies are included to support our findings.


Subject(s)
Computational Biology/methods , Protein Folding , Proteins , Databases, Protein , Kinetics , Linear Models , Protein Conformation , Proteins/chemistry , Proteins/metabolism , Sequence Analysis, Protein , Solvents
18.
Biomolecules ; 10(6)2020 06 07.
Article in English | MEDLINE | ID: mdl-32517331

ABSTRACT

Computational prediction of ion channels facilitates the identification of putative ion channels from protein sequences. Several predictors of ion channels and their types were developed in the last quindecennial. While they offer reasonably accurate predictions, they also suffer a few shortcomings including lack of availability, parallel prediction mode, single-label prediction (inability to predict multiple channel subtypes), and incomplete scope (inability to predict subtypes of the voltage-gated channels). We developed a first-of-its-kind PSIONplusm method that performs sequential multi-label prediction of ion channels and their subtypes for both voltage-gated and ligand-gated channels. PSIONplusm sequentially combines the outputs produced by three support vector machine-based models from the PSIONplus predictor and is available as a webserver. Empirical tests show that PSIONplusm outperforms current methods for the multi-label prediction of the ion channel subtypes. This includes the existing single-label methods that are available to the users, a naïve multi-label predictor that combines results produced by multiple single-label methods, and methods that make predictions based on sequence alignment and domain annotations. We also found that the current methods (including PSIONplusm) fail to accurately predict a few of the least frequently occurring ion channel subtypes. Thus, new predictors should be developed when a larger quantity of annotated ion channels will be available to train predictive models.


Subject(s)
Algorithms , Computational Biology , Ion Channels/chemistry , Software
19.
Curr Drug Targets ; 20(5): 579-592, 2019.
Article in English | MEDLINE | ID: mdl-30360734

ABSTRACT

BACKGROUND: Ion channels are a large and growing protein family. Many of them are associated with diseases, and consequently, they are targets for over 700 drugs. Discovery of new ion channels is facilitated with computational methods that predict ion channels and their types from protein sequences. However, these methods were never comprehensively compared and evaluated. OBJECTIVE: We offer first-of-its-kind comprehensive survey of the sequence-based predictors of ion channels. We describe eight predictors that include five methods that predict ion channels, their types, and four classes of the voltage-gated channels. We also develop and use a new benchmark dataset to perform comparative empirical analysis of the three currently available predictors. RESULTS: While several methods that rely on different designs were published, only a few of them are currently available and offer a broad scope of predictions. Support and availability after publication should be required when new methods are considered for publication. Empirical analysis shows strong performance for the prediction of ion channels and modest performance for the prediction of ion channel types and voltage-gated channel classes. We identify a substantial weakness of current methods that cannot accurately predict ion channels that are categorized into multiple classes/types. CONCLUSION: Several predictors of ion channels are available to the end users. They offer practical levels of predictive quality. Methods that rely on a larger and more diverse set of predictive inputs (such as PSIONplus) are more accurate. New tools that address multi-label prediction of ion channels should be developed.


Subject(s)
Computational Biology/methods , Ion Channels/genetics , Amino Acid Sequence , Animals , Benchmarking , Humans , Ion Channels/classification , Ion Channels/metabolism
20.
Chemosphere ; 228: 398-411, 2019 Aug.
Article in English | MEDLINE | ID: mdl-31048237

ABSTRACT

Endocrine disruptor chemicals induce adverse effects to animals' development, reproduction and behavior in environment. We investigated the effects of fluorene-9-bisphenol (BHPF), one substitute of bisphenol A, on courtship behavior and exploratory behavior of adult zebrafish. Customized apparatus was used to evaluate courtship behavior. The result showed that the male spent less time with BHPF and anti-oestrogenic fulvestrant (FULV) treated female in region of approaching (ROA). Courtship index between BHPF-exposed female and male decreased. The body orientation of BHPF- and FULV-exposed female to male decreased. Furthermore, BHPF exposure downregulated the expression of genes related to estrogen receptor, steroidogenesis and upregulated oxidative stress related genes. It indicated that BHPF exposure interfered the preference of male and female in courtship, and induced detrimental effects on reproduction. BHPF treatment decreased locomotor activity and time spent in top, increased freezing bouts, and induced anxiety/depression-like behavior. The tyrosine hydroxylase in brain decreased under BHPF exposure. Here we showed the potential adverse effects of BHPF on reproduction and exploratory behaviors.


Subject(s)
Benzhydryl Compounds/adverse effects , Exploratory Behavior/drug effects , Fluorenes/chemistry , Phenols/adverse effects , Reproduction/drug effects , Animals , Benzhydryl Compounds/chemistry , Female , Phenols/chemistry , Zebrafish
SELECTION OF CITATIONS
SEARCH DETAIL