Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Am J Hum Genet ; 105(3): 509-525, 2019 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-31422817

RESUMO

The human RNA helicase DDX6 is an essential component of membrane-less organelles called processing bodies (PBs). PBs are involved in mRNA metabolic processes including translational repression via coordinated storage of mRNAs. Previous studies in human cell lines have implicated altered DDX6 in molecular and cellular dysfunction, but clinical consequences and pathogenesis in humans have yet to be described. Here, we report the identification of five rare de novo missense variants in DDX6 in probands presenting with intellectual disability, developmental delay, and similar dysmorphic features including telecanthus, epicanthus, arched eyebrows, and low-set ears. All five missense variants (p.His372Arg, p.Arg373Gln, p.Cys390Arg, p.Thr391Ile, and p.Thr391Pro) are located in two conserved motifs of the RecA-2 domain of DDX6 involved in RNA binding, helicase activity, and protein-partner binding. We use functional studies to demonstrate that the first variants identified (p.Arg373Gln and p.Cys390Arg) cause significant defects in PB assembly in primary fibroblast and model human cell lines. These variants' interactions with several protein partners were also disrupted in immunoprecipitation assays. Further investigation via complementation assays included the additional variants p.Thr391Ile and p.Thr391Pro, both of which, similarly to p.Arg373Gln and p.Cys390Arg, demonstrated significant defects in P-body assembly. Complementing these molecular findings, modeling of the variants on solved protein structures showed distinct spatial clustering near known protein binding regions. Collectively, our clinical and molecular data describe a neurodevelopmental syndrome associated with pathogenic missense variants in DDX6. Additionally, we suggest DDX6 join the DExD/H-box genes DDX3X and DHX30 in an emerging class of neurodevelopmental disorders involving RNA helicases.

2.
Bioinformatics ; 35(21): 4478-4479, 2019 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-31086968

RESUMO

MOTIVATION: The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same gene. However, most of the existing variant annotation tools do not reference the score range of benign population variants on gene level. RESULTS: We present a web-application, Variant Score Ranker, which enables users to rapidly annotate variants and perform gene-specific variant score ranking on the population level. We also provide an intuitive example of how gene- and population-calibrated variant ranking scores can improve epilepsy variant prioritization. AVAILABILITY AND IMPLEMENTATION: http://vsranker.broadinstitute.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

3.
Pediatr Neurol ; 97: 18-25, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30928302

RESUMO

Cyclin-dependent kinase-like 5 (CDKL5) deficiency disorder (CDD) is a developmental encephalopathy caused by pathogenic variants in the gene CDKL5. This unique disorder includes early infantile onset refractory epilepsy, hypotonia, developmental intellectual and motor disabilities, and cortical visual impairment. We review the clinical presentations and genetic variations in CDD based on a systematic literature review and experience in the CDKL5 Centers of Excellence. We propose minimum diagnostic criteria. Pathogenic variants include deletions, truncations, splice variants, and missense variants. Pathogenic missense variants occur exclusively within the kinase domain or affect splice sites. The CDKL5 protein is widely expressed in the brain, predominantly in neurons, with roles in cell proliferation, neuronal migration, axonal outgrowth, dendritic morphogenesis, and synapse development. The molecular biology of CDD is revealing opportunities in precision therapy, with phase 2 and 3 clinical trials underway or planned to assess disease specific and disease modifying treatments.

4.
Bioinformatics ; 34(19): 3289-3299, 2018 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-29726965

RESUMO

Motivation: Machine learning plays a substantial role in bioscience owing to the explosive growth in sequence data and the challenging application of computational methods. Peptide-recognition domains (PRDs) are critical as they promote coupled-binding with short peptide-motifs of functional importance through transient interactions. It is challenging to build a reliable predictor of peptide-binding residue in proteins with diverse types of PRDs from protein sequence alone. On the other hand, it is vital to cope up with the sequencing speed and to broaden the scope of study. Results: In this paper, we propose a machine-learning-based tool, named PBRpredict, to predict residues in peptide-binding domains from protein sequence alone. To develop a generic predictor, we train the models on peptide-binding residues of diverse types of domains. As inputs to the models, we use a high-dimensional feature set of chemical, structural and evolutionary information extracted from protein sequence. We carefully investigate six different state-of-the-art classification algorithms for this application. Finally, we use the stacked generalization approach to non-linearly combine a set of complementary base-level learners using a meta-level learner which outperformed the winner-takes-all approach. The proposed predictor is found competitive based on statistical evaluation. Availability and implementation: PBRpredict-Suite software: http://cs.uno.edu/~tamjid/Software/PBRpredict/pbrpredict-suite.zip. Supplementary information: Supplementary data are available at Bioinformatics online.

5.
J Theor Biol ; 441: 44-57, 2018 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-29305182

RESUMO

Accessible surface area (ASA) of a protein residue is an effective feature for protein structure prediction, binding region identification, fold recognition problems etc. Improving the prediction of ASA by the application of effective feature variables is a challenging but explorable task to consider, specially in the field of machine learning. Among the existing predictors of ASA, REGAd3p is a highly accurate ASA predictor which is based on regularized exact regression with polynomial kernel of degree 3. In this work, we present a new predictor RBSURFpred, which extends REGAd3p on several dimensions by incorporating 58 physicochemical, evolutionary and structural properties into 9-tuple peptides via Chou's general PseAAC, which allowed us to obtain higher accuracies in predicting both real-valued and binary ASA. We have compared RBSURFpred for both real and binary space predictions with state-of-the-art predictors, such as REGAd3p and SPIDER2. We also have carried out a rigorous analysis of the performance of RBSURFpred in terms of different amino acids and their properties, and also with biologically relevant case-studies. The performance of RBSURFpred establishes itself as a useful tool for the community.

6.
PLoS One ; 11(9): e0161452, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27588752

RESUMO

A set of features computed from the primary amino acid sequence of proteins, is crucial in the process of inducing a machine learning model that is capable of accurately predicting three-dimensional protein structures. Solutions for existing protein structure prediction problems are in need of features that can capture the complexity of molecular level interactions. With a view to this, we propose a novel approach to estimate position specific estimated energy (PSEE) of a residue using contact energy and predicted relative solvent accessibility (RSA). Furthermore, we demonstrate PSEE can be reasonably estimated based on sequence information alone. PSEE is useful in identifying the structured as well as unstructured or, intrinsically disordered region of a protein by computing favorable and unfavorable energy respectively, characterized by appropriate threshold. The most intriguing finding, verified empirically, is the indication that the PSEE feature can effectively classify disorder versus ordered residues and can segregate different secondary structure type residues by computing the constituent energies. PSEE values for each amino acid strongly correlate with the hydrophobicity value of the corresponding amino acid. Further, PSEE can be used to detect the existence of critical binding regions that essentially undergo disorder-to-order transitions to perform crucial biological functions. Towards an application of disorder prediction using the PSEE feature, we have rigorously tested and found that a support vector machine model informed by a set of features including PSEE consistently outperforms a model with an identical set of features with PSEE removed. In addition, the new disorder predictor, DisPredict2, shows competitive performance in predicting protein disorder when compared with six existing disordered protein predictors.


Assuntos
Modelos Moleculares , Conformação Proteica , Proteínas/metabolismo , Sequência de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Estrutura Secundária de Proteína
7.
J Theor Biol ; 398: 112-21, 2016 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-27029514

RESUMO

The success of solving the protein folding and structure prediction problems in molecular and structural biology relies on an accurate energy function. With the rapid advancement in the computational biology and bioinformatics fields, there is a growing need of solving unknown fold and structure faster and thus an accurate energy function is indispensable. To address this need, we develop a new potential function, namely 3DIGARS3.0, which is a linearly weighted combination of 3DIGARS, mined accessible surface area (ASA) and ubiquitously computed Phi (uPhi) and Psi (uPsi) energies - optimized by a Genetic Algorithm (GA). We use a dataset of 4332 protein-structures to generate uPhi and uPsi based score libraries to be used within the core 3DIGARS method. The optimized weight of each component is obtained by applying Genetic Algorithm based optimization on three challenging decoy sets. The improved 3DIGARS3.0 outperformed state-of-the-art methods significantly based on a set of independent test datasets.


Assuntos
Conformação Proteica , Proteínas/química , Projetos de Pesquisa , Bases de Dados de Proteínas , Solventes , Termodinâmica
8.
Comput Biol Chem ; 61: 162-77, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26878130

RESUMO

Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2 × 2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20 × 20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.


Assuntos
Algoritmos , Proteínas/química , Modelos Teóricos , Conformação Proteica
9.
J Theor Biol ; 389: 60-71, 2016 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-26549467

RESUMO

Secondary structure (SS) refers to the local spatial organization of a polypeptide backbone atoms of a protein. Accurate prediction of SS can provide crucial features to form the next higher level of 3D structure of a protein accurately. SS has three different major components, helix (H), beta (E) and coil (C). Most of the SS predictors express imbalanced accuracies by claiming higher prediction performances in predicting H and C, and on the contrary having low accuracy in E predictions. E component being in low count, a predictor may show very good overall performance by over-predicting H and C and under predicting E, which can make such predictors biologically inapplicable. In this work we are motivated to develop a balanced SS predictor by incorporating 33 physicochemical properties into 15-tuble peptides via Chou׳s general PseAAC, which allowed obtaining higher accuracies in predicting all three SS components. Our approach uses three different support vector machines for binary classification of the major classes and then form optimized multiclass predictor using genetic algorithm (GA). The trained three binary SVMs are E versus non-E (i.e., E/¬E), C/¬C and H/¬H. This GA based optimized and combined three class predictor, called cSVM, is further combined with SPINE X to form the proposed final balanced predictor, called MetaSSPred. This novel paradigm assists us in optimizing the precision and recall. We prepared two independent test datasets (CB471 and N295) to compare the performance of our predictors with SPINE X. MetaSSPred significantly increases beta accuracy (QE) for both the datasets. QE score of MetaSSPred on CB471 and N295 were 71.7% and 74.4% respectively. These scores are 20.9% and 19.0% improvement over the QE scores given by SPINE X alone on CB471 and N295 datasets respectively. Standard deviations of the accuracies across three SS classes of MetaSSPred on CB471 and N295 datasets were 4.2% and 2.3% respectively. On the other hand, for SPINE X, these values are 12.9% and 10.9% respectively. These findings suggest that the proposed MetaSSPred is a well-balanced SS predictor compared to the state-of-the-art SPINE X predictor.


Assuntos
Biologia Computacional/métodos , Estrutura Secundária de Proteína , Proteínas/química , Algoritmos , Bases de Dados de Proteínas , Protease de HIV/química , Internet , Probabilidade , Reprodutibilidade dos Testes , Análise de Sequência de Proteína , Máquina de Vetores de Suporte
10.
PLoS One ; 10(10): e0141551, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26517719

RESUMO

Intrinsically disordered proteins or, regions perform important biological functions through their dynamic conformations during binding. Thus accurate identification of these disordered regions have significant implications in proper annotation of function, induced fold prediction and drug design to combat critical diseases. We introduce DisPredict, a disorder predictor that employs a single support vector machine with RBF kernel and novel features for reliable characterization of protein structure. DisPredict yields effective performance. In addition to 10-fold cross validation, training and testing of DisPredict was conducted with independent test datasets. The results were consistent with both the training and test error minimal. The use of multiple data sources, makes the predictor generic. The datasets used in developing the model include disordered regions of various length which are categorized as short and long having different compositions, different types of disorder, ranging from fully to partially disordered regions as well as completely ordered regions. Through comparison with other state of the art approaches and case studies, DisPredict is found to be a useful tool with competitive performance. DisPredict is available at https://github.com/tamjidul/DisPredict_v1.0.


Assuntos
Algoritmos , Proteínas Intrinsicamente Desordenadas/química , Máquina de Vetores de Suporte , Sequência de Aminoácidos , Área Sob a Curva , Cristalografia por Raios X , Conjuntos de Dados como Assunto , Probabilidade , Estrutura Secundária de Proteína , Curva ROC
11.
J Theor Biol ; 380: 380-91, 2015 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-26092374

RESUMO

An accurate prediction of real value accessible surface area (ASA) from protein sequence alone has wide application in the field of bioinformatics and computational biology. ASA has been helpful in understanding the 3-dimensional structure and function of a protein, acting as high impact feature in secondary structure prediction, disorder prediction, binding region identification and fold recognition applications. To enhance and support broad applications of ASA, we have made an attempt to improve the prediction accuracy of absolute accessible surface area by developing a new predictor paradigm, namely REGAd(3)p, for real value prediction through classical Exact Regression with Regularization and polynomial kernel of degree 3 which was further optimized using Genetic Algorithm. ASA assisting effective energy function, motivated us to enhance the accuracy of predicted ASA for better energy function application. Our ASA prediction paradigm was trained and tested using a new benchmark dataset, proposed in this work, consisting of 1001 and 298 protein chains, respectively. We achieved maximum Pearson Correlation Coefficient (PCC) of 0.76 and 1.45% improved PCC when compared with existing top performing predictor, SPINE-X, in ASA prediction on independent test set. Furthermore, we modeled the error between actual and predicted ASA in terms of energy and combined this energy linearly with the energy function 3DIGARS which resulted in an effective energy function, namely 3DIGARS2.0, outperforming all the state-of-the-art energy functions. Based on Rosetta and Tasser decoy-sets 3DIGARS2.0 resulted 80.78%, 73.77%, 141.24%, 16.52%, and 32.32% improvement over DFIRE, RWplus, dDFIRE, GOAP and 3DIGARS respectively.


Assuntos
Modelos Teóricos , Propriedades de Superfície , Aminoácidos/química , Estrutura Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA