Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Mol Model ; 24(4): 76, 2018 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-29500695

RESUMEN

In this paper, diversity and conservation in the 'landscape' of random variation of protein tertiary structures are explored for quantitative feature-vector models of major types of functionally important 3D structural motifs. For this, I have deployed a recently developed nonparametric regression (NPR)-based multidimensional copula method of simulation. Apart from improved accuracy of multidimensional random sample generation, the simulation provides additional insight into diversity in the protein structural landscape in terms of random variation in the feature-vector. It shows the relative importance of several features, with biological implications, in conservation of motifs. Mapping of this landscape in distance-preserving 2D eigenspace also shows consistency in demarcation of different motif classes and preservation of their characteristic patterns in this 2D space.


Asunto(s)
Secuencias de Aminoácidos , Modelos Moleculares , Estructura Terciaria de Proteína , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Secuencia Conservada , Evolución Molecular , Variación Genética , Proteínas/genética
2.
J Mol Model ; 20(1): 2077, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24464316

RESUMEN

A quantitative feature-vector representation/model of tertiary structural motifs of proteins is presented. Multiclass logistic regression and a probabilistic neural network were employed to apply this representation to large data sets in order to classify them into major families of distinct motif types (including those of functional importance) with high statistical confidence. Scatter plots of random samples of these motifs were obtained through two-dimensional transformation of the feature vector by metric MDS (multidimensional scaling). The plots showed distinct clusters and shapes for different families and demonstrated the relevance and importance of the proposed quantitative feature-vector representation for characterizing protein tertiary structural motifs. The relative importance of the features was analyzed. The scope of the present work to investigate Nature's prioritization and optimization of functional motif structures is highlighted.


Asunto(s)
Proteínas/química , Algoritmos , Simulación por Computador , Modelos Logísticos , Modelos Moleculares , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína
3.
J Mol Model ; 18(6): 2741-54, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22116606

RESUMEN

Predictive classification of major structural families and fold types of proteins is investigated deploying logistic regression. Only five to seven dimensional quantitative feature vector representations of tertiary structures are found adequate. Results for benchmark sample of non-homologous proteins from SCOP database are presented. Importance of this work as compared to homology modeling and best-known quantitative approaches is highlighted.


Asunto(s)
Simulación por Computador , Análisis de Componente Principal , Proteínas/química , Algoritmos , Interpretación Estadística de Datos , Modelos Logísticos , Modelos Moleculares , Pliegue de Proteína , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína
4.
Protein Pept Lett ; 17(10): 1198-206, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20450488

RESUMEN

Characteristic peptides of the protein segments having common secondary folds are obtained for the I-sites library using maximal position specific probability scores. The secondary structures of these peptides are predicted deploying two best-known computational methods. These are validated with significant accuracy against the corresponding motifs. The characteristic peptides also match with those computed using a Bayesian modeling approach with Markov Chain Monte Carlo Simulation. Percentage representation of the characteristic peptides in the protein structural and functional families shows some interesting results with potential applications in protein structural genomics.


Asunto(s)
Secuencias de Aminoácidos , Modelos Moleculares , Péptidos/química , Secuencia de Aminoácidos , Simulación por Computador , Datos de Secuencia Molecular , Péptidos/genética , Estructura Secundaria de Proteína
5.
Protein Pept Lett ; 16(11): 1393-8, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20001928

RESUMEN

We have attempted finding common sequential patterns among protein antigens. For this, we have used Gibbs multiple motif sampler on the set of all non-redundant antigenic sequences available in curated databanks. Several sequential motifs are obtained on these sequences when the amino acids are represented according to their similarity clusters. Significantly high proportions of known B-cell epitope sites are found within or adjacent to these motifs, thus indicating a possibility of linear epitope signatures. These findings may offer important applications in synthesis of peptide vaccines. A predictive example in this regard is presented.


Asunto(s)
Algoritmos , Secuencias de Aminoácidos , Biología Computacional/métodos , Epítopos de Linfocito B/química , Secuencia de Aminoácidos , Antígenos/química , Antígenos/genética , Antígenos/metabolismo , Bases de Datos de Proteínas , Epítopos de Linfocito B/genética , Epítopos de Linfocito B/metabolismo , Datos de Secuencia Molecular , Reproducibilidad de los Resultados
7.
Protein Pept Lett ; 14(6): 518-27, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17627590

RESUMEN

The catalytic regions of Protein Kinases are known to have similarity in primary chains. However, it is not known whether there is a signature profile specific to a particular catalytic region? Whether the signature profile, if any, is unique to a protein kinases family in a particular species or in a group of species? We have attempted analyzing some of these aspects by statistical data mining using an authentic and exhaustive database of Protein Kinases. The results reveal interesting features and provide some new directions to look at their applications.


Asunto(s)
Secuencias de Aminoácidos , Dominio Catalítico , Proteínas Quinasas/química , Proteínas Quinasas/metabolismo , Subunidades de Proteína/química , Secuencia de Aminoácidos , Análisis de Varianza , Biología Computacional , Bases de Datos de Proteínas , Datos de Secuencia Molecular , Subunidades de Proteína/metabolismo
8.
Protein Pept Lett ; 14(6): 536-42, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17627593

RESUMEN

Design and synthesis of peptide vaccines is of significant pharmaceutical importance. A knowledge based statistical model is fitted here for prediction of binding of an antigenic site of a protein or a B-cell epitope on a CDR (complementarity determining region) of an immunoglobulin. Linear analogues of the 3D structure of the epitopes are computed using this model. Extension for prediction of peptide epitopes from the protein sequence alone is also presented. Validation results show promising potential of this approach in computer-aided peptide vaccine production. The computed probabilities of binding also provide a pioneering approach for ab-initio prediction of 'potency' of protein or peptide vaccines modeled by this method.


Asunto(s)
Biología Computacional/métodos , Mapeo Epitopo/métodos , Modelos Inmunológicos , Modelos Estadísticos , Vacunas de Subunidad/inmunología , Algoritmos , Regiones Determinantes de Complementariedad/inmunología , Epítopos de Linfocito B/inmunología , Bases del Conocimiento , Modelos Logísticos , Péptidos/inmunología , Unión Proteica , Reproducibilidad de los Resultados , Alineación de Secuencia , Vacunas de Subunidad/química
9.
J Mol Model ; 13(1): 275-82, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17028865

RESUMEN

Identification of structural domains in uncharacterized protein sequences is important in the prediction of protein tertiary folds and functional sites, and hence in designing biologically active molecules. We present a new predictive computational method of classifying a protein into single, two continuous or two discontinuous domains using Bayesian Data Mining. The algorithm requires only the primary sequence and computer-predicted secondary structure. It incorporates correlation patterns between certain 3-dimensional motifs and some local helical folds found conserved in the vicinity of protein domains with high statistical confidence. The prediction of domain-class by this computationally simple and fast method shows good accuracy of prediction-average accuracies 83.3% for single domain, 60% for two continuous and 65.7% for two discontinuous domain proteins. Experiments on the large validation sample show its performance to be significantly better than that of DGS and DomSSEA. Computations of Bayesian probabilities show important features in terms of correlation of certain conserved patterns of secondary folds and tertiary motifs and give new insight. Applications for improved accuracy of predicting domain boundary points relevant to protein structural and functional modeling are also highlighted.


Asunto(s)
Proteómica/métodos , Algoritmos , Secuencias de Aminoácidos , Teorema de Bayes , Biología Computacional , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Molecular , Valor Predictivo de las Pruebas , Conformación Proteica , Estructura Terciaria de Proteína , Programas Informáticos
10.
Protein Pept Lett ; 13(6): 587-93, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16842114

RESUMEN

Hypervariability of the complementary determining regions in characteristic structure of Immunoglobulins and the distinct, cell-specific expressions of the genes coding for this important class of proteins pose intriguing problems in experimental and computational/informatics research requiring a special approach different from those for the other proteins. We present here an Average Linkage Hierarchical Clustering of the Homosapien VDJ genes and the Immunoglobulin polypeptides generated by them using special kind of data structures and correlation matrices in place of the microarray data. The results reveal interesting clues on the heterogeneity of exon - intron locations in these gene-families and its possible role in hypervariability of the Immunoglobulins.


Asunto(s)
Reordenamiento Génico de Linfocito B/genética , Genes de Inmunoglobulinas/genética , Inmunoglobulinas/genética , Mapeo Cromosómico , Exones/genética , Genoma Humano , Humanos , Inmunoglobulinas/química , Intrones/genética , Familia de Multigenes
11.
J Mol Model ; 12(6): 943-52, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16649034

RESUMEN

We have found certain conserved motifs and secondary structural patterns present in the vicinity of interior domain boundary points (dbps) by a data-driven approach without any a priori constraint on the type and number of such features, and without any requirement of sequence homology. We have used these motifs and patterns to rerank the solutions obtained by the well-known domain guess by size (DGS) algorithm. We predict, overall, five solutions. The average accuracy of overall (i.e., top five) predictions by our method [domain boundary prediction using conserved patterns (DPCP)] has improved the average accuracy of the top five solutions of DGS from 71.74 to 82.88 %, in the case of two-continuous-domain proteins, and from 21.38 to 80.56 %, for two-discontinuous-domain proteins. Considering only the top solution, the gains in accuracy are from 0 to 72.74 % for two-continuous-domain proteins with chain lengths up to 300 residues, and from 0 to 62.85 % for those with up to 400 residues. In the case of discontinuous domains, top_min solutions (the minimum number of solutions required for predicting all dbps of a protein) of DPCP improve the average accuracy of DGS prediction from 12.5 to 76.3 % in proteins with chain lengths up to 300 residues, and from 13.33 to 70.84 % for proteins with up to 400 residues. In our validation experiments, the performance of DPCP was also found to be superior to that of domain identification from secondary structure element alignment (DomSSEA), the best method reported so far for efficient prediction of domain boundaries using predicted secondary structure. The average accuracies of the topmost solution of DomSSEA are 61 and 52 % for proteins with up to 300 residues and 400, respectively, in the case of continuous domains; the corresponding accuracies for the discontinuous case are 28 and 21 %.


Asunto(s)
Secuencia Conservada , Modelos Moleculares , Proteínas/química , Secuencias de Aminoácidos , Estructura Terciaria de Proteína
12.
J Mol Model ; 11(6): 481-8, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16094534

RESUMEN

PROPAINOR is a new algorithm developed for ab initio prediction of the 3D structures of proteins using knowledge-based nonparametric multivariate statistical methods. This algorithm is found to be most efficient in terms of computational simplicity and prediction accuracy for single-domain proteins as compared to other ab initio methods. In this paper, we have used the algorithm for the atomic structure prediction of a multi-domain (two-domain) calcium-binding protein, whose solution structure has been deposited in the PDB recently (PDB ID: 1JFK). We have studied the sensitivity of the predicted structure to NMR distance restraints with their incorporation as an additional input. Further, we have compared the predicted structures in both these cases with the NMR derived solution structure reported earlier. We have also validated the refined structure for proper stereochemistry and favorable packing environment with good results and elucidated the role of the central linker.


Asunto(s)
Proteínas de Unión al Calcio/química , Proteínas de Unión al Calcio/metabolismo , Calcio/química , Calcio/metabolismo , Animales , Cationes Bivalentes/química , Biología Computacional , Bases de Datos de Proteínas , Motivos EF Hand , Entamoeba histolytica/química , Entamoeba histolytica/metabolismo , Internet , Espectroscopía de Resonancia Magnética , Modelos Moleculares , Estructura Terciaria de Proteína
13.
J Altern Complement Med ; 10(5): 879-89, 2004 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-15650478

RESUMEN

OBJECTIVE: To compute quantitative estimates of the tridosha--the qualitative characterization that constitutes the core of diagnosis and treatment in Ayurveda--to provide a basis for biostatistical analysis of this ancient Indian science, which is a promising field of alternative medicine. SUBJECTS: The data sources were 280 persons from among the residents and visitors/training students at the Brahmvarchas Research Centre and Shantikuj, Hardwar, India. DESIGN/METHODOLOGY: A quantitative measure of the tridosha level (for vata, pitta, and kapha) is obtained by applying an algorithmic heuristic approach to the exhaustive list of qualitative features/factors that are commonly used by Ayurvedic doctors. A knowledge-based concept of worth coefficients and fuzzy multiattribute decision functions are used here for regression modeling. VALIDATION AND APPLICATIONS: Statistical validation on a large sample shows the accuracy of this study's estimates with statistical confidence level above 90%. The estimates are also suited for diagnostic and prognostic applications and systematic drug-response analysis of Ayurvedic (herbal and rasayanam) medicines. An application with regard to the former is elucidated, extensions of which might also be of use in investigating the role of nadis in Ayurvedic healing vis-a-vis acupuncture and acupressure techniques. The importance and scope of this novel approach are discussed. CONCLUSIONS: This pioneering study shows that the concept of tridosha has a sound empirical basis that could be used for the scientific establishment of Ayurveda in a new light.


Asunto(s)
Biometría , Terapias Complementarias/estadística & datos numéricos , Medicina Ayurvédica , Algoritmos , Humanos , Qi , Análisis de Regresión
14.
Comput Biol Chem ; 27(3): 241-52, 2003 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-12927100

RESUMEN

We have formulated the ab-initio prediction of the 3D-structure of proteins as a probabilistic programming problem where the inter-residue 3D-distances are treated as random variables. Lower and upper bounds for these random variables and the corresponding probabilities are estimated by nonparametric statistical methods and knowledge-based heuristics. In this paper we focus on the probabilistic computation of the 3D-structure using these distance interval estimates. Validation of the predicted structures shows our method to be more accurate than other computational methods reported so far. Our method is also found to be computationally more efficient than other existing ab-initio structure prediction methods. Moreover, we provide a reliability index for the predicted structures too. Because of its computational simplicity and its applicability to any random sequence, our algorithm called PROPAINOR (PROtein structure Prediction by AI an Nonparametric Regression) has significant scope in computational protein structural genomics.


Asunto(s)
Algoritmos , Proteínas/química , Proteómica/métodos , Biología Computacional/métodos , Modelos Teóricos , Estructura Molecular , Pliegue de Proteína , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Reproducibilidad de los Resultados , Estereoisomerismo , Relación Estructura-Actividad
15.
J Mol Model ; 8(2): 50-7, 2002 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-12032598

RESUMEN

Human seminal plasma prostatic inhibin (HSPI) is a protein isolated from the human prostate gland. Despite its profound biomedical and biotechnological importance, the 3D structure of this protein of 94 amino acids remains undeciphered. The difficulties in extracting it in pure form and crystallizing it have restrained the determination of its structure experimentally. The homology-based computational methods are also not applicable, as HSPI lacks sufficient sequence homology with known structures in the protein data banks. We have predicted the structure of HSPI by a knowledge-based method using nonparametric multivariate statistical techniques. Stereochemical and other standard validation tests confirm this to be a well-refined structure. The biophysical properties exhibited by this structure are in good agreement with the NMR experimental observations. Docking and other computational studies on this structure provide significant explanation and insight into its binding activities and related biological and immunogenic functions and offer new directions for its potential applications.


Asunto(s)
Modelos Moleculares , Proteínas de Secreción Prostática/química , Sitios de Unión , Sitios de Unión de Anticuerpos , Hormona Folículo Estimulante/fisiología , Humanos , Inmunoglobulina G/química , Inmunoglobulina G/inmunología , Ligandos , Prolactina/química , Prolactina/metabolismo , Proteínas de Secreción Prostática/inmunología , Proteínas de Secreción Prostática/fisiología , Estructura Secundaria de Proteína , Análisis de Secuencia de Proteína , Estadísticas no Paramétricas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA