Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Mol Cell Proteomics ; 11(7): M111.016808, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22415040

RESUMEN

Identification of protein structural neighbors to a query is fundamental in structure and function prediction. Here we present BS-align, a systematic method to retrieve backbone string neighbors from primary sequences as templates for protein modeling. The backbone conformation of a protein is represented by the backbone string, as defined in Ramachandran space. The backbone string of a query can be accurately predicted by two innovative technologies: a knowledge-driven sequence alignment and encoding of a backbone string element profile. Then, the predicted backbone string is employed to align against a backbone string database and retrieve a set of backbone string neighbors. The backbone string neighbors were shown to be close to native structures of query proteins. BS-align was successfully employed to predict models of 10 membrane proteins with lengths ranging between 229 and 595 residues, and whose high-resolution structural determinations were difficult to elucidate both by experiment and prediction. The obtained TM-scores and root mean square deviations of the models confirmed that the models based on the backbone string neighbors retrieved by the BS-align were very close to the native membrane structures although the query and the neighbor shared a very low sequence identity. The backbone string system represents a new road for the prediction of protein structure from sequence, and suggests that the similarity of the backbone string would be more informative than describing a protein as belonging to a fold.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Proteínas de la Membrana/química , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Conformación Proteica , Proteus mirabilis , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido , Homología Estructural de Proteína
2.
BMC Bioinformatics ; 11: 109, 2010 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-20187963

RESUMEN

BACKGROUND: Recent advances in proteomics technologies such as SELDI-TOF mass spectrometry has shown promise in the detection of early stage cancers. However, dimensionality reduction and classification are considerable challenges in statistical machine learning. We therefore propose a novel approach for dimensionality reduction and tested it using published high-resolution SELDI-TOF data for ovarian cancer. RESULTS: We propose a method based on statistical moments to reduce feature dimensions. After refining and t-testing, SELDI-TOF data are divided into several intervals. Four statistical moments (mean, variance, skewness and kurtosis) are calculated for each interval and are used as representative variables. The high dimensionality of the data can thus be rapidly reduced. To improve efficiency and classification performance, the data are further used in kernel PLS models. The method achieved average sensitivity of 0.9950, specificity of 0.9916, accuracy of 0.9935 and a correlation coefficient of 0.9869 for 100 five-fold cross validations. Furthermore, only one control was misclassified in leave-one-out cross validation. CONCLUSION: The proposed method is suitable for analyzing high-throughput proteomics data.


Asunto(s)
Neoplasias Ováricas/clasificación , Proteómica/métodos , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Biomarcadores de Tumor/análisis , Femenino , Perfilación de la Expresión Génica , Humanos
3.
Biochimie ; 95(2): 354-8, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23116714

RESUMEN

Protein-DNA interactions are involved in many biological processes essential for gene expression and regulation. To understand the molecular mechanisms of protein-DNA recognition, it is crucial to analyze and identify DNA-binding residues of protein-DNA complexes. Here, we proposed a novel descriptor shape string and another two related features shape string PSSM and shape string pair composition to characterize DNA-binding residues. We employed the new features and the position-specific scoring matrix (PSSM) for modeling and prediction. The results of a benchmark dataset showed that our approach significantly improved the accuracy of the predictor. The overall accuracy of our approach reached 85.86% with 85.02% sensitivity and 86.02% specificity. The results also demonstrated that shape string is a powerful descriptor for the prediction of DNA-binding residues. The additional two related features enhanced the predictive value.


Asunto(s)
Algoritmos , ADN/química , Posición Específica de Matrices de Puntuación , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Sitios de Unión , Bases de Datos de Proteínas , Modelos Moleculares , Datos de Secuencia Molecular , Unión Proteica , Dominios y Motivos de Interacción de Proteínas , Sensibilidad y Especificidad
4.
Biochimie ; 94(3): 847-53, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22182488

RESUMEN

Mycobacterium, the most common disease-causing genus, infects billions of people and is notoriously difficult to treat. Understanding the subcellular localization of mycobacterial proteins can provide essential clues for protein function and drug discovery. In this article, we present a novel approach that focuses on local sequence information to identify localization motifs that are generated by a merging algorithm and are selected based on a binomially distributed model. These localization motifs are employed as features for identifying the subcellular localization of mycobacterial proteins. Our approach provides more accurate results than previous methods and was tested on an independent dataset recently obtained from an experimental study to provide a first and reasonably accurate prediction of subcellular localization. Our approach can also be used for large-scale prediction of new protein entries in the UniportKB database and of protein sequences obtained experimentally. In addition, our approach identified many local motifs involved with the subcellular localization that also interact with the environment. Thus, our method may have widespread applications both in the study of the functions of mycobacterial proteins and in the search for a potential vaccine target for designing drugs.


Asunto(s)
Proteínas Bacterianas/metabolismo , Biología Computacional/métodos , Mycobacterium/metabolismo , Algoritmos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA