Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Res ; 21(11): 1981-94, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21824995

RESUMO

The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.


Assuntos
Dobramento de Proteína , Proteoma/química , Animais , Corismato Mutase/química , Deinococcus/metabolismo , Deinococcus/efeitos da radiação , Proteínas de Drosophila/química , Genoma , Glucosiltransferases/química , Humanos , Camundongos , Anotação de Sequência Molecular , Proteínas Nucleares/química , Proteínas Nucleares/classificação , Plasmodium vivax/metabolismo , Conformação Proteica , Estrutura Terciária de Proteína , Proteínas de Protozoários/química , Controle de Qualidade , Reprodutibilidade dos Testes , Transglutaminases/química , Interface Usuário-Computador
2.
Bioinformatics ; 26(5): 705-7, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-20089515

RESUMO

MOTIVATION: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. AVAILABILITY: The database can be accessed through http://proteinworlddb.org


Assuntos
Bases de Dados de Proteínas , Genômica/métodos , Proteínas/química , Alinhamento de Sequência/métodos , Software , Algoritmos , Genoma , Filogenia , Proteínas/genética
3.
IEEE Trans Inf Technol Biomed ; 13(4): 636-44, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19369162

RESUMO

Breast cancer accounts for about 30% of all cancers and 15% of cancer deaths in women. Advances in computer-assisted analysis hold promise for classifying subtypes of disease and improving prognostic accuracy. We introduce a grid-enabled decision support system for performing automatic analysis of imaged breast tissue microarrays. To date, we have processed more than 1,00,000 digitized specimens (1200 x 1200 pixels each) on IBM's World Community Grid (WCG). As a part of the Help Defeat Cancer (HDC) project, we have analyzed that the data returned from WCG along with retrospective patient clinical profiles for a subset of 3744 breast tissue samples, and have reported the results in this paper. Texture-based features were extracted from the digitized specimens, and isometric feature mapping was applied to achieve nonlinear dimension reduction. Iterative prototyping and testing were performed to classify several major subtypes of breast cancer. Overall, the most reliable approach was gentle AdaBoost using an eight-node classification and regression tree as the weak learner. Using the proposed algorithm, a binary classification accuracy of 89% and the multiclass accuracy of 80% were achieved. Throughout the course of the experiments, only 30% of the dataset was used for training.


Assuntos
Neoplasias da Mama/patologia , Diagnóstico por Computador/métodos , Interpretação de Imagem Assistida por Computador/métodos , Análise Serial de Tecidos/métodos , Algoritmos , Neoplasias da Mama/metabolismo , Análise por Conglomerados , Feminino , Histocitoquímica , Humanos , Estudos Retrospectivos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...