Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Braz. arch. biol. technol ; 64: e21200118, 2021. tab, graf
Artigo em Inglês | LILACS | ID: biblio-1339316

RESUMO

Abstract This paper focuses on developing a reduced distance matrix to improve the computational performance during the protein interactions clustering. This proposed matrix considers as centroids two alpha carbon atoms from a protein structure and stores the distances between these centroids and the other atoms from this same structure. Each row in this matrix represents a database record and each column is a distance value. Through this build matrix, clusters were performed using K-Means Clustering. The precision and performance of this presented technique were compared with aCSM, RID and another distance matrix methodology that considers the distances between all atoms from each protein structure. The results were satisfactory. The reduced distance matrix obtained a high precision and the best computational performance.


Assuntos
Mapas de Interação de Proteínas , Carbono , Análise por Conglomerados , Metodologias Computacionais
2.
Biomed Res Int ; 2015: 394157, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25811026

RESUMO

Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity.


Assuntos
Algoritmos , Genoma , Proteoma/química , Sequências Repetitivas de Ácido Nucleico , Software , Loci Gênicos , Proteína 1 de Superfície de Merozoito/química , Nucleotídeos/genética , Proteínas de Protozoários/química
3.
Artigo em Inglês | MEDLINE | ID: mdl-21519118

RESUMO

A large number of unclassified sequences is still found in public databases, which suggests that there is still need for new investigations in the area. In this contribution, we present a methodology based on Artificial Neural Networks for protein functional classification. A new protein coding scheme, called here Extended-Sequence Coding by Sliding Windows, is presented with the goal of overcoming some of the difficulties of the well method Sequence Coding by Sliding Window. The new protein coding scheme uses more than one sliding window length with a weight factor that is proportional to the window length, avoiding the ambiguity problem without ignoring the identity of small subsequences Accuracy for Sequence Coding by Sliding Windows ranged from 60.1 to 77.7 percent for the first bacterium protein set and from 61.9 to 76.7 percent for the second one, whereas the accuracy for the proposed Extended-Sequence Coding by Sliding Windows scheme ranged from 70.7 to 97.1 percent for the first bacterium protein set and from 61.1 to 93.3 percent for the second one. Additionally, protein sequences classified inconsistently by the Artificial Neural Networks were analyzed by CD-Search revealing that there are some disagreement in public repositories, calling the attention for the relevant issue of error propagation in annotated databases due the incorrect transferred annotations.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Proteínas/classificação , Bases de Dados de Proteínas , Anotação de Sequência Molecular/métodos , Redes Neurais de Computação
4.
Genet. mol. biol ; 27(4): 673-678, Dec. 2004. ilus, tab, graf
Artigo em Inglês | LILACS | ID: lil-391246

RESUMO

A new scheme for representing proteins of different lengths in number of amino acids that can be presented to a fixed number of inputs Artificial Neural Networks (ANNs) speel-out classification is described. K-Means's clustering of the new vectors with subsequent classification was then possible with the dimension reduction technique Principal Component Analysis applied previously. The new representation scheme was applied to a set of 112 antigens sequences from several parasitic helminths, selected in the National Center fo Biotechnology Information and classified into fourth different groups. This bioinformatic tool permitted the establishment of a good correlation with domains that are already well characterized, regardless of the differences between the sequences that were confirmed by the PFAM database. Additionally, sequences were grouped according to their similarity, confirmed by hierarchical clustering using ClustalW.


Assuntos
Animais , Antígenos de Helmintos , Biologia Computacional , Inteligência Artificial , Análise por Conglomerados , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA