Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Evol Bioinform Online ; 14: 1176934318777755, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29977111

RESUMO

In this article, we propose a 3-dimensional graphical representation of protein sequences based on 10 physicochemical properties of 20 amino acids and the BLOSUM62 matrix. It contains evolutionary information and provides intuitive visualization. To further analyze the similarity of proteins, we extract a specific vector from the graphical representation curve. The vector is used to calculate the similarity distance between 2 protein sequences. To prove the effectiveness of our approach, we apply it to 3 real data sets. The results are consistent with the known evolution fact and show that our method is effective in phylogenetic analysis.

2.
Comput Biol Med ; 57: 1-7, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25486446

RESUMO

BACKGROUND: The graphical mapping of a protein sequence is more difficult than the graphical mapping of a DNA sequence because of the twenty amino acids and their complicated physicochemical properties. However, the graphical mapping for protein sequences attracts many researchers to develop different mapping methods. Currently, researchers have proposed their mapping methods based on several physicochemical properties. In this article, a new mapping method for protein sequences is developed by considering additional physicochemical properties, which is a simple and effective approach. METHODS: Based on the 12 major physicochemical properties of amino acids and the PCA method, we propose a simple and intuitive 2D graphical mapping method for protein sequences. Next, we extract a 20D vector from the graphical mapping which is used to characterize a protein sequence. RESULTS: The proposed graphical mapping consists of three important properties, one-to-one, no circuit, and good visualization. This mapping contains more physicochemical information. Next, this proposed method is applied to two separate applications. The results illustrate the utility of the proposed method. DISCUSSION: To validate the proposed method, we first give a comparison of protein sequences, which consists of nine ND6 proteins. The similarity/dissimilarity matrix for the ssnine ND6 proteins correctly reveals their evolutionary relationship. Next, we give another application for the cluster analysis of HA genes of influenza A (H1N1) isolates. The results are consistent with the known evolution fact of the H1N1 virus. The separate applications further illustrate the utility of the proposed method.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Animais , Análise por Conglomerados , Humanos , Mamíferos , Análise de Componente Principal
3.
Comput Biol Med ; 42(10): 975-81, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22902300

RESUMO

According to the repetition structure patterns of single-nucleotides, we propose a novel digital representation method to characterize primary DNA sequences. Based on this representation we give a new RP-SP (repeat and space) vector to compute the distance of different sequences. The examination of similarities/dissimilarities among different sequences illustrates the utility of the proposed RP-SP vector distance. Then, we use the proposed RP-SP vector method to analyze two groups of genomes, 15 E. coli genomes and 31 mitochondrial genomes. For comparison, we also apply other alignment-free methods to the two groups of genomes. The results show that the proposed method can distinguish characteristics of different genomes and used to reconstruct the phylogenetic tree of different genomes.


Assuntos
Sequência de Bases , Genômica/métodos , Sequências Repetitivas de Ácido Nucleico , Análise de Sequência de DNA/métodos , Animais , Bases de Dados Genéticas , Genoma Bacteriano , Genoma Mitocondrial , Humanos , Filogenia , Alinhamento de Sequência
4.
Comput Biol Med ; 42(5): 556-63, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22325072

RESUMO

Based on Huffman tree method, we propose a new 2D graphic representation of protein sequence. This representation can completely avoid loss of information in the transfer of data from a protein sequence to its graphic representation. The method consists of two parts. One is about the 0-1 codes of 20 amino acids by Huffman tree with amino acid frequency. The amino acid frequency is defined as the statistical number of an amino acid in the analyzed protein sequences. The other is about the 2D graphic representation of protein sequence based on the 0-1 codes. Then the applications of the method on ten ND5 genes and seven Escherichia coli strains are presented in detail. The results show that the proposed model may provide us with some new sights to understand the evolution patterns determined from protein sequences and complete genomes.


Assuntos
Gráficos por Computador , Proteínas/química , Sequência de Aminoácidos , Desenho Assistido por Computador , Dados de Sequência Molecular
5.
J Comput Chem ; 32(15): 3233-40, 2011 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-21953557

RESUMO

On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of ß-globin gene of 11 species and six ND6 proteins shows the utility of the scheme.


Assuntos
Gráficos por Computador , Análise de Sequência de DNA/métodos , Sequência de Bases , DNA/análise , DNA/genética , Métodos , Proteínas/genética
6.
J Theor Biol ; 280(1): 10-8, 2011 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-21496459

RESUMO

We introduce a weighted graph model to investigate the self-similarity characteristics of eubacteria genomes. The regular treating in similarity comparison about genome is to discover the evolution distance among different genomes. Few people focus their attention on the overall statistical characteristics of each gene compared with other genes in the same genome. In our model, each genome is attributed to a weighted graph, whose topology describes the similarity relationship among genes in the same genome. Based on the related weighted graph theory, we extract some quantified statistical variables from the topology, and give the distribution of some variables derived from the largest social structure in the topology. The 23 eubacteria recently studied by Sorimachi and Okayasu are markedly classified into two different groups by their double logarithmic point-plots describing the similarity relationship among genes of the largest social structure in genome. The results show that the proposed model may provide us with some new sights to understand the structures and evolution patterns determined from the complete genomes.


Assuntos
Bactérias/genética , Genoma Bacteriano/genética , Modelos Genéticos
7.
J Theor Biol ; 272(1): 26-34, 2011 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-21163267

RESUMO

Graphical techniques have become powerful tools for the visualization and analysis of complicated biological systems. However, we cannot give such a graphical representation in a 2D/3D space when the dimensions of the represented data are more than three dimensions. The proposed method, a combination dimensionality reduction approach (CDR), consists of two parts: (i) principal component analysis (PCA) with a newly defined parameter ρ and (ii) locally linear embedding (LLE) with a proposed graphical selection for its optional parameter k. The CDR approach with ρ and k not only avoids loss of principal information, but also sufficiently well preserves the global high-dimensional structures in low-dimensional space such as 2D or 3D. The applications of the CDR on characteristic analysis at different codon positions in genome show that the method is a useful tool by which biologists could find useful biological knowledge.


Assuntos
Bactérias/genética , Códon/genética , Genoma Bacteriano/genética , Redução Dimensional com Múltiplos Fatores/métodos , Composição de Bases/genética , Bases de Dados Genéticas , Nucleotídeos/genética , Análise de Componente Principal , Software , Staphylococcus aureus/genética
8.
J Theor Biol ; 263(2): 227-36, 2010 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-20025888

RESUMO

We introduce a new approach to investigate problem of DNA sequence alignment. The method consists of three parts: (i) simple alignment algorithm, (ii) extension algorithm for largest common substring, (iii) graphical simple alignment tree (GSA tree). The approach firstly obtains a graphical representation of scores of DNA sequences by the scoring equation R(0)*R-S(0)*S-T(0)*(a+bk). Then a GSA tree is constructed to facilitate solving the problem for global alignment of 2 DNA sequences. Finally we give several practical examples to illustrate the utility and practicality of the approach.


Assuntos
DNA/genética , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Bases , DNA/química
9.
J Theor Biol ; 260(1): 104-9, 2009 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-19481099

RESUMO

We introduce a new approach to investigate the dual nucleotides compositions of 11 Gram-positive and 12 Gram-negative eubacteria recently studied by Sorimachi and Okayasu. The approach firstly obtains a 16-dimension vector set of dual nucleotides by PN-curve from the complete genome of organism. Each vector of the set corresponds to a single gene of genome. Then we reduce the 16-dimension vector set to 2-dimension by principal components analysis (PCA). The reduction avoids possible loss of information averaging all 16-dimension vectors. Then we suggest a 2D graphical representation based on the 2-dimension vector to investigate the classification patters among different organisms.


Assuntos
DNA Bacteriano/genética , Bactérias Gram-Negativas/genética , Bactérias Gram-Positivas/genética , Genoma Bacteriano , Bactérias Gram-Negativas/classificação , Bactérias Gram-Positivas/classificação , Análise de Componente Principal
10.
Comput Biol Med ; 39(4): 388-91, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19261267

RESUMO

Based on digital signal method, we propose a new representation of DNA primary sequence. The representation can completely avoid loss of information in the transfer of data from a DNA sequence to its mathematical representation. Afterwards, we suggest one such approach to reach quantification of similarities based on digital signal similarity theory. The examination of similarities/dissimilarities among the coding sequences of the first exon of beta-globin gene of 11 species shows the utility of the scheme.


Assuntos
Sequência de Bases , Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Globinas beta/genética , Algoritmos , Animais , Simulação por Computador , DNA/genética , Éxons , Humanos , Modelos Teóricos , Conformação de Ácido Nucleico , Reprodutibilidade dos Testes , Especificidade da Espécie , Globinas beta/química
11.
J Theor Biol ; 249(4): 681-90, 2007 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-17931659

RESUMO

We introduce a 3D graphical representation of DNA sequences based on the pairs of dual nucleotides (DNs). Based on this representation, we consider some mathematical invariants and construct two 16-component vectors associated with these invariants. The vectors are used to characterize and compare the complete coding sequence part of beta globin gene of nine different species. The examination of similarities/dissimilarities illustrates the utility of the approach.


Assuntos
DNA/genética , Modelos Genéticos , Conformação de Ácido Nucleico , Animais , Sequência de Bases , Biologia Computacional/métodos , Globinas/genética , Humanos , Dados de Sequência Molecular , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...