Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros

Bases de datos
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
J Mol Evol ; 91(1): 93-131, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36587178

RESUMEN

The growth of the genome sequence has become one of the emerging areas in the study of bioinformatics. It has led to an excessive demand for researchers to develop advanced methodologies for evolutionary relationships among species. The alignment-free methods have been proved to be more efficient and appropriate related to time and space than existing alignment-based methods for sequence analysis. In this study, a new alignment-free genome sequence comparison technique is proposed based on the biochemical properties of nucleotides. Each genome sequence can be distributed in four parameters to represent a 21-dimensional numerical descriptor using the Positional Matrix. To substantiate the proposed method, phylogenetic trees are constructed on the viral and mammalian datasets by applying the UPGMA/NJ clustering method. Further, the results of this method are compared with the results of the Feature Frequency Profiles method, the Positional Correlation Natural Vector method, the Graph-theoretic method, the Multiple Encoding Vector method, and the Fuzzy Integral Similarity method. In most cases, it is found that the present method produces more accurate results than the prior methods. Also, in the present method, the execution time for computation is comparatively small.


Asunto(s)
Algoritmos , Genoma , Animales , Filogenia , Genoma/genética , Nucleótidos/genética , Biología Computacional/métodos , Análisis de Secuencia de ADN/métodos , Mamíferos/genética
2.
Genomics ; 112(6): 4701-4714, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32827671

RESUMEN

Methods of finding sequence similarity play a significant role in computational biology. Owing to the rapid increase of genome sequences in public databases, the evolutionary relationship of species becomes more challenging. But traditional alignment-based methods are found inappropriate due to their time-consuming nature. Therefore, it is necessary to find a faster method, which applies to species phylogeny. In this paper, a new graph-theory based alignment-free sequence comparison method is proposed. A complete-bipartite graph is used to represent each genome sequence based on its nucleotide triplets. Subsequently, with the help of the weights of edges of the graph, a vector descriptor is formed. Finally, the phylogenetic tree is drawn using the UPGMA algorithm. In the present case, the datasets for comparison are related to mammals, viruses, and bacteria. In most of the cases, the phylogeny in the present case is found to be more satisfactory as compared to earlier methods.


Asunto(s)
Biología Computacional , Análisis de Secuencia de ADN/métodos , Algoritmos , Animales , Bacterias/genética , Mamíferos/genética , Nucleótidos/genética , Filogenia , Virus/genética
3.
Genomics ; 110(5): 263-273, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29180261

RESUMEN

Several proteins and genes are members of families that share a public evolutionary. In order to outline the evolutionary relationships and to recognize conserved patterns, sequence comparison becomes an emerging process. The current work investigates critically the k-mer role in composition vector method for comparing genome sequences. Generally, composition vector methods using k-mer are applied under choice of different value of k to compare genome sequences. For some values of k, results are satisfactory, but for other values of k, results are unsatisfactory. Standard composition vector method is carried out in the proposed work using 3-mer string length. In addition, special type of information based similarity index is used as a distance measure. It establishes that use of 3-mer and information based similarity index provide satisfactory results especially for comparison of whole genome sequences in all cases. These selections provide a sort of unified approach towards comparison of genome sequences.


Asunto(s)
Algoritmos , Genómica/métodos , Alineación de Secuencia/métodos , Animales , Humanos , Alineación de Secuencia/normas
4.
J Biomol Struct Dyn ; : 1-13, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38698728

RESUMEN

To unravel the intricate connection between protein function and protein structure, it is imperative to comprehensively evaluate protein secondary structure similarity from various perspectives. While numerous techniques have been suggested for comparing protein secondary structure elements (SSE), there continues to be a substantial need for finding alternative ways of comparing the same. In this paper, Topology of Protein Structure (TOPS) representations of protein secondary structures are considered to offer a new alignment-free method for evaluating similarities/dissimilarities of protein secondary structures. Initially, a two-dimensional numerical representation of the SSE is created, associating each point with a mass reflecting its frequency of occurrence. Then the means of coordinate values are determined by averaging weighted sums, and these mean values are subsequently used to calculate moments-of-inertia. Next, a four-component descriptor is generated out of the eigenvalues of the matrix and the mean values of the represented coordinates. Thereafter, Manhattan distance measure is used to obtain the distance matrix. This is finally applied to obtain the phylogenetic trees under the use of NJ method. SSE considered in the proposed method comprises 36-elements from the Chew-Kedem database giving five different taxa: globin, alpha-beta, tim-barrel, beta, and alpha. Phylogenetic trees were created for these SSE through the application of various methods: Clustal-Omega, LZ-Complexity, SED, TOPS + and TOC, to facilitate comparative analysis. Phylogenetic tree of the proposed method outperformed results of the previous methods when applied to the same SSE. Therefore, the method effectively constructs phylogenetic tree for analyzing protein secondary structure comparison.Communicated by Ramaswamy H. Sarma.

5.
J Biomol Struct Dyn ; : 1-29, 2023 Oct 26.
Artículo en Inglés | MEDLINE | ID: mdl-37885236

RESUMEN

In the field of computational biology, genome sequence comparison among different species is essential and has applications in both the research and scientific fields. Owing to the lengthy processing time and large number of data sets, the alignment-based approaches are unsuitable and ineffective. Therefore, alignment-free techniques have obtained popularity for acquiring proper sequence clustering and evolutionary relationship among species. In this paper, a complete bipartite graph based Positional difference and Frequency (PdF) vector descriptor is introduced. Positional difference and Frequency, two parameters, are applied to the genome sequence to create a 16- dimensional vector descriptor using the di-nucleotide representation of genome sequence. Subsequently, a distance matrix is calculated to construct the phylogenetic trees for different data sets of mammals and viruses. The achieved outcomes are compared with the phylogenetic trees of the earlier methods viz. the FFP method, the ClustalW method, the MEV method, the PCNV method and the FIS method. In most instances, the proposed method produces more precise outcomes than the preceding techniques and has potential for genome sequence comparison on both the equal and unequal length of data-sets.Communicated by Ramaswamy H. Sarma.

6.
Gene ; 730: 144257, 2020 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-31759983

RESUMEN

Genetic sequence analysis, classification of genome sequence and evolutionary relationship between species using their biological sequences, are the emerging research domain in Bioinformatics. Several methods have already been applied to DNA sequence comparison under tri-nucleotide representation. In this paper, a new form of tri-nucleotide representation is proposed for sequence comparison. The comparison does not depend on the alignment of the sequences. In this representation, the bio-chemical properties of the nucleotides are considered. The novelty of this method is that the sequences of unequal lengths are represented by vectors of the same length and each of the tri-nucleotide formed out of the given sequence has its unique representation. To validate the proposed method, it is verified on several data sets related to mammalians, viruses and bacteria. The results of this method are further compared with those obtained by methods such as probabilistic method, natural vector method, Fourier power spectrum method, multiple encoding vector method, and feature frequency profiles method. Moreover, this method produces accurate phylogeny in all the cases. It is also proved that the time complexity of the present method is less.


Asunto(s)
Nucleótidos/química , Análisis de Secuencia de ADN/métodos , Repeticiones de Trinucleótidos/genética , Algoritmos , Animales , Bacterias/genética , Secuencia de Bases , Mapeo Cromosómico/métodos , Análisis por Conglomerados , Biología Computacional/métodos , Genómica/métodos , Humanos , Mamíferos/genética , Nucleótidos/genética , Filogenia , Alineación de Secuencia , Virus/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA