Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Int J Mol Sci ; 25(11)2024 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-38891951

RESUMEN

In the face of the SARS-CoV-2 pandemic, characterized by the virus's rapid mutation rates, developing timely and targeted therapeutic and diagnostic interventions presents a significant challenge. This study utilizes bioinformatic analyses to pinpoint conserved genomic regions within SARS-CoV-2, offering a strategic advantage in the fight against this and future pathogens. Our approach has enabled the creation of a diagnostic assay that is not only rapid, reliable, and cost-effective but also possesses a remarkable capacity to detect a wide array of current and prospective variants with unmatched precision. The significance of our findings lies in the demonstration that focusing on these conserved genomic sequences can significantly enhance our preparedness for and response to emerging infectious diseases. By providing a blueprint for the development of versatile diagnostic tools and therapeutics, this research paves the way for a more effective global pandemic response strategy.


Asunto(s)
COVID-19 , Biología Computacional , Secuencia Conservada , Genoma Viral , SARS-CoV-2 , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , COVID-19/virología , COVID-19/epidemiología , Humanos , Biología Computacional/métodos , Pandemias
2.
BMC Genomics ; 12(1): 273, 2011 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-21627783

RESUMEN

BACKGROUND: The periodical occurrence of dinucleotides with a period of 10.4 bases now is undeniably a hallmark of nucleosome positioning. Whereas many eukaryotic genomes contain visible and even strong signals for periodic distribution of dinucleotides, the human genome is rather featureless in this respect. The exact sequence features in the human genome that govern the nucleosome positioning remain largely unknown. RESULTS: When analyzing the human genome sequence with the positional autocorrelation method, we found that only the dinucleotide CG shows the 10.4 base periodicity, which is indicative of the presence of nucleosomes. There is a high occurrence of CG dinucleotides that are either 31 (10.4 × 3) or 62 (10.4 × 6) base pairs apart from one another - a sequence bias known to be characteristic of Alu-sequences. In a similar analysis with repetitive sequences removed, peaks of repeating CG motifs can be seen at positions 10, 21 and 31, the nearest integers of multiples of 10.4. CONCLUSIONS: Although the CG dinucleotides are dominant, other elements of the standard nucleosome positioning pattern are present in the human genome as well.The positional autocorrelation analysis of the human genome demonstrates that the CG dinucleotide is, indeed, one visible element of the human nucleosome positioning pattern, which appears both in Alu sequences and in sequences without repeats. The dominant role that CG dinucleotides play in organizing human chromatin is to indicate the involvement of human nucleosomes in tuning the regulation of gene expression and chromatin structure, which is very likely due to cytosine-methylation/-demethylation in CG dinucleotides contained in the human nucleosomes. This is further confirmed by the positions of CG-periodical nucleosomes on Alu sequences. Alu repeats appear as monomers, dimers and trimers, harboring two to six nucleosomes in a run. Considering the exceptional role CG dinucleotides play in the nucleosome positioning, we hypothesize that Alu-nucleosomes, especially, those that form tightly positioned runs, could serve as "anchors" in organizing the chromatin in human cells.


Asunto(s)
Elementos Alu/fisiología , Fosfatos de Dinucleósidos/fisiología , Nucleosomas/fisiología , Ensamble y Desensamble de Cromatina , Biología Computacional , Humanos , Análisis de Secuencia de ADN
3.
BMC Genomics ; 12: 203, 2011 Apr 21.
Artículo en Inglés | MEDLINE | ID: mdl-21510861

RESUMEN

BACKGROUND: Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. RESULTS: Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT) periodicity. Mouse isochores show very weak CG periodicity only. CONCLUSIONS: Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.


Asunto(s)
Isocoras/química , Nucleosomas/química , Animales , Composición de Base , Pollos , Humanos , Ratones , Azúcares de Nucleósido Difosfato/química , Oligonucleótidos/química , Alineación de Secuencia
4.
J Theor Biol ; 260(3): 438-44, 2009 Oct 07.
Artículo en Inglés | MEDLINE | ID: mdl-19591846

RESUMEN

A novel approach for evaluation of sequence relatedness via a network over the sequence space is presented. This relatedness is quantified by graph theoretical techniques. The graph is perceived as a flow network, and flow algorithms are applied. The number of independent pathways between nodes in the network is shown to reflect structural similarity of corresponding protein fragments. These results provide an appropriate parameter for quantitative estimation of such relatedness, as well as reliability of the prediction. They also demonstrate a new potential for sequence analysis and comparison by means of the flow network in the sequence space.


Asunto(s)
Secuencia de Aminoácidos , Modelos Químicos , Proteínas/química , Algoritmos , Animales , Biología Computacional/métodos , Bases de Datos de Proteínas , Conductividad Eléctrica , Fragmentos de Péptidos/química , Análisis de Secuencia de Proteína/métodos
5.
Gene ; 408(1-2): 64-71, 2008 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-18022768

RESUMEN

The formatted protein sequence space is built from identical size fragments of prokaryotic proteins (112 complete proteomes). Connecting sequence-wise similar fragments (points in the space) results in the formation of numerous networks, that combine sometimes different types of proteins sharing, though, fragments with similar or distantly related sequences. The networks are mapped on individual protein sequences revealing distinct regions (modules) associated with prominent networks with well-defined functional identities. Presence of multiple sites of sequence conservation (modules) in a given protein sequence suggests that the annotated protein function may be decomposed in "elementary" subfunctions of the respective modules. The modules correspond to previously discovered conserved closed loop structures and their sequence prototypes.


Asunto(s)
Análisis de Secuencia de Proteína , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Animales , Biología Computacional , Secuencia Conservada , Bases de Datos de Proteínas , Humanos , Datos de Secuencia Molecular , Conformación Proteica , Pliegue de Proteína , Proteínas/química
6.
J Biomol Struct Dyn ; 26(2): 215-22, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18597543

RESUMEN

To establish possible function of a newly discovered protein, alignment of its sequence with other known sequences is required. When the similarity is marginal, the function remains uncertain. A principally new approach is suggested: to use networks in the protein sequence space. The functionality of the protein is firmly established via networks forming chains of consecutive pair-wise matching fragments. The distant relatives are, thus, considered as relatives, though in some cases, there is even no sequence match between the ends of the chain, while the entire chain belongs to the same functional and structural network.


Asunto(s)
Secuencia de Aminoácidos , Proteínas/genética , Alineación de Secuencia/métodos , Algoritmos , Modelos Moleculares , Datos de Secuencia Molecular , Conformación Proteica , Programas Informáticos
7.
Proteins ; 67(2): 271-84, 2007 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-17286283

RESUMEN

A new method is proposed to reveal apparent evolutionary relationships between protein fragments with similar 3D structures by finding "intermediate" sequences in the proteomic database. Instead of looking for homologies and intermediates for a whole protein domain, we build a chain of intermediate short sequences, which allows one to link similar structural modules of proteins belonging to the same or different families. Several such chains of intermediates can be combined into an evolutionary tree of structural protein modules. All calculations were made for protein fragments of 20 aa residues. Three evolutionary trees for different module structures are described. The aim of the paper is to introduce the new method and to demonstrate its potential for protein structural predictions. The approach also opens new perspectives for protein evolution studies.


Asunto(s)
Evolución Molecular , Proteínas/química , Homología Estructural de Proteína , Secuencia de Aminoácidos , Bases de Datos de Proteínas/tendencias , Modelos Genéticos , Modelos Moleculares , Proteómica/métodos , Proteómica/tendencias
8.
J Comput Biol ; 14(8): 1044-57, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17985987

RESUMEN

In our recent work, a new approach to establish sequence relatedness, by walking through the protein sequence space, was introduced. The sequence space is built from 20 amino acid long fragments of proteins from a very large collection of fully sequenced prokaryotic genomes. The fragments, points in the space, are connected, if they are closely related (high sequence identity). The connected fragments form variety of networks of sequence kinship. In this research the networks in the formatted sequence space and their topology are analyzed. For lower identity thresholds a huge network of complex structure is formed, involving up to 10% points of the space. When the threshold is increased, the major network splits into a set of smaller clusters with a wide diversity of sizes and topologies. Such "evolutionary networks" may serve as a powerful sequence annotation tool that allows one to reveal fine details in the evolutionary history of proteins.


Asunto(s)
Evolución Molecular , Proteínas/genética , Proteómica/estadística & datos numéricos , Algoritmos , Secuencia de Aminoácidos , Biología Computacional , Bases de Datos de Proteínas , Datos de Secuencia Molecular , Fragmentos de Péptidos/genética , Alineación de Secuencia/estadística & datos numéricos , Programas Informáticos
9.
J Comput Biol ; 21(2): 173-83, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24050498

RESUMEN

Graph clustering becomes difficult as the graph size and complexity increase. In particular, in interaction graphs, the clusters are small and the data on the underlying interaction are not only complex, but also noisy due to the lack of information and experimental errors. The graphs representing such data consist of (possibly overlapping) clusters of non-uniform size with some false positive and false negative links. In this article, we propose a new approach, assuming that clusters in the graphs of protein-protein interaction (PPI) networks resemble corrupted cliques. Therefore, the problem can be reduced to looking for clusters only among nodes of approximately similar degrees. This idea was implemented using a soft version of the Farthest-Point-First (FPF) clustering algorithm with the Jaccard distance function modified to perform on slightly overlapping clusters. The StripClust program developed by us was tested on a synthetic network and on the yeast PPI network.


Asunto(s)
Gráficos por Computador , Mapeo de Interacción de Proteínas/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Algoritmos , Análisis por Conglomerados
10.
Gene ; 528(2): 282-7, 2013 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-23872203

RESUMEN

We have shown, in a previous paper, that tandem repeating sequences, especially triplet repeats, play a very important role in gene evolution. This result led to the formulation of the following hypothesis: most of the genomic sequences evolved through everlasting acts of tandem repeat expansions with subsequent accumulation of changes. In order to estimate how much of the observed sequences have the repeat origin we describe the adaptation of a text segmentation algorithm, based on dynamic programming, to the mapping of the ancient expansion events. The algorithm maximizes the segmentation cost, calculated as the similarity of obtained fragments to the putative repeat sequence. In the first application of the algorithm to segmentations of genomic sequences, a significant difference between the natural sequences and the corresponding shuffled sequences is detected. The natural fragments are longer and more similar to the putative repeat sequences. As our analysis shows, the coding sequences allow for repeats only when the size of the repeated words is divisible by three. In contrast, in the non-coding sequences, all repeated word sizes are present. It was estimated, that in Escherichia coli K12 genome, about 35.5% of sequence can be detectably traced to original simple repeat ancestors. The results shed light on the genomic sequence organization, and strongly confirm the hypothesis about the crucial role of triplet expansions in gene origin and evolution.


Asunto(s)
Evolución Molecular , Repeticiones de Trinucleótidos , Algoritmos , Secuencia de Bases , Mapeo Cromosómico , Escherichia coli K12/genética , Genoma Bacteriano , Genoma Fúngico , Modelos Genéticos , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN
11.
J Biomol Struct Dyn ; 30(2): 201-10, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22702731

RESUMEN

A novel concept on mechanisms of evolution of genes and genomes is suggested: the sequences evolve largely by local events of triplet expansion and subsequent mutational changes in the repeats. The immediate memory about the earlier expansion events still resides in the sequences, in form of the frequently occurring segments of tandemly repeating codons. Other predicted fossils of the original repeats are: (I) the expanding triplets should be accompanied by their point mutation derivatives and (II) the remaining excess of codons formerly belonging to the tandem repeats should be reflected in overall codon usage biases. Both predictions are confirmed by analysis of largest available database of non-redundant protein coding sequences, of total size ∼5 × 10(9) codons. One important conclusion also follows from the results. Life which, presumably, started with replication of expanding triplets and their subsequent mutational changes, is continuing to emerge within the genes and genomes, in form of new events of triplet expansion.


Asunto(s)
Evolución Molecular , Genoma , Expansión de Repetición de Trinucleótido , Secuencia de Bases , Codón , Bases de Datos Genéticas , Fósiles
12.
Ann N Y Acad Sci ; 1267: 35-8, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22954214

RESUMEN

If we define a genetic code as a widespread DNA sequence pattern that carries a message with an impact on biology, then there are multiple genetic codes. Sequences involved in these codes overlap and, thus, both interact with and constrain each other, such as for the triplet code, the intron-splicing code, the code for amphipathic alpha helices, and the chromatin code. Nucleosomes preferentially are located at the ends of exons, thus protecting splice junctions, with the N9 positions of guanines of the GT and AG junctions oriented toward the histones. Analysis of protein-coding sequences reveals numerous traces of tandem repeats, apparently formed by triplet expansion, which in effect is a genome inflation ``code''. Our data are consistent with the hypothesis that expansion of simple tandem repetition of certain aggressive triplets has been a characteristic of life from its emergence. Such expanding triplets appear to be the major factor underlying observed codon usage biases.


Asunto(s)
Expansión de Repetición de Trinucleótido/genética , Secuencia de Bases , ADN/genética , Código Genético , Humanos , Conformación de Ácido Nucleico , Nucleosomas/metabolismo , Sitios de Empalme de ARN , Secuencias Repetitivas de Ácidos Nucleicos , Análisis de Secuencia de ADN
13.
J Biomol Struct Dyn ; 30(2): 211-6, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22702732

RESUMEN

Apoptotic digestion of human lymphocyte chromatin results in the appearance of large amounts of nucleosome size DNA fragments. Sequencing of these fragments and analysis of the distribution of bases around the apoptotic nucleases' cutting sites revealed a rather strong consensus sequence, not observed earlier. The consensus TAAAgTAcTTTA is characterized by complementary symmetry, resembling prokaryotic restriction sites. This consensus also possesses three TA dinucleotide steps, separated by five bases (corresponding to a half-period of the DNA double helix), suggesting strong bending of the DNA at the cut sites which is perhaps required for cutting.


Asunto(s)
Apoptosis , Cromatina/metabolismo , ADN/química , Linfocitos/metabolismo , Secuencia de Bases , Enzimas de Restricción del ADN/química , Enzimas de Restricción del ADN/metabolismo , Humanos , Datos de Secuencia Molecular , Nucleosomas/metabolismo
14.
J Biomol Struct Dyn ; 29(3): 577-83, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22066542

RESUMEN

This communication reports on the nucleosome positioning patterns (bendability matrices) for the human genome, derived from over 8_million nucleosome DNA sequences obtained from apoptotically digested lymphocytes. This digestion procedure is used here for the first time for the purpose of extraction and sequencing of the nucleosome DNA fragments. The dominant motifs suggested by the matrices of DNA bendability calculated for light and heavy isochores are significantly different. Both, however, are in full agreement with the linear description YRRRRRYYYYYR, and with earlier derivations by N-gram extensions. Thus, the choice of the nucleosome positioning patterns crucially depends on the G + C composition of the analyzed sequences.


Asunto(s)
Apoptosis , Nucleosomas/química , Nucleosomas/metabolismo , Secuencias de Aminoácidos , Composición de Base , Secuencia de Bases , Ensamble y Desensamble de Cromatina , ADN/química , Humanos , Conformación de Ácido Nucleico
15.
Curr Opin Struct Biol ; 19(3): 335-40, 2009 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-19386484

RESUMEN

Proteins in their evolution appear to follow several discrete stages, which is reflected in their modular organization. The sequences of the protein modules are highly variable while their functions and structures are rather conserved. The relatedness of the variable sequences is well represented by the networks in natural protein sequence space that also suggests evolutionary connections.


Asunto(s)
Evolución Molecular , Proteínas/química , Secuencia de Aminoácidos , Humanos , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Proteínas/genética , Proteínas/metabolismo
16.
J Theor Biol ; 244(1): 77-80, 2007 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-16952377

RESUMEN

Following the original idea of Maynard Smith on evolution of the protein sequence space, a novel tool is developed that allows the "space walk", from one sequence to its likely evolutionary relative and further on. At a given threshold of identity between consecutive steps, the walks of many steps are possible. The sequences at the ends of the walks may substantially differ from one another. In a sequence space of randomized (shuffled) sequences the walks are very short. The approach opens new perspectives for protein evolutionary studies and sequence annotation.


Asunto(s)
Evolución Molecular , Proteínas/genética , Análisis de Secuencia de Proteína/métodos , Transportadoras de Casetes de Unión a ATP/genética , Secuencia de Aminoácidos , Datos de Secuencia Molecular
17.
J Mol Evol ; 65(6): 640-50, 2007 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-18026890

RESUMEN

Twenty-seven protein sequence elements, six to nine amino acids long, were extracted from 15 phylogenetically diverse complete prokaryotic proteomes. The elements are present in all of these proteomes, with at least one copy each (omnipresent elements), and have presumably been conserved since the last universal common ancestor (LUCA). All these omnipresent elements are identified in crystallized protein structures as parts of highly conserved closed loops, 25-30 residues long, thus representing the closed-loop modules discovered in 2000 by Berezovsky et al. The omnipresent peptides make up seven distinct groups, of which the largest groups, Aleph and Beth, contain 18 and four elements, respectively, which are related but different, while five other groups are represented by only one element each. The LUCA modules appear with one or several copies per protein molecule in a variety of combinations depending on the functional identity of the corresponding protein. The functional involvement of individual LUCA modules is outlined on the basis of known protein annotations. Analyses of all the related sequences in a large, formatted protein sequence space suggest that many, if not all, of the 27 omnipresent elements have a common sequence origin. This sequence space network analysis may lead to elucidation of the earliest stages of protein evolution.


Asunto(s)
Evolución Molecular , Células Procariotas/metabolismo , Proteínas/genética , Secuencias de Aminoácidos/genética , Secuencia de Aminoácidos , Secuencia Conservada/genética , Bases de Datos de Proteínas , Modelos Genéticos , Modelos Moleculares , Datos de Secuencia Molecular , Filogenia , Proteínas/química , Proteínas/clasificación , Análisis de Secuencia de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA