Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 37(19): 3277-3284, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33970217

RESUMO

MOTIVATION: The reconstruction of possible histories given a sample of genetic data in the presence of recombination and recurrent mutation is a challenging problem, but can provide key insights into the evolution of a population. We present KwARG, which implements a parsimony-based greedy heuristic algorithm for finding plausible genealogical histories (ancestral recombination graphs) that are minimal or near-minimal in the number of posited recombination and mutation events. RESULTS: Given an input dataset of aligned sequences, KwARG outputs a list of possible candidate solutions, each comprising a list of mutation and recombination events that could have generated the dataset; the relative proportion of recombinations and recurrent mutations in a solution can be controlled via specifying a set of 'cost' parameters. We demonstrate that the algorithm performs well when compared against existing methods. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/a-ignatieva/kwarg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
BMC Bioinformatics ; 13: 260, 2012 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-23043260

RESUMO

BACKGROUND: RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. RESULTS: In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. CONCLUSIONS: Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein.


Assuntos
Algoritmos , Biologia Computacional/métodos , Dobramento de RNA/genética , RNA/química , RNA/genética , Software , Pareamento de Bases , Sequência de Bases , Simulação por Computador , Riboswitch
3.
Mol Biol Evol ; 28(6): 1777-84, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21212152

RESUMO

One of the key objectives of comparative genomics is the characterization of the forces that shape genomes over the course of evolution. In the last decades, evidence has been accumulated that for vertebrate genomes also epigenetic modifications have to be considered in this context. Especially, the elevated mutation frequency of 5-methylcytosine (5mC) is assumed to facilitate the depletion of CpG dinucleotides in species that exhibit global DNA methylation. For instance, the underrepresentation of CpG dinucleotides in many mammalian genomes is attributed to this effect, which is only neutralized in so-called CpG islands (CGIs) that are preferentially unmethylated and thus partially protected from rapid CpG decay. For primate-specific CpG-rich transposable elements from the ALU family, it is unclear whether their elevated CpG frequency is caused by their small age or by the absence of DNA methylation. In consequence, these elements are often misclassified in CGI annotations. We present a method for the estimation of germ line methylation from pairwise ancestral-descendant alignments. The approach is validated in a simulation study and tested on DNA repeats from the AluSx family. We conclude that a predicted unmethylated state in the germ line is highly correlated with epigenetic activity of the respective genomic region. Thus, CpG-rich repeats can be facilitated as in silico probes for the epigenetic potential of their genomic neighborhood.


Assuntos
Metilação de DNA/genética , Células Germinativas/metabolismo , Modelos Genéticos , Elementos Alu/genética , Células Sanguíneas/metabolismo , Simulação por Computador , Ilhas de CpG/genética , Epigenômica , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
4.
Mol Biol Evol ; 26(1): 209-16, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18948299

RESUMO

Noncoding RNAs (ncRNAs) are transcripts that do not code for protein but rather function as RNA in catalytic, regulatory, or structural roles in the cell. ncRNAs are involved in universally conserved biological processes, including protein synthesis and gene regulation, and have more specific roles, such as in X-chromosome inactivation in eutherian mammals. In this paper, we propose and investigate a hypothesis for patterns of sequence selection in structurally conserved ncRNAs. Previous attempts at defining RNA selection compared rates of evolution between paired and unpaired bases with largely inconclusive results. Our approach focuses only on paired bases in ncRNAs with conserved structure. By analogy to the different properties of codon positions based on the genetic code, we use a well-developed energy model for RNA structure to classify stem positions into structural classes and argue that they are under different selective constraints. We validate the hypothesis on several RNA families and use simulated data to verify the evolutionary origin of signals. Our class labeling is shown to be a better model of ncRNA evolution than the tradition of treating stem positions equally. As well as providing a better understanding of RNA evolution, the evolutionary footprint we identify can easily be incorporated into gene finders to improve their specificity.


Assuntos
Evolução Molecular , RNA não Traduzido/genética , Animais , Sequência de Bases , Galinhas , Cães , Humanos , Camundongos , Coelhos , Alinhamento de Sequência
5.
Nucleic Acids Res ; 33(Web Server issue): W650-3, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15980555

RESUMO

Foldalign is a Sankoff-based algorithm for making structural alignments of RNA sequences. Here, we present a web server for making pairwise alignments between two RNA sequences, using the recently updated version of foldalign. The server can be used to scan two sequences for a common structural RNA motif of limited size, or the entire sequences can be aligned locally or globally. The web server offers a graphical interface, which makes it simple to make alignments and manually browse the results. The web server can be accessed at http://foldalign.kvl.dk.


Assuntos
RNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Gráficos por Computador , Internet , Conformação de Ácido Nucleico , Interface Usuário-Computador
6.
Bioinformatics ; 21(9): 1815-24, 2005 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-15657094

RESUMO

MOTIVATION: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding today as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region. RESULTS: Here we present such an approach for pairwise local alignment which is based on foldalign and the Sankoff algorithm for simultaneous structural alignment of multiple sequences. We include the ability to conduct mutual scans of two sequences of arbitrary length while searching for common local structural motifs of some maximum length. This drastically reduces the complexity of the algorithm. The scoring scheme includes structural parameters corresponding to those available for free energy as well as for substitution matrices similar to RIBOSUM. The new foldalign implementation is tested on a dataset where the ncRNAs and eleRNAs have sequence similarity <40% and where the ncRNAs and eleRNAs are energetically indistinguishable from the surrounding genomic sequence context. The method is tested in two ways: (1) its ability to find the common structure between the genes only and (2) its ability to locate ncRNAs and eleRNAs in a genomic context. In case (1), it makes sense to compare with methods like Dynalign, and the performances are very similar, but foldalign is substantially faster. The structure prediction performance for a family is typically around 0.7 using Matthews correlation coefficient. In case (2), the algorithm is successful at locating RNA families with an average sensitivity of 0.8 and a positive predictive value of 0.9 using a BLAST-like hit selection scheme. AVAILABILITY: The program is available online at http://foldalign.kvl.dk/


Assuntos
Algoritmos , RNA não Traduzido/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Sequência Conservada , Conformação de Ácido Nucleico , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...