RESUMO
MOTIVATION: The biologically meaningful algorithmic study of genome rearrangement should take into account the distribution of sizes of the rearranged genomic fragments. In particular, it is important to know the prevalence of short inversions in order to understand the patterns of gene order disruption observed in comparative genomics. RESULTS: We find a large excess of short inversions, especially those involving a single gene, in comparison with a random inversion model. This is demonstrated through comparison of four pairs of bacterial genomes, using a specially-designed implementation of the Hannenhalli-Pevzner theory, and validated through experimentation on pairs of random genomes matched to the real pairs.
Assuntos
Algoritmos , Inversão Cromossômica , Análise Mutacional de DNA/métodos , Perfilação da Expressão Gênica/métodos , Genoma Bacteriano , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Mapeamento Cromossômico/métodos , Variação Genética , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
We propose a model of the doubling of a bacterial genome followed by gene order rearrangement to explain present-day patterns of duplicated genes. On the hypothesis that inversion (reversal) is the predominant mechanism of rearrangement, we ask how to reconstruct the ancestral genome at the moment of genome duplication. We present a polynomial algorithm for finding such a genome that minimizes (within 2 reversals) the Hannenhalli-Pevzner formula for reversal distance from the modern genome. We illustrate by applying the algorithm to a set of duplicate genes in the Marchantia polymorpha mitochondrial genome.
RESUMO
A common strategy characterises the various methods independently defined to identify almost unambiguously different types of RNA molecules in DNA fragments. So far, the good quality of detection of RNA motif has been the prior motivation and effectively delayed the optimisation of programs. As an illustration of possible improvements, a modified version of tRNAscan is described. The previous algorithm was altered to run 500 times faster and to lower both rates of false positives and false negatives. The newly sequenced genome of Saccharomyces cerevisiae is scanned both ways in less than three minutes and results match annotations found in databanks with three exceptions, two of which being arguably not real tRNAs.