Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 104
Filter
Add more filters











Publication year range
1.
Cell ; 2024 Aug 20.
Article in English | MEDLINE | ID: mdl-39181133

ABSTRACT

Chromothripsis describes the catastrophic shattering of mis-segregated chromosomes trapped within micronuclei. Although micronuclei accumulate DNA double-strand breaks and replication defects throughout interphase, how chromosomes undergo shattering remains unresolved. Using CRISPR-Cas9 screens, we identify a non-canonical role of the Fanconi anemia (FA) pathway as a driver of chromothripsis. Inactivation of the FA pathway suppresses chromosome shattering during mitosis without impacting interphase-associated defects within micronuclei. Mono-ubiquitination of FANCI-FANCD2 by the FA core complex promotes its mitotic engagement with under-replicated micronuclear chromosomes. The structure-selective SLX4-XPF-ERCC1 endonuclease subsequently induces large-scale nucleolytic cleavage of persistent DNA replication intermediates, which stimulates POLD3-dependent mitotic DNA synthesis to prime shattered fragments for reassembly in the ensuing cell cycle. Notably, FA-pathway-induced chromothripsis generates complex genomic rearrangements and extrachromosomal DNA that confer acquired resistance to anti-cancer therapies. Our findings demonstrate how pathological activation of a central DNA repair mechanism paradoxically triggers cancer genome evolution through chromothripsis.

2.
Genes (Basel) ; 15(7)2024 Jun 27.
Article in English | MEDLINE | ID: mdl-39062626

ABSTRACT

The bacterium Deinococcus radiodurans is known to efficiently and accurately reassemble its genome after hundreds of DNA double-strand breaks (DSBs). Only at very large amounts of radiation-induced DSBs is this accuracy affected in the wild-type D. radiodurans, causing rearrangements in its genome structure. However, changes in its genome structure may also be possible during the propagation and storage of cell cultures. We investigate this possibility by listing structural differences between three completely sequenced genomes of D. radiodurans strains with a recent common ancestor-the type strain stored and sequenced in two different laboratories (of the ATCC 13939 lineage) and the first sequenced strain historically used as the reference (ATCC BAA-816). We detected a number of structural differences and found the most likely mechanisms behind them: (i) transposition/copy number change in mobile interspersed repeats-insertion sequences and small non-coding repeats, (ii) variable number of monomers within tandem repeats, (iii) deletions between long direct DNA repeats, and (iv) deletions between short (4-10 bp) direct DNA repeats. The most surprising finding was the deletions between short repeats because it indicates the utilization of a less accurate DSB repair mechanism in conditions in which a more accurate one should be both available and preferred. The detected structural differences, as well as SNPs and short indels, while being important footprints of deinococcal DNA metabolism and repair, are also a valuable resource for researchers using these D. radiodurans strains.


Subject(s)
Deinococcus , Genome, Bacterial , Deinococcus/genetics , DNA Breaks, Double-Stranded , DNA Transposable Elements/genetics
3.
Methods Mol Biol ; 2802: 215-245, 2024.
Article in English | MEDLINE | ID: mdl-38819562

ABSTRACT

Genome rearrangements are mutations that change the gene content of a genome or the arrangement of the genes on a genome. Several years of research on genome rearrangements have established different algorithmic approaches for solving some fundamental problems in comparative genomics based on gene order information. This review summarizes the literature on genome rearrangement analysis along two lines of research. The first line considers rearrangement models that are particularly well suited for a theoretical analysis. These models use rearrangement operations that cut chromosomes into fragments and then join the fragments into new chromosomes. The second line works with rearrangement models that reflect several biologically motivated constraints, e.g., the constraint that gene clusters have to be preserved. In this chapter, the border between algorithmically "easy" and "hard" rearrangement problems is sketched and a brief review is given on the available software tools for genome rearrangement analysis.


Subject(s)
Algorithms , Gene Rearrangement , Genomics , Multigene Family , Software , Humans , Computational Biology/methods , Genome/genetics , Genomics/methods , Models, Genetic , Animals
4.
ACS Synth Biol ; 13(4): 1116-1127, 2024 04 19.
Article in English | MEDLINE | ID: mdl-38597458

ABSTRACT

Synthetic Sc2.0 yeast strains contain hundreds to thousands of loxPsym recombination sites that allow restructuring of the Saccharomyces cerevisiae genome by SCRaMbLE. Thus, a highly diverse yeast population can arise from a single genotype. The selection of genetically diverse candidates with rearranged synthetic chromosomes for downstream analysis requires an efficient and straightforward workflow. Here we present loxTags, a set of qPCR primers for genotyping across loxPsym sites to detect not only deletions but also inversions and translocations after SCRaMbLE. To cope with the large number of amplicons, we generated qTagGer, a qPCR genotyping primer prediction tool. Using loxTag-based genotyping and long-read sequencing, we show that light-inducible Cre recombinase L-SCRaMbLE can efficiently generate diverse recombination events when applied to Sc2.0 strains containing a linear or a circular version of synthetic chromosome III.


Subject(s)
Chromosomes , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genetics , Genotype , Workflow , Gene Rearrangement , Genome, Fungal/genetics
5.
Cell Rep ; 43(4): 114001, 2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38547127

ABSTRACT

In the ciliate Paramecium, precise excision of numerous internal eliminated sequences (IESs) from the somatic genome is essential at each sexual cycle. DNA double-strands breaks (DSBs) introduced by the PiggyMac endonuclease are repaired in a highly concerted manner by the non-homologous end joining (NHEJ) pathway, illustrated by complete inhibition of DNA cleavage when Ku70/80 proteins are missing. We show that expression of a DNA-binding-deficient Ku70 mutant (Ku70-6E) permits DNA cleavage but leads to the accumulation of unrepaired DSBs. We uncoupled DNA cleavage and repair by co-expressing wild-type and mutant Ku70. High-throughput sequencing of the developing macronucleus genome in these conditions identifies the presence of extremities healed by de novo telomere addition and numerous translocations between IES-flanking sequences. Coupling the two steps of IES excision ensures that both extremities are held together throughout the process, suggesting that DSB repair proteins are essential for assembly of a synaptic precleavage complex.


Subject(s)
DNA Cleavage , Paramecium , Paramecium/genetics , Paramecium/metabolism , DNA Breaks, Double-Stranded , Genome, Protozoan , Ku Autoantigen/metabolism , Ku Autoantigen/genetics , DNA Repair , Protozoan Proteins/metabolism , Protozoan Proteins/genetics , DNA End-Joining Repair
6.
Front Genet ; 15: 1302554, 2024.
Article in English | MEDLINE | ID: mdl-38425715

ABSTRACT

Introduction: The Tibetan antelope (Pantholops hodgsonii) is a remarkable mammal thriving in the extreme Qinghai-Tibet Plateau conditions. Despite the availability of its genome sequence, limitations in the scaffold-level assembly have hindered a comprehensive understanding of its genomics. Moreover, comparative analyses with other Bovidae species are lacking, along with insights into genome rearrangements in the Tibetan antelope. Methods: Addressing these gaps, we present a multifaceted approach by refining the Tibetan Antelope genome through linkage disequilibrium analysis with data from 15 newly sequenced samples. Results: The scaffold N50 of the refined reference is 3.2 Mbp, surpassing the previous version by 1.15-fold. Our annotation analysis resulted in 50,750 genes, encompassing 29,324 novel genes not previously study. Comparative analyses reveal 182 unique rearrangements within the scaffolds, contributing to our understanding of evolutionary dynamics and species-specific adaptations. Furthermore, by conducting detailed genomic comparisons and reconstructing rearrangements, we have successfully pioneered the reconstruction of the X-chromosome in the Tibetan antelope. Discussion: This effort enhances our comprehension of the genomic landscape of this species.

7.
Metab Eng Commun ; 18: e00231, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38222043

ABSTRACT

Rhodococcus strains were designed as model biocatalysts (BCs) for the production of acrylic acid and mixtures of acrylic monomers consisting of acrylamide, acrylic acid, and N-alkylacrylamide (N-isopropylacrylamide). To obtain BC strains, we used, among other approaches, adaptive laboratory evolution (ALE), based on the use of the metabolic pathway of amide utilization. Whole genome sequencing of the strains obtained after ALE, as well as subsequent targeted gene disruption, identified candidate genes for three new amidases that are promising for the development of BCs for the production of acrylic acid from acrylamide. New BCs had two types of amidase activities, acrylamide-hydrolyzing and acrylamide-transferring, and by varying the ratio of these activities in BCs, it is possible to influence the ratio of monomers in the resulting mixtures. Based on these strains, a prototype of a new technological concept for the biocatalytic synthesis of acrylic monomers was developed for the production of water-soluble acrylic heteropolymers containing valuable N-alkylacrylamide units. In addition to the possibility of obtaining mixtures of different compositions, the advantages of the concept are a single starting reagent (acrylamide), more unification of processes (all processes are based on the same type of biocatalyst), and potentially greater safety for personnel and the environment compared to existing chemical technologies.

8.
Mol Cell ; 84(1): 55-69, 2024 Jan 04.
Article in English | MEDLINE | ID: mdl-38029753

ABSTRACT

Mitotic cell division is tightly monitored by checkpoints that safeguard the genome from instability. Failures in accurate chromosome segregation during mitosis can cause numerical aneuploidy, which was hypothesized by Theodor Boveri over a century ago to promote tumorigenesis. Recent interrogation of pan-cancer genomes has identified unexpected classes of chromosomal abnormalities, including complex rearrangements arising through chromothripsis. This process is driven by mitotic errors that generate abnormal nuclear structures that provoke extensive yet localized shattering of mis-segregated chromosomes. Here, we discuss emerging mechanisms underlying chromothripsis from micronuclei and chromatin bridges, as well as highlight how this mutational cascade converges on the DNA damage response. A fundamental understanding of these catastrophic processes will provide insight into how initial errors in mitosis can precipitate rapid cancer genome evolution.


Subject(s)
Chromothripsis , Neoplasms , Humans , Chromosome Aberrations , Mitosis/genetics , Genomic Instability , Neoplasms/genetics
9.
Cell Genom ; 3(11): 100437, 2023 Nov 08.
Article in English | MEDLINE | ID: mdl-38020969

ABSTRACT

Pioneering advances in genome engineering, and specifically in genome writing, have revolutionized the field of synthetic biology, propelling us toward the creation of synthetic genomes. The Sc2.0 project aims to build the first fully synthetic eukaryotic organism by assembling the genome of Saccharomyces cerevisiae. With the completion of synthetic chromosome VIII (synVIII) described here, this goal is within reach. In addition to writing the yeast genome, we sought to manipulate an essential functional element: the point centromere. By relocating the native centromere sequence to various positions along chromosome VIII, we discovered that the minimal 118-bp CEN8 sequence is insufficient for conferring chromosomal stability at ectopic locations. Expanding the transplanted sequence to include a small segment (∼500 bp) of the CDEIII-proximal pericentromere improved chromosome stability, demonstrating that minimal centromeres display context-dependent functionality.

10.
J Comput Biol ; 30(12): 1277-1288, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37883640

ABSTRACT

The transposition distance problem is a classical problem in genome rearrangements, which seeks to determine the minimum number of transpositions needed to transform a linear chromosome into another represented by the permutations π and σ, respectively. This article focuses on the equivalent problem of sorting by transpositions (SBT), where σ is the identity permutation ι. Specifically, we investigate palisades, a family of permutations that are "hard" to sort, as they require numerous transpositions above the celebrated lower bound devised by Bafna and Pevzner. By determining the transposition distance of palisades, we were able to provide the exact transposition diameter for 3-permutations (TD3), a special subset of the symmetric group Sn, essential for the study of approximate solutions for SBT using the simplification technique. The exact value for TD3 has remained unknown since Elias and Hartman showed an upper bound for it. Another consequence of determining the transposition distance of palisades is that, using as lower bound the one by Bafna and Pevzner, it is impossible to guarantee approximation ratios lower than 1.375 when approximating SBT. This finding has significant implications for the study of SBT, as this problem has been the subject of intense research efforts for the past 25 years.


Subject(s)
Algorithms , Genome , Gene Rearrangement , Models, Genetic
11.
Cell ; 186(20): 4404-4421.e20, 2023 09 28.
Article in English | MEDLINE | ID: mdl-37774679

ABSTRACT

Persistent DNA double-strand breaks (DSBs) in neurons are an early pathological hallmark of neurodegenerative diseases including Alzheimer's disease (AD), with the potential to disrupt genome integrity. We used single-nucleus RNA-seq in human postmortem prefrontal cortex samples and found that excitatory neurons in AD were enriched for somatic mosaic gene fusions. Gene fusions were particularly enriched in excitatory neurons with DNA damage repair and senescence gene signatures. In addition, somatic genome structural variations and gene fusions were enriched in neurons burdened with DSBs in the CK-p25 mouse model of neurodegeneration. Neurons enriched for DSBs also had elevated levels of cohesin along with progressive multiscale disruption of the 3D genome organization aligned with transcriptional changes in synaptic, neuronal development, and histone genes. Overall, this study demonstrates the disruption of genome stability and the 3D genome organization by DSBs in neurons as pathological steps in the progression of neurodegenerative diseases.


Subject(s)
DNA Breaks, Double-Stranded , Neurodegenerative Diseases , Animals , Humans , Mice , Alzheimer Disease/genetics , DNA , DNA Repair/genetics , Neurodegenerative Diseases/genetics , Neurons/physiology , Single-Cell Analysis , Sequence Analysis, RNA , Genomic Instability
12.
Bull Math Biol ; 85(11): 107, 2023 09 25.
Article in English | MEDLINE | ID: mdl-37749280

ABSTRACT

Early literature on genome rearrangement modelling views the problem of computing evolutionary distances as an inherently combinatorial one. In particular, attention is given to estimating distances using the minimum number of events required to transform one genome into another. In hindsight, this approach is analogous to early methods for inferring phylogenetic trees from DNA sequences such as maximum parsimony-both are motivated by the principle that the true distance minimises evolutionary change, and both are effective if this principle is a true reflection of reality. Recent literature considers genome rearrangement under statistical models, continuing this parallel with DNA-based methods, with the goal of using model-based methods (for example maximum likelihood techniques) to compute distance estimates that incorporate the large number of rearrangement paths that can transform one genome into another. Crucially, this approach requires one to decide upon a set of feasible rearrangement events and, in this paper, we focus on characterising well-motivated models for signed, uni-chromosomal circular genomes, where the number of regions remains fixed. Since rearrangements are often mathematically described using permutations, we isolate the sets of permutations representing rearrangements that are biologically reasonable in this context, for example inversions and transpositions. We provide precise mathematical expressions for these rearrangements, and then describe them in terms of the set of cuts made in the genome when they are applied. We directly compare cuts to breakpoints, and use this concept to count the distinct rearrangement actions which apply a given number of cuts. Finally, we provide some examples of rearrangement models, and include a discussion of some questions that arise when defining plausible models.


Subject(s)
Gene Rearrangement , Mathematical Concepts , Phylogeny , Models, Biological , Genome , Algorithms , Models, Genetic
13.
J Math Biol ; 87(2): 25, 2023 07 10.
Article in English | MEDLINE | ID: mdl-37423919

ABSTRACT

Genome rearrangements are evolutionary events that shuffle genomic architectures. The number of genome rearrangements that happened between two genomes is often used as the evolutionary distance between these species. This number is often estimated as the minimum number of genome rearrangements required to transform one genome into another which are only reliable for closely-related genomes. These estimations often underestimate the evolutionary distance for genomes that have substantially evolved from each other, and advanced statistical methods can be used to improve accuracy. Several statistical estimators have been developed, under various evolutionary models, of which the most complete one, INFER, takes into account different degrees of genome fragility. We present TruEst-an efficient tool that estimates the evolutionary distance between the genomes under the INFER model of genome rearrangements. We apply our method to both simulated and real data. It shows high accuracy on the simulated data. On the real datasets of mammal genomes the method found several pairs of genomes for which the estimated distances are in high consistency with the previous ancestral reconstruction studies.


Subject(s)
Biological Evolution , Evolution, Molecular , Animals , Genomics/methods , Genome , Gene Rearrangement , Mammals/genetics , Algorithms , Phylogeny , Models, Genetic
14.
J Bioinform Comput Biol ; 21(2): 2350009, 2023 04.
Article in English | MEDLINE | ID: mdl-37104034

ABSTRACT

Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.


Subject(s)
Gene Rearrangement , Models, Genetic , DNA, Intergenic/genetics , Genome , Mutation , Algorithms
15.
Chromosome Res ; 31(1): 2, 2023 01 20.
Article in English | MEDLINE | ID: mdl-36662301

ABSTRACT

Karyotypes are generally conserved between closely related species and large chromosome rearrangements typically have negative fitness consequences in heterozygotes, potentially driving speciation. In the order Lepidoptera, most investigated species have the ancestral karyotype and gene synteny is often conserved across deep divergence, although examples of extensive genome reshuffling have recently been demonstrated. The genus Leptidea has an unusual level of chromosome variation and rearranged sex chromosomes, but the extent of restructuring across the rest of the genome is so far unknown. To explore the genomes of the wood white (Leptidea) species complex, we generated eight genome assemblies using a combination of 10X linked reads and HiC data, and improved them using linkage maps for two populations of the common wood white (L. sinapis) with distinct karyotypes. Synteny analysis revealed an extensive amount of rearrangements, both compared to the ancestral karyotype and between the Leptidea species, where only one of the three Z chromosomes was conserved across all comparisons. Most restructuring was explained by fissions and fusions, while translocations appear relatively rare. We further detected several examples of segregating rearrangement polymorphisms supporting a highly dynamic genome evolution in this clade. Fusion breakpoints were enriched for LINEs and LTR elements, which suggests that ectopic recombination might be an important driver in the formation of new chromosomes. Our results show that chromosome count alone may conceal the extent of genome restructuring and we propose that the amount of genome evolution in Lepidoptera might still be underestimated due to lack of taxonomic sampling.


Subject(s)
Butterflies , Animals , Butterflies/genetics , Wood , Chromosome Mapping , Genome , Synteny , Sex Chromosomes , Evolution, Molecular
16.
Bioessays ; 44(10): e2100267, 2022 10.
Article in English | MEDLINE | ID: mdl-36050893

ABSTRACT

Knowledge of eukaryotic life cycles and associated genome dynamics stems largely from research on animals, plants, and a small number of "model" (i.e., easily cultivable) lineages. This skewed sampling results in an underappreciation of the variability among the many microeukaryotic lineages, which represent the bulk of eukaryotic biodiversity. The range of complex nuclear transformations that exists within lineages of microbial eukaryotes challenges the textbook understanding of genome and nuclear cycles. Here, we look in-depth at Foraminifera, an ancient (∼600 million-year-old) lineage widely studied as proxies in paleoceanography and environmental biomonitoring. We demonstrate that Foraminifera challenge the "rules" of life cycles developed largely from studies of plants and animals. To this end, we synthesize data on foraminiferal life cycles, focusing on extensive endoreplication within individuals (i.e., single cells), the unusual nuclear process called Zerfall, and the separation of germline and somatic function into distinct nuclei (i.e., heterokaryosis). These processes highlight complexities within lineages and expand our understanding of the dynamics of eukaryotic genomes.


Subject(s)
Foraminifera , Animals , Biodiversity , Eukaryota/genetics , Eukaryotic Cells , Foraminifera/genetics , Genome/genetics
17.
Genes (Basel) ; 13(7)2022 07 14.
Article in English | MEDLINE | ID: mdl-35886027

ABSTRACT

Eukaryotic DNA replication is regulated by conserved mechanisms that bring about a spatial and temporal organization in which distinct genomic domains are copied at characteristic times during S phase. Although this replication program has been closely linked with genome architecture, we still do not understand key aspects of how chromosomal context modulates the activity of replication origins. To address this question, we have exploited models that combine engineered genomic rearrangements with the unique replication programs of post-quiescence and pre-meiotic S phases. Our results demonstrate that large-scale inversions surprisingly do not affect cell proliferation and meiotic progression, despite inducing a restructuring of replication domains on each rearranged chromosome. Remarkably, these alterations in the organization of DNA replication are entirely due to changes in the positions of existing origins along the chromosome, as their efficiencies remain virtually unaffected genome wide. However, we identified striking alterations in origin firing proximal to the fusion points of each inversion, suggesting that the immediate chromosomal neighborhood of an origin is a crucial determinant of its activity. Interestingly, the impact of genome reorganization on replication initiation is highly comparable in the post-quiescent and pre-meiotic S phases, despite the differences in DNA metabolism in these two physiological states. Our findings therefore shed new light on how origin selection and the replication program are governed by chromosomal architecture.


Subject(s)
Genome, Fungal , Replication Origin , Chromosomes/genetics , DNA Replication/genetics , Replication Origin/genetics , S Phase
18.
Genome Biol Evol ; 14(4)2022 04 10.
Article in English | MEDLINE | ID: mdl-35420669

ABSTRACT

Members of the Peronosporaceae (Oomycota, Chromista), which currently consists of 25 genera and approximately 1,000 recognized species, are responsible for disease on a wide range of plant hosts. Molecular phylogenetic analyses over the last two decades have improved our understanding of evolutionary relationships within Peronosporaceae. To date, 16 numbered and three named clades have been recognized; it is clear from these studies that the current taxonomy does not reflect evolutionary relationships. Whole organelle genome sequences are an increasingly important source of phylogenetic information, and in this study, we present comparative and phylogenetic analyses of mitogenome sequences from 15 of the 19 currently recognized clades of Peronosporaceae, including 44 newly assembled sequences. Our analyses suggest strong conservation of mitogenome size and gene content across Peronosporaceae but, as previous studies have suggested, limited conservation of synteny. Specifically, we identified 28 distinct syntenies amongst the 71 examined isolates. Moreover, 19 of the isolates contained inverted or direct repeats, suggesting repeated sequences may be more common than previously thought. In terms of phylogenetic relationships, our analyses of 34 concatenated mitochondrial gene sequences resulted in a topology that was broadly consistent with previous studies. However, unlike previous studies concatenated mitochondrial sequences provided strong support for higher-level relationships within the family.


Subject(s)
Genome, Mitochondrial , Oomycetes , Evolution, Molecular , Genes, Mitochondrial , Oomycetes/genetics , Phylogeny , Synteny
19.
Algorithms Mol Biol ; 17(1): 1, 2022 Jan 15.
Article in English | MEDLINE | ID: mdl-35033127

ABSTRACT

BACKGROUND: SORTING BY TRANSPOSITIONS (SBT) is a classical problem in genome rearrangements. In 2012, SBT was proven to be [Formula: see text]-hard and the best approximation algorithm with a 1.375 ratio was proposed in 2006 by Elias and Hartman (EH algorithm). Their algorithm employs simplification, a technique used to transform an input permutation [Formula: see text] into a simple permutation [Formula: see text], presumably easier to handle with. The permutation [Formula: see text] is obtained by inserting new symbols into [Formula: see text] in a way that the lower bound of the transposition distance of [Formula: see text] is kept on [Formula: see text]. The simplification is guaranteed to keep the lower bound, not the transposition distance. A sequence of operations sorting [Formula: see text] can be mimicked to sort [Formula: see text]. RESULTS AND CONCLUSIONS: First, using an algebraic approach, we propose a new upper bound for the transposition distance, which holds for all [Formula: see text]. Next, motivated by a problem identified in the EH algorithm, which causes it, in scenarios involving how the input permutation is simplified, to require one extra transposition above the 1.375-approximation ratio, we propose a new approximation algorithm to solve SBT ensuring the 1.375-approximation ratio for all [Formula: see text]. We implemented our algorithm and EH's. Regarding the implementation of the EH algorithm, two other issues were identified and needed to be fixed. We tested both algorithms against all permutations of size n, [Formula: see text]. The results show that the EH algorithm exceeds the approximation ratio of 1.375 for permutations with a size greater than 7. The percentage of computed distances that are equal to transposition distance, computed by the implemented algorithms are also compared with others available in the literature. Finally, we investigate the performance of both implementations on longer permutations of maximum length 500. From the experiments, we conclude that maximum and the average distances computed by our algorithm are a little better than the ones computed by the EH algorithm and the running times of both algorithms are similar, despite the time complexity of our algorithm being higher.

20.
J Comput Biol ; 29(3): 243-256, 2022 03.
Article in English | MEDLINE | ID: mdl-34724796

ABSTRACT

In the comparative genomics field, one way to infer the evolutionary distance between two organisms of related species is by finding the minimum number of large-scale mutations, called genome rearrangements, that transform one genome into the other. This number is referred to as the rearrangement distance. Since problems in this area emerged in the mid-1990s, several genome rearrangements have been proposed. Rearrangements that do not alter the genome content are called conservative, and in this group we have the following: the reversal, which inverts a segment of the genome; the transposition, which exchanges two consecutive segments; and the double cut and join, which cuts two different pairs of adjacent blocks and joins them differently. Seminal works compared genomes sharing the same set of conserved blocks, but nowadays, researchers started looking at genomes with unequal gene content, by allowing the use of nonconservative rearrangements such as insertion and deletion (jointly called indel). The transposition distance and the transposition and indel distance are both NP-hard. We investigate the transposition and indel distance and present a structure called labeled cycle graph, representing an instance of rearrangement distance problems for genomes with unequal gene content. This structure is used to devise a lower bound and a 2-approximation algorithm for the transposition and indel distance.


Subject(s)
Genome , INDEL Mutation , Algorithms , Gene Rearrangement , Genomics , Models, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL