Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Dalton Trans ; 53(22): 9516-9525, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38767874

RESUMO

A set of metallocene olefin polymerization catalysts bearing triptycene moieties in either position 4-5 (complexes Ty1-Ty5) or in position 5-6 (complexes Ty6-Ty8) of the basic dimethylsilyl-bridged bis(indenyl) system has been tested in propene polymerization and in ethene/1-hexene copolymerization. Comparison of the results with QSPR (quantitative structure-property relationship) predictions not parametrized for these exotic ligand variations demonstrates that trends can still be identified by extrapolation. Interestingly, Ty7, upon suitable activation, provides a highly isotactic polypropylene with an exceptional amount of 2,1 regio-errors (8%). The previously developed QSPR type models successfully predicted the low regioselectivity of this catalyst, despite the fact that the catalyst structure differs significantly from the benchmark set.

2.
J Math Biol ; 87(2): 25, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37423919

RESUMO

Genome rearrangements are evolutionary events that shuffle genomic architectures. The number of genome rearrangements that happened between two genomes is often used as the evolutionary distance between these species. This number is often estimated as the minimum number of genome rearrangements required to transform one genome into another which are only reliable for closely-related genomes. These estimations often underestimate the evolutionary distance for genomes that have substantially evolved from each other, and advanced statistical methods can be used to improve accuracy. Several statistical estimators have been developed, under various evolutionary models, of which the most complete one, INFER, takes into account different degrees of genome fragility. We present TruEst-an efficient tool that estimates the evolutionary distance between the genomes under the INFER model of genome rearrangements. We apply our method to both simulated and real data. It shows high accuracy on the simulated data. On the real datasets of mammal genomes the method found several pairs of genomes for which the estimated distances are in high consistency with the previous ancestral reconstruction studies.


Assuntos
Evolução Biológica , Evolução Molecular , Animais , Genômica/métodos , Genoma , Rearranjo Gênico , Mamíferos/genética , Algoritmos , Filogenia , Modelos Genéticos
3.
Gigascience ; 112022 03 12.
Artigo em Inglês | MEDLINE | ID: mdl-35277961

RESUMO

BACKGROUND: The barnacles are a group of >2,000 species that have fascinated biologists, including Darwin, for centuries. Their lifestyles are extremely diverse, from free-swimming larvae to sessile adults, and even root-like endoparasites. Barnacles also cause hundreds of millions of dollars of losses annually due to biofouling. However, genomic resources for crustaceans, and barnacles in particular, are lacking. RESULTS: Using 62× Pacific Biosciences coverage, 189× Illumina whole-genome sequencing coverage, 203× HiC coverage, and 69× CHi-C coverage, we produced a chromosome-level genome assembly of the gooseneck barnacle Pollicipes pollicipes. The P. pollicipes genome is 770 Mb long and its assembly is one of the most contiguous and complete crustacean genomes available, with a scaffold N50 of 47 Mb and 90.5% of the BUSCO Arthropoda gene set. Using the genome annotation produced here along with transcriptomes of 13 other barnacle species, we completed phylogenomic analyses on a nearly 2 million amino acid alignment. Contrary to previous studies, our phylogenies suggest that the Pollicipedomorpha is monophyletic and sister to the Balanomorpha, which alters our understanding of barnacle larval evolution and suggests homoplasy in a number of naupliar characters. We also compared transcriptomes of P. pollicipes nauplius larvae and adults and found that nearly one-half of the genes in the genome are differentially expressed, highlighting the vastly different transcriptomes of larvae and adult gooseneck barnacles. Annotation of the genes with KEGG and GO terms reveals that these stages exhibit many differences including cuticle binding, chitin binding, microtubule motor activity, and membrane adhesion. CONCLUSION: This study provides high-quality genomic resources for a key group of crustaceans. This is especially valuable given the roles P. pollicipes plays in European fisheries, as a sentinel species for coastal ecosystems, and as a model for studying barnacle adhesion as well as its key position in the barnacle tree of life. A combination of genomic, phylogenetic, and transcriptomic analyses here provides valuable insights into the evolution and development of barnacles.


Assuntos
Thoracica , Animais , Cromossomos , Ecossistema , Filogenia , Thoracica/genética , Thoracica/metabolismo , Transcriptoma
4.
Bioinformatics ; 38(2): 357-363, 2022 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-34601581

RESUMO

MOTIVATION: High plasticity of bacterial genomes is provided by numerous mechanisms including horizontal gene transfer and recombination via numerous flanking repeats. Genome rearrangements such as inversions, deletions, insertions and duplications may independently occur in different strains, providing parallel adaptation or phenotypic diversity. Specifically, such rearrangements might be responsible for virulence, antibiotic resistance and antigenic variation. However, identification of such events requires laborious manual inspection and verification of phyletic pattern consistency. RESULTS: Here, we define the term 'parallel rearrangements' as events that occur independently in phylogenetically distant bacterial strains and present a formalization of the problem of parallel rearrangements calling. We implement an algorithmic solution for the identification of parallel rearrangements in bacterial populations as a tool PaReBrick. The tool takes a collection of strains represented as a sequence of oriented synteny blocks and a phylogenetic tree as input data. It identifies rearrangements, tests them for consistency with a tree, and sorts the events by their parallelism score. The tool provides diagrams of the neighbors for each block of interest, allowing the detection of horizontally transferred blocks or their extra copies and the inversions in which copied blocks are involved. We demonstrated PaReBrick's efficiency and accuracy and showed its potential to detect genome rearrangements responsible for pathogenicity and adaptation in bacterial genomes. AVAILABILITY AND IMPLEMENTATION: PaReBrick is written in Python and is available on GitHub: https://github.com/ctlab/parallel-rearrangements. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Variação Antigênica , Genoma Bacteriano , Filogenia , Sintenia , Software
5.
Front Microbiol ; 12: 628622, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33912145

RESUMO

Shigella are pathogens originating within the Escherichia lineage but frequently classified as a separate genus. Shigella genomes contain numerous insertion sequences (ISs) that lead to pseudogenisation of affected genes and an increase of non-homologous recombination. Here, we study 414 genomes of E. coli and Shigella strains to assess the contribution of genomic rearrangements to Shigella evolution. We found that Shigella experienced exceptionally high rates of intragenomic rearrangements and had a decreased rate of homologous recombination compared to pathogenic and non-pathogenic E. coli. The high rearrangement rate resulted in independent disruption of syntenic regions and parallel rearrangements in different Shigella lineages. Specifically, we identified two types of chromosomally encoded E3 ubiquitin-protein ligases acquired independently by all Shigella strains that also showed a high level of sequence conservation in the promoter and further in the 5'-intergenic region. In the only available enteroinvasive E. coli (EIEC) strain, which is a pathogenic E. coli with a phenotype intermediate between Shigella and non-pathogenic E. coli, we found a rate of genome rearrangements comparable to those in other E. coli and no functional copies of the two Shigella-specific E3 ubiquitin ligases. These data indicate that the accumulation of ISs influenced many aspects of genome evolution and played an important role in the evolution of intracellular pathogens. Our research demonstrates the power of comparative genomics-based on synteny block composition and an important role of non-coding regions in the evolution of genomic islands.

6.
Gigascience ; 10(3)2021 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-33718948

RESUMO

BACKGROUND: Anopheles coluzzii and Anopheles arabiensis belong to the Anopheles gambiae complex and are among the major malaria vectors in sub-Saharan Africa. However, chromosome-level reference genome assemblies are still lacking for these medically important mosquito species. FINDINGS: In this study, we produced de novo chromosome-level genome assemblies for A. coluzzii and A. arabiensis using the long-read Oxford Nanopore sequencing technology and the Hi-C scaffolding approach. We obtained 273.4 and 256.8 Mb of the total assemblies for A. coluzzii and A. arabiensis, respectively. Each assembly consists of 3 chromosome-scale scaffolds (X, 2, 3), complete mitochondrion, and unordered contigs identified as autosomal pericentromeric DNA, X pericentromeric DNA, and Y sequences. Comparison of these assemblies with the existing assemblies for these species demonstrated that we obtained improved reference-quality genomes. The new assemblies allowed us to identify genomic coordinates for the breakpoint regions of fixed and polymorphic chromosomal inversions in A. coluzzii and A. arabiensis. CONCLUSION: The new chromosome-level assemblies will facilitate functional and population genomic studies in A. coluzzii and A. arabiensis. The presented assembly pipeline will accelerate progress toward creating high-quality genome references for other disease vectors.


Assuntos
Anopheles , Malária , Animais , Anopheles/genética , Cromossomos/genética , Genômica , Malária/genética , Mosquitos Vetores/genética
7.
BMC Bioinformatics ; 21(Suppl 6): 261, 2020 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-33203350

RESUMO

BACKGROUND: Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem. RESULTS: We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datasets. For the first many-replicate expression dataset we show that the proposed approach is consistent with an existing resampling-based method. On the second dataset the jackknife resampling method is inapplicable due to the small number of biological replicates, but the MCMC method can be run and shows high classification performance. CONCLUSIONS: The proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy.


Assuntos
Algoritmos , Biologia Computacional , Cadeias de Markov , Método de Monte Carlo , Teorema de Bayes , Expressão Gênica , Probabilidade
8.
Infect Genet Evol ; 82: 104277, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32151775

RESUMO

Currently, the standard practice for assembling next-generation sequencing (NGS) reads of viral genomes is to summarize thousands of individual short reads into a single consensus sequence, thus confounding useful intra-host diversity information for molecular phylodynamic inference. It is hypothesized that a few viral strains may dominate the intra-host genetic diversity with a variety of lower frequency strains comprising the rest of the population. Several software tools currently exist to convert NGS sequence variants into haplotypes. Previous benchmarks of viral haplotype reconstruction programs used simulation scenarios that are useful from a mathematical perspective but do not reflect viral evolution and epidemiology. Here, we tested twelve NGS haplotype reconstruction methods using viral populations simulated under realistic evolutionary dynamics. We simulated coalescent-based populations that spanned known levels of viral genetic diversity, including mutation rates, sample size and effective population size, to test the limits of the haplotype reconstruction methods and to ensure coverage of predicted intra-host viral diversity levels (especially HIV-1). All twelve investigated haplotype callers showed variable performance and produced drastically different results that were mainly driven by differences in mutation rate and, to a lesser extent, in effective population size. Most methods were able to accurately reconstruct haplotypes when genetic diversity was low. However, under higher levels of diversity (e.g., those seen intra-host HIV-1 infections), haplotype reconstruction quality was highly variable and, on average, poor. All haplotype reconstruction tools, except QuasiRecomb and ShoRAH, greatly underestimated intra-host diversity and the true number of haplotypes. PredictHaplo outperformed, in regard to highest precision, recall, and lowest UniFrac distance values, the other haplotype reconstruction tools followed by CliqueSNV, which, given more computational time, may have outperformed PredictHaplo. Here, we present an extensive comparison of available viral haplotype reconstruction tools and provide insights for future improvements in haplotype reconstruction tools using both short-read and long-read technologies.


Assuntos
Biologia Computacional/métodos , Genoma Viral , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Genética , Infecções por HIV/virologia , HIV-1/genética , Interações Hospedeiro-Patógeno/genética , Humanos , Taxa de Mutação , Densidade Demográfica
9.
Bioinformatics ; 36(10): 2993-3003, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32058559

RESUMO

MOTIVATION: One of the key computational problems in comparative genomics is the reconstruction of genomes of ancestral species based on genomes of extant species. Since most dramatic changes in genomic architectures are caused by genome rearrangements, this problem is often posed as minimization of the number of genome rearrangements between extant and ancestral genomes. The basic case of three given genomes is known as the genome median problem. Whole-genome duplications (WGDs) represent yet another type of dramatic evolutionary events and inspire the reconstruction of preduplicated ancestral genomes, referred to as the genome halving problem. Generalization of WGDs to whole-genome multiplication events leads to the genome aliquoting problem. RESULTS: In this study, we propose polynomial-size integer linear programming (ILP) formulations for the aforementioned problems. We further obtain such formulations for the restricted and conserved versions of the median and halving problems, which have been recently introduced to improve biological relevance of the solutions. Extensive evaluation of solutions to the different ILP problems demonstrates their good accuracy. Furthermore, since the ILP formulations for the conserved versions have linear size, they provide a novel practical approach to ancestral genome reconstruction, which combines the advantages of homology- and rearrangements-based methods. AVAILABILITY AND IMPLEMENTATION: Code and data are available in https://github.com/AvdeevPavel/ILP-WGD-reconstructor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Programação Linear , Algoritmos , Evolução Biológica , Evolução Molecular , Genômica , Filogenia
10.
BMC Bioinformatics ; 20(Suppl 20): 641, 2019 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-31842730

RESUMO

BACKGROUND: Many cancer genomes are extensively rearranged with highly aberrant chromosomal karyotypes. Structural and copy number variations in cancer genomes can be determined via abnormal mapping of sequenced reads to the reference genome. Recently it became possible to reconcile both of these types of large-scale variations into a karyotype graph representation of the rearranged cancer genomes. Such a representation, however, does not directly describe the linear and/or circular structure of the underlying rearranged cancer chromosomes, thus limiting possible analysis of cancer genomes somatic evolutionary process as well as functional genomic changes brought by the large-scale genome rearrangements. RESULTS: Here we address the aforementioned limitation by introducing a novel methodological framework for recovering rearranged cancer chromosomes from karyotype graphs. For a cancer karyotype graph we formulate an Eulerian Decomposition Problem (EDP) of finding a collection of linear and/or circular rearranged cancer chromosomes that are determined by the graph. We derive and prove computational complexities for several variations of the EDP. We then demonstrate that Eulerian decomposition of the cancer karyotype graphs is not always unique and present the Consistent Contig Covering Problem (CCCP) of recovering unambiguous cancer contigs from the cancer karyotype graph, and describe a novel algorithm CCR capable of solving CCCP in polynomial time. We apply CCR on a prostate cancer dataset and demonstrate that it is capable of consistently recovering large cancer contigs even when underlying cancer genomes are highly rearranged. CONCLUSIONS: CCR can recover rearranged cancer contigs from karyotype graphs thereby addressing existing limitation in inferring chromosomal structures of rearranged cancer genomes and advancing our understanding of both patient/cancer-specific as well as the overall genetic instability in cancer.


Assuntos
Cromossomos/genética , Rearranjo Gênico/genética , Cariótipo , Neoplasias/genética , Algoritmos , Sequência de Bases , Genoma , Humanos
11.
J Comput Biol ; 25(11): 1203-1219, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30133318

RESUMO

Construction of phylogenetic trees and networks for extant species from their characters represents one of the key problems in phylogenomics. While solution to this problem is not always uniquely defined and there exist multiple methods for tree/network construction, it becomes important to measure how well the constructed networks capture the given character relationship across the species. We propose a novel method for measuring the specificity of a given phylogenetic network in terms of the total number of distributions of homoplasy-free character states at the leaves that the network may impose. While for binary phylogenetic trees, this number has an exact formula and depends only on the number of leaves and character states but not on the tree topology, the situation is much more complicated for nonbinary trees or networks. Nevertheless, we develop an algorithm for combinatorial enumeration of such distributions, which is applicable for arbitrary trees and networks under some reasonable assumptions. We further extend our algorithm to a special class of characters that follow Dollo's law of irreversibility.


Assuntos
Algoritmos , Biologia Computacional/métodos , Modelos Genéticos , Redes Neurais de Computação , Filogenia , Criança , Cor , Humanos , Conceitos Matemáticos
12.
BMC Genomics ; 18(Suppl 4): 356, 2017 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-28589865

RESUMO

BACKGROUND: The ability to estimate the evolutionary distance between extant genomes plays a crucial role in many phylogenomic studies. Often such estimation is based on the parsimony assumption, implying that the distance between two genomes can be estimated as the rearrangement distance equal the minimal number of genome rearrangements required to transform one genome into the other. However, in reality the parsimony assumption may not always hold, emphasizing the need for estimation that does not rely on the rearrangement distance. The distance that accounts for the actual (rather than minimal) number of rearrangements between two genomes is often referred to as the true evolutionary distance. While there exists a method for the true evolutionary distance estimation, it however assumes that genomes can be broken by rearrangements equally likely at any position in the course of evolution. This assumption, known as the random breakage model, has recently been refuted in favor of the more rigorous fragile breakage model postulating that only certain "fragile" genomic regions are prone to rearrangements. RESULTS: We propose a new method for estimating the true evolutionary distance between two genomes under the fragile breakage model. We evaluate the proposed method on simulated genomes, which show its high accuracy. We further apply the proposed method for estimation of evolutionary distances within a set of five yeast genomes and a set of two fish genomes. CONCLUSIONS: The true evolutionary distances between the five yeast genomes estimated with the proposed method reveals that some pairs of yeast genomes violate the parsimony assumption. The proposed method further demonstrates that the rearrangement distance between the two fish genomes underestimates their evolutionary distance by about 20%. These results demonstrate how drastically the two distances can differ and justify the use of true evolutionary distance in phylogenomic studies.


Assuntos
Evolução Molecular , Modelos Genéticos , Genômica , Filogenia
13.
J Comput Biol ; 24(2): 93-105, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28045556

RESUMO

Genome rearrangements can be modeled as k-breaks, which break a genome at k positions and glue the resulting fragments in a new order. In particular, reversals, translocations, fusions, and fissions are modeled as 2-breaks, and transpositions are modeled as 3-breaks. Although k-break rearrangements for [Formula: see text] have not been observed in evolution, they are used in cancer genomics to model chromothripsis, a catastrophic event of multiple breakages happening simultaneously in a genome. It is known that the k-break distance between two genomes (i.e., the minimum number of k-breaks required to transform one genome into the other) can be computed in terms of cycle lengths in the breakpoint graph of these genomes. In this work, we address the combinatorial problem of enumerating genomes at a given k-break distance from a fixed unichromosomal genome. More generally, we enumerate genome pairs, whose breakpoint graph has a given distribution of cycle lengths. We further show how our enumeration can be used for uniform sampling of random genomes at a given k-break distance, and describe its connection to various combinatorial objects such as Bell polynomials.


Assuntos
Algoritmos , Quebra Cromossômica , Cromotripsia , Rearranjo Gênico , Genoma , Genômica/métodos , Animais , Gráficos por Computador , Evolução Molecular , Humanos , Modelos Genéticos
14.
BMC Bioinformatics ; 17(Suppl 14): 418, 2016 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-28185564

RESUMO

BACKGROUND: Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space. RESULTS: We show that the restricted variants of genome median and halving problems are, in fact, closely related. We demonstrate that these problems have a neat topological interpretation in terms of embedded graphs and polygon gluings. We illustrate how such interpretation can lead to solutions to these problems in particular cases. CONCLUSIONS: This study provides an unexpected link between comparative genomics and topology, and demonstrates advantages of solving genome median and halving problems within the topological framework.


Assuntos
Genômica , Modelos Genéticos , Genoma
15.
PLoS One ; 10(6): e0129566, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26075913

RESUMO

A high throughput screen for compounds that induce TRAIL-mediated apoptosis identified ML100 as an active chemical probe, which potentiated TRAIL activity in prostate carcinoma PPC-1 and melanoma MDA-MB-435 cells. Follow-up in silico modeling and profiling in cell-based assays allowed us to identify NSC130362, pharmacophore analog of ML100 that induced 65-95% cytotoxicity in cancer cells and did not affect the viability of human primary hepatocytes. In agreement with the activation of the apoptotic pathway, both ML100 and NSC130362 synergistically with TRAIL induced caspase-3/7 activity in MDA-MB-435 cells. Subsequent affinity chromatography and inhibition studies convincingly demonstrated that glutathione reductase (GSR), a key component of the oxidative stress response, is a target of NSC130362. In accordance with the role of GSR in the TRAIL pathway, GSR gene silencing potentiated TRAIL activity in MDA-MB-435 cells but not in human hepatocytes. Inhibition of GSR activity resulted in the induction of oxidative stress, as was evidenced by an increase in intracellular reactive oxygen species (ROS) and peroxidation of mitochondrial membrane after NSC130362 treatment in MDA-MB-435 cells but not in human hepatocytes. The antioxidant reduced glutathione (GSH) fully protected MDA-MB-435 cells from cell lysis induced by NSC130362 and TRAIL, thereby further confirming the interplay between GSR and TRAIL. As a consequence of activation of oxidative stress, combined treatment of different oxidative stress inducers and NSC130362 promoted cell death in a variety of cancer cells but not in hepatocytes in cell-based assays and in in vivo, in a mouse tumor xenograft model.


Assuntos
Apoptose/efeitos dos fármacos , Glutationa Redutase/metabolismo , Ensaios de Triagem em Larga Escala , Estresse Oxidativo , Ligante Indutor de Apoptose Relacionado a TNF/metabolismo , Ligante Indutor de Apoptose Relacionado a TNF/farmacologia , Animais , Antineoplásicos/farmacologia , Linhagem Celular Tumoral , Relação Dose-Resposta a Droga , Doxorrubicina/farmacologia , Descoberta de Drogas , Glutationa/metabolismo , Glutationa Redutase/antagonistas & inibidores , Humanos , Camundongos , Espécies Reativas de Oxigênio , Bibliotecas de Moléculas Pequenas
16.
J Comput Biol ; 21(8): 622-31, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24650202

RESUMO

The cycle graph introduced by Bafna and Pevzner is an important tool for evaluating the distance between two genomes, that is, the minimal number of rearrangements needed to transform one genome into another. We interpret this distance in topological terms and relate it to the random matrix theory. Namely, the number of genomes at a given 2-break distance from a fixed one (the Hultman number) is represented by a coefficient in the genus expansion of a matrix integral over the space of complex matrices with the Gaussian measure. We study generating functions for the Hultman numbers and prove that the two-break distance distribution is asymptotically normal.


Assuntos
Rearranjo Gênico , Genoma , Genômica/métodos , Modelos Genéticos , Algoritmos , Animais , Evolução Molecular , Humanos , Distribuição Normal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA