RESUMEN
We model interactions between cancer cells and viruses during oncolytic viral therapy. One of our primary goals is to identify parameter regions that yield treatment failure or success. We show that the tumor size under therapy at a particular time is less than the size without therapy. Our analysis demonstrates two thresholds for the horizontal transmission rate: a "failure threshold" below which treatment fails, and a "success threshold" above which infection prevalence reaches 100% and the tumor shrinks to its smallest size. Moreover, we explain how changes in the virulence of the virus alter the success threshold and the minimum tumor size. Our study suggests that the optimal virulence of an oncolytic virus depends on the timescale of virus dynamics. We identify a threshold for the virulence of the virus and show how this threshold depends on the timescale of virus dynamics. Our results suggest that when the timescale of virus dynamics is fast, administering a more virulent virus leads to a greater reduction in the tumor size. Conversely, when the viral timescale is slow, higher virulence can induce oscillations with high amplitude in the tumor size. Furthermore, we introduce the concept of a "Hopf bifurcation Island" in the parameter space, an idea that has applications far beyond the results of this paper and is applicable to many mathematical models. We elucidate what a Hopf bifurcation Island is, and we prove that small Islands can imply very slowly growing oscillatory solutions.
Asunto(s)
Neoplasias , Viroterapia Oncolítica , Virus Oncolíticos , Viroterapia Oncolítica/métodos , Humanos , Neoplasias/terapia , Neoplasias/virología , Virus Oncolíticos/fisiología , Modelos Biológicos , Virulencia , Conceptos MatemáticosRESUMEN
Steady states of dynamical systems, whether stable or unstable, are critical for understanding future evolution. Robust steady states, ones that persist under small changes in the model parameters, are desired when modelling ecological systems, where it is common for accurate and detailed information on functional form and parameters to be unavailable. Previous work by Jahedi et al. [Robustness of solutions of almost every system of equations, SIAM J. Appl. Math. 82(5) (2022), pp. 1791-1807; Structured systems of nonlinear equations, SIAM J. Appl. Math. 83(4) (2023), pp. 1696-1716.] has established criteria to imply the prevalence of robust steady states for systems with minimal predetermined structure, including conventional structured systems. We review that work and extend it by allowing symmetries in the system structure, which present added obstructions to robustness.
Asunto(s)
Ecosistema , Modelos BiológicosRESUMEN
As the coronavirus pandemic spreads across the globe, people are debating policies to mitigate its severity. Many complex, highly detailed models have been developed to help policy setters make better decisions. However, the basis of these models is unlikely to be understood by non-experts. We describe the advantages of simple models for COVID-19. We say a model is "simple" if its only parameter is the rate of contact between people in the population. This contact rate can vary over time, depending on choices by policy setters. Such models can be understood by a broad audience, and thus can be helpful in explaining the policy decisions to the public. They can be used to evaluate the outcomes of different policies. However, simple models have a disadvantage when dealing with inhomogeneous populations. To augment the power of a simple model to evaluate complicated situations, we add what we call "satellite" equations that do not change the original model. For example, with the help of a satellite equation, one could know what his/her chance is of remaining uninfected through the end of an epidemic. Satellite equations can model the effects of the epidemic on high-risk individuals, death rates, and nursing homes and other isolated populations. To compare simple models with complex models, we introduce our "slightly complex" Model J. We find the conclusions of simple and complex models can be quite similar. However, for each added complexity, a modeler may have to choose additional parameter values describing who will infect whom under what conditions, choices for which there is often little rationale but that can have big impacts on predictions. Our simulations suggest that the added complexity offers little predictive advantage.
RESUMEN
The dynamics on a chaotic attractor can be quite heterogeneous, being much more unstable in some regions than others. Some regions of a chaotic attractor can be expanding in more dimensions than other regions. Imagine a situation where two such regions and each contains trajectories that stay in the region for all time-while typical trajectories wander throughout the attractor. Furthermore, if arbitrarily close to each point of the attractor there are points on periodic orbits that have different unstable dimensions, then we say such an attractor is "hetero-chaotic" (i.e., it has heterogeneous chaos). This is hard to picture but we believe that most physical systems possessing a high-dimensional attractor are of this type. We have created simplified models with that behavior to give insight into real high-dimensional phenomena.
RESUMEN
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
RESUMEN
A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.
Asunto(s)
Genoma de Planta , Fotosíntesis/genética , Pinaceae/genética , Pinaceae/metabolismo , Pseudotsuga/genética , Pseudotsuga/metabolismo , Secuenciación Completa del Genoma , Adaptación Biológica/genética , Biología Computacional , Evolución Molecular , Duplicación de Gen , Redes Reguladoras de Genes , Genómica , Anotación de Secuencia Molecular , Familia de Multigenes , Filogenia , Pinaceae/clasificación , Proteómica/métodos , Pseudotsuga/clasificación , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
Asunto(s)
Mapeo Contig , Genoma de Planta , Secuenciación de Nucleótidos de Alto Rendimiento , Pinus taeda/genética , Análisis de Secuencia de ADN , Algoritmos , GenómicaRESUMEN
Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted previous attempts at assembly. We show that the resulting assembled contigs are far larger than in any previous assembly, with an N50 contig size of 486,807 nucleotides. We compare the contigs to independently produced optical maps to evaluate their large-scale accuracy, and to a set of high-quality bacterial artificial chromosome (BAC)-based assemblies to evaluate base-level accuracy.
Asunto(s)
Mapeo Contig/métodos , Genoma de Planta , Genómica/métodos , Poaceae/genética , Secuencias Repetitivas de Ácidos Nucleicos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Mapeo Contig/normas , Tamaño del Genoma , Genómica/normas , Análisis de Secuencia de ADN/normasRESUMEN
Transient chaos is a characteristic behaviour in nonlinear dynamics where trajectories in a certain region of phase space behave chaotically for a while, before escaping to an external attractor. In some situations, the escapes are highly undesirable, so that it would be necessary to avoid such a situation. In this paper, we apply a control method known as partial control that allows one to prevent the escapes of the trajectories to the external attractors, keeping the trajectories in the chaotic region forever. We also show, for the first time, the application of this method in three dimensions, which is the major step forward in this work. To illustrate how the method works, we have chosen the Lorenz system for a choice of parameters where transient chaos appears, as a paradigmatic example in nonlinear dynamics. We analyse three quite different ways to implement the method. First, we apply this method by building an one-dimensional map using the successive maxima of one of the variables. Next, we implement it by building a two-dimensional map through a Poincaré section. Finally, we built a three-dimensional map, which has the advantage of using a fixed time interval between application of the control, which can be useful for practical applications.This article is part of the themed issue 'Horizons of cybernetical physics'.
RESUMEN
Until very recently, complete characterization of the megagenomes of conifers has remained elusive. The diploid genome of sugar pine (Pinus lambertiana Dougl.) has a highly repetitive, 31 billion bp genome. It is the largest genome sequenced and assembled to date, and the first from the subgenus Strobus, or white pines, a group that is notable for having the largest genomes among the pines. The genome represents a unique opportunity to investigate genome "obesity" in conifers and white pines. Comparative analysis of P. lambertiana and P. taeda L. reveals new insights on the conservation, age, and diversity of the highly abundant transposable elements, the primary factor determining genome size. Like most North American white pines, the principal pathogen of P. lambertiana is white pine blister rust (Cronartium ribicola J.C. Fischer ex Raben.). Identification of candidate genes for resistance to this pathogen is of great ecological importance. The genome sequence afforded us the opportunity to make substantial progress on locating the major dominant gene for simple resistance hypersensitive response, Cr1 We describe new markers and gene annotation that are both tightly linked to Cr1 in a mapping population, and associated with Cr1 in unrelated sugar pine individuals sampled throughout the species' range, creating a solid foundation for future mapping. This genomic variation and annotated candidate genes characterized in our study of the Cr1 region are resources for future marker-assisted breeding efforts as well as for investigations of fundamental mechanisms of invasive disease and evolutionary response.
Asunto(s)
Genoma de Planta , Pinus/genética , Basidiomycota/patogenicidad , Elementos Transponibles de ADN , Variación Genética , Tamaño del Genoma , Pinus/inmunología , Pinus/microbiología , Inmunidad de la Planta/genéticaRESUMEN
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Asunto(s)
Diploidia , Evolución Molecular , Duplicación de Gen/genética , Genes Duplicados/genética , Genoma/genética , Salmo salar/genética , Animales , Elementos Transponibles de ADN/genética , Femenino , Genómica , Masculino , Modelos Genéticos , Mutagénesis/genética , Filogenia , Estándares de Referencia , Salmo salar/clasificación , Homología de SecuenciaRESUMEN
BACKGROUND: The diversity in eukaryotic life reflects a diversity in regulatory pathways. Nocedal and Johnson argue that the rewiring of gene regulatory networks is a major force for the diversity of life, that changes in regulation can create new species. RESULTS: We have created a method (based on our new "ping-pong algorithm) for detecting more complicated rewirings, where several transcription factors can substitute for one or more transcription factors in the regulation of a family of co-regulated genes. An example is illustrative. A rewiring has been reported by Hogues et al. that RAP1 in Saccharomyces cerevisiae substitutes for TBF1/CBF1 in Candida albicans for ribosomal RP genes. There one transcription factor substitutes for another on some collection of genes. Such a substitution is referred to as a "rewiring". We agree with this finding of rewiring as far as it goes but the situation is more complicated. Many transcription factors can regulate a gene and our algorithm finds that in this example a "team" (or collection) of three transcription factors including RAP1 substitutes for TBF1 for 19 genes. The switch occurs for a branch of the phylogenetic tree containing 10 species (including Saccharomyces cerevisiae), while the remaining 13 species (Candida albicans) are regulated by TBF1. CONCLUSIONS: To gain insight into more general evolutionary mechanisms, we have created a mathematical algorithm that finds such general switching events and we prove that it converges. Of course any such computational discovery should be validated in the biological tests. For each branch of the phylogenetic tree and each gene module, our algorithm finds a sub-group of co-regulated genes and a team of transcription factors that substitutes for another team of transcription factors. In most cases the signal will be small but in some cases we find a strong signal of switching. We report our findings for 23 Ascomycota fungi species.
Asunto(s)
Algoritmos , Evolución Molecular , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Factores de Transcripción/genética , Candida albicans/clasificación , Candida albicans/genética , Candida albicans/metabolismo , Redes Reguladoras de Genes , Filogenia , Saccharomyces cerevisiae/clasificación , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Complejo Shelterina , Proteínas de Unión a Telómeros/genética , Factores de Transcripción/metabolismo , Transcripción GenéticaRESUMEN
Nonlinear systems often give rise to fractal boundaries in phase space, hindering predictability. When a single boundary separates three or more different basins of attraction, we say that the set of basins has the Wada property and initial conditions near that boundary are even more unpredictable. Many physical systems of interest with this topological property appear in the literature. However, so far the only approach to study Wada basins has been restricted to two-dimensional phase spaces. Here we report a simple algorithm whose purpose is to look for the Wada property in a given dynamical system. Another benefit of this procedure is the possibility to classify and study intermediate situations known as partially Wada boundaries.
RESUMEN
We investigate the geometry of the edge of chaos for a nine-dimensional sinusoidal shear flow model and show how the shape of the edge of chaos changes with increasing Reynolds number. Furthermore, we numerically compute the scaling of the minimum perturbation required to drive the laminar attracting state into the turbulent region. We find this minimum perturbation to scale with the Reynolds number as Re(-2).
RESUMEN
MOTIVATION: Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read at every base in the genome. These errors make handling the data more complicated because they result in a large number of low-count erroneous k-mers in the reads. However, there is enough information in the reads to correct most of the sequencing errors, thus making subsequent use of the data (e.g. for mapping or assembly) easier. Here we use the term "error correction" to denote the reduction in errors due to both changes in individual bases and trimming of unusable sequence. We developed an error correction software called QuorUM. QuorUM is mainly aimed at error correcting Illumina reads for subsequent assembly. It is designed around the novel idea of minimizing the number of distinct erroneous k-mers in the output reads and preserving the most true k-mers, and we introduce a composite statistic π that measures how successful we are at achieving this dual goal. We evaluate the performance of QuorUM by correcting actual Illumina reads from genomes for which a reference assembly is available. RESULTS: We produce trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. We compared QuorUM against several published error correctors and found that it is the best performer in most metrics we use. QuorUM is efficiently implemented making use of current multi-core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core). We also demonstrate that a third-party assembler (SOAPdenovo) benefits significantly from using QuorUM error-corrected reads. QuorUM error corrected reads result in a factor of 1.1 to 4 improvement in N50 contig size compared to using the original reads with SOAPdenovo for the data sets investigated. AVAILABILITY: QuorUM is distributed as an independent software package and as a module of the MaSuRCA assembly software. Both are available under the GPL open source license at http://www.genome.umd.edu. CONTACT: gmarcais@umd.edu.
Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Animales , Genoma , HumanosRESUMEN
BACKGROUND: The rhesus macaque (Macaca mulatta) is a key species for advancing biomedical research. Like all draft mammalian genomes, the draft rhesus assembly (rheMac2) has gaps, sequencing errors and misassemblies that have prevented automated annotation pipelines from functioning correctly. Another rhesus macaque assembly, CR_1.0, is also available but is substantially more fragmented than rheMac2 with smaller contigs and scaffolds. Annotations for these two assemblies are limited in completeness and accuracy. High quality assembly and annotation files are required for a wide range of studies including expression, genetic and evolutionary analyses. RESULTS: We report a new de novo assembly of the rhesus macaque genome (MacaM) that incorporates both the original Sanger sequences used to assemble rheMac2 and new Illumina sequences from the same animal. MacaM has a weighted average (N50) contig size of 64 kilobases, more than twice the size of the rheMac2 assembly and almost five times the size of the CR_1.0 assembly. The MacaM chromosome assembly incorporates information from previously unutilized mapping data and preliminary annotation of scaffolds. Independent assessment of the assemblies using Ion Torrent read alignments indicates that MacaM is more complete and accurate than rheMac2 and CR_1.0. We assembled messenger RNA sequences from several rhesus tissues into transcripts which allowed us to identify a total of 11,712 complete proteins representing 9,524 distinct genes. Using a combination of our assembled rhesus macaque transcripts and human transcripts, we annotated 18,757 transcripts and 16,050 genes with complete coding sequences in the MacaM assembly. Further, we demonstrate that the new annotations provide greatly improved accuracy as compared to the current annotations of rheMac2. Finally, we show that the MacaM genome provides an accurate resource for alignment of reads produced by RNA sequence expression studies. CONCLUSIONS: The MacaM assembly and annotation files provide a substantially more complete and accurate representation of the rhesus macaque genome than rheMac2 or CR_1.0 and will serve as an important resource for investigators conducting next-generation sequencing studies with nonhuman primates. REVIEWERS: This article was reviewed by Dr. Lutz Walter, Dr. Soojin Yi and Dr. Kateryna Makova.
Asunto(s)
Genoma , Macaca mulatta/genética , Secuencia de Aminoácidos , Animales , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , ARN Mensajero/metabolismo , Alineación de SecuenciaRESUMEN
The character of the time-asymptotic evolution of physical systems can have complex, singular behavior with variation of a system parameter, particularly when chaos is involved. A perturbation of the parameter by a small amount ε can convert an attractor from chaotic to nonchaotic or vice versa. We call a parameter value where this can happen ε uncertain. The probability that a random choice of the parameter is ε uncertain commonly scales like a power law in ε. Surprisingly, two seemingly similar ways of defining this scaling, both of physical interest, yield different numerical values for the scaling exponent. We show why this happens and present a quantitative analysis of this phenomenon.
RESUMEN
Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.
Asunto(s)
Genoma de Planta , Óvulo Vegetal/genética , Pinus taeda/genética , Genómica , Haploidia , Análisis de Secuencia de ADN , TranscriptomaRESUMEN
The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (â¼20-40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.
Asunto(s)
Genoma de Planta , Anotación de Secuencia Molecular/métodos , Pinus taeda/genética , ADN de Plantas/análisis , Evolución Molecular , Genes de Plantas , Familia de Multigenes , Filogenia , Alineación de SecuenciaRESUMEN
BACKGROUND: The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. RESULTS: We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. CONCLUSIONS: In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.