RESUMEN
Forest trees growing in high altitude conditions offer a convenient model for studying adaptation processes. They are subject to a whole range of adverse factors that are likely to cause local adaptation and related genetic changes. Siberian larch (Larix sibirica Ledeb.), whose distribution covers different altitudes, makes it possible to directly compare lowland with highland populations. This paper presents for the first time the results of studying the genetic differentiation of Siberian larch populations, presumably associated with adaptation to the altitudinal gradient of climatic conditions, based on a joint analysis of altitude and six other bioclimatic variables, together with a large number of genetic markers, single nucleotide polymorphisms (SNPs), obtained from double digest restriction-site-associated DNA sequencing (ddRADseq). In total, 25,143 SNPs were genotyped in 231 trees. In addition, a dataset of 761 supposedly selectively neutral SNPs was assembled by selecting SNPs located outside coding regions in the Siberian larch genome and mapped to different contigs. The analysis using four different methods (PCAdapt, LFMM, BayeScEnv and RDA) revealed 550 outlier SNPs, including 207 SNPs whose variation was significantly correlated with the variation of some of environmental factors and presumably associated with local adaptation, including 67 SNPs that correlated with altitude based on either LFMM or BayeScEnv and 23 SNPs based on both of them. Twenty SNPs were found in the coding regions of genes, and 16 of them represented non-synonymous nucleotide substitutions. They are located in genes involved in the processes of macromolecular cell metabolism and organic biosynthesis associated with reproduction and development, as well as organismal response to stress. Among these 20 SNPs, nine were possibly associated with altitude, but only one of them was identified as associated with altitude by all four methods used in the study, a nonsynonymous SNP in scaffold_31130 in position 28092, a gene encoding a cell membrane protein with uncertain function. Among the studied populations, at least two main groups (clusters), the Altai populations and all others, were significantly genetically different according to the admixture analysis based on any of the three SNP datasets as follows: 761 supposedly selectively neutral SNPs, all 25,143 SNPs and 550 adaptive SNPs. In general, according to the AMOVA results, genetic differentiation between transects or regions or between population samples was relatively low, although statistically significant, based on 761 neutral SNPs (FST = 0.036) and all 25,143 SNPs (FST = 0.017). Meanwhile, the differentiation based on 550 adaptive SNPs was much higher (FST = 0.218). The data showed a relatively weak but highly significant linear correlation between genetic and geographic distances (r = 0.206, p = 0.001).
Asunto(s)
Larix , Larix/genética , Altitud , Polimorfismo de Nucleótido Simple , Flujo Genético , Adaptación Fisiológica/genética , Árboles , Genética de PoblaciónRESUMEN
Here, we examine in silico the infection dynamics and interactions of two Zika virus (ZIKV) genomes: one is the full-length ZIKV genome (wild type [WT]), and the other is one of the naturally occurring defective viral genomes (DVGs), which can replicate in the presence of the WT genome, appears under high-MOI (multiplicity of infection) passaging conditions, and carries a deletion encompassing part of the structural and NS1 protein-coding region. Ordinary differential equations (ODEs) were used to simulate the infection of cells by virus particles and the intracellular replication of the WT and DVG genomes that produce these particles. For each virus passage in Vero and C6/36 cell cultures, the rates of the simulated processes were fitted to two types of observations: virus titer data and the assembled haplotypes of the replicate passage samples. We studied the consistency of the model with the experimental data across all passages of infection in each cell type separately as well as the sensitivity of the model's parameters. We also determined which simulated processes of virus evolution are the most important for the adaptation of the WT and DVG interplay in these two disparate cell culture environments. Our results demonstrate that in the majority of passages, the rates of DVG production are higher inC6/36 cells than in Vero cells, which might result in tolerance and therefore drive the persistence of the mosquito vector in the context of ZIKV infection. Additionally, the model simulations showed a slower accumulation of infected cells under higher activation of the DVG-associated processes, which indicates a potential role of DVGs in virus attenuation. IMPORTANCE One of the ideas for lessening Zika pathogenicity is the addition of its natural or engineered defective virus genomes (DVGs) (have no pathogenicity) to the infection pool: a DVG is redirecting the wild-type (WT)-associated virus development resources toward its own maturation. The mathematical model presented here, attuned to the data from interplays between WT Zika viruses and their natural DVGs in mammalian and mosquito cells, provides evidence that the loss of uninfected cells is attenuated by the DVG development processes. This model enabled us to estimate the rates of virus development processes in the WT/DVG interplay, determine the key processes, and show that the key processes are faster in mosquito cells than in mammalian ones. In general, the presented model and its detailed study suggest in what important virus development processes the therapeutically efficient DVG might compete with the WT; this may help in assembling engineered DVGs for ZIKV and other flaviviruses.
Asunto(s)
Virus Defectuosos , Interacciones Microbiota-Huesped , Infección por el Virus Zika/virología , Virus Zika , Aedes , Animales , Chlorocebus aethiops , Virus Defectuosos/crecimiento & desarrollo , Virus Defectuosos/patogenicidad , Células Vero , Replicación Viral , Virus Zika/crecimiento & desarrollo , Virus Zika/patogenicidadRESUMEN
BACKGROUND: Massive forest decline has been observed almost everywhere as a result of negative anthropogenic and climatic effects, which can interact with pests, fungi and other phytopathogens and aggravate their effects. Climatic changes can weaken trees and make fungi, such as Armillaria more destructive. Armillaria borealis (Marxm. & Korhonen) is a fungus from the Physalacriaceae family (Basidiomycota) widely distributed in Eurasia, including Siberia and the Far East. Species from this genus cause the root white rot disease that weakens and often kills woody plants. However, little is known about ecological behavior and genetics of A. borealis. According to field research data, A. borealis is less pathogenic than A. ostoyae, and its aggressive behavior is quite rare. Mainly A. borealis behaves as a secondary pathogen killing trees already weakened by other factors. However, changing environment might cause unpredictable effects in fungus behavior. RESULTS: The de novo genome assembly and annotation were performed for the A. borealis species for the first time and presented in this study. The A. borealis genome assembly contained ~ 68 Mbp and was comparable with ~ 60 and ~ 79.5 Mbp for the A. ostoyae and A. mellea genomes, respectively. The N50 for contigs equaled 50,544 bp. Functional annotation analysis revealed 21,969 protein coding genes and provided data for further comparative analysis. Repetitive sequences were also identified. The main focus for further study and comparative analysis will be on the enzymes and regulatory factors associated with pathogenicity. CONCLUSIONS: Pathogenic fungi such as Armillaria are currently one of the main problems in forest conservation. A comprehensive study of these species and their pathogenicity is of great importance and needs good genomic resources. The assembled genome of A. borealis presented in this study is of sufficiently good quality for further detailed comparative study on the composition of enzymes in other Armillaria species. There is also a fundamental problem with the identification and classification of species of the Armillaria genus, where the study of repetitive sequences in the genomes of basidiomycetes and their comparative analysis will help us identify more accurately taxonomy of these species and reveal their evolutionary relationships.
Asunto(s)
Armillaria , Basidiomycota , Armillaria/genética , Plantas , SiberiaRESUMEN
BACKGROUND: Plant mitochondrial genomes (mitogenomes) can be structurally complex while their size can vary from ~ 222 Kbp in Brassica napus to 11.3 Mbp in Silene conica. To date, in comparison with the number of plant species, only a few plant mitogenomes have been sequenced and released, particularly for conifers (the Pinaceae family). Conifers cover an ancient group of land plants that includes about 600 species, and which are of great ecological and economical value. Among them, Siberian larch (Larix sibirica Ledeb.) represents one of the keystone species in Siberian boreal forests. Yet, despite its importance for evolutionary and population studies, the mitogenome of Siberian larch has not yet been assembled and studied. RESULTS: Two sources of DNA sequences were used to search for mitochondrial DNA (mtDNA) sequences: mtDNA enriched samples and nucleotide reads generated in the de novo whole genome sequencing project, respectively. The assembly of the Siberian larch mitogenome contained nine contigs, with the shortest and the largest contigs being 24,767 bp and 4,008,762 bp, respectively. The total size of the genome was estimated at 11.7 Mbp. In total, 40 protein-coding, 34 tRNA, and 3 rRNA genes and numerous repetitive elements (REs) were annotated in this mitogenome. In total, 864 C-to-U RNA editing sites were found for 38 out of 40 protein-coding genes. The immense size of this genome, currently the largest reported, can be partly explained by variable numbers of mobile genetic elements, and introns, but unlikely by plasmid-related sequences. We found few plasmid-like insertions representing only 0.11% of the entire Siberian larch mitogenome. CONCLUSIONS: Our study showed that the size of the Siberian larch mitogenome is much larger than in other so far studied Gymnosperms, and in the same range as for the annual flowering plant Silene conica (11.3 Mbp). Similar to other species, the Siberian larch mitogenome contains relatively few genes, and despite its huge size, the repeated and low complexity regions cover only 14.46% of the mitogenome sequence.
Asunto(s)
Tamaño del Genoma , Genoma Mitocondrial , Larix/genética , Mapeo Contig , Anotación de Secuencia Molecular , Proteínas de Plantas/genética , ARN Ribosómico/genética , ARN de Transferencia/genética , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
BACKGROUND: De novo assembling of large genomes, such as in conifers (~ 12-30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly. RESULTS: An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method. CONCLUSION: Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii.
Asunto(s)
Genoma de Planta , Larix/genética , Análisis de Secuencia de ADN/métodos , Arabidopsis/genética , Prunus/genética , Factores de TiempoRESUMEN
BACKGROUND: Species in the genus Armillaria (fungi, basidiomycota) are well-known as saprophytes and pathogens on plants. Many of them cause white-rot root disease in diverse woody plants worldwide. Mitochondrial genomes (mitogenomes) are widely used in evolutionary and population studies, but despite the importance and wide distribution of Armillaria, the complete mitogenomes have not previously been reported for this genus. Meanwhile, the well-supported phylogeny of Armillaria species provides an excellent framework in which to study variation in mitogenomes and how they have evolved over time. RESULTS: Here we completely sequenced, assembled, and annotated the circular mitogenomes of four species: A. borealis, A. gallica, A. sinapina, and A. solidipes (116,443, 98,896, 103,563, and 122,167 bp, respectively). The variation in mitogenome size can be explained by variable numbers of mobile genetic elements, introns, and plasmid-related sequences. Most Armillaria introns contained open reading frames (ORFs) that are related to homing endonucleases of the LAGLIDADG and GIY-YIG families. Insertions of mobile elements were also evident as fragments of plasmid-related sequences in Armillaria mitogenomes. We also found several truncated gene duplications in all four mitogenomes. CONCLUSIONS: Our study showed that fungal mitogenomes have a high degree of variation in size, gene content, and genomic organization even among closely related species of Armillara. We suggest that mobile genetic elements invading introns and intergenic sequences in the Armillaria mitogenomes have played a significant role in shaping their genome structure. The mitogenome changes we describe here are consistent with widely accepted phylogenetic relationships among the four species.
Asunto(s)
Armillaria/clasificación , Armillaria/genética , ADN Mitocondrial/genética , Genoma Mitocondrial , Secuencias Repetitivas Esparcidas , Proteínas Mitocondriales/genética , Secuenciación de Nucleótidos de Alto Rendimiento , FilogeniaRESUMEN
BACKGROUND: Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. RESULTS: In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. CONCLUSIONS: This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.
Asunto(s)
Genoma , Transcriptoma/genética , Ballenas/genética , Animales , Regulación de la Expresión Génica , Biblioteca de Genes , Anotación de Secuencia Molecular , FilogeniaRESUMEN
The recent release of the nuclear, chloroplast and mitochondrial genome assemblies of Siberian larch (Larix sibirica Ledeb.), one of the most cold-resistant tree species in the only deciduous genus of Pinaceae, with seasonal senescence and a rot-resistant valuable timber widely used in construction, greatly contributed to the development of genomic resources for the larch genus. Here, we present an extensive repeatome analysis and the first annotation of the draft nuclear Siberian larch genome assembly. About 66% of the larch genome consists of highly repetitive elements (REs), with the likely wave of retrotransposons insertions into the larch genome estimated to occur 4-5 MYA. In total, 39,370 gene models were predicted, with 87% of them having homology to the Arabidopsis-annotated proteins and 78% having at least one GO term assignment. The current state of the genome annotations allows for the exploration of the gymnosperm and angiosperm species for relative gene abundance in different functional categories. Comparative analysis of functional gene categories across different angiosperm and gymnosperm species finds that the Siberian larch genome has an overabundance of genes associated with programmed cell death (PCD), autophagy, stress hormone biosynthesis and regulatory pathways; genes that may play important roles in seasonal senescence and stress response to extreme cold in larch. Despite being incomplete, the draft assemblies and annotations of the conifer genomes are at a point of development where they now represent a valuable source for further genomic, genetic and population studies.
RESUMEN
Repetitive elements (RE) and transposons (TE) can comprise up to 80% of some plant genomes and may be essential for regulating their evolution and adaptation. The "repeatome" information is often unavailable in assembled genomes because genomic areas of repeats are challenging to assemble and are often missing from final assembly. However, raw genomic sequencing data contain rich information about RE/TEs. Here, raw genomic NGS reads of 10 gymnosperm species were studied for the content and abundance patterns of their "repeatome". We utilized a combination of alignment on databases of repetitive elements and de novo assembly of highly repetitive sequences from genomic sequencing reads to characterize and calculate the abundance of known and putative repetitive elements in the genomes of 10 conifer plants: Pinus taeda, Pinus sylvestris, Pinus sibirica, Picea glauca, Picea abies, Abies sibirica, Larix sibirica, Juniperus communis, Taxus baccata, and Gnetum gnemon. We found that genome abundances of known and newly discovered putative repeats are specific to phylogenetically close groups of species and match biological taxa. The grouping of species based on abundances of known repeats closely matches the grouping based on abundances of newly discovered putative repeats (kChains) and matches the known taxonomic relations.
RESUMEN
Arthropod-borne viruses pose a major threat to global public health. Thus, innovative strategies for their control and prevention are urgently needed. Here, we exploit the natural capacity of viruses to generate defective viral genomes (DVGs) to their detriment. While DVGs have been described for most viruses, identifying which, if any, can be used as therapeutic agents remains a challenge. We present a combined experimental evolution and computational approach to triage DVG sequence space and pinpoint the fittest deletions, using Zika virus as an arbovirus model. This approach identifies fit DVGs that optimally interfere with wild-type virus infection. We show that the most fit DVGs conserve the open reading frame to maintain the translation of the remaining non-structural proteins, a characteristic that is fundamental across the flavivirus genus. Finally, we demonstrate that the high fitness DVG is antiviral in vivo both in the mammalian host and the mosquito vector, reducing transmission in the latter by up to 90%. Our approach establishes the method to interrogate the DVG fitness landscape, and enables the systematic identification of DVGs that show promise as human therapeutics and vector control strategies to mitigate arbovirus transmission and disease.
Asunto(s)
Antivirales/administración & dosificación , Virus Defectuosos/genética , Mosquitos Vectores/efectos de los fármacos , Infección por el Virus Zika/tratamiento farmacológico , Virus Zika/genética , Aedes/efectos de los fármacos , Aedes/virología , Animales , Chlorocebus aethiops , Biología Computacional , Evolución Molecular Dirigida , Modelos Animales de Enfermedad , Femenino , Aptitud Genética , Genoma Viral/genética , Células HEK293 , Humanos , Ratones , Control de Mosquitos/métodos , Mosquitos Vectores/virología , Sistemas de Lectura Abierta/genética , ARN Viral/genética , Células Vero , Infección por el Virus Zika/transmisión , Infección por el Virus Zika/virologíaRESUMEN
Coast redwood is a very important endemic conifer timber species in Southern Oregon and Northern California in the USA. Due to its good wood properties and fast growth rate it can be considered as a prospective timber species also in other countries with similar or changing toward similar climatic conditions due to global climate warming, such as Germany. In general, it is frost sensitive and suffers from freezing temperatures. To study genetic mechanisms of frost resistance in this species and to select the most frost tolerant trees we tested 17 clones in climate control chamber experiments and generated two de novo assemblies of the coast redwood transcriptome from a pooled RNA sample using Trinity and CLC Genomic Workbench software, respectively. The hexaploid nature of the coast redwood genome makes it very challenging to successfully assemble and annotate the coast redwood transcriptome. The de novo transcriptome assembly generated by Trinity and CLC considering only reads with a minimum length of 180 bp and contigs no less than 200 bp long resulted in 634,772 and 788,464 unigenes (unique contigs), respectively.