RESUMEN
BACKGROUND: Rapid adaptation to new environments can facilitate species invasions and range expansions. Understanding the mechanisms of adaptation used by invasive disease vectors in new regions has key implications for mitigating the prevalence and spread of vector-borne disease, although they remain relatively unexplored. RESULTS: Here, we integrate whole-genome sequencing data from 96 Aedes aegypti mosquitoes collected from various sites in southern and central California with 25 annual topo-climate variables to investigate genome-wide signals of local adaptation among populations. Patterns of population structure, as inferred using principal components and admixture analysis, were consistent with three genetic clusters. Using various landscape genomics approaches, which all remove the confounding effects of shared ancestry on correlations between genetic and environmental variation, we identified 112 genes showing strong signals of local environmental adaptation associated with one or more topo-climate factors. Some of them have known effects in climate adaptation, such as heat-shock proteins, which shows selective sweep and recent positive selection acting on these genomic regions. CONCLUSIONS: Our results provide a genome wide perspective on the distribution of adaptive loci and lay the foundation for future work to understand how environmental adaptation in Ae. aegypti impacts the arboviral disease landscape and how such adaptation could help or hinder efforts at population control.
Asunto(s)
Aedes , Animales , Aedes/genética , Mosquitos Vectores/genética , Genómica , Adaptación Fisiológica/genética , CaliforniaRESUMEN
The genomic architecture and molecular mechanisms controlling variation in quantitative disease resistance loci are not well understood in plant species and have been barely studied in long-generation trees. Quantitative trait loci mapping and genome-wide association studies were combined to test a large single nucleotide polymorphism (SNP) set for association with quantitative and qualitative white pine blister rust resistance in sugar pine. In the absence of a chromosome-scale reference genome, a high-density consensus linkage map was generated to obtain locations for associated SNPs. Newly discovered associations for white pine blister rust quantitative disease resistance included 453 SNPs involved in wide biological functions, including genes associated with disease resistance and others involved in morphological and developmental processes. In addition, NBS-LRR pathogen recognition genes were found to be involved in quantitative disease resistance, suggesting these newly reported genes are qualitative genes with partial resistance, they are the result of defeated qualitative resistance due to avirulent races, or they have epistatic effects on qualitative disease resistance genes. This study is a step forward in our understanding of the complex genomic architecture of quantitative disease resistance in long-generation trees, and constitutes the first step towards marker-assisted disease resistance breeding in white pine species.
Asunto(s)
Basidiomycota/fisiología , Resistencia a la Enfermedad/genética , Pinus/genética , Pinus/microbiología , Mapeo Cromosómico , Genes de Plantas , Genética de Población , Genoma de Planta , Estudio de Asociación del Genoma Completo , Fenotipo , Enfermedades de las Plantas/microbiología , Polimorfismo de Nucleótido Simple , Sitios de Carácter CuantitativoRESUMEN
BACKGROUND: Both a source of diversity and the development of genomic tools, such as reference genomes and molecular markers, are equally important to enable faster progress in plant breeding. Pear (Pyrus spp.) lags far behind other fruit and nut crops in terms of employment of available genetic resources for new cultivar development. To address this gap, we designed a high-density, high-efficiency and robust single nucleotide polymorphism (SNP) array for pear, with the main objectives of conducting genetic diversity and genome-wide association studies. RESULTS: By applying a two-step design process, which consisted of the construction of a first 'draft' array for the screening of a small subset of samples, we were able to identify the most robust and informative SNPs to include in the Applied Biosystems™ Axiom™ Pear 70 K Genotyping Array, currently the densest SNP array for pear. Preliminary evaluation of this 70 K array in 1416 diverse pear accessions from the USDA National Clonal Germplasm Repository (NCGR) in Corvallis, OR identified 66,616 SNPs (93% of all the tiled SNPs) as high quality and polymorphic (PolyHighResolution). We further used the Axiom Pear 70 K Genotyping Array to construct high-density linkage maps in a bi-parental population, and to make a direct comparison with available genotyping-by-sequencing (GBS) data, which suggested that the SNP array is a more robust method of screening for SNPs than restriction enzyme reduced representation sequence-based genotyping. CONCLUSIONS: The Axiom Pear 70 K Genotyping Array, with its high efficiency in a widely diverse panel of Pyrus species and cultivars, represents a valuable resource for a multitude of molecular studies in pear. The characterization of the USDA-NCGR collection with this array will provide important information for pear geneticists and breeders, as well as for the optimization of conservation strategies for Pyrus.
Asunto(s)
Mapeo Cromosómico/métodos , Ligamiento Genético , Marcadores Genéticos , Genoma de Planta , Polimorfismo de Nucleótido Simple , Pyrus/genética , Semillas/genética , Cromosomas de las Plantas , Estudio de Asociación del Genoma Completo , Técnicas de GenotipajeRESUMEN
Over the last 20 years, global production of Persian walnut (Juglans regia L.) has grown enormously, likely reflecting increased consumption due to its numerous benefits to human health. However, advances in genome-wide association (GWA) studies and genomic selection (GS) for agronomically important traits in walnut remain limited due to the lack of powerful genomic tools. Here, we present the development and validation of a high-density 700K single nucleotide polymorphism (SNP) array in Persian walnut. Over 609K high-quality SNPs have been thoroughly selected from a set of 9.6 m genome-wide variants, previously identified from the high-depth re-sequencing of 27 founders of the Walnut Improvement Program (WIP) of University of California, Davis. To validate the effectiveness of the array, we genotyped a collection of 1284 walnut trees, including 1167 progeny of 48 WIP families and 26 walnut cultivars. More than half of the SNPs (55.7%) fell in the highest quality class of 'Poly High Resolution' (PHR) polymorphisms, which were used to assess the WIP pedigree integrity. We identified 151 new parent-offspring relationships, all confirmed with the Mendelian inheritance test. In addition, we explored the genetic variability among cultivars of different origin, revealing how the varieties from Europe and California were differentiated from Asian accessions. Both the reconstruction of the WIP pedigree and population structure analysis confirmed the effectiveness of the Applied Biosystems™ Axiom™ J. regia 700K SNP array, which initiates a novel genomic and advanced phase in walnut genetics and breeding.
Asunto(s)
Genómica , Técnicas de Genotipaje , Juglans , Estudio de Asociación del Genoma Completo , Genómica/métodos , Genotipo , Técnicas de Genotipaje/instrumentación , Humanos , Juglans/genética , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Dissecting the genetic and genomic architecture of complex traits is essential to understand the forces maintaining the variation in phenotypic traits of ecological and economical importance. Whole-genome resequencing data were used to generate high-resolution polymorphic single nucleotide polymorphism (SNP) markers and genotype individuals from common gardens across the loblolly pine (Pinus taeda) natural range. Genome-wide associations were tested with a large phenotypic dataset comprising 409 variables including morphological traits (height, diameter, carbon isotope discrimination, pitch canker resistance), and molecular traits such as metabolites and expression of xylem development genes. Our study identified 2335 new SNP × trait associations for the species, with many SNPs located in physical clusters in the genome of the species; and the genomic location of hotspots for metabolic × genotype associations. We found a highly polygenic basis of quantitative inheritance, with significant differences in number, effects size, genomic location and frequency of alleles contributing to variation in phenotypes in the different traits. While mutation-selection balance might be shaping the genetic variation in metabolic traits, balancing selection is more likely to shape the variation in expression of xylem development genes. Our work contributes to the study of complex traits in nonmodel plant species by identifying associations at a whole-genome level.
Asunto(s)
Herencia Multifactorial , Pinus taeda/genética , Polimorfismo de Nucleótido Simple , Frecuencia de los Genes , Genética de Población , Estudio de Asociación del Genoma Completo , Genotipo , Fenotipo , Pinus taeda/fisiología , Estados Unidos , Secuenciación Completa del Genoma , Xilema/genética , Xilema/crecimiento & desarrolloRESUMEN
The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich array of polyphenolic compounds, whose complete biosynthetic pathways are still unknown. A J. regia genome sequence was obtained from the cultivar 'Chandler' to discover target genes and additional unknown genes. The 667-Mbp genome was assembled using two different methods (SOAPdenovo2 and MaSuRCA), with an N50 scaffold size of 464 955 bp (based on a genome size of 606 Mbp), 221 640 contigs and a GC content of 37%. Annotation with MAKER-P and other genomic resources yielded 32 498 gene models. Previous studies in walnut relying on tissue-specific methods have only identified a single polyphenol oxidase (PPO) gene (JrPPO1). Enabled by the J. regia genome sequence, a second homolog of PPO (JrPPO2) was discovered. In addition, about 130 genes in the large gallate 1-ß-glucosyltransferase (GGT) superfamily were detected. Specifically, two genes, JrGGT1 and JrGGT2, were significantly homologous to the GGT from Quercus robur (QrGGT), which is involved in the synthesis of 1-O-galloyl-ß-d-glucose, a precursor for the synthesis of hydrolysable tannins. The reference genome for J. regia provides meaningful insight into the complex pathways required for the synthesis of polyphenols. The walnut genome sequence provides important tools and methods to accelerate breeding and to facilitate the genetic dissection of complex traits.
Asunto(s)
Genoma de Planta/genética , Juglans/genética , Proteínas de Plantas/genética , Polifenoles/metabolismo , Catecol Oxidasa/metabolismoRESUMEN
Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa F(ST) were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.
Asunto(s)
Drosophila melanogaster/genética , Variación Genética , Genoma de los Insectos , Metagenómica , Adaptación Fisiológica/genética , África del Sur del Sahara , Alelos , Animales , Secuencia de Bases , Europa (Continente) , Evolución Molecular , Secuenciación de Nucleótidos de Alto Rendimiento , Selección GenéticaRESUMEN
The mosquito Anopheles gambiae s.s. is a primary malaria vector throughout sub-Saharan Africa including the islands of the Comoros archipelago (Anjouan, Grande Comore, Mayotte and Mohéli). These islands are located at the northern end of the Mozambique Channel in eastern Africa. Previous studies have shown a relatively high degree of genetic isolation between the Comoros islands and mainland populations of A. gambiae, but the origin of the island populations remains unclear. Here, we analyzed phylogenetic relationships among island and mainland populations using complete mitochondrial genome sequences of individual A. gambiae specimens. This work augments earlier studies based on analysis of the nuclear genome. We investigated the source population of A. gambiae for each island, estimated the number of introductions, when they occurred and explored evidence for contemporary gene flow between island and mainland populations. These studies are relevant to understanding historical patterns in the dispersal of this important malaria vector and provide information critical to assessing their potential for the exploration of genetic-based vector control methods to eliminate this disease. Phylogenetic analysis and haplotype networks were constructed from mitogenome sequences of 258 A. gambiae from the four islands. In addition, 112 individuals from seven countries across sub-Saharan Africa and Madagascar were included to identify potential source populations. Our results suggest that introduction events of A. gambiae into the Comoros archipelago were rare and recent events and support earlier claims that gene flow between the mainland and these islands is limited. This study is concordant with earlier work suggesting the suitability of these oceanic islands as appropriate sites for conducting field trial releases of genetically engineered mosquitoes (GEMs).
Asunto(s)
Anopheles , Malaria , Humanos , Animales , Anopheles/genética , Filogenia , Océano Índico , Mosquitos Vectores/genética , Malaria/genética , Malaria/prevención & controlRESUMEN
Using high-depth whole genome sequencing of F0 mating pairs and multiple individual F1 offspring, we estimated the nuclear mutation rate per generation in the malaria vectors Anopheles coluzzii and Anopheles stephensi by detecting de novo genetic mutations. A purpose-built computer program was employed to filter actual mutations from a deep background of superficially similar artifacts resulting from read misalignment. Performance of filtering parameters was determined using software-simulated mutations, and the resulting estimate of false negative rate was used to correct final mutation rate estimates. Spontaneous mutation rates by base substitution were estimated at 1.00 × 10-9 (95% confidence interval, 2.06 × 10-10-2.91 × 10-9) and 1.36 × 10-9 (95% confidence interval, 4.42 × 10-10-3.18 × 10-9) per site per generation in A. coluzzii and A. stephensi respectively. Although similar studies have been performed on other insect species including dipterans, this is the first study to empirically measure mutation rates in the important genus Anopheles, and thus provides an estimate of µ that will be of utility for comparative evolutionary genomics, as well as for population genetic analysis of malaria vector mosquito species.
Asunto(s)
Anopheles/genética , Mosquitos Vectores/genética , Animales , Femenino , Humanos , Proteínas de Insectos/genética , Malaria/transmisión , Masculino , Tasa de Mutación , Secuenciación Completa del GenomaRESUMEN
Anopheles pretoriensis is widely distributed across Africa, including on oceanic islands such as Grande Comore in the Comoros. This species is known to be mostly zoophylic and therefore considered to have low impact on the transmission of human malaria. However, A. pretoriensis has been found infected with Plasmodium, suggesting that it may be epidemiologically important. In the present study, we sequenced and assembled the complete mitogenome of A. pretoriensis and inferred its phylogenetic relationship among other species in the subgenus Cellia. We also investigated the genetic structure of A. pretoriensis populations on Grande Comore Island, and between this island population and sites in continental Africa, using partial sequence of the mitochondrial cytochrome c oxidase subunit I (COI) gene. Seven haplotypes were found on the island, one of which was ubiquitous. There was no clear divergence between island haplotypes and those found on the continent. The present work contributes knowledge on this understudied, yet abundant, Anopheles species.
RESUMEN
Novel malaria control strategies using genetically engineered mosquitoes (GEMs) are on the horizon. Population modification is one approach wherein mosquitoes are engineered with genes rendering them refractory to the malaria parasite, Plasmodium falciparum, coupled with a low-threshold, Cas9-based gene drive. When released into a wild vector population, GEMs preferentially transmit these parasite-blocking genes to their offspring, ultimately modifying a vector population into a nonvector one. Deploying this technology awaits ecologically contained field trial evaluations. Here, we consider a process for site selection, the first critical step in designing a trial. Our goal is to identify a site that maximizes prospects for success, minimizes risk, and serves as a fair, valid, and convincing test of efficacy and impacts of a GEM product intended for large-scale deployment in Africa. We base site selection on geographic, geological, and biological, rather than social or legal, criteria. We recognize the latter as critically important but not as a first step in selecting a site. We propose physical islands as being the best candidates for a GEM field trial and present an evaluation of 22 African islands. We consider geographic and genetic isolation, biological complexity, island size, and topography and identify two island groups that satisfy key criteria for ideal GEM field trial sites.
RESUMEN
Understanding the genomic and environmental basis of cold adaptation is key to understand how plants survive and adapt to different environmental conditions across their natural range. Univariate and multivariate genome-wide association (GWAS) and genotype-environment association (GEA) analyses were used to test associations among genome-wide SNPs obtained from whole-genome resequencing, measures of growth, phenology, emergence, cold hardiness, and range-wide environmental variation in coastal Douglas-fir (Pseudotsuga menziesii). Results suggest a complex genomic architecture of cold adaptation, in which traits are either highly polygenic or controlled by both large and small effect genes. Newly discovered associations for cold adaptation in Douglas-fir included 130 genes involved in many important biological functions such as primary and secondary metabolism, growth and reproductive development, transcription regulation, stress and signaling, and DNA processes. These genes were related to growth, phenology and cold hardiness and strongly depend on variation in environmental variables such degree days below 0c, precipitation, elevation and distance from the coast. This study is a step forward in our understanding of the complex interconnection between environment and genomics and their role in cold-associated trait variation in boreal tree species, providing a baseline for the species' predictions under climate change.
Asunto(s)
Aclimatación/genética , Genes de Plantas , Polimorfismo de Nucleótido Simple , Pseudotsuga/genética , Estudio de Asociación del Genoma CompletoRESUMEN
We report the first complete mitogenome (Mt) sequence of Anopheles coustani, an understudied malaria vector in Africa. The sequence was extracted from one individual mosquito from São Tomé island. The length of the A. coustani Mt genome was 15,408 bp with 79.3% AT content. Phylogenetic analysis revealed that A. coustani is most closely related to A. sinensis (93.5% of identity); and 90.1% identical to A. gambiae complex members.
RESUMEN
The majority of mammalian species are uniparental, with the mother solely providing care for young conspecifics, although fathering behaviours can emerge under certain circumstances. For example, a great deal of individual variation in response to young pups has been reported in multiple inbred strains of laboratory male mice. Furthermore, sexual experience and subsequent cohabitation with a female conspecific can induce caregiving responses in otherwise indifferent, fearful or aggressive males. Thus, a highly conserved parental neural circuit is likely present in both sexes; however, the extent to which infants are capable of activating this circuit may vary. In support of this idea, fearful or indifferent responses toward pups in female mice are linked to greater immediate early gene (IEG) expression in a fear/defensive circuit involving the anterior hypothalamus compared to that in an approach/attraction circuit involving the ventral tegmental area. However, experience with infants, particularly in combination with histone deacetylase inhibitor (HDACi) treatment, can reverse this pattern of pup-induced activation of fear/defence circuitry and promote approach behaviour. Thus, HDACi treatment may increase the transcription of primed/poised genes that play a role in the activation and selection of a maternal approach circuit in response to pup stimuli. In the present study, we investigated whether HDACi treatment would impact behavioural response selection and associated IEG expression changes in virgin male mice that are capable of ignoring, attacking or caring for pups. The results obtained indicate that systemic HDACi treatment induces spontaneous caregiving behaviour in non-aggressive male mice and alters the pattern of pup-induced IEG expression across a fear/defensive neural circuit.
Asunto(s)
Agresión/fisiología , Encéfalo/metabolismo , Inhibidores de Histona Desacetilasas/administración & dosificación , Histona Desacetilasas/fisiología , Conducta Paterna/fisiología , Animales , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/metabolismo , Conducta Animal/efectos de los fármacos , Conducta Animal/fisiología , Encéfalo/efectos de los fármacos , Relaciones Interpersonales , Masculino , Ratones Endogámicos C57BL , Conducta Paterna/efectos de los fármacos , Proteínas Proto-Oncogénicas c-fos/metabolismo , ARN Mensajero/metabolismoRESUMEN
Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are 'draft' genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata.
Asunto(s)
Variación Genética , Genoma de Planta , Genómica , Juglans/clasificación , Juglans/genética , Biología Computacional/métodos , Evolución Molecular , Tamaño del Genoma , Genómica/métodos , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Filogenia , Polimorfismo de Nucleótido SimpleRESUMEN
We investigate the utility and scalability of new read cloud technologies to improve the draft genome assemblies of the colossal, and largely repetitive, genomes of conifers. Synthetic long read technologies have existed in various forms as a means of reducing complexity and resolving repeats since the outset of genome assembly. Recently, technologies that combine subhaploid pools of high molecular weight DNA with barcoding on a massive scale have brought new efficiencies to sample preparation and data generation. When combined with inexpensive light shotgun sequencing, the resulting data can be used to scaffold large genomes. The protocol is efficient enough to consider routinely for even the largest genomes. Conifers represent the largest reference genome projects executed to date. The largest of these is that of the conifer Pinus lambertiana (sugar pine), with a genome size of 31 billion bp. In this paper, we report on the molecular and computational protocols for scaffolding the P. lambertiana genome using the library technology from 10× Genomics. At 247,000 bp, the NG50 of the existing reference sequence is the highest scaffold contiguity among the currently published conifer assemblies; this new assembly's NG50 is 1.94 million bp, an eightfold increase.
Asunto(s)
Mapeo Contig/métodos , Genoma de Planta , Pinus/genética , Extractos Vegetales/genética , Secuenciación Completa del Genoma/métodos , Algoritmos , Bálsamos , Mapeo Contig/normas , Estándares de Referencia , Secuenciación Completa del Genoma/normasRESUMEN
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
Asunto(s)
Mapeo Contig , Genoma de Planta , Secuenciación de Nucleótidos de Alto Rendimiento , Pinus taeda/genética , Análisis de Secuencia de ADN , Algoritmos , GenómicaRESUMEN
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
RESUMEN
A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.
Asunto(s)
Genoma de Planta , Fotosíntesis/genética , Pinaceae/genética , Pinaceae/metabolismo , Pseudotsuga/genética , Pseudotsuga/metabolismo , Secuenciación Completa del Genoma , Adaptación Biológica/genética , Biología Computacional , Evolución Molecular , Duplicación de Gen , Redes Reguladoras de Genes , Genómica , Anotación de Secuencia Molecular , Familia de Multigenes , Filogenia , Pinaceae/clasificación , Proteómica/métodos , Pseudotsuga/clasificación , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. Our assembly includes a nuclear genome and a complete chloroplast genome, along with annotation of encoded genes. The assembly contains 94,394 scaffolds, totaling 1.17 Gb with 18,512 scaffolds of length 2 kb or longer, with a total length of 1.15 Gb, and a N50 scaffold size of 278,077 kb. The k-mer histograms indicate an diploid genome size of â¼720-730 Mb, which is smaller than the total length due to high heterozygosity, estimated at 1.25%. A comparison with a recently published European oak (Q. robur) nuclear sequence indicates 93% similarity. The Q. lobata chloroplast genome has 99% identity with another North American oak, Q. rubra Preliminary annotation yielded an estimate of 61,773 predicted protein-coding genes, of which 71% had similarity to known protein domains. We searched 956 Benchmarking Universal Single-Copy Orthologs, and found 863 complete orthologs, of which 450 were present in > 1 copy. We also examined an earlier version (v0.5) where duplicate haplotypes were removed to discover variants. These additional sources indicate that the predicted gene count in Version 1.0 is overestimated by 37-52%. Nonetheless, this first draft valley oak genome assembly represents a high-quality, well-annotated genome that provides a tool for forest restoration and management practices.