RESUMO
BACKGROUND: Olive tree (Olea europaea L. subsp. europaea, Oleaceae) has been the most emblematic perennial crop for Mediterranean countries since its domestication around 6000 years ago in the Levant. Two taxonomic varieties are currently recognized: cultivated (var. europaea) and wild (var. sylvestris) trees. However, it remains unclear whether olive cultivars derive from a single initial domestication event followed by secondary diversification, or whether cultivated lineages are the result of more than a single, independent primary domestication event. To shed light into the recent evolution and domestication of the olive tree, here we analyze a group of newly sequenced and available genomes using a phylogenomics and population genomics framework. RESULTS: We improved the assembly and annotation of the reference genome, newly sequenced the genomes of twelve individuals: ten var. europaea, one var. sylvestris, and one outgroup taxon (subsp. cuspidata)-and assembled a dataset comprising whole genome data from 46 var. europaea and 10 var. sylvestris. Phylogenomic and population structure analyses support a continuous process of olive tree domestication, involving a major domestication event, followed by recurrent independent genetic admixture events with wild populations across the Mediterranean Basin. Cultivated olives exhibit only slightly lower levels of genetic diversity than wild forms, which can be partially explained by the occurrence of a mild population bottleneck 3000-14,000 years ago during the primary domestication period, followed by recurrent introgression from wild populations. Genes associated with stress response and developmental processes were positively selected in cultivars, but we did not find evidence that genes involved in fruit size or oil content were under positive selection. This suggests that complex selective processes other than directional selection of a few genes are in place. CONCLUSIONS: Altogether, our results suggest that a primary domestication area in the eastern Mediterranean basin was followed by numerous secondary events across most countries of southern Europe and northern Africa, often involving genetic admixture with genetically rich wild populations, particularly from the western Mediterranean Basin.
Assuntos
Domesticação , Variação Genética , Genoma de Planta , Olea/genética , Filogenia , Evolução BiológicaRESUMO
Lecanosticta acicola is the causal agent for brown spot needle blight that affects pine trees across the northern hemisphere. Based on marker genes and microsatellite data, two distinct lineages have been identified that were introduced into Europe on two separate occasions. Despite their overall distinct geographic distribution, they have been found to coexist in regions of northern Spain and France. Here, we present the first genome-wide study of Lecanosticta acicola, including assembly of the reference genome and a population genomics analysis of 70 natural isolates from northern Spain. We show that most of the isolates belong to the southern lineage but show signs of introgression with northern lineage isolates, indicating mating between the two lineages. We also identify phenotypic differences between the two lineages based on the activity profiles of 20 enzymes, with introgressed strains being more phenotypically similar to members of the southern lineage. In conclusion, we show undergoing genetic admixture between the two main lineages of L. acicola in a region of recent expansion. IMPORTANCE: Lecanosticta acicola is a fungal pathogen causing severe defoliation, growth reduction, and even death in more than 70 conifer species. Despite the increasing incidence of this species, little is known about its population dynamics. Two divergent lineages have been described that have now been found together in regions of France and Spain, but it is unknown how these mixed populations evolve. Here we present the first reference genome for this important plant pathogenic fungi and use it to study the population genomics of 70 isolates from an affected forest in the north of Spain. We find signs of introgression between the two main lineages, indicating that active mating is occurring in this region which could propitiate the appearance of novel traits in this species. We also study the phenotypic differences across this population based on enzymatic activities on 20 compounds.
Assuntos
Ascomicetos , Pinus , Humanos , Estudo de Associação Genômica Ampla , Pinus/genética , Ascomicetos/genética , GenômicaRESUMO
The pearly razorfish (Xyrichtys novacula), commonly known as raor in the Balearic Islands, is a wrasse within the family Labridae. This fish species has particular biological and socio-cultural characteristics making it an ideal model organism in the fields of behavioural ecology, molecular ecology and conservation biology. In this study, we present the first annotated chromosome-level assembly for this species. Sequencing involved a combination of long reads with Oxford Nanopore Technologies, Illumina paired-end short reads (2â ×â 151 bp), Hi-C and RNA-seq from different tissues. The nuclear genome assembly has a scaffold N50 of 34.33 Mb, a total assembly span of 775.53 Mb and 99.63% of the sequence assembled into 24 superscaffolds, consistent with its known karyotype. Quality metrics revealed a consensus accuracy (QV) of 42.92 and gene completenessâ >â 98%. The genome annotation resulted in 26,690 protein-coding genes and 12,737 non-coding transcripts. The coding regions encoded 39,613 unique protein products, 93% of them with assigned function. Overall, the publication of the X. novacula's reference genome will broaden the scope and impact of genomic research conducted on this iconic and colourful species.
Assuntos
Genoma , Perciformes , Animais , Anotação de Sequência Molecular , Perciformes/genética , Genômica/métodos , Cromossomos , FilogeniaRESUMO
Cephalopods are emerging animal models and include iconic species for studying the link between genomic innovations and physiological and behavioral complexities. Coleoid cephalopods possess the largest nervous system among invertebrates, both for cell counts and brain-to-body ratio. Octopus vulgaris has been at the center of a long-standing tradition of research into diverse aspects of cephalopod biology, including behavioral and neural plasticity, learning and memory recall, regeneration, and sophisticated cognition. However, no chromosome-scale genome assembly was available for O. vulgaris to aid in functional studies. To fill this gap, we sequenced and assembled a chromosome-scale genome of the common octopus, O. vulgaris. The final assembly spans 2.8 billion basepairs, 99.34% of which are in 30 chromosome-scale scaffolds. Hi-C heatmaps support a karyotype of 1n = 30 chromosomes. Comparisons with other octopus species' genomes show a conserved octopus karyotype and a pattern of local genome rearrangements between species. This new chromosome-scale genome of O. vulgaris will further facilitate research in all aspects of cephalopod biology, including various forms of plasticity and the neural machinery underlying sophisticated cognition, as well as an understanding of cephalopod evolution.
Assuntos
Octopodiformes , Animais , Octopodiformes/genética , Genoma , Genômica , Sistema Nervoso , Cromossomos/genéticaRESUMO
The Mediterranean lizard Podarcis lilfordi is an emblematic species of the Balearic Islands. The extensive phenotypic diversity among extant isolated populations makes the species a great insular model system for eco-evolutionary studies, as well as a challenging target for conservation management plans. Here we report the first high-quality chromosome-level assembly and annotation of the P. lilfordi genome, along with its mitogenome, based on a mixed sequencing strategy (10X Genomics linked reads, Oxford Nanopore Technologies long reads and Hi-C scaffolding) coupled with extensive transcriptomic data (Illumina and PacBio). The genome assembly (1.5 Gb) is highly contiguous (N50 = 90 Mb) and complete, with 99% of the sequence assigned to candidate chromosomal sequences and >97% gene completeness. We annotated a total of 25,663 protein-coding genes translating into 38,615 proteins. Comparison to the genome of the related species Podarcis muralis revealed substantial similarity in genome size, annotation metrics, repeat content, and a strong collinearity, despite their evolutionary distance (~18-20 MYA). This genome expands the repertoire of available reptilian genomes and will facilitate the exploration of the molecular and evolutionary processes underlying the extraordinary phenotypic diversity of this insular species, while providing a critical resource for conservation genomics.
Assuntos
Cromossomos , Lagartos , Animais , Espanha , Anotação de Sequência Molecular , Genoma , Lagartos/genéticaRESUMO
In response to the threat of increasing antimicrobial resistance, we must increase the amount of available high-quality genomic data gathered on antibiotic-resistant bacteria. To this end, we developed an integrated pipeline for high-throughput long-read sequencing, assembly, annotation and analysis of bacterial isolates and used it to generate a large genomic data set of carbapenemase-producing Enterobacterales (CPE) isolates collected in Spain. The set of 461 isolates were sequenced with a combination of both Illumina and Oxford Nanopore Technologies (ONT) DNA sequencing technologies in order to provide genomic context for chromosomal loci and, most importantly, structural resolution of plasmids, important determinants for transmission of antimicrobial resistance. We developed an informatics pipeline called Assembly and Annotation of Carbapenem-Resistant Enterobacteriaceae (AACRE) for the full assembly and annotation of the bacterial genomes and their complement of plasmids. To explore the resulting genomic data set, we developed a new database called inCREDBle that not only stores the genomic data, but provides unique ways to filter and compare data, enabling comparative genomic analyses at the level of chromosomes, plasmids and individual genes. We identified a new sequence type, ST5000, and discovered a genomic locus unique to ST15 that may be linked to its increased spread in the population. In addition to our major objective of generating a large regional data set, we took the opportunity to compare the effects of sample quality and sequencing methods, including R9 versus R10 nanopore chemistry, on genome assembly and annotation quality. We conclude that converting short-read and hybrid microbial sequencing and assembly workflows to the latest nanopore chemistry will further reduce processing time and cost, truly enabling the routine monitoring of resistance transmission patterns at the resolution of complete chromosomes and plasmids.
Assuntos
Enterobacteriáceas Resistentes a Carbapenêmicos , Carbapenêmicos , Carbapenêmicos/farmacologia , Enterobacteriáceas Resistentes a Carbapenêmicos/genética , Fluxo de Trabalho , Genômica/métodos , Antibacterianos/farmacologiaRESUMO
The emergence of 16S rRNA methyltransferases (RMTs) in Gram-negative pathogens bearing other clinically relevant resistance mechanisms, such as carbapenemase-producing Enterobacterales (CPE), is becoming an alarming concern. We investigated the prevalence, antimicrobial susceptibility, resistance mechanisms, molecular epidemiology and genetic support of RMTs in CPE isolates from Spain. This study included a collection of 468 CPE isolates recovered during 2018 from 32 participating Spanish hospitals. MICs were determined using the broth microdilution method, the agar dilution method (fosfomycin) or MIC gradient strips (plazomicin). All isolates were subjected to hybrid whole-genome sequencing (WGS). Sequence types (STs), core genome phylogenetic relatedness, horizontally acquired resistance mechanisms, plasmid analysis and the genetic environment of RMTs were determined in silico from WGS data in all RMT-positive isolates. Among the 468 CPE isolates evaluated, 24 isolates (5.1%) recovered from nine different hospitals spanning five Spanish regions showed resistance to all aminoglycosides and were positive for an RMT (21 RmtF, 2 ArmA and 1 RmtC). All RMT-producers showed high-level resistance to all aminoglycosides, including plazomicin, and in most cases exhibited an extensively drug-resistant susceptibility profile. The RMT-positive isolates showed low genetic diversity and were global clones of Klebsiella pneumoniae (ST147, ST101, ST395) and Enterobacter cloacae (ST93) bearing blaOXA-48, blaNDM-1 or blaVIM-1 carbapenemase genes. RMTs were harboured in five different multidrug resistance plasmids and linked to efficient mobile genetic elements. Our findings highlight that RMTs are emerging among clinical CPE isolates from Spain and their spread should be monitored to preserve the future clinical utility of aminoglycosides and plazomicin.
Assuntos
Farmacorresistência Bacteriana/genética , Enterobacteriaceae/genética , Enterobacteriaceae/metabolismo , Metiltransferases/genética , Metiltransferases/metabolismo , RNA Ribossômico 16S/genética , beta-Lactamases/genética , beta-Lactamases/metabolismo , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , EspanhaRESUMO
BACKGROUND: The Mediterranean mussel Mytilus galloprovincialis is an ecologically and economically relevant edible marine bivalve, highly invasive and resilient to biotic and abiotic stressors causing recurrent massive mortalities in other bivalves. Although these traits have been recently linked with the maintenance of a high genetic variation within natural populations, the factors underlying the evolutionary success of this species remain unclear. RESULTS: Here, after the assembly of a 1.28-Gb reference genome and the resequencing of 14 individuals from two independent populations, we reveal a complex pan-genomic architecture in M. galloprovincialis, with a core set of 45,000 genes plus a strikingly high number of dispensable genes (20,000) subject to presence-absence variation, which may be entirely missing in several individuals. We show that dispensable genes are associated with hemizygous genomic regions affected by structural variants, which overall account for nearly 580 Mb of DNA sequence not included in the reference genome assembly. As such, this is the first study to report the widespread occurrence of gene presence-absence variation at a whole-genome scale in the animal kingdom. CONCLUSIONS: Dispensable genes usually belong to young and recently expanded gene families enriched in survival functions, which might be the key to explain the resilience and invasiveness of this species. This unique pan-genome architecture is characterized by dispensable genes in accessory genomic regions that exceed by orders of magnitude those observed in other metazoans, including humans, and closely mirror the open pan-genomes found in prokaryotes and in a few non-metazoan eukaryotes.
Assuntos
Genoma , Mytilus/genética , Animais , Sequência de Bases , Evolução Biológica , Feminino , Genômica , Humanos , Imunidade Inata , Masculino , Mytilus/anatomia & histologia , Fator 1 de Elongação de Peptídeos , Proteínas Citotóxicas Formadoras de PorosRESUMO
The evolution of winged insects revolutionized terrestrial ecosystems and led to the largest animal radiation on Earth. However, we still have an incomplete picture of the genomic changes that underlay this diversification. Mayflies, as one of the sister groups of all other winged insects, are key to understanding this radiation. Here, we describe the genome of the mayfly Cloeon dipterum and its gene expression throughout its aquatic and aerial life cycle and specific organs. We discover an expansion of odorant-binding-protein genes, some expressed specifically in breathing gills of aquatic nymphs, suggesting a novel sensory role for this organ. In contrast, flying adults use an enlarged opsin set in a sexually dimorphic manner, with some expressed only in males. Finally, we identify a set of wing-associated genes deeply conserved in the pterygote insects and find transcriptomic similarities between gills and wings, suggesting a common genetic program. Globally, this comprehensive genomic and transcriptomic study uncovers the genetic basis of key evolutionary adaptations in mayflies and winged insects.
Assuntos
Adaptação Fisiológica/genética , Ephemeroptera/genética , Evolução Molecular , Asas de Animais , Animais , Ephemeroptera/classificação , Ephemeroptera/crescimento & desenvolvimento , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Genes de Insetos/genética , Genoma de Inseto/genética , Brânquias , Insetos/classificação , Insetos/genética , Estágios do Ciclo de Vida/genética , Masculino , FilogeniaRESUMO
U12-type introns are spliced by the U12-dependent spliceosome and are present in the genomes of many higher eukaryotic lineages including plants, chordates and some invertebrates. However, due to their relatively recent discovery and a systematic bias against recognition of non-canonical splice sites in general, the introns defined by U12-type splice sites are under-represented in genome annotations. Such under-representation compounds the already difficult problem of determining gene structures. It also impedes attempts to study these introns genome-wide or phylum-wide. The resource described here, the U12 Intron Database (U12DB), aims to catalog the U12-type introns of completely sequenced eukaryotic genomes in a framework that groups orthologous introns with each other. This will aid further investigations into the evolution and mechanism of U12-dependent splicing as well as assist ongoing genome annotation efforts. Public access to the U12DB is available at http://genome.imim.es/cgi-bin/u12db/u12db.cgi.
Assuntos
Bases de Dados de Ácidos Nucleicos , Íntrons , Sítios de Splice de RNA , RNA Nuclear Pequeno/metabolismo , Spliceossomos/metabolismo , Animais , Humanos , Internet , Filogenia , Splicing de RNA , Interface Usuário-ComputadorRESUMO
The Eastern woodchuck (Marmota monax) has been extensively used in research of chronic hepatitis B and liver cancer because its infection with the woodchuck hepatitis virus closely resembles a human hepatitis B virus infection. Development of novel immunotherapeutic approaches requires genetic information on immune pathway genes in this animal model. The woodchuck genome was assembled with a combination of high-coverage whole-genome shotgun sequencing of Illumina paired-end, mate-pair libraries and fosmid pool sequencing. The result is a 2.63 Gigabase (Gb) assembly with a contig N50 of 74.5 kilobases (kb), scaffold N50 of 892 kb, and genome completeness of 99.2%. RNA sequencing (RNA-seq) from seven different tissues aided in the annotation of 30,873 protein-coding genes, which in turn encode 41,826 unique protein products. More than 90% of the genes have been functionally annotated, with 82% of them containing open reading frames. This genome sequence and its annotation will enable further research in chronic hepatitis B and hepatocellular carcinoma and contribute to the understanding of immunological responses in the woodchuck.
Assuntos
Genoma , Hepatite B Crônica/virologia , Marmota/genética , Marmota/virologia , Animais , Sequência de Bases , Análise por Conglomerados , Modelos Animais de Doenças , Feminino , Regulação da Expressão Gênica , Marmota/imunologia , Anotação de Sequência Molecular , Fases de Leitura Aberta/genética , FilogeniaRESUMO
Drosophila guanche is a member of the obscura group that originated in the Canary Islands archipelago upon its colonization by D. subobscura. It evolved into a new species in the laurisilva, a laurel forest present in wet regions that in the islands have only minor long-term weather fluctuations. Oceanic island endemic species such as D. guanche can become model species to investigate not only the relative role of drift and adaptation in speciation processes but also how population size affects nucleotide variation. Moreover, the previous identification of two satellite DNAs in D. guanche makes this species attractive for studying how centromeric DNA evolves. As a prerequisite for its establishment as a model species suitable to address all these questions, we generated a high-quality D. guanche genome sequence composed of 42 cytologically mapped scaffolds, which are assembled into six super-scaffolds (one per chromosome). The comparative analysis of the D. guanche proteome with that of twelve other Drosophila species identified 151 genes that were subject to adaptive evolution in the D. guanche lineage, with a subset of them being involved in flight and genome stability. For example, the Centromere Identifier (CID) protein, directly interacting with centromeric satellite DNA, shows signals of adaptation in this species. Both genomic analyses and FISH of the two satellites would support an ongoing replacement of centromeric satellite DNA in D. guanche.
Assuntos
Adaptação Fisiológica/genética , Drosophila/genética , Evolução Molecular , Voo Animal/fisiologia , Genes de Insetos , Instabilidade Genômica , Ilhas , Animais , Sequência de Bases , Cromossomos/genética , Elementos de DNA Transponíveis/genética , Anotação de Sequência Molecular , Oceanos e Mares , FilogeniaRESUMO
BACKGROUND: Vertebrate odorant receptors comprise at least three types of G protein-coupled receptors (GPCRs): the OR, V1R, and V2R/V2R-like receptors, the latter group belonging to the C family of GPCRs. These receptor families are thought to receive chemosensory information from a wide spectrum of odorant and pheromonal cues that influence critical animal behaviors such as feeding, reproduction and other social interactions. RESULTS: Using genome database mining and other informatics approaches, we identified and characterized the repertoire of 54 intact "V2R-like" olfactory C family GPCRs in the zebrafish. Phylogenetic analysis - which also included a set of 34 C family GPCRs from fugu - places the fish olfactory receptors in three major groups, which are related to but clearly distinct from other C family GPCRs, including the calcium sensing receptor, metabotropic glutamate receptors, GABA-B receptor, T1R taste receptors, and the major group of V2R vomeronasal receptor families. Interestingly, an analysis of sequence conservation and selective pressure in the zebrafish receptors revealed the retention of a conserved sequence motif previously shown to be required for ligand binding in other amino acid receptors. CONCLUSION: Based on our findings, we propose that the repertoire of zebrafish olfactory C family GPCRs has evolved to allow the detection and discrimination of a spectrum of amino acid and/or amino acid-based compounds, which are potent olfactory cues in fish. Furthermore, as the major groups of fish receptors and mammalian V2R receptors appear to have diverged significantly from a common ancestral gene(s), these receptors likely mediate chemosensation of different classes of chemical structures by their respective organisms.
Assuntos
Receptores de Aminoácido/genética , Receptores Odorantes/genética , Proteínas de Peixe-Zebra/genética , Peixe-Zebra/genética , Sequência de Aminoácidos , Animais , Mapeamento Cromossômico , Peixes/genética , Genômica/métodos , Camundongos , Modelos Moleculares , Dados de Sequência Molecular , Neurônios Receptores Olfatórios/fisiologia , Filogenia , Estrutura Terciária de Proteína , Receptores de Aminoácido/química , Receptores de Aminoácido/classificação , Receptores Acoplados a Proteínas G/classificação , Receptores Odorantes/química , Receptores Odorantes/classificação , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Especificidade da Espécie , Takifugu/genética , Proteínas de Peixe-Zebra/química , Proteínas de Peixe-Zebra/classificaçãoRESUMO
BACKGROUND: The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is currently of major agricultural importance in the Mediterranean region as the source of olive oil. The molecular bases underlying the phenotypic differences among domesticated cultivars, or between domesticated olive trees and their wild relatives, remain poorly understood. Both wild and cultivated olive trees have 46 chromosomes (2n). FINDINGS: A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. Assembly gave a final genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). In addition, the associated fungus Aureobasidium pullulans was partially sequenced. Genome annotation, assisted by RNA sequencing from leaf, root, and fruit tissues at various stages, resulted in 56,349 unique protein coding genes, suggesting recent genomic expansion. Genome completeness, as estimated using the CEGMA pipeline, reached 98.79 %. CONCLUSIONS: The assembled draft genome of O. europaea will provide a valuable resource for the study of the evolution and domestication processes of this important tree, and allow determination of the genetic bases of key phenotypic traits. Moreover, it will enhance breeding programs and the formation of new varieties.
Assuntos
Genoma de Planta , Olea/genética , Análise de Sequência de DNA/métodos , Mapeamento Cromossômico , Mapeamento de Sequências Contíguas , Biblioteca Gênica , Tamanho do Genoma , Região do Mediterrâneo , Anotação de Sequência MolecularRESUMO
BACKGROUND: Vertebrate odorant receptors comprise three types of G protein-coupled receptors: the OR, V1R and V2R receptors. The OR superfamily contains over 1,000 genes in some mammalian species, representing the largest gene superfamily in the mammalian genome. RESULTS: To facilitate an informed analysis of OR gene phylogeny, we identified the complete set of 143 OR genes in the zebrafish genome, as well as the OR repertoires in two pufferfish species, fugu (44 genes) and tetraodon (42 genes). Although the genomes analyzed here contain fewer genes than in mammalian species, the teleost OR genes can be grouped into a larger number of major clades, representing greater overall OR diversity in the fish. CONCLUSION: Based on the phylogeny of fish and mammalian repertoires, we propose a model for OR gene evolution in which different ancestral OR genes or gene families were selectively lost or expanded in different vertebrate lineages. In addition, our calculations of the ratios of non-synonymous to synonymous codon substitutions among more recently expanding OR subgroups in zebrafish implicate residues that may be involved in odorant binding.
Assuntos
Receptores Odorantes/genética , Receptores Odorantes/metabolismo , Algoritmos , Motivos de Aminoácidos , Animais , Linhagem da Célula , Mapeamento Cromossômico , Códon , Biologia Computacional , Bases de Dados Genéticas , Evolução Molecular , Peixes , Regulação da Expressão Gênica , Humanos , Modelos Biológicos , Família Multigênica , Filogenia , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Análise de Sequência de DNA , Especificidade da Espécie , Takifugu , Peixe-ZebraRESUMO
As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to â¼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.