RESUMEN
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Asunto(s)
Diploidia , Evolución Molecular , Duplicación de Gen/genética , Genes Duplicados/genética , Genoma/genética , Salmo salar/genética , Animales , Elementos Transponibles de ADN/genética , Femenino , Genómica , Masculino , Modelos Genéticos , Mutagénesis/genética , Filogenia , Estándares de Referencia , Salmo salar/clasificación , Homología de SecuenciaRESUMEN
BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers.
Asunto(s)
Biología Computacional/métodos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Alineación de SecuenciaRESUMEN
Graph-based representations are considered to be the future for reference genomes, as they allow integrated representation of the steadily increasing data on individual variation. Currently available tools allow de novo assembly of graph-based reference genomes, alignment of new read sets to the graph representation as well as certain analyses like variant calling and haplotyping. We here present a first method for calling ChIP-Seq peaks on read data aligned to a graph-based reference genome. The method is a graph generalization of the peak caller MACS2, and is implemented in an open source tool, Graph Peak Caller. By using the existing tool vg to build a pan-genome of Arabidopsis thaliana, we validate our approach by showing that Graph Peak Caller with a pan-genome reference graph can trace variants within peaks that are not part of the linear reference genome, and find peaks that in general are more motif-enriched than those found by MACS2.
Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Arabidopsis/genética , Genoma/genética , Unión Proteica , Programas Informáticos , Factores de TranscripciónRESUMEN
BACKGROUND: The ballan wrasse (Labrus bergylta) belongs to a large teleost family containing more than 600 species showing several unique evolutionary traits such as lack of stomach and hermaphroditism. Agastric fish are found throughout the teleost phylogeny, in quite diverse and unrelated lineages, indicating stomach loss has occurred independently multiple times in the course of evolution. By assembling the ballan wrasse genome and transcriptome we aimed to determine the genetic basis for its digestive system function and appetite regulation. Among other, this knowledge will aid the formulation of aquaculture diets that meet the nutritional needs of agastric species. RESULTS: Long and short read sequencing technologies were combined to generate a ballan wrasse genome of 805 Mbp. Analysis of the genome and transcriptome assemblies confirmed the absence of genes that code for proteins involved in gastric function. The gene coding for the appetite stimulating protein ghrelin was also absent in wrasse. Gene synteny mapping identified several appetite-controlling genes and their paralogs previously undescribed in fish. Transcriptome profiling along the length of the intestine found a declining expression gradient from the anterior to the posterior, and a distinct expression profile in the hind gut. CONCLUSIONS: We showed gene loss has occurred for all known genes related to stomach function in the ballan wrasse, while the remaining functions of the digestive tract appear intact. The results also show appetite control in ballan wrasse has undergone substantial changes. The loss of ghrelin suggests that other genes, such as motilin, may play a ghrelin like role. The wrasse genome offers novel insight in to the evolutionary traits of this large family. As the stomach plays a major role in protein digestion, the lack of genes related to stomach digestion in wrasse suggests it requires formulated diets with higher levels of readily digestible protein than those for gastric species.
Asunto(s)
Evolución Biológica , Perfilación de la Expresión Génica , Perciformes/genética , Estómago/fisiología , Animales , Apetito , Digestión , Tracto Gastrointestinal , Genoma , Perciformes/fisiología , FilogeniaRESUMEN
BACKGROUND: Increased availability of genome assemblies for non-model organisms has resulted in invaluable biological and genomic insight into numerous vertebrates, including teleosts. Sequencing of the Atlantic cod (Gadus morhua) genome and the genomes of many of its relatives (Gadiformes) demonstrated a shared loss of the major histocompatibility complex (MHC) II genes 100 million years ago. An improved version of the Atlantic cod genome assembly shows an extreme density of tandem repeats compared to other vertebrate genome assemblies. Highly contiguous assemblies are therefore needed to further investigate the unusual immune system of the Gadiformes, and whether the high density of tandem repeats found in Atlantic cod is a shared trait in this group. RESULTS: Here, we have sequenced and assembled the genome of haddock (Melanogrammus aeglefinus) - a relative of Atlantic cod - using a combination of PacBio and Illumina reads. Comparative analyses reveal that the haddock genome contains an even higher density of tandem repeats outside and within protein coding sequences than Atlantic cod. Further, both species show an elevated number of tandem repeats in genes mainly involved in signal transduction compared to other teleosts. A characterization of the immune gene repertoire demonstrates a substantial expansion of MCHI in Atlantic cod compared to haddock. In contrast, the Toll-like receptors show a similar pattern of gene losses and expansions. For the NOD-like receptors (NLRs), another gene family associated with the innate immune system, we find a large expansion common to all teleosts, with possible lineage-specific expansions in zebrafish, stickleback and the codfishes. CONCLUSIONS: The generation of a highly contiguous genome assembly of haddock revealed that the high density of short tandem repeats as well as expanded immune gene families is not unique to Atlantic cod - but possibly a feature common to all, or most, codfishes. A shared expansion of NLR genes in teleosts suggests that the NLRs have a more substantial role in the innate immunity of teleosts than other vertebrates. Moreover, we find that high copy number genes combined with variable genome assembly qualities may impede complete characterization of these genes, i.e. the number of NLRs in different teleost species might be underestimates.
Asunto(s)
Proteínas de Peces/genética , Gadiformes/genética , Genoma , Inmunidad Innata/genética , Repeticiones de Microsatélite , Animales , Variación Genética , Antígenos de Histocompatibilidad Clase I/genética , Proteínas NLR/genética , Densidad de Población , Receptores Toll-Like/genéticaRESUMEN
BACKGROUND: It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph-based reference genomes. RESULTS: We formalize offset-based coordinate systems on graph-based reference genomes and introduce methods for representing intervals on these reference structures. We show the advantage of our methods by representing genes on a graph-based representation of the newest assembly of the human genome (GRCh38) and its alternative loci for regions that are highly variable. CONCLUSION: More complex reference genomes, containing alternative loci, require methods to represent genomic data on these structures. Our proposed notation for genomic intervals makes it possible to fully utilize the alternative loci of the GRCh38 assembly and potential future graph-based reference genomes. We have made a Python package for representing such intervals on offset-based coordinate systems, available at https://github.com/uio-cels/offsetbasedgraph . An interactive web-tool using this Python package to visualize genes on a graph created from GRCh38 is available at https://github.com/uio-cels/genomicgraphcoords .
Asunto(s)
Gráficos por Computador , Genoma Humano , Genómica/métodos , Algoritmos , Sitios Genéticos , Humanos , Internet , ARN Mensajero/genética , ARN Mensajero/metabolismo , Análisis de Secuencia de ADN , Programas InformáticosRESUMEN
BACKGROUND: The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. RESULTS: By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. CONCLUSIONS: The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Asunto(s)
Gadus morhua/genética , Genómica/métodos , Secuencias Repetidas en Tándem/genética , Animales , Heterocigoto , Anotación de Secuencia Molecular , Regiones Promotoras Genéticas , Análisis de Secuencia de ADNRESUMEN
Atlantic cod (Gadus morhua) is a large, cold-adapted teleost that sustains long-standing commercial fisheries and incipient aquaculture. Here we present the genome sequence of Atlantic cod, showing evidence for complex thermal adaptations in its haemoglobin gene cluster and an unusual immune architecture compared to other sequenced vertebrates. The genome assembly was obtained exclusively by 454 sequencing of shotgun and paired-end libraries, and automated annotation identified 22,154 genes. The major histocompatibility complex (MHC) II is a conserved feature of the adaptive immune system of jawed vertebrates, but we show that Atlantic cod has lost the genes for MHC II, CD4 and invariant chain (Ii) that are essential for the function of this pathway. Nevertheless, Atlantic cod is not exceptionally susceptible to disease under natural conditions. We find a highly expanded number of MHC I genes and a unique composition of its Toll-like receptor (TLR) families. This indicates how the Atlantic cod immune system has evolved compensatory mechanisms in both adaptive and innate immunity in the absence of MHC II. These observations affect fundamental assumptions about the evolution of the adaptive immune system and its components in vertebrates.
Asunto(s)
Gadus morhua/genética , Gadus morhua/inmunología , Genoma/genética , Sistema Inmunológico/inmunología , Inmunidad/genética , Animales , Evolución Molecular , Genómica , Hemoglobinas/genética , Inmunidad/inmunología , Complejo Mayor de Histocompatibilidad/genética , Complejo Mayor de Histocompatibilidad/inmunología , Masculino , Polimorfismo Genético/genética , Sintenía/genética , Receptores Toll-Like/genéticaRESUMEN
Speciation by hybridization is emerging as a significant contributor to biological diversification. Yet, little is known about the relative contributions of (i) evolutionary novelty and (ii) sorting of pre-existing parental incompatibilities to the build-up of reproductive isolation under this mode of speciation. Few studies have addressed empirically whether hybrid animal taxa are intrinsically isolated from their parents, and no study has so far investigated by which of the two aforementioned routes intrinsic barriers evolve. Here, we show that sorting of pre-existing parental incompatibilities contributes to intrinsic isolation of a hybrid animal taxon. Using a genomic cline framework, we demonstrate that the sex-linked and mitonuclear incompatibilities isolating the homoploid hybrid Italian sparrow at its two geographically separated hybrid-parent boundaries represent a subset of those contributing to reproductive isolation between its parent species, house and Spanish sparrows. Should such a sorting mechanism prove to be pervasive, the circumstances promoting homoploid hybrid speciation may be broader than currently thought, and indeed, there may be many cryptic hybrid taxa separated from their parent species by sorted, inherited incompatibilities.
Asunto(s)
Especiación Genética , Hibridación Genética , Aislamiento Reproductivo , Gorriones/genética , Simpatría , Animales , Teorema de Bayes , Femenino , Italia , Masculino , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , EspañaRESUMEN
Horizontal gene transfer is common in cyanobacteria, and transfer of large gene clusters may lead to acquisition of new functions and conceivably niche adaption. In the present study, we demonstrate that horizontal gene transfer between closely related Planktothrix strains can explain the production of the same oligopeptide isoforms by strains of different colors. Comparison of the genomes of eight Planktothrix strains revealed that strains producing the same oligopeptide isoforms are closely related, regardless of color. We have investigated genes involved in the synthesis of the photosynthetic pigments phycocyanin and phycoerythrin, which are responsible for green and red appearance, respectively. Sequence comparisons suggest the transfer of a functional phycoerythrin gene cluster generating a red phenotype in a strain that is otherwise more closely related to green strains. Our data show that the insertion of a DNA fragment containing the 19.7-kb phycoerythrin gene cluster has been facilitated by homologous recombination, also replacing a region of the phycocyanin operon. These findings demonstrate that large DNA fragments spanning entire functional gene clusters can be effectively transferred between closely related cyanobacterial strains and result in a changed phenotype. Further, the results shed new light on the discussion of the role of horizontal gene transfer in the sporadic distribution of large gene clusters in cyanobacteria, as well as the appearance of red and green strains.
Asunto(s)
Cianobacterias/genética , Transferencia de Gen Horizontal/genética , Familia de Multigenes/genética , Fenotipo , Ficoeritrina/genética , Secuencia de Bases , Análisis por Conglomerados , Color , Recombinación Homóloga/genética , Lagos/microbiología , Funciones de Verosimilitud , Modelos Genéticos , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Noruega , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN , Especificidad de la EspecieRESUMEN
Repetitive DNA make up a considerable fraction of most eukaryotic genomes. In fish, transposable element (TE) activity has coincided with rapid species diversification. Here, we annotated the repetitive content in 100 genome assemblies, covering the major branches of the diverse lineage of teleost fish. We investigated if TE content correlates with family level net diversification rates and found support for a weak negative correlation. Further, we demonstrated that TE proportion correlates with genome size, but not to the proportion of short tandem repeats (STRs), which implies independent evolutionary paths. Marine and freshwater fish had large differences in STR content, with the most extreme propagation detected in the genomes of codfish species and Atlantic herring. Such a high density of STRs is likely to increase the mutational load, which we propose could be counterbalanced by high fecundity as seen in codfishes and herring.
RESUMEN
BACKGROUND: Interstitial Cystitis (IC) is a chronic inflammatory condition of the bladder with unknown etiology. The aim of this study was to characterize the microbial community present in the urine from IC female patients by 454 high throughput sequencing of the 16S variable regions V1V2 and V6. The taxonomical composition, richness and diversity of the IC microbiota were determined and compared to the microbial profile of asymptomatic healthy female (HF) urine. RESULTS: The composition and distribution of bacterial sequences differed between the urine microbiota of IC patients and HFs. Reduced sequence richness and diversity were found in IC patient urine, and a significant difference in the community structure of IC urine in relation to HF urine was observed. More than 90% of the IC sequence reads were identified as belonging to the bacterial genus Lactobacillus, a marked increase compared to 60% in HF urine. CONCLUSION: The 16S rDNA sequence data demonstrates a shift in the composition of the bacterial community in IC urine. The reduced microbial diversity and richness is accompanied by a higher abundance of the bacterial genus Lactobacillus, compared to HF urine. This study demonstrates that high throughput sequencing analysis of urine microbiota in IC patients is a powerful tool towards a better understanding of this enigmatic disease.
Asunto(s)
Biodiversidad , Cistitis Intersticial/microbiología , Metagenoma , Orina/microbiología , Adulto , Anciano , Análisis por Conglomerados , ADN Bacteriano/química , ADN Bacteriano/genética , ADN Ribosómico/química , ADN Ribosómico/genética , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Persona de Mediana Edad , Filogenia , ARN Ribosómico 16S/genéticaRESUMEN
The genome of Enterococcus faecalis 62, a commensal isolate from a healthy Norwegian infant, revealed multiple adaptive traits to the gastrointestinal tract (GIT) environment and the milk-containing diet of breast-fed infants. Adaptation to a commensal existence was emphasized by lactose and other carbohydrate metabolism genes within genomic islands, accompanied by the absence of virulence traits.
Asunto(s)
Enterococcus faecalis/clasificación , Enterococcus faecalis/genética , Genoma Bacteriano , Humanos , Lactante , Datos de Secuencia Molecular , NoruegaRESUMEN
BACKGROUND: Urine within the urinary tract is commonly regarded as "sterile" in cultivation terms. Here, we present a comprehensive in-depth study of bacterial 16S rDNA sequences associated with urine from healthy females by means of culture-independent high-throughput sequencing techniques. RESULTS: Sequencing of the V1V2 and V6 regions of the 16S ribosomal RNA gene using the 454 GS FLX system was performed to characterize the possible bacterial composition in 8 culture-negative (<100,000 CFU/ml) healthy female urine specimens. Sequences were compared to 16S rRNA databases and showed significant diversity, with the predominant genera detected being Lactobacillus, Prevotella and Gardnerella. The bacterial profiles in the female urine samples studied were complex; considerable variation between individuals was observed and a common microbial signature was not evident. Notably, a significant amount of sequences belonging to bacteria with a known pathogenic potential was observed. The number of operational taxonomic units (OTUs) for individual samples varied substantially and was in the range of 20-500. CONCLUSIONS: Normal female urine displays a noticeable and variable bacterial 16S rDNA sequence richness, which includes fastidious and anaerobic bacteria previously shown to be associated with female urogenital pathology.
Asunto(s)
Bacterias/clasificación , Variación Genética , Metagenoma , Orina/microbiología , Adulto , Bacterias/genética , ADN Bacteriano/genética , Bases de Datos de Ácidos Nucleicos , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: The vertebrate globin genes encoding the α- and ß-subunits of the tetrameric hemoglobins are clustered at two unlinked loci. The highly conserved linear order of the genes flanking the hemoglobins provides a strong anchor for inferring common ancestry of the globin clusters. In fish, the number of α-ß-linked globin genes varies considerably between different sublineages and seems to be related to prevailing physico-chemical conditions. Draft sequences of the Atlantic cod genome enabled us to determine the genomic organization of the globin repertoire in this marine species that copes with fluctuating environments of the temperate and Arctic regions. RESULTS: The Atlantic cod genome was shown to contain 14 globin genes, including nine hemoglobin genes organized in two unlinked clusters designated ß5-α1-ß1-α4 and ß3-ß4-α2-α3-ß2. The diverged cod hemoglobin genes displayed different expression levels in adult fish, and tetrameric hemoglobins with or without a Root effect were predicted. The novel finding of maternally inherited hemoglobin mRNAs is consistent with a potential role played by fish hemoglobins in the non-specific immune response. In silico analysis of the six teleost genomes available showed that the two α-ß globin clusters are flanked by paralogs of five duplicated genes, in agreement with the proposed teleost-specific duplication of the ancestral vertebrate globin cluster. Screening the genome of extant urochordate and cephalochordate species for conserved globin-flanking genes revealed linkage of RHBDF1, MPG and ARHGAP17 to globin genes in the tunicate Ciona intestinalis, while these genes together with LCMT are closely positioned in amphioxus (Branchiostoma floridae), but seem to be unlinked to the multiple globin genes identified in this species. CONCLUSION: The plasticity of Atlantic cod to variable environmental conditions probably involves the expression of multiple globins with potentially different properties. The interspecific difference in number of fish hemoglobin genes contrasts with the highly conserved synteny of the flanking genes. The proximity of globin-flanking genes in the tunicate and amphioxus genomes resembles the RHBDF1-MPG-α-globin-ARHGAP17-LCMT linked genes in man and chicken. We hypothesize that the fusion of the three chordate linkage groups 3, 15 and 17 more than 800 MYA led to the ancestral vertebrate globin cluster during a geological period of increased atmospheric oxygen content.
Asunto(s)
Gadus morhua/genética , Globinas/genética , Vertebrados/genética , Secuencia de Aminoácidos , Animales , Globinas/química , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa , Homología de Secuencia de Aminoácido , Globinas alfa/química , Globinas alfa/genética , Globinas beta/química , Globinas beta/genéticaRESUMEN
BACKGROUND: Cardiomyopathy syndrome (CMS) is a severe disease affecting large farmed Atlantic salmon. Mortality often appears without prior clinical signs, typically shortly prior to slaughter. We recently reported the finding and the complete genomic sequence of a novel piscine reovirus (PRV), which is associated with another cardiac disease in Atlantic salmon; heart and skeletal muscle inflammation (HSMI). In the present work we have studied whether PRV or other infectious agents may be involved in the etiology of CMS. RESULTS: Using high throughput sequencing on heart samples from natural outbreaks of CMS and from fish experimentally challenged with material from fish diagnosed with CMS a high number of sequence reads identical to the PRV genome were identified. In addition, a sequence contig from a novel totivirus could also be constructed. Using RT-qPCR, levels of PRV in tissue samples were quantified and the totivirus was detected in all samples tested from CMS fish but not in controls. In situ hybridization supported this pattern indicating a possible association between CMS and the novel piscine totivirus. CONCLUSIONS: Although causality for CMS in Atlantic salmon could not be proven for either of the two viruses, our results are compatible with a hypothesis where, in the experimental challenge studied, PRV behaves as an opportunist whereas the totivirus might be more directly linked with the development of CMS.
Asunto(s)
Cardiomiopatías/veterinaria , Enfermedades de los Peces/virología , Infecciones por Reoviridae/veterinaria , Reoviridae/aislamiento & purificación , Salmo salar/virología , Totivirus/aislamiento & purificación , Animales , Cardiomiopatías/virología , Corazón/virología , Histocitoquímica , Hibridación in Situ , Microscopía , Datos de Secuencia Molecular , Miocardio/patología , ARN Viral/genética , Reoviridae/clasificación , Reoviridae/genética , Infecciones por Reoviridae/virología , Análisis de Secuencia de ADN , Totivirus/clasificación , Totivirus/genéticaRESUMEN
BACKGROUND: Cyanobacteria often produce several different oligopeptides, with unknown biological functions, by nonribosomal peptide synthetases (NRPS). Although some cyanobacterial NRPS gene cluster types are well described, the entire NRPS genomic content within a single cyanobacterial strain has never been investigated. Here we have combined a genome-wide analysis using massive parallel pyrosequencing ("454") and mass spectrometry screening of oligopeptides produced in the strain Planktothrix rubescens NIVA CYA 98 in order to identify all putative gene clusters for oligopeptides. RESULTS: Thirteen types of oligopeptides were uncovered by mass spectrometry (MS) analyses. Microcystin, cyanopeptolin and aeruginosin synthetases, highly similar to already characterized NRPS, were present in the genome. Two novel NRPS gene clusters were associated with production of anabaenopeptins and microginins, respectively. Sequence-depth of the genome and real-time PCR data revealed three copies of the microginin gene cluster. Since NRPS gene cluster candidates for microviridin and oscillatorin synthesis could not be found, putative (gene encoded) precursor peptide sequences to microviridin and oscillatorin were found in the genes mdnA and oscA, respectively. The genes flanking the microviridin and oscillatorin precursor genes encode putative modifying enzymes of the precursor oligopeptides. We therefore propose ribosomal pathways involving modifications and cyclisation for microviridin and oscillatorin. The microviridin, anabaenopeptin and cyanopeptolin gene clusters are situated in close proximity to each other, constituting an oligopeptide island. CONCLUSION: Altogether seven nonribosomal peptide synthetase (NRPS) gene clusters and two gene clusters putatively encoding ribosomal oligopeptide biosynthetic pathways were revealed. Our results demonstrate that whole genome shotgun sequencing combined with MS-directed determination of oligopeptides successfully can identify NRPS gene clusters and the corresponding oligopeptides. The analyses suggest independent evolution of all NRPS gene clusters as functional units. Our data indicate that the Planktothrix genome displays evolution of dual pathways (NRPS and ribosomal) for production of oligopeptides in order to maximize the diversity of oligopeptides with similar but functional discrete bioactivities.
Asunto(s)
Cianobacterias/genética , Estudio de Asociación del Genoma Completo , Familia de Multigenes , Péptido Sintasas/genética , Biología Computacional , Cianobacterias/metabolismo , ADN Bacteriano/genética , Evolución Molecular , Genes Bacterianos , Genoma Bacteriano , Oligopéptidos/biosíntesis , Filogenia , Análisis de Secuencia de ADN/métodosRESUMEN
The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic, and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the "ecology's supermodel," whereas the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2 Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and about 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromere-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years ago (Ma) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 Ma. Compared with the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.
Asunto(s)
Genoma , Perciformes/genética , Animales , Elementos Transponibles de ADN , Evolución Molecular , Femenino , Proteínas de Peces/genética , Hemoglobinas/genética , Masculino , Repeticiones de Microsatélite , Anotación de Secuencia Molecular , Perciformes/clasificación , Filogenia , Recombinación GenéticaRESUMEN
Several methods for typing of Legionella pneumophila exist, one of which is an 8-locus variable-number of tandem repeats analysis (MLVA). This method is based on separating and sizing amplified VNTR PCR products by agarose gel electrophoresis. In the present work, the existing L. pneumophila MLVA-8 assay is adapted to capillary electrophoresis. The assay was multiplexed by using multiple fluorescent labeling dyes and tested on a panel of L. pneumophila strains with known genotypes. The results from the capillary electrophoresis-based assay are shown to be equivalent to, and in a few cases more sensitive than, the gel-based genotyping assay. The assay presented here allows for a swift, automated and precise typing of L. pneumophila from patient or environmental samples and represents an improvement over the current gel-based method.
Asunto(s)
Técnicas de Tipificación Bacteriana/métodos , Electroforesis Capilar/métodos , Legionella pneumophila/clasificación , Legionella pneumophila/genética , Repeticiones de Minisatélite , Epidemiología Molecular/métodos , Fluorescencia , Genotipo , Sensibilidad y Especificidad , Coloración y Etiquetado/métodosRESUMEN
Whole-genome duplication (WGD) has been a major evolutionary driver of increased genomic complexity in vertebrates. One such event occurred in the salmonid family â¼80 Ma (Ss4R) giving rise to a plethora of structural and regulatory duplicate-driven divergence, making salmonids an exemplary system to investigate the evolutionary consequences of WGD. Here, we present a draft genome assembly of European grayling (Thymallus thymallus) and use this in a comparative framework to study evolution of gene regulation following WGD. Among the Ss4R duplicates identified in European grayling and Atlantic salmon (Salmo salar), one-third reflect nonneutral tissue expression evolution, with strong purifying selection, maintained over â¼50 Myr. Of these, the majority reflect conserved tissue regulation under strong selective constraints related to brain and neural-related functions, as well as higher-order protein-protein interactions. A small subset of the duplicates have evolved tissue regulatory expression divergence in a common ancestor, which have been subsequently conserved in both lineages, suggestive of adaptive divergence following WGD. These candidates for adaptive tissue expression divergence have elevated rates of protein coding- and promoter-sequence evolution and are enriched for immune- and lipid metabolism ontology terms. Lastly, lineage-specific duplicate divergence points toward underlying differences in adaptive pressures on expression regulation in the nonanadromous grayling versus the anadromous Atlantic salmon. Our findings enhance our understanding of the role of WGD in genome evolution and highlight cases of regulatory divergence of Ss4R duplicates, possibly related to a niche shift in early salmonid evolution.