Search | VHL Search Portal

The genomic basis of evolutionary differentiation among honey bees.

Fouks, Bertrand; Brand, Philipp; Nguyen, Hung N; Herman, Jacob; Camara, Francisco; Ence, Daniel; Hagen, Darren E; Hoff, Katharina J; Nachweide, Stefanie; Romoth, Lars; Walden, Kimberly K O; Guigo, Roderic; Stanke, Mario; Narzisi, Giuseppe; Yandell, Mark; Robertson, Hugh M; Koeniger, Nikolaus; Chantawannakul, Panuwan; Schatz, Michael C; Worley, Kim C; Robinson, Gene E; Elsik, Christine G; Rueppell, Olav.

Genome Res ; 31(7): 1203-1215, 2021 Jul.

Article in English | MEDLINE | ID: mdl-33947700

ABSTRACT

In contrast to the western honey bee, Apis mellifera, other honey bee species have been largely neglected despite their importance and diversity. The genetic basis of the evolutionary diversification of honey bees remains largely unknown. Here, we provide a genome-wide comparison of three honey bee species, each representing one of the three subgenera of honey bees, namely the dwarf (Apis florea), giant (A. dorsata), and cavity-nesting (A. mellifera) honey bees with bumblebees as an outgroup. Our analyses resolve the phylogeny of honey bees with the dwarf honey bees diverging first. We find that evolution of increased eusocial complexity in Apis proceeds via increases in the complexity of gene regulation, which is in agreement with previous studies. However, this process seems to be related to pathways other than transcriptional control. Positive selection patterns across Apis reveal a trade-off between maintaining genome stability and generating genetic diversity, with a rapidly evolving piRNA pathway leading to genomes depleted of transposable elements, and a rapidly evolving DNA repair pathway associated with high recombination rates in all Apis species. Diversification within Apis is accompanied by positive selection in several genes whose putative functions present candidate mechanisms for lineage-specific adaptations, such as migration, immunity, and nesting behavior.

Simultaneous gene finding in multiple genomes.

König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario.

Bioinformatics ; 32(22): 3388-3395, 2016 11 15.

Article in English | MEDLINE | ID: mdl-27466621

ABSTRACT

MOTIVATION: As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. RESULTS: The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. AVAILABILITY AND IMPLEMENTATION: The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online.

Subject(s)

Genome , Sequence Alignment , Animals , Drosophila melanogaster , Exons , Humans , Mice

Comparative Genome Annotation.

Nachtweide, Stefanie; Romoth, Lars; Stanke, Mario.

Methods Mol Biol ; 2802: 165-187, 2024.

Article in English | MEDLINE | ID: mdl-38819560

ABSTRACT

Newly sequenced genomes are being added to the tree of life at an unprecedented fast pace. A large proportion of such new genomes are phylogenetically close to previously sequenced and annotated genomes. In other cases, whole clades of closely related species or strains ought to be annotated simultaneously. Often, in subsequent studies, differences between the closely related species or strains are in the focus of research when the shared gene structures prevail. We here review methods for comparative structural genome annotation. The reviewed methods include classical approaches such as the alignment of protein sequences or protein profiles against the genome and comparative gene prediction methods that exploit a genome alignment to annotate either a single target genome or all input genomes simultaneously. We discuss how the methods depend on the phylogenetic placement of genomes, give advice on the choice of methods, and examine the consistency between gene structure annotations in an example. Furthermore, we provide practical advice on genome annotation in general.

Subject(s)

Genomics , Molecular Sequence Annotation , Phylogeny , Molecular Sequence Annotation/methods , Genomics/methods , Computational Biology/methods , Genome/genetics , Sequence Alignment/methods , Software

Comparative Genome Annotation.

König, Stefanie; Romoth, Lars; Stanke, Mario.

Methods Mol Biol ; 1704: 189-212, 2018.

Article in English | MEDLINE | ID: mdl-29277866

ABSTRACT

Newly sequenced genomes are being added to the tree of life at an unprecedented fast pace. Increasingly, such new genomes are phylogenetically close to previously sequenced and annotated genomes. In other cases, whole clades of closely related species or strains ought to be annotated simultaneously. Often, in subsequent studies differences between the closely related species or strains are in the focus of research when the shared gene structures prevail. We here review methods for comparative structural genome annotation. The reviewed methods include classical approaches such as the alignment of protein sequences or protein profiles against the genome and comparative gene prediction methods that exploit a genome alignment to annotate a target genome. Newer approaches such as the simultaneous annotation of multiple genomes are also reviewed. We discuss how the methods depend on the phylogenetic placement of genomes, give advice on the choice of methods, and examine the consistency between gene structure annotations in an example. Further, we provide practical advice on genome annotation in general.

Subject(s)

Computational Biology , Genome , Molecular Sequence Annotation , Animals , Chromosome Mapping , Databases, Genetic , Humans , Phylogeny , Sequence Alignment , Sequence Analysis, DNA , Software

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

Lilue, Jingtao; Doran, Anthony G; Fiddes, Ian T; Abrudan, Monica; Armstrong, Joel; Bennett, Ruth; Chow, William; Collins, Joanna; Collins, Stephan; Czechanski, Anne; Danecek, Petr; Diekhans, Mark; Dolle, Dirk-Dominik; Dunn, Matt; Durbin, Richard; Earl, Dent; Ferguson-Smith, Anne; Flicek, Paul; Flint, Jonathan; Frankish, Adam; Fu, Beiyuan; Gerstein, Mark; Gilbert, James; Goodstadt, Leo; Harrow, Jennifer; Howe, Kerstin; Ibarra-Soria, Ximena; Kolmogorov, Mikhail; Lelliott, Chris J; Logan, Darren W; Loveland, Jane; Mathews, Clayton E; Mott, Richard; Muir, Paul; Nachtweide, Stefanie; Navarro, Fabio C P; Odom, Duncan T; Park, Naomi; Pelan, Sarah; Pham, Son K; Quail, Mike; Reinholdt, Laura; Romoth, Lars; Shirley, Lesley; Sisu, Cristina; Sjoberg-Herrera, Marcela; Stanke, Mario; Steward, Charles; Thomas, Mark; Threadgold, Glen.

Nat Genet ; 50(11): 1574-1583, 2018 11.

Article in English | MEDLINE | ID: mdl-30275530

ABSTRACT

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.

Subject(s)

Chromosome Mapping , Genetic Loci , Genome , Haplotypes , Mice, Inbred Strains/genetics , Animals , Animals, Laboratory , Chromosome Mapping/veterinary , Haplotypes/genetics , Mice , Mice, Inbred BALB C/genetics , Mice, Inbred C3H/genetics , Mice, Inbred C57BL/genetics , Mice, Inbred CBA/genetics , Mice, Inbred DBA/genetics , Mice, Inbred NOD/genetics , Mice, Inbred Strains/classification , Molecular Sequence Annotation , Phylogeny , Polymorphism, Single Nucleotide , Species Specificity

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL