Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 153
Filter
1.
Semin Cell Dev Biol ; 152-153: 4-15, 2024.
Article in English | MEDLINE | ID: mdl-36526530

ABSTRACT

The Hox gene cluster is an iconic example of evolutionary conservation between divergent animal lineages, providing evidence for ancient similarities in the genetic control of embryonic development. However, there are differences between taxa in gene order, gene number and genomic organisation implying conservation is not absolute. There are also examples of radical functional change of Hox genes; for example, the ftz, zen and bcd genes in insects play roles in segmentation, extraembryonic membrane formation and body polarity, rather than specification of anteroposterior position. There have been detailed descriptions of Hox genes and Hox gene clusters in several insect species, including important model systems, but a large-scale overview has been lacking. Here we extend these studies using the publicly-available complete genome sequences of 243 insect species from 13 orders. We show that the insect Hox cluster is characterised by large intergenic distances, consistently extreme in Odonata, Orthoptera, Hemiptera and Trichoptera, and always larger between the 'posterior' Hox genes. We find duplications of ftz and zen in many species and multiple independent cluster breaks, although certain modules of neighbouring genes are rarely broken apart suggesting some organisational constraints. As more high-quality genomes are obtained, a challenge will be to relate structural genomic changes to phenotypic change across insect phylogeny.

2.
Genome Res ; 33(1): 32-44, 2023 01.
Article in English | MEDLINE | ID: mdl-36617663

ABSTRACT

Homeobox genes encode transcription factors with essential roles in patterning and cell fate in developing animal embryos. Many homeobox genes, including Hox and NK genes, are arranged in gene clusters, a feature likely related to transcriptional control. Sparse taxon sampling and fragmentary genome assemblies mean that little is known about the dynamics of homeobox gene evolution across Lepidoptera or about how changes in homeobox gene number and organization relate to diversity in this large order of insects. Here we analyze an extensive data set of high-quality genomes to characterize the number and organization of all homeobox genes in 123 species of Lepidoptera from 23 taxonomic families. We find most Lepidoptera have around 100 homeobox loci, including an unusual Hox gene cluster in which the lab gene is repositioned and the ro gene is next to pb A topologically associating domain spans much of the gene cluster, suggesting deep regulatory conservation of the Hox cluster arrangement in this insect order. Most Lepidoptera have four Shx genes, divergent zen-derived loci, but these loci underwent dramatic duplication in several lineages, with some moths having over 165 homeobox loci in the Hox gene cluster; this expansion is associated with local LINE element density. In contrast, the NK gene cluster content is more stable, although there are differences in organization compared with other insects, as well as major rearrangements within butterflies. Our analysis represents the first description of homeobox gene content across the order Lepidoptera, exemplifying the potential of newly generated genome assemblies for understanding genome and gene family evolution.


Subject(s)
Butterflies , Genes, Homeobox , Animals , Phylogeny , Multigene Family , Genomics , Evolution, Molecular
3.
Mol Biol Evol ; 40(11)2023 Nov 03.
Article in English | MEDLINE | ID: mdl-37935057

ABSTRACT

Color vision in insects is determined by signaling cascades, central to which are opsin proteins, resulting in sensitivity to light at different wavelengths. In certain insect groups, lineage-specific evolution of opsin genes, in terms of copy number, shifts in expression patterns, and functional amino acid substitutions, has resulted in changes in color vision with subsequent behavioral and niche adaptations. Lepidoptera are a fascinating model to address whether evolutionary change in opsin content and sequence evolution are associated with changes in vision phenotype. Until recently, the lack of high-quality genome data representing broad sampling across the lepidopteran phylogeny has greatly limited our ability to accurately address this question. Here, we annotate opsin genes in 219 lepidopteran genomes representing 33 families, reconstruct their evolutionary history, and analyze shifts in selective pressures and expression between genes and species. We discover 44 duplication events in opsin genes across ∼300 million years of lepidopteran evolution. While many duplication events are species or family specific, we find retention of an ancient long-wavelength-sensitive (LW) opsin duplication derived by retrotransposition within the speciose superfamily Noctuoidea (in the families Nolidae, Erebidae, and Noctuidae). This conserved LW retrogene shows life stage-specific expression suggesting visual sensitivities or other sensory functions specific to the early larval stage. This study provides a comprehensive order-wide view of opsin evolution across Lepidoptera, showcasing high rates of opsin duplications and changes in expression patterns.


Subject(s)
Color Vision , Lepidoptera , Humans , Animals , Opsins/genetics , Gene Duplication , Lepidoptera/genetics , Evolution, Molecular , Rod Opsins/chemistry , Rod Opsins/genetics , Insecta/genetics , Phylogeny , Gene Expression
4.
J Mol Evol ; 92(2): 138-152, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38491221

ABSTRACT

The proportions of A:T and G:C nucleotide pairs are often unequal and can vary greatly between animal species and along chromosomes. The causes and consequences of this variation are incompletely understood. The recent release of high-quality genome sequences from the Darwin Tree of Life and other large-scale genome projects provides an opportunity for GC heterogeneity to be compared across a large number of insect species. Here we analyse GC content along chromosomes, and within protein-coding genes and codons, of 150 insect species from four holometabolous orders: Coleoptera, Diptera, Hymenoptera, and Lepidoptera. We find that protein-coding sequences have higher GC content than the genome average, and that Lepidoptera generally have higher GC content than the other three insect orders examined. GC content is higher in small chromosomes in most Lepidoptera species, but this pattern is less consistent in other orders. GC content also increases towards subtelomeric regions within protein-coding genes in Diptera, Coleoptera and Lepidoptera. Two species of Diptera, Bombylius major and B. discolor, have very atypical genomes with ubiquitous increase in AT content, especially at third codon positions. Despite dramatic AT-biased codon usage, we find no evidence that this has driven divergent protein evolution. We argue that the GC landscape of Lepidoptera, Diptera and Coleoptera genomes is influenced by GC-biased gene conversion, strongest in Lepidoptera, with some outlier taxa affected drastically by counteracting processes.


Subject(s)
Genome, Insect , Insecta , Animals , Base Composition , Phylogeny , Genome, Insect/genetics , Codon/genetics , Insecta/genetics , Evolution, Molecular
5.
Mol Biol Evol ; 39(5)2022 05 03.
Article in English | MEDLINE | ID: mdl-35512670

ABSTRACT

Eutherian Totipotent Cell Homeobox (ETCHbox) genes are mammalian-specific PRD-class homeobox genes with conserved expression in the preimplantation embryo but fast-evolving and highly divergent sequences. Here, we exploit an ectopic expression approach to examine the role of bovine ETCHbox genes and show that ARGFX and LEUTX homeodomain proteins upregulate genes normally expressed in the blastocyst; the identities of the regulated genes suggest that, in vivo, the ETCHbox genes play a role in coordinating the physical formation of the blastocyst structure. Both genes also downregulate genes expressed earlier during development and genes associated with an undifferentiated cell state, possibly via the JAK/STAT pathway. We find evidence that bovine ARGFX and LEUTX have overlapping functions, in contrast to their antagonistic roles in humans. Finally, we characterize a mutant bovine ARGFX allele which eliminates the homeodomain and show that homozygous mutants are viable. These data support the hypothesis of functional overlap between ETCHbox genes within a species, roles for ETCHbox genes in blastocyst formation and the change of their functions over evolutionary time.


Subject(s)
Genes, Homeobox , Janus Kinases , Animals , Blastocyst/metabolism , Cattle , Embryonic Development/genetics , Gene Expression Regulation, Developmental , Humans , Janus Kinases/genetics , Janus Kinases/metabolism , Mammals/genetics , STAT Transcription Factors/genetics , STAT Transcription Factors/metabolism , Signal Transduction
6.
Mol Biol Evol ; 37(8): 2197-2210, 2020 08 01.
Article in English | MEDLINE | ID: mdl-32170949

ABSTRACT

Recombination increases the local GC-content in genomic regions through GC-biased gene conversion (gBGC). The recent discovery of a large genomic region with extreme GC-content in the fat sand rat Psammomys obesus provides a model to study the effects of gBGC on chromosome evolution. Here, we compare the GC-content and GC-to-AT substitution patterns across protein-coding genes of four gerbil species and two murine rodents (mouse and rat). We find that the known high-GC region is present in all the gerbils, and is characterized by high substitution rates for all mutational categories (AT-to-GC, GC-to-AT, and GC-conservative) both at synonymous and nonsynonymous sites. A higher AT-to-GC than GC-to-AT rate is consistent with the high GC-content. Additionally, we find more than 300 genes outside the known region with outlying values of AT-to-GC synonymous substitution rates in gerbils. Of these, over 30% are organized into at least 17 large clusters observable at the megabase-scale. The unusual GC-skewed substitution pattern suggests the evolution of genomic regions with very high recombination rates in the gerbil lineage, which can lead to a runaway increase in GC-content. Our results imply that rapid evolution of GC-content is possible in mammals, with gerbil species providing a powerful model to study the mechanisms of gBGC.


Subject(s)
Base Composition , Evolution, Molecular , Gene Conversion , Genome , Gerbillinae/genetics , Animals , Multigene Family , Mutation
7.
J Mol Evol ; 89(6): 396-414, 2021 07.
Article in English | MEDLINE | ID: mdl-34097121

ABSTRACT

The majority of homeobox genes are highly conserved across animals, but the eutherian-specific ETCHbox genes, embryonically expressed and highly divergent duplicates of CRX, are a notable exception. Here we compare the ETCHbox genes of 34 mammalian species, uncovering dynamic patterns of gene loss and tandem duplication, including the presence of a large tandem array of LEUTX loci in the genome of the European rabbit (Oryctolagus cuniculus). Despite extensive gene gain and loss, all sampled species possess at least two ETCHbox genes, suggesting their collective role is indispensable. We find evidence for positive selection and show that TPRX1 and TPRX2 have been the subject of repeated gene conversion across the Boreoeutheria, homogenising their sequences and preventing divergence, especially in the homeobox region. Together, these results are consistent with a model where mammalian ETCHbox genes are dynamic in evolution due to functional overlap, yet have collective indispensable roles.


Subject(s)
Gene Conversion , Genes, Homeobox , Animals , Evolution, Molecular , Gene Duplication , Genes, Homeobox/genetics , Genome/genetics , Mammals/genetics , Phylogeny , Rabbits
8.
Nature ; 520(7548): 450-5, 2015 Apr 23.
Article in English | MEDLINE | ID: mdl-25903626

ABSTRACT

Over the past 200 years, almost every invertebrate phylum has been proposed as a starting point for evolving vertebrates. Most of these scenarios are outdated, but several are still seriously considered. The short-range transition from ancestral invertebrate chordates (similar to amphioxus and tunicates) to vertebrates is well accepted. However, longer-range transitions leading up to the invertebrate chordates themselves are more controversial. Opinion is divided between the annelid and the enteropneust scenarios, predicting, respectively, a complex or a simple ancestor for bilaterian animals. Deciding between these ideas will be facilitated by further comparative studies of multicellular animals, including enigmatic taxa such as xenacoelomorphs.


Subject(s)
Phylogeny , Vertebrates , Animals , Annelida/anatomy & histology , Annelida/classification , Invertebrates/anatomy & histology , Invertebrates/classification , Models, Biological , Research , Vertebrates/anatomy & histology , Vertebrates/classification
9.
BMC Biol ; 18(1): 68, 2020 06 16.
Article in English | MEDLINE | ID: mdl-32546156

ABSTRACT

BACKGROUND: The homeobox genes Pdx and Cdx are widespread across the animal kingdom and part of the small ParaHox gene cluster. Gene expression patterns suggest ancient roles for Pdx and Cdx in patterning the through-gut of bilaterian animals although functional data are available for few lineages. To examine evolutionary conservation of Pdx and Cdx gene functions, we focus on amphioxus, small marine animals that occupy a pivotal position in chordate evolution and in which ParaHox gene clustering was first reported. RESULTS: Using transcription activator-like effector nucleases (TALENs), we engineer frameshift mutations in the Pdx and Cdx genes of the amphioxus Branchiostoma floridae and establish mutant lines. Homozygous Pdx mutants have a defect in amphioxus endoderm, manifest as loss of a midgut region expressing endogenous GFP. The anus fails to open in homozygous Cdx mutants, which also have defects in posterior body extension and epidermal tail fin development. Treatment with an inverse agonist of retinoic acid (RA) signalling partially rescues the axial and tail fin phenotypes indicating they are caused by increased RA signalling. Gene expression analyses and luciferase assays suggest that posterior RA levels are kept low in wild type animals by a likely direct transcriptional regulation of a Cyp26 gene by Cdx. Transcriptome analysis reveals extensive gene expression changes in mutants, with a disproportionate effect of Pdx and Cdx on gut-enriched genes and a colinear-like effect of Cdx on Hox genes. CONCLUSIONS: These data reveal that amphioxus Pdx and Cdx have roles in specifying middle and posterior cell fates in the endoderm of the gut, roles that likely date to the origin of Bilateria. This conclusion is consistent with these two ParaHox genes playing a role in the origin of the bilaterian through-gut with a distinct anus, morphological innovations that contributed to ecological change in the Cambrian. In addition, we find that amphioxus Cdx promotes body axis extension through a molecular mechanism conserved with vertebrates. The axial extension role for Cdx dates back at least to the origin of Chordata and may have facilitated the evolution of the post-anal tail and active locomotion in chordates.


Subject(s)
Anal Canal/embryology , Gastrointestinal Tract/embryology , Homeodomain Proteins/genetics , Lancelets/embryology , Mutation , Tail/embryology , Transcription Factors/genetics , Animals , Embryo, Nonmammalian , Embryonic Development/genetics , Genes, Homeobox , Homeodomain Proteins/metabolism , Lancelets/genetics , Transcription Factors/metabolism
10.
BMC Evol Biol ; 20(1): 134, 2020 10 19.
Article in English | MEDLINE | ID: mdl-33076817

ABSTRACT

BACKGROUND: Two gerbil species, sand rat (Psammomys obesus) and Mongolian jird (Meriones unguiculatus), can become obese and show signs of metabolic dysregulation when maintained on standard laboratory diets. The genetic basis of this phenotype is unknown. Recently, genome sequencing has uncovered very unusual regions of high guanine and cytosine (GC) content scattered across the sand rat genome, most likely generated by extreme and localized biased gene conversion. A key pancreatic transcription factor PDX1 is encoded by a gene in the most extreme GC-rich region, is remarkably divergent and exhibits altered biochemical properties. Here, we ask if gerbils have proteins in addition to PDX1 that are aberrantly divergent in amino acid sequence, whether they have also become divergent due to GC-biased nucleotide changes, and whether these proteins could plausibly be connected to metabolic dysfunction exhibited by gerbils. RESULTS: We analyzed ~ 10,000 proteins with 1-to-1 orthologues in human and rodents and identified 50 proteins that accumulated unusually high levels of amino acid change in the sand rat and 41 in Mongolian jird. We show that more than half of the aberrantly divergent proteins are associated with GC biased nucleotide change and many are in previously defined high GC regions. We highlight four aberrantly divergent gerbil proteins, PDX1, INSR, MEDAG and SPP1, that may plausibly be associated with dietary metabolism. CONCLUSIONS: We show that through the course of gerbil evolution, many aberrantly divergent proteins have accumulated in the gerbil lineage, and GC-biased nucleotide substitution rather than positive selection is the likely cause of extreme divergence in more than half of these. Some proteins carry putatively deleterious changes that could be associated with metabolic and physiological phenotypes observed in some gerbil species. We propose that these animals provide a useful model to study the 'tug-of-war' between natural selection and the excessive accumulation of deleterious substitutions mutations through biased gene conversion.


Subject(s)
Evolution, Molecular , Gene Conversion , Gerbillinae/genetics , Animals , Humans , Mice , Phenotype , Rats
11.
Mol Biol Evol ; 36(7): 1473-1480, 2019 07 01.
Article in English | MEDLINE | ID: mdl-30968125

ABSTRACT

Several processes can lead to strong GC skew in localized genomic regions. In most cases, GC skew should not affect conserved amino acids because natural selection will purge deleterious alleles. However, in the gerbil subfamily of rodents, several conserved genes have undergone radical alteration in association with strong GC skew. An extreme example concerns the highly conserved homeobox gene Pdx1, which is uniquely divergent and GC rich in the sand rat Psammomys obesus and close relatives. Here, we investigate the antagonistic interplay between very rare amino acid changes driven by GC skew and the force of natural selection. Using ectopic protein expression in cell culture, pulse-chase labeling, in vitro mutagenesis, and drug treatment, we compare properties of mouse and sand rat Pdx1 proteins. We find that amino acid change driven by GC skew resulted in altered protein stability, with a significantly longer protein half-life for sand rat Pdx1. Using a reversible inhibitor of the 26S proteasome, MG132, we find that sand rat and mouse Pdx1 are both degraded through the ubiquitin proteasome pathway. However, in vitro mutagenesis reveals this pathway operates through different amino acid residues. We propose that GC skew caused loss of a key ubiquitination site, conserved through vertebrate evolution, and that sand rat Pdx1 evolved or fixed a new ubiquitination site to compensate. Our results give molecular insight into the power of natural selection in the face of maladaptive changes driven by strong GC skew.


Subject(s)
Evolution, Molecular , Genes, Homeobox , Gerbillinae/genetics , Homeodomain Proteins/metabolism , Selection, Genetic , Trans-Activators/metabolism , Amino Acid Substitution , Animals , Base Composition , Gerbillinae/metabolism , Homeodomain Proteins/genetics , Proteasome Endopeptidase Complex/metabolism , Protein Stability , Trans-Activators/genetics , Ubiquitination
12.
Proc Natl Acad Sci U S A ; 114(29): 7677-7682, 2017 07 18.
Article in English | MEDLINE | ID: mdl-28674003

ABSTRACT

The sand rat Psammomys obesus is a gerbil species native to deserts of North Africa and the Middle East, and is constrained in its ecology because high carbohydrate diets induce obesity and type II diabetes that, in extreme cases, can lead to pancreatic failure and death. We report the sequencing of the sand rat genome and discovery of an unusual, extensive, and mutationally biased GC-rich genomic domain. This highly divergent genomic region encompasses several functionally essential genes, and spans the ParaHox cluster which includes the insulin-regulating homeobox gene Pdx1. The sequence of sand rat Pdx1 has been grossly affected by GC-biased mutation, leading to the highest divergence observed for this gene across the Bilateria. In addition to genomic insights into restricted caloric intake in a desert species, the discovery of a localized chromosomal region subject to elevated mutation suggests that mutational heterogeneity within genomes could influence the course of evolution.


Subject(s)
Gerbillinae/genetics , Homeodomain Proteins/genetics , Mutation , Sequence Analysis, DNA , Trans-Activators/genetics , Transcriptional Activation , Adaptation, Biological , Animals , Chromosome Mapping , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/metabolism , Ecosystem , Evolution, Molecular , Genes, Homeobox , Genome , Insulin/metabolism , Male , Multigene Family , Transcriptome
13.
Proc Biol Sci ; 286(1907): 20190830, 2019 07 24.
Article in English | MEDLINE | ID: mdl-31337308

ABSTRACT

ETCHbox genes are fast-evolving homeobox genes present only in eutherian (placental) mammals which originated by duplication and divergence from a conserved homeobox gene, Cone-rod homeobox (CRX). While expression and function of CRX are restricted to the retina in eutherian mammals, ETCHbox gene expression is specific to preimplantation embryos. This dramatic difference could reflect the acquisition of new functions by duplicated genes or subfunctionalization of pleiotropic roles between CRX and ETCHbox genes. To resolve between these hypotheses, we compared expression, sequence and inferred function between CRX of metatherian (marsupial) mammals and ETCHbox genes of eutherians. We find the metatherian CRX homeobox gene is expressed in early embryos and in eyes, unlike eutherian CRX, and distinct amino acid substitutions were fixed in the metatherian and eutherian evolutionary lineages consistent with altered transcription factor specificity. We find that metatherian CRX is capable of regulating embryonically expressed genes in cultured cells in a comparable way to eutherian ETCHbox. The data are consistent with CRX having a dual role in eyes and embryos of metatherians, providing an early embryonic function comparable to that of eutherian ETCHbox genes; we propose that subfunctionalization of pleiotropic functions occurred after gene duplication along the placental lineage, followed by functional elaboration.


Subject(s)
Evolution, Molecular , Genes, Homeobox , Homeodomain Proteins/genetics , Mammals/genetics , Trans-Activators/genetics , Amino Acid Sequence , Amino Acid Substitution , Animals , Homeodomain Proteins/chemistry , Homeodomain Proteins/metabolism , Mammals/metabolism , Retina/metabolism , Sequence Alignment , Species Specificity , Trans-Activators/chemistry , Trans-Activators/metabolism
14.
Nature ; 496(7443): 57-63, 2013 Apr 04.
Article in English | MEDLINE | ID: mdl-23485966

ABSTRACT

Tapeworms (Cestoda) cause neglected diseases that can be fatal and are difficult to treat, owing to inefficient drugs. Here we present an analysis of tapeworm genome sequences using the human-infective species Echinococcus multilocularis, E. granulosus, Taenia solium and the laboratory model Hymenolepis microstoma as examples. The 115- to 141-megabase genomes offer insights into the evolution of parasitism. Synteny is maintained with distantly related blood flukes but we find extreme losses of genes and pathways that are ubiquitous in other animals, including 34 homeobox families and several determinants of stem cell fate. Tapeworms have specialized detoxification pathways, metabolism that is finely tuned to rely on nutrients scavenged from their hosts, and species-specific expansions of non-canonical heat shock proteins and families of known antigens. We identify new potential drug targets, including some on which existing pharmaceuticals may act. The genomes provide a rich resource to underpin the development of urgently needed treatments and control.


Subject(s)
Adaptation, Physiological/genetics , Cestoda/genetics , Genome, Helminth/genetics , Parasites/genetics , Animals , Biological Evolution , Cestoda/drug effects , Cestoda/physiology , Cestode Infections/drug therapy , Cestode Infections/metabolism , Conserved Sequence/genetics , Echinococcus granulosus/genetics , Echinococcus multilocularis/drug effects , Echinococcus multilocularis/genetics , Echinococcus multilocularis/metabolism , Genes, Helminth/genetics , Genes, Homeobox/genetics , HSP70 Heat-Shock Proteins/genetics , Humans , Hymenolepis/genetics , Metabolic Networks and Pathways/genetics , Molecular Targeted Therapy , Parasites/drug effects , Parasites/physiology , Proteome/genetics , Stem Cells/cytology , Stem Cells/metabolism , Taenia solium/genetics
15.
Nature ; 490(7418): 49-54, 2012 Oct 04.
Article in English | MEDLINE | ID: mdl-22992520

ABSTRACT

The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.


Subject(s)
Adaptation, Physiological/genetics , Animal Shells/growth & development , Crassostrea/genetics , Genome/genetics , Stress, Physiological/physiology , Animal Shells/chemistry , Animals , Apoptosis Regulatory Proteins/genetics , DNA Transposable Elements/genetics , Evolution, Molecular , Female , Gene Expression Regulation, Developmental/genetics , Genes, Homeobox/genetics , Genomics , HSP70 Heat-Shock Proteins/genetics , Humans , Larva/genetics , Larva/growth & development , Mass Spectrometry , Molecular Sequence Annotation , Molecular Sequence Data , Polymorphism, Genetic/genetics , Repetitive Sequences, Nucleic Acid/genetics , Sequence Analysis, DNA , Stress, Physiological/genetics , Transcriptome/genetics
16.
Proc Biol Sci ; 284(1864)2017 Oct 11.
Article in English | MEDLINE | ID: mdl-28978728

ABSTRACT

Analysis of genome sequences within a phylogenetic context can give insight into the mode and tempo of gene and protein evolution, including inference of gene ages. This can reveal whether new genes arose on particular evolutionary lineages and were recruited for new functional roles. Here, we apply MCL clustering with all-versus-all reciprocal BLASTP to identify and phylogenetically date 'Homology Groups' among vertebrate proteins. Homology Groups include new genes and highly divergent duplicate genes. Focusing on the origin of the placental mammals within the Eutheria, we identify 357 novel Homology Groups that arose on the stem lineage of Placentalia, 87 of which are deduced to play core roles in mammalian biology as judged by extensive retention in evolution. We find the human homologues of novel eutherian genes are enriched for expression in preimplantation embryo, brain, and testes, and enriched for functions in keratinization, reproductive development, and the immune system.


Subject(s)
Eutheria/genetics , Evolution, Molecular , Genome , Animals , Phylogeny
17.
J Exp Zool B Mol Dev Evol ; 328(7): 638-644, 2017 Nov.
Article in English | MEDLINE | ID: mdl-28229564

ABSTRACT

An ancient genome duplication (TGD or 3R) occurred in teleost fish after divergence from the lineage leading to gar. This genome duplication is shared by the three extant teleost lineages: Osteoglossomorpha (bony-tongues), Elopomorpha (eels and tarpons), and Clupeocephala (a large clade including salmon, carp, medaka, zebrafish, cichlids, pufferfish, stickleback, and ∼26,000 other species). After TGD, different clupeocephalan species retained different gene duplicates; this is seen clearly in Hox gene clusters but extends to all genes. Since divergent resolution of TGD paralogs is a potential driving force for speciation, it is possible this contributed to diversification of this clade. The extent to which divergent resolution of TGD paralogs occurred within Osteoglossomorpha has not been investigated in detail, and Hox cluster organization has been reported for just two species: Pantodon buchholzi (Pantodontidae) and Scleropages formosus (Osteoglossidae). We applied survey-scale genome sequencing and de novo assembly to three further osteoglossomorph taxa: Osteoglossum bicirrhosum (Osteoglossidae), Chitala ornata (Notopteridae), and Gnathonemus petersii (Mormyridae). We find that each retained more Hox genes than clupeocephalan taxa (excluding those that underwent additional genome duplication), but fewer than eels. Several Hox genes are missing in all teleosts, including duplicates of two Hox genes present in the slow evolving pre-TGD genome of the spotted gar. We find divergent resolution through individual gene losses, and whole cluster losses have been rampant across osteoglossomorphs, despite their extant species paucity. We suggest that reciprocal gene loss following TGD was probably insufficient to drive the exceptional diversification of teleosts.


Subject(s)
Fishes/genetics , Genes, Homeobox/genetics , Genetic Variation , Multigene Family , Animals , Fishes/classification , Gene Duplication , Gene Expression Regulation , Genetic Speciation , Genome , Species Specificity
18.
PLoS Genet ; 10(10): e1004698, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25340822

ABSTRACT

Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks.


Subject(s)
Evolution, Molecular , Gene Duplication , Gene Regulatory Networks , Homeodomain Proteins/genetics , Lepidoptera/genetics , Animals , Bombyx/genetics , Butterflies/genetics , Gene Expression Regulation, Developmental , Genome , High-Throughput Nucleotide Sequencing , Multigene Family , Phylogeny
19.
BMC Biol ; 14: 45, 2016 06 13.
Article in English | MEDLINE | ID: mdl-27296695

ABSTRACT

BACKGROUND: A central goal of evolutionary biology is to link genomic change to phenotypic evolution. The origin of new transcription factors is a special case of genomic evolution since it brings opportunities for novel regulatory interactions and potentially the emergence of new biological properties. RESULTS: We demonstrate that a group of four homeobox gene families (Argfx, Leutx, Dprx, Tprx), plus a gene newly described here (Pargfx), arose by tandem gene duplication from the retinal-expressed Crx gene, followed by asymmetric sequence evolution. We show these genes arose as part of repeated gene gain and loss events on a dynamic chromosomal region in the stem lineage of placental mammals, on the forerunner of human chromosome 19. The human orthologues of these genes are expressed specifically in early embryo totipotent cells, peaking from 8-cell to morula, prior to cell fate restrictions; cow orthologues have similar expression. To examine biological roles, we used ectopic gene expression in cultured human cells followed by high-throughput RNA-seq and uncovered extensive transcriptional remodelling driven by three of the genes. Comparison to transcriptional profiles of early human embryos suggest roles in activating and repressing a set of developmentally-important genes that spike at 8-cell to morula, rather than a general role in genome activation. CONCLUSIONS: We conclude that a dynamic chromosome region spawned a set of evolutionarily new homeobox genes, the ETCHbox genes, specifically in eutherian mammals. After these genes diverged from the parental Crx gene, we argue they were recruited for roles in the preimplantation embryo including activation of genes at the 8-cell stage and repression after morula. We propose these new homeobox gene roles permitted fine-tuning of cell fate decisions necessary for specification and function of embryonic and extra-embryonic tissues utilised in mammalian development and pregnancy.


Subject(s)
Evolution, Molecular , Genes, Homeobox , Mammals/genetics , Totipotent Stem Cells/metabolism , Animals , Base Sequence , Cell Nucleus/genetics , Embryo, Mammalian/metabolism , Embryonic Development/genetics , Gene Duplication , Gene Expression Regulation, Developmental , Genome , Mammals/embryology , Protein Domains , Totipotent Stem Cells/cytology , Transcription, Genetic
20.
BMC Dev Biol ; 16(1): 40, 2016 11 03.
Article in English | MEDLINE | ID: mdl-27809766

ABSTRACT

BACKGROUND: Homeobox genes encode a diverse set of transcription factors implicated in a vast range of biological processes including, but not limited to, embryonic cell fate specification and patterning. Although numerous studies report expression of particular sets of homeobox genes, a systematic analysis of the tissue specificity of homeobox genes is lacking. RESULTS: Here we analyse publicly-available transcriptome data from human and mouse developmental stages, and adult human tissues, to identify groups of homeobox genes with similar expression patterns. We calculate expression profiles for 242 human and 278 mouse homeobox loci across a combination of 59 human and 12 mouse adult tissues, early and late developmental stages. This revealed 20 human homeobox genes with widespread expression, primarily from the TALE, CERS and ZF classes. Most homeobox genes, however, have greater tissue-specificity, allowing us to compile homeobox gene expression lists for neural tissues, immune tissues, reproductive and developmental samples, and for numerous organ systems. In mouse development, we propose four distinct phases of homeobox gene expression: oocyte to zygote; 2-cell; 4-cell to blastocyst; early to mid post-implantation. The final phase change is marked by expression of ANTP class genes. We also use these data to compare expression specificity between evolutionarily-based gene classes, revealing that ANTP, PRD, LIM and POU homeobox gene classes have highest tissue specificity while HNF, TALE, CUT and CERS are most widely expressed. CONCLUSIONS: The homeobox genes comprise a large superclass and their expression patterns are correspondingly diverse, although in a broad sense related to an evolutionarily-based classification. The ubiquitous expression of some genes suggests roles in general cellular processes; in contrast, most human homeobox genes have greater tissue specificity and we compile useful homeobox datasets for particular tissues, organs and developmental stages. The identification of a set of eutherian-specific homeobox genes peaking from human 8-cell to morula stages suggests co-option of new genes to new developmental roles in evolution.


Subject(s)
Gene Expression , Homeodomain Proteins/genetics , Animals , Cell Differentiation , Databases, Genetic , Gene Expression Regulation, Developmental , Homeodomain Proteins/metabolism , Humans , Mice , Organ Specificity , Tissue Distribution
SELECTION OF CITATIONS
SEARCH DETAIL