RESUMEN
Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE-gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies, we detected â¼36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, whereas DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25%-85% of repetitive sequences were "unclassified" following automated annotation, compared with only â¼13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress toward this goal.
Asunto(s)
Genómica , Secuencias Repetitivas de Ácidos Nucleicos , Genoma de los Insectos , Secuencias Repetidas Terminales , Elementos Transponibles de ADNRESUMEN
Arthropod silk is vital to the evolutionary success of hundreds of thousands of species. The primary proteins in silks are often encoded by long, repetitive gene sequences. Until recently, sequencing and assembling these complex gene sequences has proven intractable given their repetitive structure. Here, using high-quality long-read sequencing, we show that there is extensive variation-both in terms of length and repeat motif order-between alleles of silk genes within individual arthropods. Further, this variation exists across two deep, independent origins of silk which diverged more than 500 Mya: the insect clade containing caddisflies and butterflies and spiders. This remarkable convergence in previously overlooked patterns of allelic variation across multiple origins of silk suggests common mechanisms for the generation and maintenance of structural protein-coding genes. Future genomic efforts to connect genotypes to phenotypes should account for such allelic variation.
Asunto(s)
Mariposas Diurnas , Fibroínas , Arañas , Animales , Seda/química , Secuencia de Aminoácidos , Fibroínas/química , Alelos , Insectos/genética , Mariposas Diurnas/genética , Variación Genética , Arañas/genética , Proteínas de Insectos/genética , FilogeniaRESUMEN
The remarkable radiation of South American (SA) canids produced 10 extant species distributed across diverse habitats, including disparate forms such as the short-legged, hypercarnivorous bush dog and the long-legged, largely frugivorous maned wolf. Despite considerable research spanning nearly two centuries, many aspects of their evolutionary history remain unknown. Here, we analyzed 31 whole genomes encompassing all extant SA canid species to assess phylogenetic relationships, interspecific hybridization, historical demography, current genetic diversity, and the molecular bases of adaptations in the bush dog and maned wolf. We found that SA canids originated from a single ancestor that colonized South America 3.9 to 3.5 Mya, followed by diversification east of the Andes and then a single colonization event and radiation of Lycalopex species west of the Andes. We detected extensive historical gene flow between recently diverged lineages and observed distinct patterns of genomic diversity and demographic history in SA canids, likely induced by past climatic cycles compounded by human-induced population declines. Genome-wide scans of selection showed that disparate limb proportions in the bush dog and maned wolf may derive from mutations in genes regulating chondrocyte proliferation and enlargement. Further, frugivory in the maned wolf may have been enabled by variants in genes associated with energy intake from short-chain fatty acids. In contrast, unique genetic variants detected in the bush dog may underlie interdigital webbing and dental adaptations for hypercarnivory. Our analyses shed light on the evolution of a unique carnivoran radiation and how it was shaped by South American topography and climate change.
Asunto(s)
Adaptación Fisiológica , Canidae , Filogenia , Adaptación Fisiológica/genética , Animales , Canidae/clasificación , Canidae/genética , Demografía , Variación Genética , Genómica , América del SurRESUMEN
Caddisflies (Trichoptera) are among the most diverse groups of freshwater animals with more than 16 000 described species. They play a fundamental role in freshwater ecology and environmental engineering in streams, rivers and lakes. Because of this, they are frequently used as indicator organisms in biomonitoring programmes. Despite their importance, key questions concerning the evolutionary history of caddisflies, such as the timing and origin of larval case making, remain unanswered owing to the lack of a well-resolved phylogeny. Here, we estimated a phylogenetic tree using a combination of transcriptomes and targeted enrichment data for 207 species, representing 48 of 52 extant families and 174 genera. We calibrated and dated the tree with 33 carefully selected fossils. The first caddisflies originated approximately 295 million years ago in the Permian, and major suborders began to diversify in the Triassic. Furthermore, we show that portable case making evolved in three separate lineages, and shifts in diversification occurred in concert with key evolutionary innovations beyond case making.
Asunto(s)
Insectos , Filogenia , Insectos/clasificación , Insectos/genética , Insectos/fisiología , Agua Dulce , Transcriptoma , Biodiversidad , Fósiles , Evolución Biológica , AnimalesRESUMEN
Petaluridae (Odonata: Anisoptera) is a relict dragonfly family, having diverged from its sister family in the Jurassic, of eleven species that are notable among odonates (dragonflies and damselflies) for their exclusive use of fen and bog habitats, their burrowing behavior as nymphs, large body size as adults, and extended lifespans. To date, several nodes within this family remain unresolved, limiting the study of the evolution of this peculiar family. Using an anchored hybrid enrichment dataset of over 900 loci we reconstructed the species tree of Petaluridae. To estimate the temporal origin of the genera within this family, we used a set of well-vetted fossils and a relaxed molecular clock model in a divergence time estimation analysis. We estimate that Petaluridae originated in the early Cretaceous and confirm the existence of monophyletic Gondwanan and Laurasian clades within the family. Our relaxed molecular clock analysis estimated that these clades diverged from their MRCA approximately 160 mya. Extant lineages within this family were identified to have persisted from 6 (Uropetala) to 120 million years (Phenes). Our biogeographical analyses focusing on a set of key regions suggest that divergence within Petaluridae is largely correlated with continental drift, the exposure of land bridges, and the development of mountain ranges. Our results support the hypothesis that species within Petaluridae have persisted for tens of millions of years, with little fossil evidence to suggest widespread extinction in the family, despite optimal conditions for the fossilization of nymphs. Petaluridae appear to be a rare example of habitat specialists that have persisted for tens of millions of years.
Asunto(s)
Fósiles , Odonata , Filogenia , Animales , Odonata/genética , Odonata/clasificación , Extinción Biológica , Modelos Genéticos , Teorema de Bayes , Análisis de Secuencia de ADN , Evolución MolecularRESUMEN
Pteronarcys californica (Newport 1848) is commonly referred to as the giant salmonfly and is the largest species of stonefly (Insecta: Plecoptera) in the western United States. Historically, it was widespread and abundant in western rivers, but populations have experienced a substantial decline in the past few decades, becoming locally extirpated in numerous rivers in Utah, Colorado, and Montana. Although previous research has explored the ecological variables conducive to the survivability of populations of the giant salmonfly, a lack of genomic resources hampers exploration of how genetic variation is spread across extant populations. To accelerate research on this imperiled species, we present a de novo chromosomal-length genome assembly of P. californica generated from PacBio HiFi sequencing and Hi-C chromosome conformation capture. Our assembly includes 14 predicted pseudo chromosomes and 98.8% of Insecta universal core orthologs. At 2.40 gigabases, the P. californica assembly is the largest of available stonefly assemblies, highlighting at least 9.5-fold variation in assembly size across the order. Repetitive elements (REs) account for much of the genome size increase in P. californica relative to other stonefly species, with the content of Class I retroelements alone exceeding the entire assembly size of all but two other species studied. We also observed preliminary suborder-specific trends in genome size that merit testing with more robust taxon sampling.
RESUMEN
The penstemons are ornamental annual flowering plants native to the Intermountain West and Rocky Mountains and commonly used for urban landscaping. Elite commercial penstemons are generally susceptible to abiotic stresses, including drought, root rot, cold, and high salinity. Firecracker penstemon (Penstemon eatonii), however, is much more tolerant to these stresses than most elite cultivars. Importantly, firecracker penstemon has been reported to hybridize with many other penstemons and therefore provides the opportunity to develop more tolerant elite cultivars through strategic crossing. To facilitate the study and utilization of firecracker penstemon, we sequenced and annotated the genome of a P. eatonii accession collected from Utah, USA. We also performed low-coverage, whole-genome sequencing of 26 additional accessions from three different varieties of P. eatonii. This chromosome-scale genome assembly is the most contiguous and complete Penstemon genome sequenced to date.
RESUMEN
In less than 25 y, the field of animal genome science has transformed from a discipline seeking its first glimpses into genome sequences across the Tree of Life to a global enterprise with ambitions to sequence genomes for all of Earth's eukaryotic diversity [H. A. Lewin et al., Proc. Natl. Acad. Sci. U.S.A. 115, 4325-4333 (2018)]. As the field rapidly moves forward, it is important to take stock of the progress that has been made to best inform the discipline's future. In this Perspective, we provide a contemporary, quantitative overview of animal genome sequencing. We identified the best available genome assemblies in GenBank, the world's most extensive genetic database, for 3,278 unique animal species across 24 phyla. We assessed taxonomic representation, assembly quality, and annotation status for major clades. We show that while tremendous taxonomic progress has occurred, stark disparities in genomic representation exist, highlighted by a systemic overrepresentation of vertebrates and underrepresentation of arthropods. In terms of assembly quality, long-read sequencing has dramatically improved contiguity, whereas gene annotations are available for just 34.3% of taxa. Furthermore, we show that animal genome science has diversified in recent years with an ever-expanding pool of researchers participating. However, the field still appears to be dominated by institutions in the Global North, which have been listed as the submitting institution for 77% of all assemblies. We conclude by offering recommendations for improving genomic resource availability and research value while also broadening global representation.
Asunto(s)
Artrópodos/genética , Bases de Datos Genéticas , Genoma/genética , Genómica , Vertebrados/genética , Animales , Cordados/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Invertebrados/genética , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Generating the most contiguous, accurate genome assemblies given available sequencing technologies is a long-standing challenge in genome science. With the rise of long-read sequencing, assembly challenges have shifted from merely increasing contiguity to correctly assembling complex, repetitive regions of interest, ideally in a phased manner. At present, researchers largely choose between two types of long read data: longer, but less accurate sequences, or highly accurate, but shorter reads (i.e., >Q20 or 99% accurate). To better understand how these types of long-read data as well as scale of data (i.e., mean length and sequencing depth) influence genome assembly outcomes, we compared genome assemblies for a caddisfly, Hesperophylax magnus, generated with longer, but less accurate, Oxford Nanopore (ONT) R9.4.1 and highly accurate PacBio HiFi (HiFi) data. Next, we expanded this comparison to consider the influence of highly accurate long-read sequence data on genome assemblies across 6750 plant and animal genomes. For this broader comparison, we used HiFi data as a surrogate for highly accurate long-reads broadly as we could identify when they were used from GenBank metadata. RESULTS: HiFi reads outperformed ONT reads in all assembly metrics tested for the caddisfly data set and allowed for accurate assembly of the repetitive ~ 20 Kb H-fibroin gene. Across plants and animals, genome assemblies that incorporated HiFi reads were also more contiguous. For plants, the average HiFi assembly was 501% more contiguous (mean contig N50 = 20.5 Mb) than those generated with any other long-read data (mean contig N50 = 4.1 Mb). For animals, HiFi assemblies were 226% more contiguous (mean contig N50 = 20.9 Mb) versus other long-read assemblies (mean contig N50 = 9.3 Mb). In plants, we also found limited evidence that HiFi may offer a unique solution for overcoming genomic complexity that scales with assembly size. CONCLUSIONS: Highly accurate long-reads generated with HiFi or analogous technologies represent a key tool for maximizing genome assembly quality for a wide swath of plants and animals. This finding is particularly important when resources only allow for one type of sequencing data to be generated. Ultimately, to realize the promise of biodiversity genomics, we call for greater uptake of highly accurate long-reads in future studies.
Asunto(s)
Biodiversidad , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Genómica/métodos , Genómica/normas , Genómica/tendencias , Insectos/clasificación , Insectos/genética , Fibroínas/genética , Mapeo Contig , Genoma de los Insectos/genética , Animales , Bases de Datos de Ácidos Nucleicos , Reproducibilidad de los Resultados , Metaanálisis como Asunto , Conjuntos de Datos como Asunto , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Secuenciación de Nucleótidos de Alto Rendimiento/tendencias , Plantas/genética , Genoma de Planta/genéticaRESUMEN
Using recently published chromosome-length genome assemblies of two damselfly species, Ischnura elegans and Platycnemis pennipes, and two dragonfly species, Pantala flavescens and Tanypteryx hageni, we demonstrate that the autosomes of Odonata have undergone few fission, fusion, or inversion events, despite 250 million years of separation. In the four genomes discussed here, our results show that all autosomes have a clear ortholog in the ancestral karyotype. Despite this clear chromosomal orthology, we demonstrate that different factors, including concentration of repeat dynamics, GC content, relative position on the chromosome, and the relative proportion of coding sequence all influence the density of syntenic blocks across chromosomes. However, these factors do not interact to influence synteny the same way in any two pairs of species, nor is any one factor retained in all four species. Furthermore, it was previously unknown whether the micro-chromosomes in Odonata are descended from one ancestral chromosome. Despite structural rearrangements, our evidence suggests that the micro-chromosomes in the sampled Odonata do indeed descend from an ancestral chromosome, and that the micro-chromosome in P. flavescens was lost through fusion with autosomes.
Asunto(s)
Odonata , Animales , Odonata/genética , Genoma , Cariotipo , Cariotipificación , SinteníaRESUMEN
Butterflies and moths (Lepidoptera) are one of the major superradiations of insects, comprising nearly 160,000 described extant species. As herbivores, pollinators, and prey, Lepidoptera play a fundamental role in almost every terrestrial ecosystem. Lepidoptera are also indicators of environmental change and serve as models for research on mimicry and genetics. They have been central to the development of coevolutionary hypotheses, such as butterflies with flowering plants and moths' evolutionary arms race with echolocating bats. However, these hypotheses have not been rigorously tested, because a robust lepidopteran phylogeny and timing of evolutionary novelties are lacking. To address these issues, we inferred a comprehensive phylogeny of Lepidoptera, using the largest dataset assembled for the order (2,098 orthologous protein-coding genes from transcriptomes of 186 species, representing nearly all superfamilies), and dated it with carefully evaluated synapomorphy-based fossils. The oldest members of the Lepidoptera crown group appeared in the Late Carboniferous (â¼300 Ma) and fed on nonvascular land plants. Lepidoptera evolved the tube-like proboscis in the Middle Triassic (â¼241 Ma), which allowed them to acquire nectar from flowering plants. This morphological innovation, along with other traits, likely promoted the extraordinary diversification of superfamily-level lepidopteran crown groups. The ancestor of butterflies was likely nocturnal, and our results indicate that butterflies became day-flying in the Late Cretaceous (â¼98 Ma). Moth hearing organs arose multiple times before the evolutionary arms race between moths and bats, perhaps initially detecting a wide range of sound frequencies before being co-opted to specifically detect bat sonar. Our study provides an essential framework for future comparative studies on butterfly and moth evolution.
Asunto(s)
Mariposas Diurnas/genética , Evolución Molecular , Mariposas Nocturnas/genética , Filogenia , Animales , Mariposas Diurnas/clasificación , Mariposas Diurnas/fisiología , Mariposas Nocturnas/clasificación , Mariposas Nocturnas/fisiologíaRESUMEN
Polyneoptera represents one of the major lineages of winged insects, comprising around 40,000 extant species in 10 traditional orders, including grasshoppers, roaches, and stoneflies. Many important aspects of polyneopteran evolution, such as their phylogenetic relationships, changes in their external appearance, their habitat preferences, and social behavior, are unresolved and are a major enigma in entomology. These ambiguities also have direct consequences for our understanding of the evolution of winged insects in general; for example, with respect to the ancestral habitats of adults and juveniles. We addressed these issues with a large-scale phylogenomic analysis and used the reconstructed phylogenetic relationships to trace the evolution of 112 characters associated with the external appearance and the lifestyle of winged insects. Our inferences suggest that the last common ancestors of Polyneoptera and of the winged insects were terrestrial throughout their lives, implying that wings did not evolve in an aquatic environment. The appearance of the first polyneopteran insect was mainly characterized by ancestral traits such as long segmented abdominal appendages and biting mouthparts held below the head capsule. This ancestor lived in association with the ground, which led to various specializations including hardened forewings and unique tarsal attachment structures. However, within Polyneoptera, several groups switched separately to a life on plants. In contrast to a previous hypothesis, we found that social behavior was not part of the polyneopteran ground plan. In other traits, such as the biting mouthparts, Polyneoptera shows a high degree of evolutionary conservatism unique among the major lineages of winged insects.
Asunto(s)
Evolución Biológica , Insectos/fisiología , Neoptera/fisiología , Alas de Animales/fisiología , Animales , Insectos/genética , Neoptera/genética , FilogeniaRESUMEN
The divergence of sister orders Trichoptera (caddisflies) and Lepidoptera (moths and butterflies) from a silk-spinning ancestor occurred around 290 million years ago. Trichoptera larvae are mainly aquatic, and Lepidoptera larvae are almost entirely terrestrial-distinct habitats that required molecular adaptation of their silk for deployment in water and air, respectively. The major protein components of their silks are heavy chain and light chain fibroins. In an effort to identify molecular changes in L-fibroins that may have contributed to the divergent use of silk in water and air, we used the ColabFold implementation of AlphaFold2 to predict three-dimensional structures of L-fibroins from both orders. A comparison of the structures revealed that despite the ancient divergence, profoundly different habitats, and low sequence conservation, a novel 10-helix core structure was strongly conserved in L-fibroins from both orders. Previously known intra- and intermolecular disulfide linkages were accurately predicted. Structural variations outside of the core may represent molecular changes that contributed to the evolution of insect silks adapted to water or air. The distributions of electrostatic potential, for example, were not conserved and present distinct order-specific surfaces for potential interactions with or modulation by external factors. Additionally, the interactions of L-fibroins with the H-fibroin C-termini are different for these orders; lepidopteran L-fibroins have N-terminal insertions that are not present in trichopteran L-fibroins, which form an unstructured ribbon in isolation but become part of an intermolecular ß-sheet when folded with their corresponding H-fibroin C-termini. The results are an example of protein structure prediction from deep sequence data of understudied proteins made possible by AlphaFold2.
Asunto(s)
Bombyx , Mariposas Diurnas , Fibroínas , Lepidópteros , Secuencia de Aminoácidos , Animales , Bombyx/metabolismo , Mariposas Diurnas/metabolismo , Disulfuros/metabolismo , Fibroínas/química , Insectos/metabolismo , Lepidópteros/metabolismo , Seda/metabolismo , Agua/metabolismoRESUMEN
Dragonflies and damselflies are a charismatic, medium-sized insect order (~6300 species) with a unique potential to approach comparative research questions. Their taxonomy and many ecological traits for a large fraction of extant species are relatively well understood. However, until now, the lack of a large-scale phylogeny based on high throughput data with the potential to connect both perspectives has precluded comparative evolutionary questions for these insects. Here, we provide an ordinal hypothesis of classification based on anchored hybrid enrichment using a total of 136 species representing 46 of the 48 families or incertae sedis, and a total of 478 target loci. Our analyses recovered the monophyly for all three suborders: Anisoptera, Anisozygoptera and Zygoptera. Although the backbone of the topology was reinforced and showed the highest support values to date, our genomic data was unable to stronglyresolve portions of the topology. In addition, a quartet sampling approach highlights the potential evolutionary scenarios that may have shaped evolutionary phylogeny (e.g., incomplete lineage sorting and introgression) of this taxon. Finally, in light of our phylogenomic reconstruction and previous morphological and molecular information we proposed an updated odonate classification and define five new families (Amanipodagrionidae fam. nov., Mesagrionidae fam. nov., Mesopodagrionidae fam. nov., Priscagrionidae fam. nov., Protolestidae fam. nov.) and reinstate another two (Rhipidolestidae stat. res., Tatocnemididae stat. res.). Additionally, we feature the problematic taxonomic groupings for examination in future studies to improve our current phylogenetic hypothesis.
Asunto(s)
Genómica , Odonata/clasificación , Odonata/genética , Filogenia , Animales , Femenino , MasculinoRESUMEN
PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses. PartitionFinder 2 is substantially faster and more efficient than version 1, and incorporates many new methods and features. These include the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, new output formats to facilitate interoperability with downstream software, and many new models of molecular evolution. PartitionFinder 2 is freely available under an open source license and works on Windows, OSX, and Linux operating systems. It can be downloaded from www.robertlanfear.com/partitionfinder. The source code is available at https://github.com/brettc/partitionfinder.
Asunto(s)
Evolución Molecular , Modelos Genéticos , Análisis de Secuencia de ADN/métodos , Algoritmos , Evolución Biológica , Simulación por Computador , Genoma , Filogenia , Programas InformáticosRESUMEN
BACKGROUND: Model selection is a vital part of most phylogenetic analyses, and accounting for the heterogeneity in evolutionary patterns across sites is particularly important. Mixture models and partitioning are commonly used to account for this variation, and partitioning is the most popular approach. Most current partitioning methods require some a priori partitioning scheme to be defined, typically guided by known structural features of the sequences, such as gene boundaries or codon positions. Recent evidence suggests that these a priori boundaries often fail to adequately account for variation in rates and patterns of evolution among sites. Furthermore, new phylogenomic datasets such as those assembled from ultra-conserved elements lack obvious structural features on which to define a priori partitioning schemes. The upshot is that, for many phylogenetic datasets, partitioned models of molecular evolution may be inadequate, thus limiting the accuracy of downstream phylogenetic analyses. RESULTS: We present a new algorithm that automatically selects a partitioning scheme via the iterative division of the alignment into subsets of similar sites based on their rates of evolution. We compare this method to existing approaches using a wide range of empirical datasets, and show that it consistently leads to large increases in the fit of partitioned models of molecular evolution when measured using AICc and BIC scores. In doing so, we demonstrate that some related approaches to solving this problem may have been associated with a small but important bias. CONCLUSIONS: Our method provides an alternative to traditional approaches to partitioning, such as dividing alignments by gene and codon position. Because our method is data-driven, it can be used to estimate partitioned models for all types of alignments, including those that are not amenable to traditional approaches to partitioning.
Asunto(s)
Algoritmos , Evolución Molecular , Modelos Genéticos , Filogenia , Análisis por Conglomerados , CodónRESUMEN
We present the first long-read de novo assembly and annotation of the luna moth (Actias luna) and provide the full characterization of heavy chain fibroin (h-fibroin), a long and highly repetitive gene (>20 kb) essential in silk fiber production. There are >160,000 described species of moths and butterflies (Lepidoptera), but only within the last 5 years have we begun to recover high-quality annotated whole genomes across the order that capture h-fibroin. Using PacBio HiFi reads, we produce the first high-quality long-read reference genome for this species. The assembled genome has a length of 532 Mb, a contig N50 of 16.8 Mb, an L50 of 14 contigs, and 99.4% completeness (BUSCO). Our annotation using Bombyx mori protein and A. luna RNAseq evidence captured a total of 20,866 genes at 98.9% completeness with 10,267 functionally annotated proteins and a full-length h-fibroin annotation of 2,679 amino acid residues.
Asunto(s)
Fibroínas , Genoma de los Insectos , Anotación de Secuencia Molecular , Mariposas Nocturnas , Animales , Mariposas Nocturnas/genética , Fibroínas/genética , Seda/genética , Proteínas de Insectos/genética , Bombyx/genética , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
Ghost moths are an unusual family of primitive moths (Lepidoptera: Hepialidae) known for their large body size and crepuscular adult activity. These moths represent an ancient lineage, frequently have soil dwelling larvae, and are adapted to high elevations, deserts, and other extreme environments. Despite being rather speciose with more than 700 species, there is a dearth of genomic resources for the family. Here, we present the first high quality, publicly available hepialid genome, generated from an Andean species of ghost moth, Druceiella hillmani. Our genome assembly has a length of 2,586 Mbp with contig N50 of 28.1 Mb and N50 of 29, and BUSCO completeness of 97.1%, making it one of the largest genomes in the order Lepidoptera. Our assembly is a vital resource for future research on ghost moth genomics.
Asunto(s)
Genoma de los Insectos , Mariposas Nocturnas , Animales , Mariposas Nocturnas/genéticaRESUMEN
While most species of butterflies and moths (Lepidoptera) have entirely terrestrial life histories, â¼0.5% of the described species are known to have an aquatic larval stage. Larvae of aquatic Lepidoptera are similar to caddisflies (Trichoptera) in that they use silk to anchor themselves to underwater substrates or to build protective cases. However, the physical properties and genetic elements of silks in aquatic Lepidoptera remain unstudied, as most research on lepidopteran silk has focused on the commercially important silkworm, Bombyx mori. Here, we provide high-quality PacBio HiFi genome assemblies of 2 distantly-related aquatic Lepidoptera species [Elophila obliteralis (Pyraloidea: Crambidae) and Hyposmocoma kahamanoa (Gelechioidea: Cosmopterigidae)]. As a step toward understanding the evolution of underwater silk in aquatic Lepidoptera, we used the genome assemblies and compared them to published genetic data of aquatic and terrestrial Lepidoptera. Sequences of the primary silk protein, h-fibroin, in aquatic moths have conserved termini and share a basic motif structure with terrestrial Lepidoptera. However, these sequences were similar to aquatic Trichoptera in that the percentage of positively and negatively charged amino acids was much higher than in terrestrial Lepidoptera, indicating a possible adaptation of silks to aquatic environments.
Asunto(s)
Lepidópteros , Filogenia , Seda , Animales , Seda/genética , Lepidópteros/genética , Genómica/métodos , Aminoácidos/genética , Genoma de los Insectos , Secuencia de AminoácidosRESUMEN
Insects have evolved complex and diverse visual systems in which light-sensing protein molecules called "opsins" couple with a chromophore to form photopigments. Insect photopigments group into three major gene families based on wavelength sensitivity: long wavelength (LW), short wavelength (SW), and ultraviolet wavelength (UV). In this study, we identified 123 opsin sequences from whole-genome assemblies across 25 caddisfly species (Insecta: Trichoptera). We discovered the LW opsins have the most diversity across species and form two separate clades in the opsin gene tree. Conversely, we observed a loss of the SW opsin in half of the trichopteran species in this study, which might be associated with the fact that caddisflies are active during low-light conditions. Lastly, we found a single copy of the UV opsin in all the species in this study, with one exception: Athripsodes cinereus has two copies of the UV opsin and resides within a clade of caddisflies with colorful wing patterns.