RESUMEN
The remarkable radiation of South American (SA) canids produced 10 extant species distributed across diverse habitats, including disparate forms such as the short-legged, hypercarnivorous bush dog and the long-legged, largely frugivorous maned wolf. Despite considerable research spanning nearly two centuries, many aspects of their evolutionary history remain unknown. Here, we analyzed 31 whole genomes encompassing all extant SA canid species to assess phylogenetic relationships, interspecific hybridization, historical demography, current genetic diversity, and the molecular bases of adaptations in the bush dog and maned wolf. We found that SA canids originated from a single ancestor that colonized South America 3.9 to 3.5 Mya, followed by diversification east of the Andes and then a single colonization event and radiation of Lycalopex species west of the Andes. We detected extensive historical gene flow between recently diverged lineages and observed distinct patterns of genomic diversity and demographic history in SA canids, likely induced by past climatic cycles compounded by human-induced population declines. Genome-wide scans of selection showed that disparate limb proportions in the bush dog and maned wolf may derive from mutations in genes regulating chondrocyte proliferation and enlargement. Further, frugivory in the maned wolf may have been enabled by variants in genes associated with energy intake from short-chain fatty acids. In contrast, unique genetic variants detected in the bush dog may underlie interdigital webbing and dental adaptations for hypercarnivory. Our analyses shed light on the evolution of a unique carnivoran radiation and how it was shaped by South American topography and climate change.
Asunto(s)
Adaptación Fisiológica , Canidae , Filogenia , Adaptación Fisiológica/genética , Animales , Canidae/clasificación , Canidae/genética , Demografía , Variación Genética , Genómica , América del SurRESUMEN
BACKGROUND: Coevolution between modern aphids and their primary obligate, bacterial endosymbiont, Buchnera aphidicola, has been previously reported at different classification levels based on molecular phylogenetic analyses. However, the Buchnera genome remains poorly understood within the Rhus gall aphids. RESULTS: We assembled the complete genome of the endosymbiont Buchnera in 16 aphid samples, representing 13 species in all six genera of Rhus gall aphids by shotgun genome skimming method. We compared the newly assembled genomes with those from GenBank to comprehensively investigate patterns of coevolution between the bacteria Buchnera and their aphid hosts. Buchnera genomes were mostly collinear, and the pan-genome contained 684 genes, in which the core genome contained 256 genes with some lineages having large numbers of tandem gene duplications. There has been substantial gene-loss in each Buchnera lineage. We also reconstructed the phylogeny for Buchnera and their host aphids, respectively, using 72 complete genomes of Buchnera, along with the complete mitochondrial genomes and three nuclear genes of 31 corresponding host aphid accessions. The cophylogenetic test demonstrated significant coevolution between these two partner groups at individual, species, generic, and tribal levels. CONCLUSIONS: Buchnera exhibits very high levels of genomic sequence divergence but relative stability in gene order. The relationship between the symbionts Buchnera and its aphid hosts shows a significant coevolutionary pattern and supports complexity of the obligate symbiotic relationship.
Asunto(s)
Áfidos , Buchnera , Genoma Bacteriano , Genómica , Filogenia , Simbiosis , Áfidos/microbiología , Áfidos/genética , Animales , Buchnera/genética , Buchnera/fisiología , Simbiosis/genética , Coevolución BiológicaRESUMEN
Reliable estimation of phylogeny is central to avoid inaccuracy in downstream macroevolutionary inferences. However, limitations exist in the implementation of concatenated and summary coalescent approaches, and Bayesian and full coalescent inference methods may not yet be feasible for computation of phylogeny using complicated models and large data sets. Here, we explored methodological (e.g., optimality criteria, character sampling, model selection) and biological (e.g., heterotachy, branch length heterogeneity) sources of systematic error that can result in biased or incorrect parameter estimates when reconstructing phylogeny by using the gadiform fishes as a model clade. Gadiformes include some of the most economically important fishes in the world (e.g., Cods, Hakes, and Rattails). Despite many attempts, a robust higher-level phylogenetic framework was lacking due to limited character and taxonomic sampling, particularly from several species-poor families that have been recalcitrant to phylogenetic placement. We compiled the first phylogenomic data set, including 14,208 loci ($>$2.8 M bp) from 58 species representing all recognized gadiform families, to infer a time-calibrated phylogeny for the group. Data were generated with a gene-capture approach targeting coding DNA sequences from single-copy protein-coding genes. Species-tree and concatenated maximum-likelihood (ML) analyses resolved all family-level relationships within Gadiformes. While there were a few differences between topologies produced by the DNA and the amino acid data sets, most of the historically unresolved relationships among gadiform lineages were consistently well resolved with high support in our analyses regardless of the methodological and biological approaches used. However, at deeper levels, we observed inconsistency in branch support estimates between bootstrap and gene and site coefficient factors (gCF, sCF). Despite numerous short internodes, all relationships received unequivocal bootstrap support while gCF and sCF had very little support, reflecting hidden conflict across loci. Most of the gene-tree and species-tree discordance in our study is a result of short divergence times, and consequent lack of informative characters at deep levels, rather than incomplete lineage sorting. We use this phylogeny to establish a new higher-level classification of Gadiformes as a way of clarifying the evolutionary diversification of the order. We recognize 17 families in five suborders: Bregmacerotoidei, Gadoidei, Ranicipitoidei, Merluccioidei, and Macrouroidei (including two subclades). A time-calibrated analysis using 15 fossil taxa suggests that Gadiformes evolved $\sim $79.5 Ma in the late Cretaceous, but that most extant lineages diverged after the Cretaceous-Paleogene (K-Pg) mass extinction (66 Ma). Our results reiterate the importance of examining phylogenomic analyses for evidence of systematic error that can emerge as a result of unsuitable modeling of biological factors and/or methodological issues, even when data sets are large and yield high support for phylogenetic relationships. [Branch length heterogeneity; Codfishes; commercial fish species; Cretaceous-Paleogene (K-Pg); heterotachy; systematic error; target enrichment.].
Asunto(s)
Gadiformes , Animales , Teorema de Bayes , Evolución Biológica , Peces/genética , Gadiformes/genética , Humanos , FilogeniaRESUMEN
The sunflower family, Asteraceae, comprises 10% of all flowering plant species and displays an incredible diversity of form. Asteraceae are clearly monophyletic, yet resolving phylogenetic relationships within the family has proven difficult, hindering our ability to understand its origin and diversification. Recent molecular clock dating has suggested a Cretaceous origin, but the lack of deep sampling of many genes and representative taxa from across the family has impeded the resolution of migration routes and diversifications that led to its global distribution and tremendous diversity. Here we use genomic data from 256 terminals to estimate evolutionary relationships, timing of diversification(s), and biogeographic patterns. Our study places the origin of Asteraceae at â¼83 MYA in the late Cretaceous and reveals that the family underwent a series of explosive radiations during the Eocene which were accompanied by accelerations in diversification rates. The lineages that gave rise to nearly 95% of extant species originated and began diversifying during the middle Eocene, coincident with the ensuing marked cooling during this period. Phylogenetic and biogeographic analyses support a South American origin of the family with subsequent dispersals into North America and then to Asia and Africa, later followed by multiple worldwide dispersals in many directions. The rapid mid-Eocene diversification is aligned with the biogeographic range shift to Africa where many of the modern-day tribes appear to have originated. Our robust phylogeny provides a framework for future studies aimed at understanding the role of the macroevolutionary patterns and processes that generated the enormous species diversity of Asteraceae.
Asunto(s)
Asteraceae/genética , Evolución Biológica , Genoma de Planta/genética , Filogenia , África , Asia , Asteraceae/clasificación , Magnoliopsida/genética , América del Norte , América del SurRESUMEN
BACKGROUND: Polydnaviruses (PDVs) are mutualistic endogenous viruses inoculated by some lineages of parasitoid wasps into their hosts, where they facilitate successful wasp development. PDVs include the ichnoviruses and bracoviruses that originate from independent viral acquisitions in ichneumonid and braconid wasps respectively. PDV genomes are fully incorporated into the wasp genomes and consist of (1) genes involved in viral particle production, which derive from the viral ancestor and are not encapsidated, and (2) proviral segments harboring virulence genes, which are packaged into the viral particle. To help elucidating the mechanisms that have facilitated viral domestication in ichneumonid wasps, we analyzed the structure of the viral insertions by sequencing the whole genome of two ichnovirus-carrying wasp species, Hyposoter didymator and Campoletis sonorensis. RESULTS: Assemblies with long scaffold sizes allowed us to unravel the organization of the endogenous ichnovirus and revealed considerable dispersion of the viral loci within the wasp genomes. Proviral segments contained species-specific sets of genes and occupied distinct genomic locations in the two ichneumonid wasps. In contrast, viral machinery genes were organized in clusters showing highly conserved gene content and order, with some loci located in collinear wasp genomic regions. This genomic architecture clearly differs from the organization of PDVs in braconid wasps, in which proviral segments are clustered and viral machinery elements are more dispersed. CONCLUSIONS: The contrasting structures of the two types of ichnovirus genomic elements are consistent with their different functions: proviral segments are vehicles for virulence proteins expected to adapt according to different host defense systems, whereas the genes involved in virus particle production in the wasp are likely more stable and may reflect ancestral viral architecture. The distinct genomic architectures seen in ichnoviruses versus bracoviruses reveal different evolutionary trajectories that have led to virus domestication in the two wasp lineages.
Asunto(s)
Evolución Molecular , Genoma Viral , Interacciones Microbiota-Huesped , Polydnaviridae/genética , Avispas/virología , Animales , Especificidad de la Especie , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: The greater bamboo lemur (Prolemur simus) is a member of the Family Lemuridae that is unique in their dependency on bamboo as a primary food source. This Critically Endangered species lives in small forest patches in eastern Madagascar, occupying a fraction of its historical range. Here we sequence the genome of the greater bamboo lemur for the first time, and provide genome resources for future studies of this species that can be applied across its distribution. RESULTS: Following whole genome sequencing of five individuals we identified over 152,000 polymorphic single nucleotide variants (SNVs), and evaluated geographic structuring across nearly 19 k SNVs. We characterized a stronger signal associated with a north-south divide than across elevations for our limited samples. We also evaluated the demographic history of this species, and infer a dramatic population crash. This species had the largest effective population size (estimated between ~ 900,000 to one million individuals) between approximately 60,000-90,000 years before present (ybp), during a time in which global climate change affected terrestrial mammals worldwide. We also note the single sample from the northern portion of the extant range had the largest effective population size around 35,000 ybp. CONCLUSIONS: From our whole genome sequencing we recovered an average genomic heterozygosity of 0.0037%, comparable to other lemurs. Our demographic history reconstructions recovered a probable climate-related decline (60-90,000 ybp), followed by a second population decrease following human colonization, which has reduced the species to a census size of approximately 1000 individuals. The historical distribution was likely a vast portion of Madagascar, minimally estimated at 44,259 km2, while the contemporary distribution is only ~ 1700 km2. The decline in effective population size of 89-99.9% corresponded to a vast range retraction. Conservation management of this species is crucial to retain genetic diversity across the remaining isolated populations.
Asunto(s)
Conservación de los Recursos Naturales , Especies en Peligro de Extinción , Genoma de Planta/genética , Lemuridae/genética , Animales , Genoma Mitocondrial/genética , Genómica , Lemuridae/crecimiento & desarrollo , Polimorfismo de Nucleótido Simple , Dinámica Poblacional , Análisis de SecuenciaRESUMEN
The Rhus gall aphids are sometimes referred to as subtribe Melaphidina (Aphididae: Eriosomatinae: Fordini) and comprise a unique group that forms galls on the primary host plants, Rhus. We examined the evolutionary relationships within the Melaphidina aphids using sequences of the complete mitochondrial genome and with samples of 11 of the 12 recognized species representing all six genera. Bayesian, maximum likelihood and parsimony analyses of the mitochondrial genome data support five well-supported clades within Melaphidina: (1) Nurudea (except N. ibofushi), (2) Schlechtendalia-Nurudea ibofushi, (3) Meitanaphis-Kaburagia, (4) Floraphis, and (5) Melaphis. Nurudea shiraii and N. yanoniella are sister to each other, but N. ibofushi is nested within Schlechtendalia. The Nurudea shiraii-N. yanoniella clade is sister to the large clade of the remaining taxa of Melaphidina aphids. The Bayesian and maximum likelihood analyses support the North American Melaphis rhois as sister to the clade of Floraphis-Kaburagia-Meitanaphis-Schlechtendalia from eastern Asia, whereas the parsimony analysis suggests Melaphis sister to Floraphis with low support (bootstrap support 38%), and the amino acid data weakly place it sister to Schlechtendalia-Nurudea ibofushi. The Melaphis position needs to be further tested with nuclear data. Meitanaphis flavogallis is sister to Kaburagia species instead of grouping with Meitanaphis elongallis. Using the Bayesian method, the North American Melaphis was estimated to have diverged from its closest Asian relatives around 64.6 (95% HPD 59.4-69.8) Ma, which is in the early Paleocene near the Cretaceous and Paleogene boundary (K/Pg boundary). At the K/Pg boundary, mass extinctions caused many types of insect-plant associations to disappear, and these extinctions may explain some of the difficulties in the phylogenetic placement of Melaphis within the analyses.
Asunto(s)
Áfidos/clasificación , Áfidos/genética , Genoma Mitocondrial/genética , Filogenia , Rhus/parasitología , Animales , Teorema de Bayes , Núcleo Celular/genética , Asia Oriental , América del NorteRESUMEN
BACKGROUND: Phylogenetic hypotheses based on complete genome data are presented for the Gammaproteobacteria family Vibrionaceae. Two taxon samplings are presented: one including all those taxa for which the genome sequences are complete in terms of arrangement (chromosomal location of fragments; 19 taxa) and one for which the genome sequences contain multiple contigs (44 taxa). Analyses are presented under the Maximum Parsimony and Maximum Likelihood optimality criteria for total evidence datasets, the two chromosomes separately, and individual analyses of locally collinear blocks. Three of the genomes included in the 44 taxon dataset, those of Vibrio gazogenes, Salinivibrio costicola, and Aliivibrio logei have been newly sequenced and their genome sequences are documented here. RESULTS: Phylogenetic results for the 19-taxon datasets show similar levels of collinear subset of dataset incongruence as a previous study of 22 taxa from the sister family Shewanellaceae, while also echoing the strong phylogenetic performance of random subsets of data also shown in this study. Phylogenetic results for both the 19-taxon and 44-taxon datasets corroborate previous hypotheses about the placement of Photobacterium and Aliivibrio within Vibrionaceae and also highlight problems with how Photobacterium is delimited and indicate that it likely should be dissolved into Vibrio to produce a phylogenetic taxonomy. The 19-taxon and 44-taxon trees based on the large chromosome are congruent for the majority of taxa that are present in both datasets. Analyses of the 44-taxon sampling based on the second, small chromosome are quite different from those based on the large chromosome, which is not surprising given the dramatically divergent nature of the small chromosome and the difficulty in postulating primary homologies. CONCLUSIONS: The phylogenetic analyses presented here represent the most comprehensive genome-level phylogenetic analyses in terms of taxa and data. Based on the availability of genome data for many bacterial species on GenBank, many other bacterial groups would also be amenable to similar genome-scale phylogenetic analyses even when present in multiple contigs. The result that collinear subsets of data are incongruent with the concatenated dataset and with each other while random data subsets show very little incongruence echoes the result of previous work on Shewanellaceae. The 44-taxon phylogenetic analysis presented here thus represents the future of phylogenomic analyses in scope and complexity.
Asunto(s)
Genoma Bacteriano , Filogenia , Vibrionaceae/clasificación , Vibrionaceae/genética , ADN Bacteriano/química , ADN Bacteriano/genética , Datos de Secuencia Molecular , Análisis de Secuencia de ADNRESUMEN
Complete genome sequences from a genus of Gammaproteobacteria, Shewanella, are used to generate a genome-wide exploration of the gene-tree species-tree dichotomy. A number of datasets were constructed and analyses were attempted. Single genes were chosen from 243 regions of collinear gene homology (128 of these 243 chosen genes are from the core Shewanella genome and 162 of 243 have the complete taxon sampling) from a previous study (Dikow, 2011) and subjected to phylogenetic analysis both individually and concatenated. In addition, three of the 243 sets of collinear genes from the core Shewanella genome were also chosen (comprising 15, 17, and 23 genes each) to be analysed in detail, this time to maximize the expectation of gene concordance. Analysis of these 55 genes in maximum parsimony (MP) and maximum likelihood (ML) produced 164 unique topologies (out of 166 resulting topologies). No genes from within collinear regions were congruent with one another, and none of these 164 topologies matches the result from concatenation. This result is particularly striking given that we chose collinear sets of genes. Analyses in MP and ML of 243 genes distributed across the genome produced 567 unique topologies (out of 571 resulting topologies for those 162 genes with complete taxon sampling). These results are discussed in light of recent works focused on incongruence. The gene as a phylogenetic unit is also discussed. It is our conclusion that molecular systematics has been reliant on the gene as a unit without a critical eye on the distinction between gene homology and character homology.
RESUMEN
Given the sharp increase in agricultural and infrastructure development and the paucity of widespread data available to support conservation management decisions, a more rapid and accurate tool for identifying fish fauna in the world's largest freshwater ecosystem, the Amazon, is needed. Current strategies for identification of freshwater fishes require high levels of training and taxonomic expertise for morphological identification or genetic testing for species recognition at a molecular level. To overcome these challenges, we built an image masking model (U-Net) and a convolutional neural net (CNN) to classify Amazonian fish in photographs. Fish used to generate training data were collected and photographed in tributaries in seasonally flooded forests of the upper Morona River valley in Loreto, Peru in 2018 and 2019. Species identifications in the training images (n = 3068) were verified by expert ichthyologists. These images were supplemented with photographs taken of additional Amazonian fish specimens housed in the ichthyological collection of the Smithsonian's National Museum of Natural History. We generated a CNN model that identified 33 genera of fishes with a mean accuracy of 97.9%. Wider availability of accurate freshwater fish image recognition tools, such as the one described here, will enable fishermen, local communities, and citizen scientists to more effectively participate in collecting and sharing data from their territories to inform policy and management decisions that impact them directly.
Dado el aumento del desarrollo agrícola e infraestructura y la escasa información disponible para apoyar la toma de decisiones con respecto al manejo y la conservación de la fauna, es necesario contar con una herramienta más rápida y precisa para la identificación de peces en el ecosistema de agua dulce más grande del mundo, la Amazonía. Las estrategias actuales para la identificación de peces de agua dulce requieren altos niveles de capacitación y experiencia taxonómica para la identificación morfológica o las pruebas genéticas para el reconocimiento de especies a nivel molecular. Para superar estos desafíos, construimos un modelo de enmascaramiento de imágenes (UNet) y una red neuronal convolucional (CNN) para clasificar los peces amazónicos en las fotografías. Los peces utilizados para generar datos de entrenamiento fueron recolectados y fotografiados en afluentes de bosques inundables de la cuenca alta del río Morona en Loreto, Perú en 2018 y 2019. Las identificaciones de especies en las imágenes de entrenamiento (n = 3.068) fueron verificadas por ictiólogos expertos. Estas imágenes se complementaron con fotografías tomadas de ejemplares adicionales de peces amazónicos alojados en la colección ictiológica del Museo Nacional de Historia Natural del Smithsonian en Washington, DC. Se generó un modelo CNN que identificó 33 géneros de peces con una precisión media del 97,9%. Una mayor disponibilidad de herramientas precisas de reconocimiento de imágenes de peces de agua dulce, como la que se describe aquí, permitirá a los pescadores, las comunidades amazónicas y los "científicos ciudadanos" participar de manera más efectiva en la recopilación y el intercambio de datos de sus territorios para informar las políticas y decisiones de gestión que los afectan directamente.
RESUMEN
Whole genome sequencing for generating SNP data is increasingly used in population genetic studies. However, obtaining genomes for massive numbers of samples is still not within the budgets of many researchers. It is thus imperative to select an appropriate reference genome and sequencing depth to ensure the accuracy of the results for a specific research question, while balancing cost and feasibility. To evaluate the effect of the choice of the reference genome and sequencing depth on downstream analyses, we used five confamilial reference genomes of variable relatedness and three levels of sequencing depth (3.5×, 7.5× and 12×) in a population genomic study on two caddisfly species: Himalopsyche digitata and H. tibetana. Using these 30 datasets (five reference genomes × three depths × two target species), we estimated population genetic indices (inbreeding coefficient, nucleotide diversity, pairwise F ST, and genome-wide distribution of F ST) based on variants and population structure (PCA and admixture) based on genotype likelihood estimates. The results showed that both distantly related reference genomes and lower sequencing depth lead to degradation of resolution. In addition, choosing a more closely related reference genome may significantly remedy the defects caused by low depth. Therefore, we conclude that population genetic studies would benefit from closely related reference genomes, especially as the costs of obtaining a high-quality reference genome continue to decrease. However, to determine a cost-efficient strategy for a specific population genomic study, a trade-off between reference genome relatedness and sequencing depth can be considered.
RESUMEN
Similar to other apex predator species, populations of mainland (Neofelis nebulosa) and Sunda (Neofelis diardi) clouded leopards are declining. Understanding their patterns of genetic variation can provide critical insights on past genetic erosion and a baseline for understanding their long-term conservation needs. As a step toward this goal, we present draft genome assemblies for the two clouded leopard species to quantify their phylogenetic divergence, genome-wide diversity, and historical population trends. We estimate that the two species diverged 5.1 Mya, much earlier than previous estimates of 1.41 Mya and 2.86 Mya, suggesting they separated when Sundaland was becoming increasingly isolated from mainland Southeast Asia. The Sunda clouded leopard displays a distinct and reduced effective population size trajectory, consistent with a lower genome-wide heterozygosity and SNP density, relative to the mainland clouded leopard. Our results provide new insights into the evolutionary history and genetic health of this unique lineage of felids.
RESUMEN
Insect silk is a versatile biomaterial. Lepidoptera and Trichoptera display some of the most diverse uses of silk, with varying strength, adhesive qualities, and elastic properties. Silk fibroin genes are long (>20 Kbp), with many repetitive motifs that make them challenging to sequence. Most research thus far has focused on conserved N- and C-terminal regions of fibroin genes because a full comparison of repetitive regions across taxa has not been possible. Using the PacBio Sequel II system and SMRT sequencing, we generated high fidelity (HiFi) long-read genomic and transcriptomic sequences for the Indianmeal moth (Plodia interpunctella) and genomic sequences for the caddisfly Eubasilissa regina. Both genomes were highly contiguous (N50 = 9.7 Mbp/32.4 Mbp, L50 = 13/11) and complete (BUSCO complete = 99.3%/95.2%), with complete and contiguous recovery of silk heavy fibroin gene sequences. We show that HiFi long-read sequencing is helpful for understanding genes with long, repetitive regions.
RESUMEN
BACKGROUND: The explosion in availability of whole genome data provides the opportunity to build phylogenetic hypotheses based on these data as well as the ability to learn more about the genomes themselves. The biological history of genes and genomes can be investigated based on the taxomonic history provided by the phylogeny. A phylogenetic hypothesis based on complete genome data is presented for the genus Shewanella (Gammaproteobacteria: Alteromonadales: Shewanellaceae). Nineteen taxa from Shewanella (16 species and 3 additional strains of one species) as well as three outgroup species representing the genera Aeromonas (Gammaproteobacteria: Aeromonadales: Aeromonadaceae), Alteromonas (Gammaproteobacteria: Alteromonadales: Alteromonadaceae) and Colwellia (Gammaproteobacteria: Alteromonadales: Colwelliaceae) are included for a total of 22 taxa. RESULTS: Putatively homologous regions were found across unannotated genomes and tested with a phylogenetic analysis. Two genome-wide data-sets are considered, one including only those genomic regions for which all taxa are represented, which included 3,361,015 aligned nucleotide base-pairs (bp) and a second that additionally includes those regions present in only subsets of taxa, which totaled 12,456,624 aligned bp. Alignment columns in these large data-sets were then randomly sampled to create smaller data-sets. After the phylogenetic hypothesis was generated, genome annotations were projected onto the DNA sequence alignment to compare the historical hypothesis generated by the phylogeny with the functional hypothesis posited by annotation. CONCLUSIONS: Individual phylogenetic analyses of the 243 locally co-linear genome regions all failed to recover the genome topology, but the smaller data-sets that were random samplings of the large concatenated alignments all produced the genome topology. It is shown that there is not a single orthologous copy of 16S rRNA across the taxon sampling included in this study and that the relationships among the multiple copies are consistent with 16S rRNA undergoing concerted evolution. Unannotated whole genome data can provide excellent raw material for generating hypotheses of historical homology, which can be tested with phylogenetic analysis and compared with hypotheses of gene function.
Asunto(s)
Genoma Bacteriano/genética , Filogenia , Homología de Secuencia de Ácido Nucleico , Shewanella/clasificación , Shewanella/genética , Evolución Molecular , Genes Bacterianos/genética , Genómica , ARN Bacteriano/genética , ARN Ribosómico 16S/genéticaRESUMEN
A phylogenetic hypothesis is presented for all 95 species of the family Vibrionaceae (Bacteria: Gammaproteobacteria) based on a combined analysis of eight molecular loci (16S rRNA, gyrB, recA, rpoA, gapA, mreB, topA, atpA) for up to 9337 nucleotide characters. Members of this taxon exhibit diverse life histories, including bioluminescence, pathogenicity to human and marine organisms, symbiosis, quorum sensing and extremophilic environment living, making a hypothesis of phylogenetic history important to studies addressing these traits from an evolutionary perspective. It is proposed that this phylogenetic set of relationships replaces previous phenetic hypotheses and be used to construct a phylogenetic taxonomy. Recent taxonomic proposals, including the validity of four, instead of one, families representing the 95 species and historical notions of genera within the group are compared with the presented phylogenetic hypothesis. Character support is traced through the tree and is used to address these taxonomic proposals. Photobacterium is not a monophyletic group as it is currently delimited. Aliivibrio is found within Photobacterium, suggesting a new definition for Photobacterium that includes all species of Aliivibrio. Enterovibrio, Salinivibrio and Grimontia, previously thought to be distinct from and basal to Photobacterium and Vibrio, are found nested deeply within a large Vibrio clade. © The Willi Hennig Society 2010.
RESUMEN
The bluntnose knifefish Brachyhypopomus occidentalis is a primary freshwater fish from north-western South America and Lower Central America. Like other Gymnotiformes, it has an electric organ that generates electric discharges used for both communication and electrolocation. We assembled a high-quality reference genome sequence of B. occidentalis by combining Oxford Nanopore and 10X Genomics linked-reads technologies. We also describe its demographic history in the context of the rise of the Isthmus of Panama. The size of the assembled genome is 540.3 Mb with an N50 scaffold length of 5.4 Mb, which includes 93.8% complete, 0.7% fragmented, and 5.5% of missing vertebrate/Actinoterigie Benchmarking Universal Single-Copy Orthologs. Repetitive elements account for 11.04% of the genome, and 34,347 protein-coding genes were predicted, of which 23,935 have been functionally annotated. Demographic analysis suggests a rapid effective population expansion between 3 and 5 Myr, corresponding to the final closure of the Isthmus of Panama (2.8-3.5 Myr). This event was followed by a sudden and constant population decline during the last 1 Myr, likely associated with strong shifts in both precipitation and sea level during the Pleistocene glacial-interglacial cycles. The de novo genome assembly of B. occidentalis will provide novel insights into the molecular basis of both electric signal productions and detection and will be fundamental for understanding the processes that have shaped the diversity of Neotropical freshwater environments.
Asunto(s)
Pez Eléctrico , Gymnotiformes , Animales , Pez Eléctrico/genética , Genoma , Genómica , Gymnotiformes/genética , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
Here, we present the initial comparison of the nuclear genomes of the North American raccoon (Procyon lotor) and the kinkajou (Potos flavus) based on draft assemblies. These two species encompass almost 21 Myr of evolutionary history within Procyonidae. Because assemblies greatly impact downstream results, such as gene prediction and annotation, we tested three de novo assembly strategies (implemented in ALLPATHS-LG, MaSuRCA, and Platanus), some of which are optimized for highly heterozygous genomes. We discovered significant variation in contig and scaffold N50 and L50 statistics and genome completeness depending on the de novo assembler used. We compared the performance of these three assembly algorithms in hopes that this study will aid others looking to improve the quality of existing draft genome assemblies even without additional sequence data. We also estimate the demographic histories of raccoons and kinkajous using the Pairwise Sequentially Markovian Coalescent and discuss the variation in population sizes with respect to climatic change during the Pleistocene, as well as aspects of their ecology and taxonomy. Our goal is to achieve a better understanding of the evolutionary history of procyonids and to create robust genomic resources for future studies regarding adaptive divergence and selection.
Asunto(s)
Demografía , Procyonidae/genética , Mapaches/genética , Secuenciación Completa del Genoma , Algoritmos , Animales , Ecología , Genómica , Masculino , Análisis de Secuencia de ADNRESUMEN
We provide a new, annotated genome assembly of Neomicropteryx cornuta, a species of the so-called mandibulate archaic moths (Lepidoptera: Micropterigidae). These moths belong to a lineage that is thought to have split from all other Lepidoptera more than 300 Ma and are consequently vital to understanding the early evolution of superorder Amphiesmenoptera, which contains the order Lepidoptera (butterflies and moths) and its sister order Trichoptera (caddisflies). Using PacBio HiFi sequencing reads, we assembled a highly contiguous genome with a contig N50 of nearly 17 Mb. The assembled genome length of 541,115,538 bp is about half the length of the largest published Amphiesmenoptera genome (Limnephilus lunatus, Trichoptera) and double the length of the smallest (Papilio polytes, Lepidoptera). We find high recovery of universal single copy orthologs with 98.1% of BUSCO genes present and provide a genome annotation of 15,643 genes aided by resolved isoforms from PacBio IsoSeq data. This high-quality genome assembly provides an important resource for studying ecological and evolutionary transitions in the early evolution of Amphiesmenoptera.
Asunto(s)
Mariposas Diurnas , Mariposas Nocturnas , Animales , Mariposas Diurnas/genética , Genoma , Insectos/genética , Mariposas Nocturnas/genética , Análisis de Secuencia de ADNRESUMEN
Trichoptera (caddisflies) play an essential role in freshwater ecosystems; for instance, larvae process organic material from the water and are food for a variety of predators. Knowledge on the genomic diversity of caddisflies can facilitate comparative and phylogenetic studies thereby allowing scientists to better understand the evolutionary history of caddisflies. Although Trichoptera are the most diverse aquatic insect order, they remain poorly represented in terms of genomic resources. To date, all long-read based genomes have been sequenced from individuals in the retreat-making suborder, Annulipalpia, leaving â¼275 Ma of evolution without high-quality genomic resources. Here, we report the first long-read based de novo genome assemblies of two tube case-making Trichoptera from the suborder Integripalpia, Agrypnia vestita Walker and Hesperophylax magnus Banks. We find that these tube case-making caddisflies have genome sizes that are at least 3-fold larger than those of currently sequenced annulipalpian genomes and that this pattern is at least partly driven by major expansion of repetitive elements. In H. magnus, long interspersed nuclear elements alone exceed the entire genome size of some annulipalpian counterparts suggesting that caddisflies have high potential as a model for understanding genome size evolution in diverse insect lineages.
Asunto(s)
Genómica , Holometabola/genética , Insectos/genética , Secuencias Repetitivas de Ácidos Nucleicos , Animales , Biodiversidad , Agua Dulce , Tamaño del Genoma , Holometabola/clasificación , Insectos/clasificación , Larva , Anotación de Secuencia Molecular , FilogeniaRESUMEN
Prairie dogs (genus Cynomys) are a charismatic symbol of the American West. Their large social aggregations and complex vocalizations have been the subject of scientific and popular interest for decades. A large body of literature has documented their role as keystone species of western North America's grasslands: They generate habitat for other vertebrates, increase nutrient availability for plants, and act as a food source for mammalian, squamate, and avian predators. An additional keystone role lies in their extreme susceptibility to sylvatic plague (caused by Yersinia pestis), which results in periodic population extinctions, thereby generating spatiotemporal heterogeneity in both biotic communities and ecological processes. Here, we report the first Cynomys genome for a Gunnison's prairie dog (C. gunnisoni gunnisoni) from Telluride, Colorado (USA). The genome was constructed using a hybrid assembly of PacBio and Illumina reads and assembled with MaSuRCA and PBJelly, which resulted in a scaffold N50 of 824 kb. Total genome size was 2.67 Gb, with 32.46% of the bases occurring in repeat regions. We recovered 94.9% (91% complete) of the single copy orthologs using the mammalian Benchmarking Universal Single-Copy Orthologs database and detected 49,377 gene models (332,141 coding regions). Pairwise Sequentially Markovian Coalescent showed support for long-term stable population size followed by a steady decline beginning near the end of the Pleistocene, as well as a recent population reduction. The genome will aid in studies of mammalian evolution, disease resistance, and the genomic basis of life history traits in ground squirrels.