Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 81
Filter
Add more filters










Publication year range
1.
bioRxiv ; 2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38585780

ABSTRACT

The evolutionary mechanisms that drive the emergence of genome architecture remain poorly understood but can now be assessed with unprecedented power due to the massive accumulation of genome assemblies spanning phylogenetic diversity1,2. Transposable elements (TEs) are a rich source of large-effect mutations since they directly and indirectly drive genomic structural variation and changes in gene expression3. Here, we demonstrate universal patterns of TE compartmentalization across eukaryotic genomes spanning ~1.7 billion years of evolution, in which TEs colocalize with gene families under strong predicted selective pressure for dynamic evolution and involved in specific functions. For non-pathogenic species these genes represent families involved in defense, sensory perception and environmental interaction, whereas for pathogenic species, TE-compartmentalized genes are highly enriched for pathogenic functions. Many TE-compartmentalized gene families display signatures of positive selection at the molecular level. Furthermore, TE-compartmentalized genes exhibit an excess of high-frequency alleles for polymorphic TE insertions in fruit fly populations. We postulate that these patterns reflect selection for adaptive TE insertions as well as TE-associated structural variants. This process may drive the emergence of a shared TE-compartmentalized genome architecture across diverse eukaryotic lineages.

2.
bioRxiv ; 2024 Apr 13.
Article in English | MEDLINE | ID: mdl-38645127

ABSTRACT

Host-microbe systems are evolutionary niches that produce coevolved biological interactions and are a key component of global health. However, these systems have historically been a difficult field of biological research due to their experimental intractability. Impactful advances in global health will be obtained by leveraging in silico screens to identify genes involved in mediating interspecific interactions. These predictions will progress our understanding of these systems and lay the groundwork for future in vitro and in vivo experiments and bioengineering projects. A driver of host-manipulation and intracellular survival utilized by host-associated microbes is molecular mimicry, a critical mechanism that can occur at any level from DNA to protein structures. We applied protein structure prediction and alignment tools to explore host-associated bacterial structural proteomes for examples of protein structure mimicry. By leveraging the Legionella pneumophila proteome and its many known structural mimics, we developed and validated a screen that can be applied to virtually any host-microbe system to uncover signals of protein mimicry. These mimics represent candidate proteins that mediate host interactions in microbial proteomes. We successfully applied this screen to other microbes with demonstrated effects on global health, Helicobacter pylori and Wolbachia , identifying protein mimic candidates in each proteome. We discuss the roles these candidates may play in important Wolbachia -induced phenotypes and show that Wobachia infection can partially rescue the loss of one of these factors. This work demonstrates how a genome-wide screen for candidates of host-manipulation and intracellular survival offers an opportunity to identify functionally important genes in host-microbe systems.

3.
Mol Ecol ; : e17362, 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38682494

ABSTRACT

The black abalone, Haliotis cracherodii, is a large, long-lived marine mollusc that inhabits rocky intertidal habitats along the coast of California and Mexico. In 1985, populations were impacted by a bacterial disease known as withering syndrome (WS) that wiped out >90% of individuals, leading to the closure of all U.S. black abalone fisheries since 1993. Current conservation strategies include restoring diminished populations by translocating healthy individuals. However, population collapse on this scale may have dramatically lowered genetic diversity and strengthened geographic differentiation, making translocation-based recovery contentious. Additionally, the current prevalence of WS remains unknown. To address these uncertainties, we sequenced and analysed the genomes of 133 black abalone individuals from across their present range. We observed no spatial genetic structure among black abalone, with the exception of a single chromosomal inversion that increases in frequency with latitude. Outside the inversion, genetic differentiation between sites is minimal and does not scale with either geographic distance or environmental dissimilarity. Genetic diversity appears uniformly high across the range. Demographic inference does indicate a severe population bottleneck beginning just 15 generations in the past, but this decline is short lived, with present-day size far exceeding the pre-bottleneck status quo. Finally, we find the bacterial agent of WS is equally present across the sampled range, but only in 10% of individuals. The lack of population genetic structure, uniform diversity and prevalence of WS bacteria indicates that translocation could be a valid and low-risk means of population restoration for black abalone species' recovery.

4.
Nat Microbiol ; 9(2): 550-560, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38316930

ABSTRACT

Pathogen lineage nomenclature systems are a key component of effective communication and collaboration for researchers and public health workers. Since February 2021, the Pango dynamic lineage nomenclature for SARS-CoV-2 has been sustained by crowdsourced lineage proposals as new isolates were sequenced. This approach is vulnerable to time-critical delays as well as regional and personal bias. Here we developed a simple heuristic approach for dividing phylogenetic trees into lineages, including the prioritization of key mutations or genes. Our implementation is efficient on extremely large phylogenetic trees consisting of millions of sequences and produces similar results to existing manually curated lineage designations when applied to SARS-CoV-2 and other viruses including chikungunya virus, Venezuelan equine encephalitis virus complex and Zika virus. This method offers a simple, automated and consistent approach to pathogen nomenclature that can assist researchers in developing and maintaining phylogeny-based classifications in the face of ever-increasing genomic datasets.


Subject(s)
Encephalitis Virus, Venezuelan Equine , Zika Virus Infection , Zika Virus , Animals , Horses/genetics , Phylogeny , Encephalitis Virus, Venezuelan Equine/genetics , Genomics , Base Sequence , Genome, Viral , SARS-CoV-2/genetics , Zika Virus/genetics
5.
bioRxiv ; 2024 Jan 29.
Article in English | MEDLINE | ID: mdl-38352393

ABSTRACT

The black abalone, Haliotis cracherodii, is a large, long-lived marine mollusc that inhabits rocky intertidal habitats along the coast of California and Mexico. In 1985, populations were impacted by a bacterial disease known as withering syndrome (WS) that wiped out >90% of individuals, leading to the species' designation as critically endangered. Current conservation strategies include restoring diminished populations by translocating healthy individuals. However, population collapse on this scale may have dramatically lowered genetic diversity and strengthened geographic differentiation, making translocation-based recovery contentious. Additionally, the current prevalence of WS is unknown. To address these uncertainties, we sequenced and analyzed the genomes of 133 black abalone individuals from across their present range. We observed no spatial genetic structure among black abalone, with the exception of a single chromosomal inversion that increases in frequency with latitude. Genetic divergence between sites is minimal, and does not scale with either geographic distance or environmental dissimilarity. Genetic diversity appears uniformly high across the range. Despite this, however, demographic inference confirms a severe population bottleneck beginning around the time of WS onset, highlighting the temporal offset that may occur between a population collapse and its potential impact on genetic diversity. Finally, we find the bacterial agent of WS is equally present across the sampled range, but only in 10% of individuals. The lack of genetic structure, uniform diversity, and prevalence of WS bacteria indicates that translocation could be a valid and low-risk means of population restoration for black abalone species' recovery.

6.
Nature ; 626(7997): 119-127, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38200310

ABSTRACT

The evolution of reproductive barriers is the first step in the formation of new species and can help us understand the diversification of life on Earth. These reproductive barriers often take the form of hybrid incompatibilities, in which alleles derived from two different species no longer interact properly in hybrids1-3. Theory predicts that hybrid incompatibilities may be more likely to arise at rapidly evolving genes4-6 and that incompatibilities involving multiple genes should be common7,8, but there has been sparse empirical data to evaluate these predictions. Here we describe a mitonuclear incompatibility involving three genes whose protein products are in physical contact within respiratory complex I of naturally hybridizing swordtail fish species. Individuals homozygous for mismatched protein combinations do not complete embryonic development or die as juveniles, whereas those heterozygous for the incompatibility have reduced complex I function and unbalanced representation of parental alleles in the mitochondrial proteome. We find that the effects of different genetic interactions on survival are non-additive, highlighting subtle complexity in the genetic architecture of hybrid incompatibilities. Finally, we document the evolutionary history of the genes involved, showing signals of accelerated evolution and evidence that an incompatibility has been transferred between species via hybridization.


Subject(s)
Cell Nucleus , Electron Transport Complex I , Fishes , Genes, Lethal , Genetic Speciation , Hybridization, Genetic , Mitochondrial Proteins , Animals , Alleles , Electron Transport Complex I/genetics , Fishes/classification , Fishes/embryology , Fishes/genetics , Fishes/growth & development , Homozygote , Genes, Lethal/genetics , Species Specificity , Embryonic Development/genetics , Mitochondrial Proteins/genetics , Cell Nucleus/genetics , Heterozygote , Evolution, Molecular
8.
PLoS Genet ; 19(11): e1011062, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38015992

ABSTRACT

Admixture, the exchange of genetic information between distinct source populations, is thought to be a major source of adaptive genetic variation. Unlike mutation events, which periodically generate single alleles, admixture can introduce many selected alleles simultaneously. As such, the effects of linkage between selected alleles may be especially pronounced in admixed populations. However, existing tools for identifying selected mutations within admixed populations only account for selection at a single site, overlooking phenomena such as linkage among proximal selected alleles. Here, we develop and extensively validate a method for identifying and quantifying the individual effects of multiple linked selected sites on a chromosome in admixed populations. Our approach numerically calculates the expected local ancestry landscape in an admixed population for a given multi-locus selection model, and then maximizes the likelihood of the model. After applying this method to admixed populations of Drosophila melanogaster and Passer italiae, we found that the impacts between linked sites may be an important contributor to natural selection in admixed populations. Furthermore, for the situations we considered, the selection coefficients and number of selected sites are overestimated in analyses that do not consider the effects of linkage among selected sites. Our results imply that linkage among selected sites may be an important evolutionary force in admixed populations. This tool provides a powerful generalized method to investigate these crucial phenomena in diverse populations.


Subject(s)
Drosophila melanogaster , Genetics, Population , Animals , Drosophila melanogaster/genetics , Selection, Genetic
9.
Front Public Health ; 11: 1249614, 2023.
Article in English | MEDLINE | ID: mdl-37937074

ABSTRACT

Introduction: The SARS-CoV-2 pandemic represented a formidable scientific and technological challenge to public health due to its rapid spread and evolution. To meet these challenges and to characterize the virus over time, the State of California established the California SARS-CoV-2 Whole Genome Sequencing (WGS) Initiative, or "California COVIDNet". This initiative constituted an unprecedented multi-sector collaborative effort to achieve large-scale genomic surveillance of SARS-CoV-2 across California to monitor the spread of variants within the state, to detect new and emerging variants, and to characterize outbreaks in congregate, workplace, and other settings. Methods: California COVIDNet consists of 50 laboratory partners that include public health laboratories, private clinical diagnostic laboratories, and academic sequencing facilities as well as expert advisors, scientists, consultants, and contractors. Data management, sample sourcing and processing, and computational infrastructure were major challenges that had to be resolved in the midst of the pandemic chaos in order to conduct SARS-CoV-2 genomic surveillance. Data management, storage, and analytics needs were addressed with both conventional database applications and newer cloud-based data solutions, which also fulfilled computational requirements. Results: Representative and randomly selected samples were sourced from state-sponsored community testing sites. Since March of 2021, California COVIDNet partners have contributed more than 450,000 SARS-CoV-2 genomes sequenced from remnant samples from both molecular and antigen tests. Combined with genomes from CDC-contracted WGS labs, there are currently nearly 800,000 genomes from all 61 local health jurisdictions (LHJs) in California in the COVIDNet sequence database. More than 5% of all reported positive tests in the state have been sequenced, with similar rates of sequencing across 5 major geographic regions in the state. Discussion: Implementation of California COVIDNet revealed challenges and limitations in the public health system. These were overcome by engaging in novel partnerships that established a successful genomic surveillance program which provided valuable data to inform the COVID-19 public health response in California. Significantly, California COVIDNet has provided a foundational data framework and computational infrastructure needed to respond to future public health crises.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Genomics , California/epidemiology , Data Management
10.
mBio ; 14(5): e0188923, 2023 Oct 31.
Article in English | MEDLINE | ID: mdl-37830873

ABSTRACT

IMPORTANCE: Emerging infectious diseases require continuous pathogen monitoring. Rapid clinical diagnosis by nucleic acid amplification is limited to a small number of targets and may miss target detection due to new mutations in clinical isolates. Whole-genome sequencing (WGS) identifies genome-wide variations that may be used to determine a pathogen's drug resistance patterns and phylogenetically characterize isolates to track disease origin and transmission. WGS is typically performed using DNA isolated from cultured clinical isolates. Culturing clinical specimens increases turn-around time and may not be possible for fastidious bacteria. To overcome some of these limitations, direct sequencing of clinical specimens has been attempted using expensive capture probes to enrich the entire genomes of target pathogens. We present a method to produce a cost-effective, time-efficient, and large-scale synthesis of probes for whole-genome enrichment. We envision that our method can be used for direct clinical sequencing of a wide range of microbial pathogens for genomic epidemiology.


Subject(s)
Bacteria , Genomics , Nucleic Acid Hybridization , Whole Genome Sequencing/methods , Bacteria/genetics
11.
bioRxiv ; 2023 Aug 21.
Article in English | MEDLINE | ID: mdl-37662385

ABSTRACT

The sequencing of PCR amplicons is a core application of high-throughput sequencing technology. Using unique molecular identifiers (UMIs), individual amplified molecules can be sequenced to very high accuracy on an Illumina sequencer. However, Illumina sequencers have limited read length and are therefore restricted to sequencing amplicons shorter than 600bp unless using inefficient synthetic long-read approaches. Native long-read sequencers from Pacific Biosciences and Oxford Nanopore Technologies can, using consensus read approaches, match or exceed Illumina quality while achieving much longer read lengths. Using a circularization-based concatemeric consensus sequencing approach (R2C2) paired with UMIs (R2C2+UMI) we show that we can sequence ~550nt antibody heavy-chain (IGH) and ~1500nt 16S amplicons at accuracies up to and exceeding Q50 (<1 error in 100,0000 sequenced bases), which exceeds accuracies of UMI-supported Illumina paired sequencing as well as synthetic long-read approaches.

12.
Proc Natl Acad Sci U S A ; 120(33): e2301411120, 2023 08 15.
Article in English | MEDLINE | ID: mdl-37552755

ABSTRACT

The acquisition of novel sexually dimorphic traits poses an evolutionary puzzle: How do new traits arise and become sex-limited? Recently acquired color vision, sexually dimorphic in animals like primates and butterflies, presents a compelling model for understanding how traits become sex-biased. For example, some Heliconius butterflies uniquely possess UV (ultraviolet) color vision, which correlates with the expression of two differentially tuned UV-sensitive rhodopsins, UVRh1 and UVRh2. To discover how such traits become sexually dimorphic, we studied Heliconius charithonia, which exhibits female-specific UVRh1 expression. We demonstrate that females, but not males, discriminate different UV wavelengths. Through whole-genome shotgun sequencing and assembly of the H. charithonia genome, we discovered that UVRh1 is present on the W chromosome, making it obligately female-specific. By knocking out UVRh1, we show that UVRh1 protein expression is absent in mutant female eye tissue, as in wild-type male eyes. A PCR survey of UVRh1 sex-linkage across the genus shows that species with female-specific UVRh1 expression lack UVRh1 gDNA in males. Thus, acquisition of sex linkage is sufficient to achieve female-specific expression of UVRh1, though this does not preclude other mechanisms, like cis-regulatory evolution from also contributing. Moreover, both this event, and mutations leading to differential UV opsin sensitivity, occurred early in the history of Heliconius. These results suggest a path for acquiring sexual dimorphism distinct from existing mechanistic models. We propose a model where gene traffic to heterosomes (the W or the Y) genetically partitions a trait by sex before a phenotype shifts (spectral tuning of UV sensitivity).


Subject(s)
Butterflies , Color Vision , Animals , Female , Color Vision/genetics , Butterflies/genetics , Butterflies/metabolism , Eye/metabolism , Opsins/genetics , Opsins/metabolism , Rhodopsin/metabolism
13.
mSystems ; 8(4): e0028423, 2023 08 31.
Article in English | MEDLINE | ID: mdl-37493648

ABSTRACT

The intra-host composition of horizontally transmitted microbial symbionts can vary across host populations due to interactive effects of host genetics, environmental, and geographic factors. While adaptation to local habitat conditions can drive geographic subdivision of symbiont strains, it is unknown how differences in ecological characteristics among host-symbiont associations influence the genomic structure of symbiont populations. To address this question, we sequenced metagenomes of different populations of the deep-sea mussel Bathymodiolus septemdierum, which are common at Western Pacific deep-sea hydrothermal vents and show characteristic patterns of niche partitioning with sympatric gastropod symbioses. Bathymodiolus septemdierum lives in close symbiotic relationship with sulfur-oxidizing chemosynthetic bacteria but supplements its symbiotrophic diet through filter-feeding, enabling it to occupy ecological niches with little exposure to geochemical reductants. Our analyses indicate that symbiont populations associated with B. septemdierum show structuring by geographic location, but that the dominant symbiont strain is uncorrelated with vent site. These patterns are in contrast to co-occurring Alviniconcha and Ifremeria gastropod symbioses that exhibit greater symbiont nutritional dependence and occupy habitats with higher spatial variability in environmental conditions. Our results suggest that relative habitat homogeneity combined with sufficient symbiont dispersal and genomic mixing might promote persistence of similar symbiont strains across geographic locations, while mixotrophy might decrease selective pressures on the host to affiliate with locally adapted symbiont strains. Overall, these data contribute to our understanding of the potential mechanisms influencing symbiont population structure across a spectrum of marine microbial symbioses that occupy contrasting ecological niches. IMPORTANCE Beneficial relationships between animals and microbial organisms (symbionts) are ubiquitous in nature. In the ocean, microbial symbionts are typically acquired from the environment and their composition across geographic locations is often shaped by adaptation to local habitat conditions. However, it is currently unknown how generalizable these patterns are across symbiotic systems that have contrasting ecological characteristics. To address this question, we compared symbiont population structure between deep-sea hydrothermal vent mussels and co-occurring but ecologically distinct snail species. Our analyses show that mussel symbiont populations are less partitioned by geography and do not demonstrate evidence for environmental adaptation. We posit that the mussel's mixotrophic feeding mode may lower its need to affiliate with locally adapted symbiont strains, while microhabitat stability and symbiont genomic mixing likely favors persistence of symbiont strains across geographic locations. Altogether, these findings further our understanding of the mechanisms shaping symbiont population structure in marine environmentally transmitted symbioses.


Subject(s)
Gastropoda , Hydrothermal Vents , Mytilidae , Animals , Hydrothermal Vents/microbiology , Mytilidae/genetics , Bacteria/genetics , Ecosystem , Geography , Gastropoda/microbiology
14.
PLoS One ; 18(7): e0288261, 2023.
Article in English | MEDLINE | ID: mdl-37432953

ABSTRACT

Bacterial symbionts that manipulate the reproduction of their hosts are important factors in invertebrate ecology and evolution, and are being leveraged for host biological control. Infection prevalence restricts which biological control strategies are possible and is thought to be strongly influenced by the density of symbiont infection within hosts, termed titer. Current methods to estimate infection prevalence and symbiont titers are low-throughput, biased towards sampling infected species, and rarely measure titer. Here we develop a data mining approach to estimate symbiont infection frequencies within host species and titers within host tissues. We applied this approach to screen ~32,000 publicly available sequence samples from the most common symbiont host taxa, discovering 2,083 arthropod and 119 nematode infected samples. From these data, we estimated that Wolbachia infects approximately 44% of all arthropod and 34% of all nematode species, while other reproductive manipulators only infect 1-8% of arthropod and nematode species. Although relative titers within hosts were highly variable within and between arthropod species, a combination of arthropod host species and Wolbachia strain explained approximately 36% of variation in Wolbachia titer across the dataset. To explore potential mechanisms for host control of symbiont titer, we leveraged population genomic data from the model system Drosophila melanogaster. In this host, we found a number of SNPs associated with titer in candidate genes potentially relevant to host interactions with Wolbachia. Our study demonstrates that data mining is a powerful tool to detect bacterial infections and quantify infection intensities, thus opening an array of previously inaccessible data for further analysis in host-symbiont evolution.


Subject(s)
Arthropods , Wolbachia , Animals , Drosophila melanogaster/genetics , Data Mining , Ecology , Reproduction , Wolbachia/genetics
15.
Syst Biol ; 72(5): 1039-1051, 2023 11 01.
Article in English | MEDLINE | ID: mdl-37232476

ABSTRACT

Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 data sets do not fit this mold. There are currently over 14 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) and pseudo-ML methods may be more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger data sets. Here, we evaluate the performance of de novo and online phylogenetic approaches, as well as ML, pseudo-ML, and MP frameworks for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimization with UShER and matOptimize produces equivalent SARS-CoV-2 phylogenies to some of the most popular ML and pseudo-ML inference tools. MP optimization with UShER and matOptimize is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo inference. Our results therefore suggest that parsimony-based methods like UShER and matOptimize represent an accurate and more practical alternative to established ML implementations for large SARS-CoV-2 phylogenies and could be successfully applied to other similar data sets with particularly dense sampling and short branch lengths.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Phylogeny , Probability , Genomics
16.
Microb Genom ; 9(5)2023 05.
Article in English | MEDLINE | ID: mdl-37185044

ABSTRACT

Exposure to different mutagens leaves distinct mutational patterns that can allow inference of pathogen replication niches. We therefore investigated whether SARS-CoV-2 mutational spectra might show lineage-specific differences, dependent on the dominant site(s) of replication and onwards transmission, and could therefore rapidly infer virulence of emergent variants of concern (VOCs). Through mutational spectrum analysis, we found a significant reduction in G>T mutations in the Omicron variant, which replicates in the upper respiratory tract (URT), compared to other lineages, which replicate in both the URT and lower respiratory tract (LRT). Mutational analysis of other viruses and bacteria indicates a robust, generalizable association of high G>T mutations with replication within the LRT. Monitoring G>T mutation rates over time, we found early separation of Omicron from Beta, Gamma and Delta, while mutational patterns in Alpha varied consistent with changes in transmission source as social restrictions were lifted. Mutational spectra may be a powerful tool to infer niches of established and emergent pathogens.


Subject(s)
COVID-19 , Humans , SARS-CoV-2/genetics , Mutation , Bacteria/genetics , Lung
17.
Nat Genet ; 55(5): 746-752, 2023 05.
Article in English | MEDLINE | ID: mdl-37038003

ABSTRACT

Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's origins, spread, and the emergence and reproductive success of new variants. However, most phylogenetic approaches, including maximum likelihood and Bayesian methods, cannot scale to the size of the datasets from the current pandemic. We present 'MAximum Parsimonious Likelihood Estimation' (MAPLE), an approach for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. MAPLE infers SARS-CoV-2 phylogenies more accurately than existing maximum likelihood approaches while running up to thousands of times faster, and requiring at least 100 times less memory on large datasets. This extends the reach of genomic epidemiology, allowing the continued use of accurate phylogenetic, phylogeographic and phylodynamic analyses on datasets of millions of genomes.


Subject(s)
COVID-19 , Humans , Phylogeny , COVID-19/epidemiology , COVID-19/genetics , SARS-CoV-2/genetics , Likelihood Functions , Pandemics , Bayes Theorem
18.
Curr Biol ; 33(1): 189-196.e4, 2023 01 09.
Article in English | MEDLINE | ID: mdl-36543167

ABSTRACT

Spliceosomal introns, which interrupt nuclear genes, are ubiquitous features of eukaryotic nuclear genes.1 Spliceosomal intron evolution is complex, with different lineages ranging from virtually zero to thousands of newly created introns.2,3,4,5 This punctate phylogenetic distribution could be explained if intron creation is driven by specialized transposable elements ("Introners"), with Introner-containing lineages undergoing frequent intron gain.6,7,8,9,10 Fragmentation of nuclear genes by spliceosomal introns reaches its apex in dinoflagellates, which have some twenty introns per gene11,12; however, little is known about dinoflagellate intron evolution. We reconstructed intron evolution in five dinoflagellate genomes, revealing a dynamic history of intron gain. We find evidence for historical creation of introns in all five species and identify recently active Introners in 4/5 studied species. In one species, Polarella glacialis, we find an unprecedented diversity of Introners, with recent Introner insertion leading to creation of some 12,253 introns, and with 15 separate families of Introners accounting for at least 100 introns each. These Introner families show diverse mechanisms of moblization and intron creation. Comparison within and between Introner families provides evidence that biases in the so-called intron phase, intron position relative to codon periodicity, could be driven by Introner insertion site requirements.9,13,14 Finally, we report additional transformations of the spliceosomal system in dinoflagellates, including widespread loss of ancestral introns, and novelties of tolerated and favored donor sequence motifs. These results reveal unappreciated diversity of intron-creating elements and spliceosomal evolutionary capacity and highlight the complex evolutionary dependencies shaping genome structures.


Subject(s)
DNA Transposable Elements , Dinoflagellida , Introns/genetics , Phylogeny , DNA Transposable Elements/genetics , Dinoflagellida/genetics , Evolution, Molecular , Spliceosomes/genetics
19.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36453872

ABSTRACT

SUMMARY: Treenome Browser is a web browser tool to interactively visualize millions of genomes alongside huge phylogenetic trees. AVAILABILITY AND IMPLEMENTATION: Treenome Browser for SARS-CoV-2 can be accessed at cov2tree.org, or at taxonium.org for user-provided trees. Source code and documentation are available at github.com/theosanderson/taxonium and docs.taxonium.org/en/latest/treenome.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
COVID-19 , Genomics , Humans , Phylogeny , SARS-CoV-2/genetics , Genome , Software
20.
Proc Natl Acad Sci U S A ; 119(48): e2209766119, 2022 11 29.
Article in English | MEDLINE | ID: mdl-36417430

ABSTRACT

There is massive variation in intron numbers across eukaryotic genomes, yet the major drivers of intron content during evolution remain elusive. Rapid intron loss and gain in some lineages contrast with long-term evolutionary stasis in others. Episodic intron gain could be explained by recently discovered specialized transposons called Introners, but so far Introners are only known from a handful of species. Here, we performed a systematic search across 3,325 eukaryotic genomes and identified 27,563 Introner-derived introns in 175 genomes (5.2%). Species with Introners span remarkable phylogenetic diversity, from animals to basal protists, representing lineages whose last common ancestor dates to over 1.7 billion years ago. Aquatic organisms were 6.5 times more likely to contain Introners than terrestrial organisms. Introners exhibit mechanistic diversity but most are consistent with DNA transposition, indicating that Introners have evolved convergently hundreds of times from nonautonomous transposable elements. Transposable elements and aquatic taxa are associated with high rates of horizontal gene transfer, suggesting that this combination of factors may explain the punctuated and biased diversity of species containing Introners. More generally, our data suggest that Introners may explain the episodic nature of intron gain across the eukaryotic tree of life. These results illuminate the major source of ongoing intron creation in eukaryotic genomes.


Subject(s)
DNA Transposable Elements , Eukaryota , Animals , Introns/genetics , Eukaryota/genetics , DNA Transposable Elements/genetics , Phylogeny , Eukaryotic Cells
SELECTION OF CITATIONS
SEARCH DETAIL
...