Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
1.
Mol Biol Evol ; 41(3)2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38442736

RESUMO

Transposable elements drive genome evolution in all branches of life. Transposable element insertions are often deleterious to their hosts and necessitate evolution of control mechanisms to limit their spread. The long terminal repeat retrotransposon Ty1 prime (Ty1'), a subfamily of the Ty1 family, is present in many Saccharomyces cerevisiae strains, but little is known about what controls its copy number. Here, we provide evidence that a novel gene from an exapted Ty1' sequence, domesticated restriction of Ty1' relic 2 (DRT2), encodes a restriction factor that inhibits Ty1' movement. DRT2 arose through domestication of a Ty1' GAG gene and contains the C-terminal domain of capsid, which in the related Ty1 canonical subfamily functions as a self-encoded restriction factor. Bioinformatic analysis reveals the widespread nature of DRT2, its evolutionary history, and pronounced structural variation at the Ty1' relic 2 locus. Ty1' retromobility analyses demonstrate DRT2 restriction factor functionality, and northern blot and RNA-seq analysis indicate that DRT2 is transcribed in multiple strains. Velocity cosedimentation profiles indicate an association between Drt2 and Ty1' virus-like particles or assembly complexes. Chimeric Ty1' elements containing DRT2 retain retromobility, suggesting an ancestral role of productive Gag C-terminal domain of capsid functionality is present in the sequence. Unlike Ty1 canonical, Ty1' retromobility increases with copy number, suggesting that C-terminal domain of capsid-based restriction is not limited to the Ty1 canonical subfamily self-encoded restriction factor and drove the endogenization of DRT2. The discovery of an exapted Ty1' restriction factor provides insight into the evolution of the Ty1 family, evolutionary hot-spots, and host-transposable element interactions.


Assuntos
Retroelementos , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Domesticação , Elementos de DNA Transponíveis
2.
Nucleic Acids Res ; 50(21): e124, 2022 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-36156149

RESUMO

Animal cell lines often undergo extreme genome restructuring events, including polyploidy and segmental aneuploidy that can impede de novo whole-genome assembly (WGA). In some species like Drosophila, cell lines also exhibit massive proliferation of transposable elements (TEs). To better understand the role of transposition during animal cell culture, we sequenced the genome of the tetraploid Drosophila S2R+ cell line using long-read and linked-read technologies. WGAs for S2R+ were highly fragmented and generated variable estimates of TE content across sequencing and assembly technologies. We therefore developed a novel WGA-independent bioinformatics method called TELR that identifies, locally assembles, and estimates allele frequency of TEs from long-read sequence data (https://github.com/bergmanlab/telr). Application of TELR to a ∼130x PacBio dataset for S2R+ revealed many haplotype-specific TE insertions that arose by transposition after initial cell line establishment and subsequent tetraploidization. Local assemblies from TELR also allowed phylogenetic analysis of paralogous TEs, which revealed that proliferation of TE families in vitro can be driven by single or multiple source lineages. Our work provides a model for the analysis of TEs in complex heterozygous or polyploid genomes that are recalcitrant to WGA and yields new insights into the mechanisms of genome evolution in animal cell culture.


Assuntos
Elementos de DNA Transponíveis , Poliploidia , Animais , Elementos de DNA Transponíveis/genética , Filogenia , Drosophila/genética , Linhagem Celular
3.
PLoS Genet ; 16(2): e1008632, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32084126

RESUMO

Transposable elements constitute a large fraction of most eukaryotic genomes. Insertion of mobile DNA sequences typically has deleterious effects on host fitness, and thus diverse mechanisms have evolved to control mobile element proliferation. Mobility of the Ty1 retrotransposon in Saccharomyces yeasts is regulated by copy number control (CNC) mediated by a self-encoded restriction factor derived from the Ty1 gag capsid gene that inhibits virus-like particle function. Here, we survey a panel of wild and human-associated strains of S. cerevisiae and S. paradoxus to investigate how genomic Ty1 content influences variation in Ty1 mobility. We observe high levels of mobility for a tester element with a gag sequence from the canonical Ty1 subfamily in permissive strains that either lack full-length Ty1 elements or only contain full-length copies of the Ty1' subfamily that have a divergent gag sequence. In contrast, low levels of canonical Ty1 mobility are observed in restrictive strains carrying full-length Ty1 elements containing a canonical gag sequence. Phylogenomic analysis of full-length Ty1 elements revealed that Ty1' is the ancestral subfamily present in wild strains of S. cerevisiae, and that canonical Ty1 in S. cerevisiae is a derived subfamily that acquired gag from S. paradoxus by horizontal transfer and recombination. Our results provide evidence that variation in the ability of S. cerevisiae and S. paradoxus strains to repress canonical Ty1 transposition via CNC is regulated by the genomic content of different Ty1 subfamilies, and that self-encoded forms of transposon control can spread across species boundaries by horizontal transfer.


Assuntos
Variações do Número de Cópias de DNA , Transferência Genética Horizontal , Genoma Fúngico/genética , Retroelementos/genética , Saccharomyces cerevisiae/genética , DNA Fúngico/genética , Evolução Molecular , Simpatria/genética
4.
PLoS Pathog ; 14(11): e1007445, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30422992

RESUMO

Wolbachia is an intracellular bacterium that infects a remarkable range of insect hosts. Insects such as mosquitos act as vectors for many devastating human viruses such as Dengue, West Nile, and Zika. Remarkably, Wolbachia infection provides insect hosts with resistance to many arboviruses thereby rendering the insects ineffective as vectors. To utilize Wolbachia effectively as a tool against vector-borne viruses a better understanding of the host-Wolbachia relationship is needed. To investigate Wolbachia-insect interactions we used the Wolbachia/Drosophila model that provides a genetically tractable system for studying host-pathogen interactions. We coupled genome-wide RNAi screening with a novel high-throughput fluorescence in situ hybridization (FISH) assay to detect changes in Wolbachia levels in a Wolbachia-infected Drosophila cell line JW18. 1117 genes altered Wolbachia levels when knocked down by RNAi of which 329 genes increased and 788 genes decreased the level of Wolbachia. Validation of hits included in depth secondary screening using in vitro RNAi, Drosophila mutants, and Wolbachia-detection by DNA qPCR. A diverse set of host gene networks was identified to regulate Wolbachia levels and unexpectedly revealed that perturbations of host translation components such as the ribosome and translation initiation factors results in increased Wolbachia levels both in vitro using RNAi and in vivo using mutants and a chemical-based translation inhibition assay. This work provides evidence for Wolbachia-host translation interaction and strengthens our general understanding of the Wolbachia-host intracellular relationship.


Assuntos
Drosophila melanogaster/genética , Interações entre Hospedeiro e Microrganismos/genética , Wolbachia/genética , Animais , Culicidae , Drosophila/genética , Drosophila/microbiologia , Drosophila melanogaster/microbiologia , Genoma , Interações Hospedeiro-Patógeno/genética , Humanos , Hibridização in Situ Fluorescente/métodos , Mosquitos Vetores , Interferência de RNA , Simbiose , Vírus/genética , Sequenciamento Completo do Genoma/métodos
5.
Proc Natl Acad Sci U S A ; 113(10): E1352-61, 2016 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-26903656

RESUMO

Multiply inverted balancer chromosomes that suppress exchange with their homologs are an essential part of the Drosophila melanogaster genetic toolkit. Despite their widespread use, the organization of balancer chromosomes has not been characterized at the molecular level, and the degree of sequence variation among copies of balancer chromosomes is unknown. To map inversion breakpoints and study potential diversity in descendants of a structurally identical balancer chromosome, we sequenced a panel of laboratory stocks containing the most widely used X chromosome balancer, First Multiple 7 (FM7). We mapped the locations of FM7 breakpoints to precise euchromatic coordinates and identified the flanking sequence of breakpoints in heterochromatic regions. Analysis of SNP variation revealed megabase-scale blocks of sequence divergence among currently used FM7 stocks. We present evidence that this divergence arose through rare double-crossover events that replaced a female-sterile allele of the singed gene (sn(X2)) on FM7c with a sequence from balanced chromosomes. We propose that although double-crossover events are rare in individual crosses, many FM7c chromosomes in the Bloomington Drosophila Stock Center have lost sn(X2) by this mechanism on a historical timescale. Finally, we characterize the original allele of the Bar gene (B(1)) that is carried on FM7, and validate the hypothesis that the origin and subsequent reversion of the B(1) duplication are mediated by unequal exchange. Our results reject a simple nonrecombining, clonal mode for the laboratory evolution of balancer chromosomes and have implications for how balancer chromosomes should be used in the design and interpretation of genetic experiments in Drosophila.


Assuntos
Pontos de Quebra do Cromossomo , Drosophila melanogaster/genética , Variação Genética , Recombinação Genética , Cromossomo X/genética , Animais , Sequência de Bases , Quebra Cromossômica , Inversão Cromossômica , Mapeamento Cromossômico , Troca Genética , Feminino , Heterocromatina/genética , Masculino , Modelos Genéticos , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos
6.
Nature ; 482(7384): 173-8, 2012 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-22318601

RESUMO

A major challenge of biology is understanding the relationship between molecular genetic variation and variation in quantitative traits, including fitness. This relationship determines our ability to predict phenotypes from genotypes and to understand how evolutionary forces shape variation within and between species. Previous efforts to dissect the genotype-phenotype map were based on incomplete genotypic information. Here, we describe the Drosophila melanogaster Genetic Reference Panel (DGRP), a community resource for analysis of population genomics and quantitative traits. The DGRP consists of fully sequenced inbred lines derived from a natural population. Population genomic analyses reveal reduced polymorphism in centromeric autosomal regions and the X chromosome, evidence for positive and negative selection, and rapid evolution of the X chromosome. Many variants in novel genes, most at low frequency, are associated with quantitative traits and explain a large fraction of the phenotypic variance. The DGRP facilitates genotype-phenotype mapping using the power of Drosophila genetics.


Assuntos
Drosophila melanogaster/genética , Estudo de Associação Genômica Ampla , Genômica , Locos de Características Quantitativas/genética , Alelos , Animais , Centrômero/genética , Cromossomos de Insetos/genética , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Seleção Genética/genética , Inanição/genética , Telômero/genética , Cromossomo X/genética
7.
Nucleic Acids Res ; 43(22): 10655-72, 2015 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-26578579

RESUMO

To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains.


Assuntos
Elementos de DNA Transponíveis , Drosophila melanogaster/genética , Genômica/métodos , Retroelementos , Animais , Linhagem Celular , Bases de Dados de Ácidos Nucleicos , Variação Genética , Genoma de Inseto , RNA Interferente Pequeno/metabolismo
8.
Nature ; 468(7325): 811-4, 2010 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-21150996

RESUMO

The observation that animal morphology tends to be conserved during the embryonic phylotypic period (a period of maximal similarity between the species within each animal phylum) led to the proposition that embryogenesis diverges more extensively early and late than in the middle, known as the hourglass model. This pattern of conservation is thought to reflect a major constraint on the evolution of animal body plans. Despite a wealth of morphological data confirming that there is often remarkable divergence in the early and late embryos of species from the same phylum, it is not yet known to what extent gene expression evolution, which has a central role in the elaboration of different animal forms, underpins the morphological hourglass pattern. Here we address this question using species-specific microarrays designed from six sequenced Drosophila species separated by up to 40 million years. We quantify divergence at different times during embryogenesis, and show that expression is maximally conserved during the arthropod phylotypic period. By fitting different evolutionary models to each gene, we show that at each time point more than 80% of genes fit best to models incorporating stabilizing selection, and that for genes whose evolutionarily optimal expression level is the same across all species, selective constraint is maximized during the phylotypic period. The genes that conform most to the hourglass pattern are involved in key developmental processes. These results indicate that natural selection acts to conserve patterns of gene expression during mid-embryogenesis, and provide a genome-wide insight into the molecular basis of the hourglass pattern of developmental evolution.


Assuntos
Drosophila/embriologia , Drosophila/genética , Regulação da Expressão Gênica no Desenvolvimento/genética , Modelos Biológicos , Animais , Sequência Conservada/genética , Drosophila/classificação , Proteínas de Drosophila/genética , Evolução Molecular , Genes de Insetos/genética , Genoma de Inseto/genética , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Seleção Genética , Especificidade da Espécie , Fatores de Tempo
9.
BMC Biol ; 13: 81, 2015 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-26437768

RESUMO

BACKGROUND: The diversification of immune systems during evolution involves the expansion of particular gene families in given phyla. A better understanding of the metazoan immune system requires an analysis of the logic underlying such immune gene amplification. This analysis is now within reach due to the ease with which we can generate multiple mutations in an organism. In this paper, we analyze the contribution of the three Drosophila prophenoloxidases (PPOs) to host defense by generating single, double and triple mutants. PPOs are enzymes that catalyze the production of melanin at the site of infection and around parasites. They are the rate-limiting enzymes that contribute to the melanization reaction, a major immune mechanism of arthropods. The number of PPO-encoding genes is variable among insects, ranging from one in the bee to ten in the mosquito. RESULTS: By analyzing mutations alone and in combination, we ascribe a specific function to each of the three PPOs of Drosophila. Our study confirms that two PPOs produced by crystal cells, PPO1 and PPO2, contribute to the bulk of melanization in the hemolymph, upon septic or clean injury. In contrast, PPO3, a PPO restricted to the D. melanogaster group, is expressed in lamellocytes and contributes to melanization during the encapsulation process. Interestingly, another overlapping set of PPOs, PPO2 and PPO3, achieve melanization of the capsule upon parasitoid wasp infection. CONCLUSIONS: The use of single or combined mutations allowed us to show that each PPO mutant has a specific phenotype, and that knocking out two of three genes is required to abolish fully a particular function. Thus, Drosophila PPOs have partially overlapping functions to optimize melanization in at least two conditions: following injury or during encapsulation. Since PPO3 is restricted to the D. melanogaster group, this suggests that production of PPO by lamellocytes emerged as a recent defense mechanism against parasitoid wasps. We conclude that differences in spatial localization, immediate or late availability, and mode of activation underlie the functional diversification of the three Drosophila PPOs, with each of them having non-redundant but overlapping functions.


Assuntos
Catecol Oxidase/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/enzimologia , Drosophila melanogaster/imunologia , Precursores Enzimáticos/genética , Imunidade Inata , Animais , Catecol Oxidase/metabolismo , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/parasitologia , Precursores Enzimáticos/metabolismo , Feminino , Imunidade Inata/genética , Vespas/fisiologia
10.
Nature ; 458(7236): 337-41, 2009 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-19212322

RESUMO

Since the completion of the genome sequence of Saccharomyces cerevisiae in 1996 (refs 1, 2), there has been a large increase in complete genome sequences, accompanied by great advances in our understanding of genome evolution. Although little is known about the natural and life histories of yeasts in the wild, there are an increasing number of studies looking at ecological and geographic distributions, population structure and sexual versus asexual reproduction. Less well understood at the whole genome level are the evolutionary processes acting within populations and species that lead to adaptation to different environments, phenotypic differences and reproductive isolation. Here we present one- to fourfold or more coverage of the genome sequences of over seventy isolates of the baker's yeast S. cerevisiae and its closest relative, Saccharomyces paradoxus. We examine variation in gene content, single nucleotide polymorphisms, nucleotide insertions and deletions, copy numbers and transposable elements. We find that phenotypic variation broadly correlates with global genome-wide phylogenetic relationships. S. paradoxus populations are well delineated along geographic boundaries, whereas the variation among worldwide S. cerevisiae isolates shows less differentiation and is comparable to a single S. paradoxus population. Rather than one or two domestication events leading to the extant baker's yeasts, the population structure of S. cerevisiae consists of a few well-defined, geographically isolated lineages and many different mosaics of these lineages, supporting the idea that human influence provided the opportunity for cross-breeding and production of new combinations of pre-existing variations.


Assuntos
Genoma Fúngico/genética , Genômica , Saccharomyces cerevisiae/genética , Saccharomyces/genética , Genética Populacional , Geografia , Mutação INDEL/genética , Fenótipo , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Saccharomyces/classificação , Seleção Genética
11.
PLoS Genet ; 8(12): e1003129, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23284297

RESUMO

Wolbachia are maternally inherited symbiotic bacteria, commonly found in arthropods, which are able to manipulate the reproduction of their host in order to maximise their transmission. The evolutionary history of endosymbionts like Wolbachia can be revealed by integrating information on infection status in natural populations with patterns of sequence variation in Wolbachia and host mitochondrial genomes. Here we use whole-genome resequencing data from 290 lines of Drosophila melanogaster from North America, Europe, and Africa to predict Wolbachia infection status, estimate relative cytoplasmic genome copy number, and reconstruct Wolbachia and mitochondrial genome sequences. Overall, 63% of Drosophila strains were predicted to be infected with Wolbachia by our in silico analysis pipeline, which shows 99% concordance with infection status determined by diagnostic PCR. Complete Wolbachia and mitochondrial genomes show congruent phylogenies, consistent with strict vertical transmission through the maternal cytoplasm and imperfect transmission of Wolbachia. Bayesian phylogenetic analysis reveals that the most recent common ancestor of all Wolbachia and mitochondrial genomes in D. melanogaster dates to around 8,000 years ago. We find evidence for a recent global replacement of ancestral Wolbachia and mtDNA lineages, but our data suggest that the derived wMel lineage arose several thousand years ago, not in the 20th century as previously proposed. Our data also provide evidence that this global replacement event is incomplete and is likely to be one of several similar incomplete replacement events that have occurred since the out-of-Africa migration that allowed D. melanogaster to colonize worldwide habitats. This study provides a complete genomic analysis of the evolutionary mode and temporal dynamics of the D. melanogaster-Wolbachia symbiosis, as well as important resources for further analyses of the impact of Wolbachia on host biology.


Assuntos
Drosophila melanogaster , Metagenômica , Simbiose , Wolbachia , Animais , Teorema de Bayes , Drosophila melanogaster/genética , Drosophila melanogaster/fisiologia , Evolução Molecular , Variação Genética , Genoma Mitocondrial , Haplótipos , Filogenia , Wolbachia/genética , Wolbachia/fisiologia
13.
Bioinformatics ; 28(16): 2154-61, 2012 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-22711795

RESUMO

MOTIVATION: Although the amount of data in biology is rapidly increasing, critical information for understanding biological events like phosphorylation or gene expression remains locked in the biomedical literature. Most current text mining (TM) approaches to extract information about biological events are focused on either limited-scale studies and/or abstracts, with data extracted lacking context and rarely available to support further research. RESULTS: Here we present BioContext, an integrated TM system which extracts, extends and integrates results from a number of tools performing entity recognition, biomolecular event extraction and contextualization. Application of our system to 10.9 million MEDLINE abstracts and 234 000 open-access full-text articles from PubMed Central yielded over 36 million mentions representing 11.4 million distinct events. Event participants included over 290 000 distinct genes/proteins that are mentioned more than 80 million times and linked where possible to Entrez Gene identifiers. Over a third of events contain contextual information such as the anatomical location of the event occurrence or whether the event is reported as negated or speculative. AVAILABILITY: The BioContext pipeline is available for download (under the BSD license) at http://www.biocontext.org, along with the extracted data which is also available for online browsing.


Assuntos
Fenômenos Bioquímicos , Biologia Computacional/métodos , Mineração de Dados , Software , MEDLINE , PubMed
14.
Nature ; 450(7167): 203-18, 2007 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-17994087

RESUMO

Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.


Assuntos
Drosophila/classificação , Drosophila/genética , Evolução Molecular , Genes de Insetos/genética , Genoma de Inseto/genética , Genômica , Filogenia , Animais , Códon/genética , Elementos de DNA Transponíveis/genética , Drosophila/imunologia , Drosophila/metabolismo , Proteínas de Drosophila/genética , Ordem dos Genes/genética , Genoma Mitocondrial/genética , Imunidade/genética , Família Multigênica/genética , RNA não Traduzido/genética , Reprodução/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Sintenia/genética
15.
Nucleic Acids Res ; 39(Database issue): D118-23, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20965965

RESUMO

The REDfly database of Drosophila transcriptional cis-regulatory elements provides the broadest and most comprehensive available resource for experimentally validated cis-regulatory modules and transcription factor binding sites among the metazoa. The third major release of the database extends the utility of REDfly as a powerful tool for both computational and experimental studies of transcription regulation. REDfly v3.0 includes the introduction of new data classes to expand the types of regulatory elements annotated in the database along with a roughly 40% increase in the number of records. A completely redesigned interface improves access for casual and power users alike; among other features it now automatically provides graphical views of the genome, displays images of reporter gene expression and implements improved capabilities for database searching and results filtering. REDfly is freely accessible at http://redfly.ccr.buffalo.edu.


Assuntos
Bases de Dados Genéticas , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/genética , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Software , Sintenia , Interface Usuário-Computador
16.
bioRxiv ; 2023 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-38187645

RESUMO

Horizontal transposon transfer (HTT) plays an important role in the evolution of eukaryotic genomes, however the detailed evolutionary history and impact of most HTT events remain to be elucidated. To better understand the process of HTT in closely-related microbial eukaryotes, we studied Ty4 retrotransposon subfamily content and sequence evolution across the genus Saccharomyces using short- and long-read whole genome sequence data, including new PacBio genome assemblies for two S. mikatae strains. We find evidence for multiple independent HTT events introducing the Tsu4 subfamily into specific lineages of S. paradoxus, S. cerevisiae, S. eubayanus, S. kudriavzevii and the ancestor of the S. mikatae/S. jurei species pair. In both S. mikatae and S. kudriavzevii, we identified novel Ty4 clades that were independently generated through recombination between resident and horizontally-transferred subfamilies. Our results reveal that recurrent HTT and lineage-specific extinction events lead to a complex pattern of Ty4 subfamily content across the genus Saccharomyces. Moreover, our results demonstrate how HTT can lead to coexistence of related retrotransposon subfamilies in the same genome that can fuel evolution of new retrotransposon clades via recombination.

17.
bioRxiv ; 2023 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-36824955

RESUMO

BACKGROUND: Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors. RESULTS: We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide a consistent and biologically meaningful view of non-reference TE insertions in a species-wide panel of ∻1000 yeast genomes, as evaluated by coverage-based abundance estimates and expected patterns of tRNA promoter targeting. Finally, we show that best-in-class predictors for yeast have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences first revealed experimentally for Ty1 to natural insertions and related copia-superfamily retrotransposons in yeast. CONCLUSION: McClintock (https://github.com/bergmanlab/mcclintock/) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors for other species.

18.
Mob DNA ; 14(1): 8, 2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37452430

RESUMO

BACKGROUND: Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors. RESULTS: We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide consistent estimates of [Formula: see text]50 non-reference TE insertions per strain and that Ty2 has the highest number of non-reference TE insertions in a species-wide panel of [Formula: see text]1000 yeast genomes. Finally, we show that best-in-class predictors for yeast applied to resequencing data have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences revealed previously for experimentally-induced Ty1 insertions to spontaneous insertions for other copia-superfamily retrotransposons in yeast. CONCLUSION: McClintock ( https://github.com/bergmanlab/mcclintock/ ) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors in other species.

19.
Mol Biol Evol ; 28(7): 1967-71, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21297157

RESUMO

The non-recombining Y chromosome is expected to degenerate over evolutionary time, however, gene gain is a common feature of Y chromosomes of mammals and Drosophila. Here, we report that a large palindrome containing interchromosomal segmental duplications is located in the vicinity of the first amplicon detected in the Y chromosome of D. melanogaster. The recent appearance of such amplicons suggests that duplications to the Y chromosome, followed by the amplification of the segmental duplications, are a mechanism for the continuing evolution of Drosophila Y chromosomes.


Assuntos
Drosophila melanogaster/genética , Duplicação Gênica , Genes de Insetos , Sequências Repetidas Invertidas , Cromossomo Y , Animais , Evolução Molecular , Modelos Genéticos , Dados de Sequência Molecular
20.
Bioinformatics ; 27(7): 980-6, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21325301

RESUMO

MOTIVATION: Increasing rates of publication and DNA sequencing make the problem of finding relevant articles for a particular gene or genomic region more challenging than ever. Existing text-mining approaches focus on finding gene names or identifiers in English text. These are often not unique and do not identify the exact genomic location of a study. RESULTS: Here, we report the results of a novel text-mining approach that extracts DNA sequences from biomedical articles and automatically maps them to genomic databases. We find that ∼20% of open access articles in PubMed central (PMC) have extractable DNA sequences that can be accurately mapped to the correct gene (91%) and genome (96%). We illustrate the utility of data extracted by text2genome from more than 150 000 PMC articles for the interpretation of ChIP-seq data and the design of quantitative reverse transcriptase (RT)-PCR experiments. CONCLUSION: Our approach links articles to genes and organisms without relying on gene names or identifiers. It also produces genome annotation tracks of the biomedical literature, thereby allowing researchers to use the power of modern genome browsers to access and analyze publications in the context of genomic data. AVAILABILITY AND IMPLEMENTATION: Source code is available under a BSD license from http://sourceforge.net/projects/text2genome/ and results can be browsed and downloaded at http://text2genome.org.


Assuntos
DNA/química , Mineração de Dados/métodos , Genes , Genoma , Anotação de Sequência Molecular , PubMed , Sequência de Bases , Imunoprecipitação da Cromatina , Bases de Dados de Ácidos Nucleicos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Análise de Sequência de DNA , Software
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa