RESUMO
Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.
Assuntos
Drosophilidae , Genoma de Inseto , Genômica , Filogenia , Animais , Drosophilidae/genética , Drosophilidae/classificação , Genômica/métodos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Deciphering the molecular basis of pluripotency is fundamental to our understanding of development and embryonic stem cell function. Here, we report that TAF3, a TBP-associated core promoter factor, is highly enriched in ES cells. In this context, TAF3 is required for endoderm lineage differentiation and prevents premature specification of neuroectoderm and mesoderm. In addition to its role in the core promoter recognition complex TFIID, genome-wide binding studies reveal that TAF3 localizes to a subset of chromosomal regions bound by CTCF/cohesin that are selectively associated with genes upregulated by TAF3. Notably, CTCF directly recruits TAF3 to promoter distal sites and TAF3-dependent DNA looping is observed between the promoter distal sites and core promoters occupied by TAF3/CTCF/cohesin. Together, our findings support a new role of TAF3 in mediating long-range chromatin regulatory interactions that safeguard the finely-balanced transcriptional programs underlying pluripotency.
Assuntos
Células-Tronco Embrionárias/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Homeodomínio/metabolismo , Fator de Transcrição TFIID/metabolismo , Animais , Fator de Ligação a CCCTC , Proteínas de Ciclo Celular/metabolismo , Proliferação de Células , Proteínas Cromossômicas não Histona/metabolismo , Células-Tronco Embrionárias/citologia , Endoderma/citologia , Humanos , Camundongos , Regiões Promotoras Genéticas , Proteínas Repressoras/metabolismo , Fatores Associados à Proteína de Ligação a TATA , Teratoma/metabolismo , Teratoma/patologia , Transcrição Gênica , CoesinasRESUMO
Intrinsically disordered regions (IDRs) are segments of proteins without stable three-dimensional structures. As this flexibility allows them to interact with diverse binding partners, IDRs play key roles in cell signaling and gene expression. Despite the prevalence and importance of IDRs in eukaryotic proteomes and various biological processes, associating them with specific molecular functions remains a significant challenge due to their high rates of sequence evolution. However, by comparing the observed values of various IDR-associated properties against those generated under a simulated model of evolution, a recent study found most IDRs across the entire yeast proteome contain conserved features. Furthermore, it showed clusters of IDRs with common "evolutionary signatures," i.e. patterns of conserved features, were associated with specific biological functions. To determine if similar patterns of conservation are found in the IDRs of other systems, in this work we applied a series of phylogenetic models to over 7,500 orthologous IDRs identified in the Drosophila genome to dissect the forces driving their evolution. By comparing models of constrained and unconstrained continuous trait evolution using the Brownian motion and Ornstein-Uhlenbeck models, respectively, we identified signals of widespread constraint, indicating conservation of distributed features is mechanism of IDR evolution common to multiple biological systems. In contrast to the previous study in yeast, however, we observed limited evidence of IDR clusters with specific biological functions, which suggests a more complex relationship between evolutionary constraints and function in the IDRs of multicellular organisms.
Assuntos
Proteínas de Drosophila , Proteínas Intrinsicamente Desordenadas , Drosophila melanogaster/genética , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Evolução Molecular , Homologia de Sequência , Sequência de AminoácidosRESUMO
Morphogen gradients direct the spatial patterning of developing embryos; however, the mechanisms by which these gradients are interpreted remain elusive. Here we used lattice light-sheet microscopy to perform in vivo single-molecule imaging in early Drosophila melanogaster embryos of the transcription factor Bicoid that forms a gradient and initiates patterning along the anteroposterior axis. In contrast to canonical models, we observed that Bicoid binds to DNA with a rapid off rate throughout the embryo such that its average occupancy at target loci is on-rate-dependent. We further observed Bicoid forming transient "hubs" of locally high density that facilitate binding as factor levels drop, including in the posterior, where we observed Bicoid binding despite vanishingly low protein levels. We propose that localized modulation of transcription factor on rates via clustering provides a general mechanism to facilitate binding to low-affinity targets and that this may be a prevalent feature of other developmental transcription factors.
Assuntos
Proteínas de Drosophila/metabolismo , Drosophila melanogaster/embriologia , Proteínas de Homeodomínio/metabolismo , Transativadores/metabolismo , Animais , Padronização Corporal/fisiologia , Cromatina/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/ultraestrutura , Drosophila melanogaster/metabolismo , Embrião não Mamífero , Proteínas de Homeodomínio/química , Proteínas de Homeodomínio/ultraestrutura , Proteínas Nucleares , Ligação Proteica , Imagem Individual de Molécula , Transativadores/química , Transativadores/ultraestrutura , Fatores de Transcrição/metabolismoRESUMO
To fully understand animal transcription networks, it is essential to accurately measure the spatial and temporal expression patterns of transcription factors and their targets. We describe a registration technique that takes image-based data from hundreds of Drosophila blastoderm embryos, each costained for a reference gene and one of a set of genes of interest, and builds a model VirtualEmbryo. This model captures in a common framework the average expression patterns for many genes in spite of significant variation in morphology and expression between individual embryos. We establish the method's accuracy by showing that relationships between a pair of genes' expression inferred from the model are nearly identical to those measured in embryos costained for the pair. We present a VirtualEmbryo containing data for 95 genes at six time cohorts. We show that known gene-regulatory interactions can be automatically recovered from this data set and predict hundreds of new interactions.
Assuntos
Drosophila melanogaster/genética , Redes Reguladoras de Genes , Modelos Genéticos , Animais , Blastoderma , Drosophila melanogaster/metabolismo , Embrião não Mamífero/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Genes de InsetosRESUMO
The Drosophila montium species group is a clade of 94 named species, closely related to the model species D. melanogaster. The montium species group is distributed over a broad geographic range throughout Asia, Africa, and Australasia. Species of this group possess a wide range of morphologies, mating behaviors, and endosymbiont associations, making this clade useful for comparative analyses. We use genomic data from 42 available species to estimate the phylogeny and relative divergence times within the montium species group, and its relative divergence time from D. melanogaster. To assess the robustness of our phylogenetic inferences, we use 3 non-overlapping sets of 20 single-copy coding sequences and analyze all 60 genes with both Bayesian and maximum likelihood methods. Our analyses support monophyly of the group. Apart from the uncertain placement of a single species, D. baimaii, our analyses also support the monophyly of all seven subgroups proposed within the montium group. Our phylograms and relative chronograms provide a highly resolved species tree, with discordance restricted to estimates of relatively short branches deep in the tree. In contrast, age estimates for the montium crown group, relative to its divergence from D. melanogaster, depend critically on prior assumptions concerning variation in rates of molecular evolution across branches, and hence have not been reliably determined. We discuss methodological issues that limit phylogenetic resolution - even when complete genome sequences are available - as well as the utility of the current phylogeny for understanding the evolutionary and biogeographic history of this clade.
Assuntos
Drosophila/classificação , Animais , Teorema de Bayes , DNA/química , DNA/isolamento & purificação , DNA/metabolismo , Drosophila/genética , Proteínas de Drosophila/classificação , Proteínas de Drosophila/genética , Drosophila melanogaster/classificação , Drosophila melanogaster/genética , Evolução Molecular , Filogenia , Análise de Sequência de DNARESUMO
As the Drosophila embryo transitions from the use of maternal RNAs to zygotic transcription, domains of open chromatin, with relatively low nucleosome density and specific histone marks, are established at promoters and enhancers involved in patterned embryonic transcription. However it remains unclear how regions of activity are established during early embryogenesis, and if they are the product of spatially restricted or ubiquitous processes. To shed light on this question, we probed chromatin accessibility across the anterior-posterior axis (A-P) of early Drosophila melanogaster embryos by applying a transposon based assay for chromatin accessibility (ATAC-seq) to anterior and posterior halves of hand-dissected, cellular blastoderm embryos. We find that genome-wide chromatin accessibility is highly similar between the two halves, with regions that manifest significant accessibility in one half of the embryo almost always accessible in the other half, even for promoters that are active in exclusively one half of the embryo. These data support previous studies that show that chromatin accessibility is not a direct result of activity, and point to a role for ubiquitous factors or processes in establishing chromatin accessibility at promoters in the early embryo. However, in concordance with similar works, we find that at enhancers active exclusively in one half of the embryo, we observe a significant skew towards greater accessibility in the region of their activity, highlighting the role of patterning factors such as Bicoid in this process.
Assuntos
Padronização Corporal/genética , Cromatina/genética , Drosophila melanogaster/genética , Embrião não Mamífero/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Animais , Blastoderma/embriologia , Blastoderma/metabolismo , Proteínas de Drosophila/genética , Drosophila melanogaster/embriologia , Drosophila melanogaster/metabolismo , Embrião não Mamífero/embriologia , Elementos Facilitadores Genéticos/genética , Proteínas de Homeodomínio/genética , Nucleossomos/genética , Regiões Promotoras Genéticas/genética , Transativadores/genéticaRESUMO
Changes in developmental gene regulatory networks enable evolved changes in morphology. These changes can be in cis regulatory elements that act in an allele-specific manner, or changes to the overall trans regulatory environment that interacts with cis regulatory sequences. Here we address several questions about the evolution of gene expression accompanying a convergently evolved constructive morphological trait, increases in tooth number in two independently derived freshwater populations of threespine stickleback fish (Gasterosteus aculeatus). Are convergently evolved cis and/or trans changes in gene expression associated with convergently evolved morphological evolution? Do cis or trans regulatory changes contribute more to gene expression changes accompanying an evolved morphological gain trait? Transcriptome data from dental tissue of ancestral low-toothed and two independently derived high-toothed stickleback populations revealed significantly shared gene expression changes that have convergently evolved in the two high-toothed populations. Comparing cis and trans regulatory changes using phased gene expression data from F1 hybrids, we found that trans regulatory changes were predominant and more likely to be shared among both high-toothed populations. In contrast, while cis regulatory changes have evolved in both high-toothed populations, overall these changes were distinct and not shared among high-toothed populations. Together these data suggest that a convergently evolved trait can occur through genetically distinct regulatory changes that converge on similar trans regulatory environments.
Assuntos
Smegmamorpha/genética , Alelos , Animais , Evolução Biológica , Mapeamento Cromossômico/métodos , Evolução Molecular , Expressão Gênica/genética , Regulação da Expressão Gênica no Desenvolvimento/genética , Frequência do Gene/genética , Redes Reguladoras de Genes/genética , Genótipo , Fenótipo , Locos de Características Quantitativas , DenteRESUMO
During vertebrate neurulation, the embryonic ectoderm is patterned into lineage progenitors for neural plate, neural crest, placodes and epidermis. Here, we use Xenopus laevis embryos to analyze the spatial and temporal transcriptome of distinct ectodermal domains in the course of neurulation, during the establishment of cell lineages. In order to define the transcriptome of small groups of cells from a single germ layer and to retain spatial information, dorsal and ventral ectoderm was subdivided along the anterior-posterior and medial-lateral axes by microdissections. Principal component analysis on the transcriptomes of these ectoderm fragments primarily identifies embryonic axes and temporal dynamics. This provides a genetic code to define positional information of any ectoderm sample along the anterior-posterior and dorsal-ventral axes directly from its transcriptome. In parallel, we use nonnegative matrix factorization to predict enhanced gene expression maps onto early and mid-neurula embryos, and specific signatures for each ectoderm area. The clustering of spatial and temporal datasets allowed detection of multiple biologically relevant groups (e.g., Wnt signaling, neural crest development, sensory placode specification, ciliogenesis, germ layer specification). We provide an interactive network interface, EctoMap, for exploring synexpression relationships among genes expressed in the neurula, and suggest several strategies to use this comprehensive dataset to address questions in developmental biology as well as stem cell or cancer research.
Assuntos
Ectoderma/embriologia , Crista Neural/embriologia , Neurônios/citologia , Células-Tronco/metabolismo , Xenopus laevis/embriologia , Algoritmos , Animais , Análise por Conglomerados , Bases de Dados Genéticas , Ectoderma/metabolismo , Gastrulação/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Ontologia Genética , Redes Reguladoras de Genes , Humanos , Internet , Microdissecção , Neoplasias/genética , Crista Neural/metabolismo , Neurulação/genética , Análise de Componente Principal , Fatores de Tempo , Transcriptoma/genética , Proteínas Wnt/metabolismo , Xenopus laevis/genéticaRESUMO
The Drosophila athabasca species complex contains three recently diverged, prezygotically isolated semispecies (Western-Northern, Eastern-A, and Eastern-B) that are distributed across North America and share zones of sympatry. Inferences based on a handful of loci suggest that this complex might be an ideal system for studying the genetics of incipient speciation and the evolution of prezygotic isolating mechanisms, but patterns of differentiation have not been characterized systematically. Here, we assembled a draft genome for D. athabasca and analyze whole-genome re-sequencing data for 28 individuals from across the species range to characterize genome-wide patterns of diversity and population differentiation among semispecies. Patterns of differentiation on the X-chromosome vs. autosomes vary, with the X-chromosome showing better phylogenetic resolution and increased levels of between semispecies divergence. Despite low levels of overall differentiation and a lack of phylogenetic resolution of the autosomes for the most closely related semispecies, individuals do exhibit distinct genetic clustering. Demographic analyses provide some support for a model of isolation with migration within D. athabasca, with divergence times <20 kya. The young divergence times of the semispecies of D. athabasca, together with strong levels of sexual isolation, makes them a promising system for studying the evolution of prezygotic isolation and speciation.
Assuntos
Drosophila/genética , Animais , Evolução Biológica , Especiação Genética , Variação Genética/genética , Genoma , Genoma de Inseto/genética , América do Norte , Filogenia , Isolamento Reprodutivo , Simpatria/genética , Cromossomo X/genéticaRESUMO
Early embryogenesis is a unique developmental stage where genetic control of development is handed off from mother to zygote. Yet the contribution of this transition to the evolution of gene expression is poorly understood. Here we study two aspects of gene expression specific to early embryogenesis in Drosophila: sex-biased gene expression prior to the onset of canonical X chromosomal dosage compensation, and the contribution of maternally supplied mRNAs. We sequenced mRNAs from individual unfertilized eggs and precisely staged and sexed blastoderm embryos, and compared levels between D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. First, we find that mRNA content is highly conserved for a given stage and that studies relying on pooled embryos likely systematically overstate the degree of gene expression divergence. Unlike studies done on larvae and adults where most species show a larger proportion of genes with male-biased expression, we find that transcripts in Drosophila embryos are largely female-biased in all species, likely due to incomplete dosage compensation prior to the activation of the canonical dosage compensation mechanism. The divergence of sex-biased gene expression across species is observed to be often due to lineage-specific decrease of expression; the most drastic example of which is the overall reduction of male expression from the neo-X chromosome in D. pseudoobscura, leading to a pervasive female-bias on this chromosome. We see no evidence for a faster evolution of expression on the X chromosome in embryos (no "faster-X" effect), unlike in adults, and contrary to a previous study on pooled non-sexed embryos. Finally, we find that most genes are conserved in regard to their maternal or zygotic origin of transcription, and present evidence that differences in maternal contribution to the blastoderm transcript pool may be due to species-specific divergence of transcript degradation rates.
Assuntos
Blastoderma/crescimento & desenvolvimento , Mecanismo Genético de Compensação de Dose , Desenvolvimento Embrionário/genética , Evolução Molecular , RNA Mensageiro/genética , Animais , Drosophila melanogaster/genética , Drosophila melanogaster/crescimento & desenvolvimento , Embrião não Mamífero , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Masculino , RNA Mensageiro/biossíntese , Razão de Masculinidade , Especificidade da Espécie , Cromossomo X/genéticaRESUMO
Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
Assuntos
Biologia Computacional/métodos , Drosophila melanogaster/genética , Perfilação da Expressão Gênica , Anotação de Sequência Molecular , Transcriptoma , Animais , Análise por Conglomerados , Drosophila melanogaster/classificação , Evolução Molecular , Éxons , Feminino , Genoma de Inseto , Humanos , Masculino , Motivos de Nucleotídeos , Filogenia , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Edição de RNA , Sítios de Splice de RNA , Splicing de RNA , Reprodutibilidade dos Testes , Sítio de Iniciação de TranscriçãoRESUMO
Heterogeneous nuclear ribonucleoproteins (hnRNPs) have been traditionally seen as proteins packaging RNA nonspecifically into ribonucleoprotein particles (RNPs), but evidence suggests specific cellular functions on discrete target pre-mRNAs. Here we report genome-wide analysis of alternative splicing patterns regulated by four Drosophila homologs of the mammalian hnRNP A/B family (hrp36, hrp38, hrp40, and hrp48). Analysis of the global RNA-binding distributions of each protein revealed both small and extensively bound regions on target transcripts. A significant subset of RNAs were bound and regulated by more than one hnRNP protein, revealing a combinatorial network of interactions. In vitro RNA-binding site selection experiments (SELEX) identified distinct binding motif specificities for each protein, which were overrepresented in their respective regulated and bound transcripts. These results indicate that individual heterogeneous ribonucleoproteins have specific affinities for overlapping, but distinct, populations of target pre-mRNAs controlling their patterns of RNA processing.
Assuntos
Processamento Alternativo/genética , Drosophila/genética , Genoma de Inseto , Ribonucleoproteínas Nucleares Heterogêneas Grupo A-B/genética , Ribonucleoproteínas Nucleares Heterogêneas Grupo A-B/metabolismo , Precursores de RNA/metabolismo , RNA Mensageiro/metabolismo , Animais , Sequência de Bases , Sítios de Ligação , Células Cultivadas , Drosophila/metabolismo , Dados de Sequência Molecular , Precursores de RNA/genéticaRESUMO
Temperature affects both the timing and outcome of animal development, but the detailed effects of temperature on the progress of early development have been poorly characterized. To determine the impact of temperature on the order and timing of events during Drosophila melanogaster embryogenesis, we used time-lapse imaging to track the progress of embryos from shortly after egg laying through hatching at seven precisely maintained temperatures between 17.5 °C and 32.5 °C. We employed a combination of automated and manual annotation to determine when 36 milestones occurred in each embryo. D. melanogaster embryogenesis takes [Formula: see text]33 hours at 17.5 °C, and accelerates with increasing temperature to a low of 16 hours at 27.5 °C, above which embryogenesis slows slightly. Remarkably, while the total time of embryogenesis varies over two fold, the relative timing of events from cellularization through hatching is constant across temperatures. To further explore the relationship between temperature and embryogenesis, we expanded our analysis to cover ten additional Drosophila species of varying climatic origins. Six of these species, like D. melanogaster, are of tropical origin, and embryogenesis time at different temperatures was similar for them all. D. mojavensis, a sub-tropical fly, develops slower than the tropical species at lower temperatures, while D. virilis, a temperate fly, exhibits slower development at all temperatures. The alpine sister species D. persimilis and D. pseudoobscura develop as rapidly as tropical flies at cooler temperatures, but exhibit diminished acceleration above 22.5 °C and have drastically slowed development by 30 °C. Despite ranging from 13 hours for D. erecta at 30 °C to 46 hours for D. virilis at 17.5 °C, the relative timing of events from cellularization through hatching is constant across all species and temperatures examined here, suggesting the existence of a previously unrecognized timer controlling the progress of embryogenesis that has been tuned by natural selection as each species diverges.
Assuntos
Drosophila melanogaster/genética , Desenvolvimento Embrionário/genética , Seleção Genética/genética , Animais , Temperatura Baixa , Variação Genética/genética , Filogenia , Especificidade da Espécie , Imagem com Lapso de Tempo/métodosRESUMO
Sex chromosome dosage differences between females and males are a significant form of natural genetic variation in many species. Like many species with chromosomal sex determination, Drosophila females have two X chromosomes, while males have one X and one Y. Fusions of sex chromosomes with autosomes have occurred along the lineage leading to D. pseudoobscura and D. miranda. The resulting neo-sex chromosomes are gradually evolving the properties of sex chromosomes, and neo-X chromosomes are becoming targets for the molecular mechanisms that compensate for differences in X chromosome dose between sexes. We have previously shown that D. melanogaster possess at least two dosage compensation mechanisms: the well- characterized MSL-mediated dosage compensation active in most somatic tissues, and another system active during early embryogenesis prior to the onset of MSL-mediated dosage compensation. To better understand the developmental constraints on sex chromosome gene expression and evolution, we sequenced mRNA from individual male and female embryos of D. pseudoobscura and D. miranda, from â¼0.5 to 8 hours of development. Autosomal expression levels are highly conserved between these species. But, unlike D. melanogaster, we observe a general lack of dosage compensation in D. pseudoobscura and D. miranda prior to the onset of MSL-mediated dosage compensation. Thus, either there has been a lineage-specific gain or loss in early dosage compensation mechanism(s) or increasing X chromosome dose may strain dosage compensation systems and make them less effective. The extent of female bias on the X chromosomes decreases through developmental time with the establishment of MSL-mediated dosage compensation, but may do so more slowly in D. miranda than D. pseudoobscura. These results also prompt a number of questions about whether species with more sex-linked genes have more sex-specific phenotypes, and how much transcript level variance is tolerable during critical stages of development.
Assuntos
Evolução Molecular , Caracteres Sexuais , Cromossomos Sexuais/genética , Processos de Determinação Sexual , Animais , Mecanismo Genético de Compensação de Dose , Drosophila melanogaster/genética , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Masculino , Especificidade da EspécieRESUMO
In many species, a dosage compensation complex (DCC) is targeted to X chromosomes of one sex to equalize levels of X-gene products between males (1X) and females (2X). Here we identify cis-acting regulatory elements that target the Caenorhabditis elegans X chromosome for repression by the DCC. The DCC binds to discrete, dispersed sites on X of two types. rex sites (recruitment elements on X) recruit the DCC in an autonomous, DNA sequence-dependent manner using a 12-base-pair (bp) consensus motif that is enriched on X. This motif is critical for DCC binding, is clustered in rex sites, and confers much of X-chromosome specificity. Motif variants enriched on X by 3.8-fold or more are highly predictive (95%) for rex sites. In contrast, dox sites (dependent on X) lack the X-enriched variants and cannot bind the DCC when detached from X. dox sites are more prevalent than rex sites and, unlike rex sites, reside preferentially in promoters of some expressed genes. These findings fulfill predictions for a targeting model in which the DCC binds to recruitment sites on X and disperses to discrete sites lacking autonomous recruitment ability. To relate DCC binding to function, we identified dosage-compensated and noncompensated genes on X. Unexpectedly, many genes of both types have bound DCC, but many do not, suggesting the DCC acts over long distances to repress X-gene expression. Remarkably, the DCC binds to autosomes, but at far fewer sites and rarely at consensus motifs. DCC disruption causes opposite effects on expression of X and autosomal genes. The DCC thus acts at a distance to impact expression throughout the genome.
Assuntos
Adenosina Trifosfatases/metabolismo , Caenorhabditis elegans/fisiologia , Proteínas de Ligação a DNA/metabolismo , Mecanismo Genético de Compensação de Dose/fisiologia , Regulação da Expressão Gênica no Desenvolvimento , Genoma Helmíntico/fisiologia , Complexos Multiproteicos/metabolismo , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/metabolismo , Sequência Consenso/genética , Feminino , Genoma Helmíntico/genética , Masculino , Ligação Proteica , Elementos Reguladores de Transcrição , Cromossomo X/genética , Cromossomo X/metabolismoRESUMO
We create a new assembly of the Drosophila simulans genome using 142 million paired short-read sequences and previously published data for strain w(501). Our assembly represents a higher-quality genomic sequence with greater coverage, fewer misassemblies, and, by several indexes, fewer sequence errors. Evolutionary analysis of this genome reference sequence reveals interesting patterns of lineage-specific divergence that are different from those previously reported. Specifically, we find that Drosophila melanogaster evolves faster than D. simulans at all annotated classes of sites, including putatively neutrally evolving sites found in minimal introns. While this may be partly explained by a higher mutation rate in D. melanogaster, we also find significant heterogeneity in rates of evolution across classes of sites, consistent with historical differences in the effective population size for the two species. Also contrary to previous findings, we find that the X chromosome is evolving significantly faster than autosomes for nonsynonymous and most noncoding DNA sites and significantly slower for synonymous sites. The absence of a X/A difference for putatively neutral sites and the robustness of the pattern to Gene Ontology and sex-biased expression suggest that partly recessive beneficial mutations may comprise a substantial fraction of noncoding DNA divergence observed between species. Our results have more general implications for the interpretation of evolutionary analyses of genomes of different quality.
Assuntos
Drosophila/genética , Evolução Molecular , Genoma de Inseto , Animais , Cromossomos de Insetos/genética , Mapeamento de Sequências Contíguas , Íntrons , Taxa de Mutação , Filogenia , População/genética , Cromossomo X/genéticaRESUMO
The Drosophila embryo proceeds through thirteen mitotic divisions as a syncytium. Its nuclei distribute in the embryo's interior during the first six divisions, dividing synchronously with a cycle time of less than ten minutes. After seven divisions (nuclear cycle 8), the syncytial blastoderm forms as the nuclei approach the embryo surface and slow their cycle time; subsequent divisions proceed in waves that initiate at the poles. Because genetic studies have not identified zygotic mutants that affect the early divisions and because transcription has not been detected before cycle 8, the early, pre-blastoderm embryo has been considered to rely entirely on maternal contributions and to be transcriptionally silent. Our studies identified several abnormal phenotypes in live engrailed (en) mutant embryos prior to cycle 8, as well as a small group of genes that are transcribed in embryos prior to cycle 7. Nuclei in en embryos divide asynchronously, an abnormality that was detected as early as nuclear cycle 2-3. Anti-En antibody detected nuclear En protein in embryos at cycle 2, and expression of an En:GFP fusion protein encoded in the paternal genome was also detected in cycle 2 nuclei. These findings demonstrate that the Drosophila embryo is functionally competent for gene expression prior to the onset of its rapid nuclear divisions and that the embryo requires functions that are expressed in the zygote in order to faithfully prosecute its early, pre-cellularization mitotic cycles.
Assuntos
Divisão Celular/genética , Drosophila melanogaster/embriologia , Proteínas de Homeodomínio , Morfogênese/genética , Fatores de Transcrição , Animais , Blastoderma/citologia , Núcleo Celular/genética , Núcleo Celular/metabolismo , Proteínas de Drosophila , Drosophila melanogaster/genética , Embrião não Mamífero/citologia , Embrião não Mamífero/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Mutação , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Zigoto/citologia , Zigoto/metabolismoRESUMO
To better characterize how variation in regulatory sequences drives divergence in gene expression, we undertook a systematic study of transcription factor binding and gene expression in blastoderm embryos of four species, which sample much of the diversity in the 40 million-year old genus Drosophila: D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. We compared gene expression, measured by mRNA-seq, to the genome-wide binding, measured by ChIP-seq, of four transcription factors involved in early anterior-posterior patterning. We found that mRNA levels are much better conserved than individual transcription factor binding events, and that changes in a gene's expression were poorly explained by changes in adjacent transcription factor binding. However, highly bound sites, sites in regions bound by multiple factors and sites near genes are conserved more frequently than other binding, suggesting that a considerable amount of transcription factor binding is weakly or non-functional and not subject to purifying selection.
Assuntos
Drosophila melanogaster/embriologia , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Fatores de Transcrição/genética , Animais , Sequência de Bases , Sítios de Ligação , Blastoderma/citologia , Blastoderma/crescimento & desenvolvimento , Blastoderma/metabolismo , Sequência Conservada/genética , Embrião não Mamífero , Elementos Facilitadores Genéticos , Ligação ProteicaRESUMO
Transcription factors have two functional constraints on their evolution: (1) their binding sites must have enough information to be distinguishable from all other sequences in the genome, and (2) they must bind these sites with an affinity that appropriately modulates the rate of transcription. Since both are determined by the biophysical properties of the DNA-binding domain, selection on one will ultimately affect the other. We were interested in understanding how plastic the informational and regulatory properties of a transcription factor are and how transcription factors evolve to balance these constraints. To study this, we developed an in vivo selection system in Escherichia coli to identify variants of the helix-turn-helix transcription factor MarA that bind different sets of binding sites with varying degrees of degeneracy. Unlike previous in vitro methods used to identify novel DNA binders and to probe the plasticity of the binding domain, our selections were done within the context of the initiation complex, selecting for both specific binding within the genome and for a physiologically significant strength of interaction to maintain function of the factor. Using MITOMI, quantitative PCR, and a binding site fitness assay, we characterized the binding, function, and fitness of some of these variants. We observed that a large range of binding preferences, information contents, and activities could be accessed with a few mutations, suggesting that transcriptional regulatory networks are highly adaptable and expandable.