RESUMO
A major challenge in genetics is to identify genetic variants driving natural phenotypic variation. However, current methods of genetic mapping have limited resolution. To address this challenge, we developed a CRISPR-Cas9-based high-throughput genome editing approach that can introduce thousands of specific genetic variants in a single experiment. This enabled us to study the fitness consequences of 16,006 natural genetic variants in yeast. We identified 572 variants with significant fitness differences in glucose media; these are highly enriched in promoters, particularly in transcription factor binding sites, while only 19.2% affect amino acid sequences. Strikingly, nearby variants nearly always favor the same parent's alleles, suggesting that lineage-specific selection is often driven by multiple clustered variants. In sum, our genome editing approach reveals the genetic architecture of fitness variation at single-base resolution and could be adapted to measure the effects of genome-wide genetic variation in any screen for cell survival or cell-sortable markers.
Assuntos
Edição de Genes/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Saccharomyces cerevisiae/genética , Sistemas CRISPR-Cas , Mapeamento Cromossômico , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Variação Genética/genética , Vetores Genéticos , Genoma , Leveduras/genéticaRESUMO
Cis-regulatory elements such as transcription factor (TF) binding sites can be identified genome-wide, but it remains far more challenging to pinpoint genetic variants affecting TF binding. Here, we introduce a pooling-based approach to mapping quantitative trait loci (QTLs) for molecular-level traits. Applying this to five TFs and a histone modification, we mapped thousands of cis-acting QTLs, with over 25-fold lower cost compared to standard QTL mapping. We found that single genetic variants frequently affect binding of multiple TFs, and CTCF can recruit all five TFs to its binding sites. These QTLs often affect local chromatin and transcription but can also influence long-range chromosomal contacts, demonstrating a role for natural genetic variation in chromosomal architecture. Thousands of these QTLs have been implicated in genome-wide association studies, providing candidate molecular mechanisms for many disease risk loci and suggesting that TF binding variation may underlie a large fraction of human phenotypic variation.
Assuntos
Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Análise de Sequência de DNA/métodos , Fatores de Transcrição/metabolismo , Predisposição Genética para Doença , Código das Histonas , HumanosRESUMO
Among primates, humans display a unique trajectory of development that is responsible for the many traits specific to our species. However, the inaccessibility of primary human and chimpanzee tissues has limited our ability to study human evolution. Comparative in vitro approaches using primate-derived induced pluripotent stem cells have begun to reveal species differences on the cellular and molecular levels1,2. In particular, brain organoids have emerged as a promising platform to study primate neural development in vitro3-5, although cross-species comparisons of organoids are complicated by differences in developmental timing and variability of differentiation6,7. Here we develop a new platform to address these limitations by fusing human and chimpanzee induced pluripotent stem cells to generate a panel of tetraploid hybrid stem cells. We applied this approach to study species divergence in cerebral cortical development by differentiating these cells into neural organoids. We found that hybrid organoids provide a controlled system for disentangling cis- and trans-acting gene-expression divergence across cell types and developmental stages, revealing a signature of selection on astrocyte-related genes. In addition, we identified an upregulation of the human somatostatin receptor 2 gene (SSTR2), which regulates neuronal calcium signalling and is associated with neuropsychiatric disorders8,9. We reveal a human-specific response to modulation of SSTR2 function in cortical neurons, underscoring the potential of this platform for elucidating the molecular basis of human evolution.
Assuntos
Fusão Celular , Regulação da Expressão Gênica no Desenvolvimento , Células Híbridas/citologia , Células-Tronco Pluripotentes Induzidas/citologia , Neurogênese/genética , Alelos , Animais , Astrócitos/citologia , Sinalização do Cálcio , Córtex Cerebral/citologia , Feminino , Humanos , Masculino , Neurônios/citologia , Organoides/citologia , Pan troglodytes/genética , Receptores de Somatostatina/genética , Reprodutibilidade dos Testes , Transcrição GênicaRESUMO
Cis-regulatory changes are thought to play a major role in adaptation. Threespine sticklebacks have repeatedly colonized freshwater habitats in the Northern Hemisphere, where they have evolved a suite of phenotypes that distinguish them from marine populations, including changes in physiology, behavior, and morphology. To understand the role of gene regulatory evolution in adaptive divergence, here we investigate cis-regulatory changes in gene expression between marine and freshwater ecotypes through allele-specific expression (ASE) in F1 hybrids. Surveying seven ecologically relevant tissues, including three sampled across two developmental stages, we identified cis-regulatory divergence affecting a third of genes, nearly half of which were tissue-specific. Next, we compared allele-specific expression in dental tissues at two timepoints to characterize cis-regulatory changes during development between marine and freshwater fish. Applying a genome-wide test for selection on cis-regulatory changes, we find evidence for lineage-specific selection on several processes between ecotypes, including the Wnt signaling pathway in dental tissues. Finally, we show that genes with ASE, particularly those that are tissue-specific, are strongly enriched in genomic regions of repeated marine-freshwater divergence, supporting an important role for these cis-regulatory differences in parallel adaptive evolution of sticklebacks to freshwater habitats. Altogether, our results provide insight into the cis-regulatory landscape of divergence between stickleback ecotypes across tissues and during development, and support a fundamental role for tissue-specific cis-regulatory changes in rapid adaptation to new environments.
Assuntos
Smegmamorpha , Animais , Smegmamorpha/genética , Água Doce , Adaptação Fisiológica/genética , Genoma , AclimataçãoRESUMO
Assembly-line polyketide synthases (PKSs) are large and complex enzymatic machineries with a multimodular architecture, typically encoded in bacterial genomes by biosynthetic gene clusters. Their modularity has led to an astounding diversity of biosynthesized molecules, many with medical relevance. Thus, understanding the mechanisms that drive PKS evolution is fundamental for both functional prediction of natural PKSs as well as for the engineering of novel PKSs. Here, we describe a repetitive genetic element in assembly-line PKS genes which appears to play a role in accelerating the diversification of closely related biosynthetic clusters. We named this element GRINS: genetic repeats of intense nucleotide skews. GRINS appear to recode PKS protein regions with a biased nucleotide composition and to promote gene conversion. GRINS are present in a large number of assembly-line PKS gene clusters and are particularly widespread in the actinobacterial genus Streptomyces While the molecular mechanisms associated with GRINS appearance, dissemination, and maintenance are unknown, the presence of GRINS in a broad range of bacterial phyla and gene families indicates that these genetic elements could play a fundamental role in protein evolution.
Assuntos
Variação Genética , Policetídeo Sintases/genética , Sequências Repetitivas de Ácido Nucleico/genética , Sequência de Bases , Evolução Molecular , Conversão Gênica , Genoma Bacteriano , Família Multigênica , Nucleotídeos/genética , Filogenia , Policetídeo Sintases/química , Domínios Proteicos , Streptomyces/enzimologia , Streptomyces/genéticaRESUMO
Candida albicans is the most common cause of systemic fungal infections in humans and is considerably more virulent than its closest known relative, Candida dubliniensis. To investigate this difference, we constructed interspecies hybrids and quantified mRNA levels produced from each genome in the hybrid. This approach systematically identified expression differences in orthologous genes arising from cis-regulatory sequence changes that accumulated since the two species last shared a common ancestor, some 10 million y ago. We documented many orthologous gene-expression differences between the two species, and we pursued one striking observation: All 15 genes coding for the enzymes of glycolysis showed higher expression from the C. albicans genome than the C. dubliniensis genome in the interspecies hybrid. This pattern requires evolutionary changes to have occurred at each gene; the fact that they all act in the same direction strongly indicates lineage-specific natural selection as the underlying cause. To test whether these expression differences contribute to virulence, we created a C. dubliniensis strain in which all 15 glycolysis genes were produced at modestly elevated levels and found that this strain had significantly increased virulence in the standard mouse model of systemic infection. These results indicate that small expression differences across a deeply conserved set of metabolism enzymes can play a significant role in the evolution of virulence in fungal pathogens.
Assuntos
Evolução Biológica , Candida/classificação , Candida/genética , Seleção Genética , Alelos , Candida/metabolismo , Candida/patogenicidade , Candidíase/microbiologia , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Ontologia Genética , Genes Fúngicos , Hibridização Genética , Virulência/genéticaRESUMO
RNA sequencing has been widely used as an essential tool to probe gene expression. While standard practices have been established to analyze RNA-seq data, it is still challenging to interpret and remove artifactual signals. Several biological and technical factors such as sex, age, batches, and sequencing technology have been found to bias these estimates. Probabilistic estimation of expression residuals (PEER), which infers broad variance components in gene expression measurements, has been used to account for some systematic effects, but it has remained challenging to interpret these PEER factors. Here we show that transcriptome diversity-a simple metric based on Shannon entropy-explains a large portion of variability in gene expression and is the strongest known factor encoded in PEER factors. We then show that transcriptome diversity has significant associations with multiple technical and biological variables across diverse organisms and datasets. In sum, transcriptome diversity provides a simple explanation for a major source of variation in both gene expression estimates and PEER covariates.
Assuntos
RNA , Transcriptoma , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , RNA/genética , RNA-Seq , Análise de Sequência de RNA , Transcriptoma/genética , Sequenciamento do ExomaRESUMO
Distinguishing which traits have evolved under natural selection, as opposed to neutral evolution, is a major goal of evolutionary biology. Several tests have been proposed to accomplish this, but these either rely on false assumptions or suffer from low power. Here, I introduce an approach to detecting selection that makes minimal assumptions and only requires phenotypic data from â¼10 individuals. The test compares the phenotypic difference between two populations to what would be expected by chance under neutral evolution, which can be estimated from the phenotypic distribution of an F2 cross between those populations. Simulations show that the test is robust to variation in the number of loci affecting the trait, the distribution of locus effect sizes, heritability, dominance, and epistasis. Comparing its performance to the QTL sign test-an existing test of selection that requires both genotype and phenotype data-the new test achieves comparable power with 50- to 100-fold fewer individuals (and no genotype data). Applying the test to empirical data spanning over a century shows strong directional selection in many crops, as well as on naturally selected traits such as head shape in Hawaiian Drosophila and skin color in humans. Applied to gene expression data, the test reveals that the strength of stabilizing selection acting on mRNA levels in a species is strongly associated with that species' effective population size. In sum, this test is applicable to phenotypic data from almost any genetic cross, allowing selection to be detected more easily and powerfully than previously possible.
Assuntos
Cruzamentos Genéticos , Modelos Genéticos , Seleção Genética/genética , Animais , Produtos Agrícolas/genética , Drosophila/anatomia & histologia , Drosophila/genética , Evolução Molecular , Variação Genética/genética , Humanos , Fenótipo , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Saccharomyces cerevisiae/genética , Pigmentação da Pele/genéticaRESUMO
Interspecific hybrids have played a key role in research on gene expression regulation. A growing number of studies have measured genome-wide allele-specific expression in hybrids and observed that cis-regulatory changes often oppose trans-acting changes affecting the same genes, suggesting stabilizing selection for compensatory changes. However, the most common method for estimating these effects is biased, producing artifactual patterns of compensatory evolution. Here I introduce a simple modification leveraging biological replicates that ameliorates the bias.
Assuntos
Evolução Molecular , Regulação da Expressão Gênica/genética , Variação Genética , Animais , Redes Reguladoras de Genes/genética , GenomaRESUMO
Spatial patterning of gene expression is a key process in development, yet how it evolves is still poorly understood. Both cis- and trans-acting changes could participate in complex interactions, so to isolate the cis-regulatory component of patterning evolution, we measured allele-specific spatial gene expression patterns in D. melanogaster × simulans hybrid embryos. RNA-seq of cryo-sectioned slices revealed 66 genes with strong spatially varying allele-specific expression. We found that hunchback, a major regulator of developmental patterning, had reduced expression of the D. simulans allele specifically in the anterior tip of hybrid embryos. Mathematical modeling of hunchback cis-regulation suggested a candidate transcription factor binding site variant, which we verified as causal using CRISPR-Cas9 genome editing. In sum, even comparing morphologically near-identical species we identified surprisingly extensive spatial variation in gene expression, suggesting not only that development is robust to many such changes, but also that natural selection may have ample raw material for evolving new body plans via changes in spatial patterning.
Assuntos
Drosophila/genética , Embrião não Mamífero , Sequências Reguladoras de Ácido Nucleico , Alelos , Animais , Sistemas CRISPR-Cas , Proteínas de Drosophila/genética , Edição de Genes , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Marcação de Genes , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Many behaviors are associated with heritable genetic variation [Kendler and Greenspan (2006) Am J Psychiatry 163:1683-1694]. Genetic mapping has revealed genomic regions or, in a few cases, specific genes explaining part of this variation [Bendesky and Bargmann (2011) Nat Rev Gen 12:809-820]. However, the genetic basis of behavioral evolution remains unclear. Here we investigate the evolution of an innate extended phenotype, bower building, among cichlid fishes of Lake Malawi. Males build bowers of two types, pits or castles, to attract females for mating. We performed comparative genome-wide analyses of 20 bower-building species and found that these phenotypes have evolved multiple times with thousands of genetic variants strongly associated with this behavior, suggesting a polygenic architecture. Remarkably, F1 hybrids of a pit-digging and a castle-building species perform sequential construction of first a pit and then a castle bower. Analysis of brain gene expression in these hybrids showed that genes near behavior-associated variants display behavior-dependent allele-specific expression with preferential expression of the pit-digging species allele during pit digging and of the castle-building species allele during castle building. These genes are highly enriched for functions related to neurodevelopment and neural plasticity. Our results suggest that natural behaviors are associated with complex genetic architectures that alter behavior via cis-regulatory differences whose effects on gene expression are specific to the behavior itself.
Assuntos
Comportamento Animal/fisiologia , Ciclídeos/genética , Animais , Mapeamento Cromossômico , Expressão Gênica , Regulação da Expressão Gênica/genética , Variação Genética/genética , Genoma/genética , Estudo de Associação Genômica Ampla , Lagos , Malaui , MasculinoRESUMO
Genetic variants affecting gene-expression levels are a major source of phenotypic variation. The approximate locations of these variants can be mapped as expression quantitative trait loci (eQTLs); however, a major limitation of eQTLs is their low resolution, which precludes investigation of the causal variants and their molecular mechanisms. Here we report RNA-seq and full genome sequences for 85 diverse isolates of the yeast Saccharomyces cerevisiae-including wild, domesticated, and human clinical strains-which allowed us to perform eQTL mapping with 50-fold higher resolution than previously possible. In addition to variants in promoters, we uncovered an important role for variants in 3'UTRs, especially those affecting binding of the PUF family of RNA-binding proteins. The eQTLs are predominantly under negative selection, particularly those affecting essential genes and conserved genes. However, applying the sign test for lineage-specific selection revealed the polygenic up-regulation of dozens of biofilm suppressor genes in strains isolated from human patients, consistent with the key role of biofilms in fungal pathogenicity. In addition, a single variant in the promoter of a biofilm suppressor, NIT3, showed the strongest genome-wide association with clinical origin. Altogether, our results demonstrate the power of high-resolution eQTL mapping in understanding the molecular mechanisms of regulatory variation, as well as the natural selection acting on this variation that drives adaptation to environments, ranging from laboratories to vineyards to the human body.
Assuntos
Cromossomos Fúngicos , Regulação Fúngica da Expressão Gênica , Variação Genética , Locos de Características Quantitativas , Saccharomyces cerevisiae/genética , Biofilmes , Mapeamento Cromossômico , Estudo de Associação Genômica Ampla , Humanos , Micoses/microbiologia , RNA Fúngico , Sequências Reguladoras de Ácido Nucleico , Seleção Genética , Análise de Sequência de RNARESUMO
Sun-exposure is a key environmental variable in the study of human evolution. Several skin-pigmentation genes serve as classical examples of positive selection, suggesting that sun-exposure has significantly shaped worldwide genomic variation. Here we investigate the interaction between genetic variation and sun-exposure, and how this impacts gene expression regulation. Using RNA-Seq data from 607 human skin samples, we identified thousands of transcripts that are differentially expressed between sun-exposed skin and non-sun-exposed skin. We then tested whether genetic variants may influence each individual's gene expression response to sun-exposure. Our analysis revealed 10 sun-exposure-dependent gene expression quantitative trait loci (se-eQTLs), including genes involved in skin pigmentation (SLC45A2) and epidermal differentiation (RASSF9). The allele frequencies of the RASSF9 se-eQTL across diverse populations correlate with the magnitude of solar radiation experienced by these populations, suggesting local adaptation to varying levels of sunlight. These results provide the first examples of sun-exposure-dependent regulatory variation and suggest that this variation has contributed to recent human adaptation.
Assuntos
Antígenos de Neoplasias/genética , Proteínas de Membrana Transportadoras/genética , Dermatopatias/genética , Pigmentação da Pele/genética , Luz Solar/efeitos adversos , Proteínas de Transporte Vesicular/genética , Antígenos de Neoplasias/biossíntese , Diferenciação Celular/genética , Diferenciação Celular/efeitos da radiação , Epiderme/metabolismo , Epiderme/efeitos da radiação , Feminino , Regulação da Expressão Gênica/efeitos da radiação , Frequência do Gene , Estudos de Associação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Proteínas de Membrana Transportadoras/biossíntese , Locos de Características Quantitativas/genética , Pele/fisiopatologia , Pele/efeitos da radiação , Dermatopatias/etiologia , Dermatopatias/fisiopatologia , Pigmentação da Pele/efeitos da radiação , Proteínas de Transporte Vesicular/biossínteseRESUMO
DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.
Assuntos
Mapeamento Cromossômico , Metilação de DNA , Polimorfismo de Nucleotídeo Único , Alelos , Linhagem Celular , Biologia Computacional , Simulação por Computador , Bases de Dados Genéticas , Epigênese Genética , Regulação da Expressão Gênica , Biblioteca Gênica , Estudos de Associação Genética , Genoma Humano , Genótipo , Humanos , Fenótipo , Locos de Características Quantitativas , Reprodutibilidade dos Testes , Análise de Sequência de DNARESUMO
Although single genes underlying several evolutionary adaptations have been identified, the genetic basis of complex, polygenic adaptations has been far more challenging to pinpoint. Here we report that the budding yeast Saccharomyces paradoxus has recently evolved resistance to citrinin, a naturally occurring mycotoxin. Applying a genome-wide test for selection on cis-regulation, we identified five genes involved in the citrinin response that are constitutively up-regulated in S. paradoxus. Four of these genes are necessary for resistance, and are also sufficient to increase the resistance of a sensitive strain when over-expressed. Moreover, cis-regulatory divergence in the promoters of these genes contributes to resistance, while exacting a cost in the absence of citrinin. Our results demonstrate how the subtle effects of individual regulatory elements can be combined, via natural selection, into a complex adaptation. Our approach can be applied to dissect the genetic basis of polygenic adaptations in a wide range of species.
Assuntos
Adaptação Fisiológica/genética , Aptidão Genética , Regiões Promotoras Genéticas , Saccharomyces/genética , Antifúngicos/toxicidade , Citrinina/toxicidade , Farmacorresistência Fúngica/genética , Genes Fúngicos , Saccharomyces/efeitos dos fármacos , Saccharomyces/metabolismo , Seleção GenéticaRESUMO
In addition to coding for proteins, exons can also impact transcription by encoding regulatory elements such as enhancers. It has been debated whether such features confer heightened selective constraint, or evolve neutrally. We have addressed this question by developing a new approach to disentangle the sources of selection acting on exonic enhancers, in which we model the evolutionary rates of every possible substitution as a function of their effects on both protein sequence and enhancer activity. In three exonic enhancers, we found no significant association between evolutionary rates and effects on enhancer activity. This suggests that despite having biochemical activity, these exonic enhancers have no detectable selective constraint, and thus are unlikely to play a major role in protein evolution.
Assuntos
Elementos Facilitadores Genéticos , Evolução Molecular , Éxons , Seleção Genética , Regulação da Expressão Gênica , FilogeniaRESUMO
Despite the greater functional importance of protein levels, our knowledge of gene expression evolution is based almost entirely on studies of mRNA levels. In contrast, our understanding of how translational regulation evolves has lagged far behind. Here we have applied ribosome profiling--which measures both global mRNA levels and their translation rates--to two species of Saccharomyces yeast and their interspecific hybrid in order to assess the relative contributions of changes in mRNA abundance and translation to regulatory evolution. We report that both cis- and trans-acting regulatory divergence in translation are abundant, affecting at least 35% of genes. The majority of translational divergence acts to buffer changes in mRNA abundance, suggesting a widespread role for stabilizing selection acting across regulatory levels. Nevertheless, we observe evidence of lineage-specific selection acting on several yeast functional modules, including instances of reinforcing selection acting at both levels of regulation. Finally, we also uncover multiple instances of stop-codon readthrough that are conserved between species. Our analysis reveals the underappreciated complexity of post-transcriptional regulatory divergence and indicates that partitioning the search for the locus of selection into the binary categories of "coding" versus "regulatory" may overlook a significant source of selection, acting at multiple regulatory levels along the path from genotype to phenotype.
Assuntos
Evolução Molecular , RNA Fúngico/genética , RNA Mensageiro/genética , Ribossomos/genética , Saccharomyces/classificação , Saccharomyces/genética , Códon , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Genes Fúngicos , Genoma Fúngico , Modelos Genéticos , Filogenia , Elementos Reguladores de Transcrição , Especificidade da EspécieRESUMO
The recent advent of ribosome profiling-sequencing of short ribosome-bound fragments of mRNA-has offered an unprecedented opportunity to interrogate the sequence features responsible for modulating translational rates. Nevertheless, numerous analyses of the first riboprofiling data set have produced equivocal and often incompatible results. Here we analyze three independent yeast riboprofiling data sets, including two with much higher coverage than previously available, and find that all three show substantial technical sequence biases that confound interpretations of ribosomal occupancy. After accounting for these biases, we find no effect of previously implicated factors on ribosomal pausing. Rather, we find that incorporation of proline, whose unique side-chain stalls peptide synthesis in vitro, also slows the ribosome in vivo. We also reanalyze a method that implicated positively charged amino acids as the major determinant of ribosomal stalling and demonstrate that it produces false signals of stalling in low-coverage data. Our results suggest that any analysis of riboprofiling data should account for sequencing biases and sparse coverage. To this end, we establish a robust methodology that enables analysis of ribosome profiling data without prior assumptions regarding which positions spanned by the ribosome cause stalling.