RESUMO
Genetic variation segregating within a species reflects the combined activities of mutation, selection, and genetic drift. In the absence of selection, polymorphisms are expected to be a random subset of new mutations; thus, comparing the effects of polymorphisms and new mutations provides a test for selection. When evidence of selection exists, such comparisons can identify properties of mutations that are most likely to persist in natural populations. Here we investigate how mutation and selection have shaped variation in a cis-regulatory sequence controlling gene expression by empirically determining the effects of polymorphisms segregating in the TDH3 promoter among 85 strains of Saccharomyces cerevisiae and comparing their effects to a distribution of mutational effects defined by 236 point mutations in the same promoter. Surprisingly, we find that selection on expression noise (that is, variability in expression among genetically identical cells) appears to have had a greater impact on sequence variation in the TDH3 promoter than selection on mean expression level. This is not necessarily because variation in expression noise impacts fitness more than variation in mean expression level, but rather because of differences in the distributions of mutational effects for these two phenotypes. This study shows how systematically examining the effects of new mutations can enrich our understanding of evolutionary mechanisms. It also provides rare empirical evidence of selection acting on expression noise.
Assuntos
Polimorfismo Genético/genética , Regiões Promotoras Genéticas/genética , Saccharomyces cerevisiae/genética , Seleção Genética/genética , Evolução Molecular , Regulação Fúngica da Expressão Gênica/genética , Gliceraldeído-3-Fosfato Desidrogenase (Fosforiladora)/genética , Mutação/genética , Fenótipo , Proteínas de Saccharomyces cerevisiae/genéticaRESUMO
Genetic variation within and between species can be shaped by population-level processes and mutation; however, the relative impact of "survival of the fittest" and "arrival of the fittest" on phenotypic evolution remains unclear. Assessing the influence of mutation on evolution requires understanding the relative rates of different types of mutations and their genetic properties, yet little is known about the functional consequences of new mutations. Here, we examine the spectrum of mutations affecting a focal gene in Saccharomyces cerevisiae by characterizing 231 novel haploid genotypes with altered activity of a fluorescent reporter gene. 7% of these genotypes had a nonsynonymous mutation in the coding sequence for the fluorescent protein and were classified as "coding" mutants; 2% had a change in the S. cerevisiae TDH3 promoter sequence controlling expression of the fluorescent protein and were classified as "cis-regulatory" mutants; 10% contained two copies of the reporter gene and were classified as "copy number" mutants; and the remaining 81% showed altered fluorescence without a change in the reporter gene itself and were classified as "trans-acting" mutants. As a group, coding mutants had the strongest effect on reporter gene activity and always decreased it. By contrast, 50%-95% of the mutants in each of the other three classes increased gene activity, with mutants affecting copy number and cis-regulatory sequences having larger median effects on gene activity than trans-acting mutants. When made heterozygous in diploid cells, coding, cis-regulatory, and copy number mutant genotypes all had significant effects on gene activity, whereas 88% of the trans-acting mutants appeared to be recessive. These differences in the frequency, effects, and dominance among functional classes of mutations might help explain why some types of mutations are found to be segregating within or fixed between species more often than others.
Assuntos
Variações do Número de Cópias de DNA/genética , Gliceraldeído-3-Fosfato Desidrogenase (Fosforiladora)/genética , Mutação/genética , Fases de Leitura Aberta/genética , Sequências Reguladoras de Ácido Nucleico/genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Evolução Molecular , Genes Dominantes , Genes Recessivos , Genótipo , Haploidia , Heterozigoto , Taxa de Mutação , Regiões Promotoras GenéticasRESUMO
Gene expression levels vary heritably, with approximately 25-35% of the loci affecting expression acting in cis. We characterized standing cis-regulatory variation among 16 wild-derived strains of Drosophila melanogaster. Our experiment's robust biological and technical replication enabled precise estimates of variation in allelic expression on a high-throughput SNP genotyping platform. We observed concordant, significant differential allelic expression (DAE) in 7/10 genes queried with multiple SNPs, and every member of a set of eight additional, one-assay genes suggest significant DAE. Four of the high-confidence, multiple-assay genes harbor three or more statistically distinguishable allelic classes, often at intermediate frequency. Numerous intermediate-frequency, detectable regulatory polymorphisms cast doubt on a model in which cis-acting variation is a product of deleterious mutations of large effect. Comparing our data to predictions of population genetics theory using coalescent simulations, we estimate that a typical gene harbors 7-15 cis-regulatory sites (nucleotides) at which a selectively neutral mutation would elicit an observable expression phenotype. If standing cis-regulatory variation is actually slightly deleterious, the true mutational target size is larger.
Assuntos
Drosophila melanogaster/genética , Alelos , Animais , Animais Selvagens/genética , Feminino , Regulação da Expressão Gênica , Genes de Insetos , Teste de Complementação Genética , Variação Genética , Genética Populacional , Masculino , Mutação , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Differences in gene expression are thought to be an important source of phenotypic diversity, so dissecting the genetic components of natural variation in gene expression is important for understanding the evolutionary mechanisms that lead to adaptation. Gene expression is a complex trait that, in diploid organisms, results from transcription of both maternal and paternal alleles. Directly measuring allelic expression rather than total gene expression offers greater insight into regulatory variation. The recent emergence of high-throughput sequencing offers an unprecedented opportunity to study allelic transcription at a genomic scale for virtually any species. By sequencing transcript pools derived from heterozygous individuals, estimates of allelic expression can be directly obtained. The statistical power of this approach is influenced by the number of transcripts sequenced and the ability to unambiguously assign individual sequence fragments to specific alleles on the basis of transcribed nucleotide polymorphisms. Here, using mathematical modelling and computer simulations, we determine the minimum sequencing depth required to accurately measure relative allelic expression and detect allelic imbalance via high-throughput sequencing under a variety of conditions. We conclude that, within a species, a minimum of 500-1000 sequencing reads per gene are needed to test for allelic imbalance, and consequently, at least five to 10 millions reads are required for studying a genome expressing 10 000 genes. Finally, using 454 sequencing, we illustrate an application of allelic expression by testing for cis-regulatory divergence between closely related Drosophila species.
Assuntos
Alelos , Genômica/métodos , Modelos Estatísticos , Análise de Sequência de DNA/métodos , Animais , Simulação por Computador , Drosophila/genética , Expressão Gênica , Biblioteca Gênica , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Currently, the relevance of common genetic variants--particularly those significantly associated with phenotypic variation in laboratory studies--to standing phenotypic variation in the wild is poorly understood. To address this, we quantified the relationship between achaete-scute complex (ASC) polymorphisms and Drosophila bristle number phenotypes in several new population samples. MC22 is a biallelic, nonrepetitive-length polymorphism 97 bp downstream of the scute transcript. It has been previously shown to be associated with sternopleural bristle number variation in both sexes in a set of isogenic lines. We replicated this association in a large cohort of wild-caught Drosophila melanogaster. We also detected a significant association at MC22 in an outbred population maintained under laboratory conditions for approximately 25 years, but the phenotypic effects in this sample were opposite from the direction estimated in the initial study. Finally, no significant associations were detected in a second large wild-caught cohort or in a set of 134 nearly isogenic lines. Our ability to repeat the initial association in wild samples suggests that it was not spurious. Nevertheless, inconsistent results from the other three panels suggest that the relationship between polymorphic genetic markers and loci contributing to continuous variation is not a simple one.
Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Proteínas de Ligação a DNA/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/anatomia & histologia , Drosophila melanogaster/genética , Genes de Insetos , Fatores de Transcrição/genética , Estruturas Animais/anatomia & histologia , Animais , Animais Selvagens/anatomia & histologia , Animais Selvagens/genética , Sequência de Bases , California , DNA/genética , Meio Ambiente , Epistasia Genética , Feminino , Variação Genética , Genética Populacional , Genótipo , Desequilíbrio de Ligação , Masculino , Polimorfismo GenéticoRESUMO
BACKGROUND: Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly. RESULTS: By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers. CONCLUSION: Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics.
Assuntos
Borboletas/genética , Etiquetas de Sequências Expressas , Asas de Animais/crescimento & desenvolvimento , Animais , Borboletas/anatomia & histologia , Borboletas/crescimento & desenvolvimento , Bases de Dados Genéticas , Evolução Molecular , Biblioteca Gênica , Genes de Insetos , Repetições de Microssatélites , Polimorfismo Genético , Análise de Sequência de DNARESUMO
Adiponectin is an abundant adipose tissue-derived protein with important metabolic effects. Plasma adiponectin levels are decreased in obese individuals, and low adiponectin levels predict insulin resistance and type 2 diabetes. Two variants in the adiponectin gene ACDC have been previously associated with plasma adiponectin levels, obesity, insulin resistance, and type 2 diabetes. To determine the role of genetic variation in ACDC in susceptibility to obesity and type 2 diabetes in Pima Indians, we screened the promoter, exons, and exon-intron boundaries of the gene to identify allelic variants. We identified 17 informative polymorphisms that comprised four common (minor allele frequency >15%) linkage disequilibrium clusters consisting of 1-4 variants each. We genotyped one representative polymorphism from each cluster in 1,338 individuals and assessed genotypic association with type 2 diabetes, BMI, serum lipid levels, serum adiponectin levels, and measures of insulin sensitivity and secretion. None of the ACDC variants were associated with type 2 diabetes, BMI, or measures of insulin sensitivity or secretion. One variant, single nucleotide polymorphism (SNP)-12823, was associated with serum adiponectin levels (P = 0.002), but this association explained only 2% of the variance of serum adiponectin levels. Our findings suggest that these common ACDC polymorphisms do not play a major role in susceptibility to obesity or type 2 diabetes in this population.
Assuntos
Diabetes Mellitus Tipo 2/genética , Indígenas Norte-Americanos/genética , Peptídeos e Proteínas de Sinalização Intercelular/genética , Desequilíbrio de Ligação , Polimorfismo Genético , Polimorfismo de Nucleotídeo Único , Adiponectina , Arizona/epidemiologia , Índice de Massa Corporal , Diabetes Mellitus Tipo 2/sangue , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/fisiopatologia , Genótipo , Humanos , Peptídeos e Proteínas de Sinalização Intercelular/sangue , Lipídeos/sangue , Estudos Longitudinais , Obesidade/genética , Linhagem , PrevalênciaRESUMO
Genetic variants identified by mapping are biased toward large phenotypic effects because of methodologic challenges for detecting genetic variants with small phenotypic effects. Recently, bulk segregant analysis combined with next-generation sequencing (BSA-seq) was shown to be a powerful and cost-effective way to map small effect variants in natural populations. Here, we examine the power of BSA-seq for efficiently mapping small effect mutations isolated from a mutagenesis screen. Specifically, we determined the impact of segregant population size, intensity of phenotypic selection to collect segregants, number of mitotic generations between meiosis and sequencing, and average sequencing depth on power for mapping mutations with a range of effects on the phenotypic mean and standard deviation as well as relative fitness. We then used BSA-seq to map the mutations responsible for three ethyl methanesulfonate-induced mutant phenotypes in Saccharomyces cerevisiae. These mutants display small quantitative variation in the mean expression of a fluorescent reporter gene (-3%, +7%, and +10%). Using a genetic background with increased meiosis rate, a reliable mating type marker, and fluorescence-activated cell sorting to efficiently score large segregating populations and isolate cells with extreme phenotypes, we successfully mapped and functionally confirmed a single point mutation responsible for the mutant phenotype in all three cases. Our simulations and experimental data show that the effects of a causative site not only on the mean phenotype, but also on its standard deviation and relative fitness should be considered when mapping genetic variants in microorganisms such as yeast that require population growth steps for BSA-seq.
Assuntos
Mapeamento Cromossômico , Projetos de Pesquisa , Saccharomyces cerevisiae/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Genes Reporter , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas Luminescentes/genética , Proteínas Luminescentes/metabolismo , Mutagênese Sítio-Dirigida , Análise de Sequência de DNARESUMO
Genome-wide association studies are providing exciting new insight into the genetics of complex disease, but oftentimes, the genomic regions associated with the trait of interest are large enough to contain several equally plausible candidate genes. Commonly, no obvious, putatively functional, polymorphisms are found to segregate. In most cases, therefore, functional evaluation of possible regulatory mechanisms is necessary to narrow the list of potential candidates. One approach to functional characterization of such variants is allelic expression (AE) profiling, which provides an assessment of transcriptional differences between two homologous transcripts. In AE, a heterozygous, transcribed single nucleotide polymorphism is used to quantify the relative transcript abundance between two gene copies. A ratio that differs significantly from 1:1 suggests that the sample may be heterozygous for a cis-acting regulatory allele. The pattern of observed cis-regulatory variation in the profiled candidate genes can thus narrow the list of candidates under an association signal substantially. In addition, AE is also an accessible and economical strategy, as it relies heavily upon standard techniques and equipment likely to be present in any disease-mapping laboratory.
Assuntos
Alelos , Perfilação da Expressão Gênica/métodos , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Elementos Reguladores de Transcrição/genéticaRESUMO
BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize â¼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation.
Assuntos
Borboletas/crescimento & desenvolvimento , Borboletas/genética , Genes Controladores do Desenvolvimento/genética , Genes de Insetos/genética , Anotação de Sequência Molecular , Asas de Animais/crescimento & desenvolvimento , Asas de Animais/metabolismo , Álcool Desidrogenase/genética , Animais , Composição de Bases/genética , Sequência de Bases , Bombyx/genética , Cromossomos Artificiais Bacterianos/genética , Biologia Computacional , Sequência Conservada/genética , Elementos de DNA Transponíveis/genética , DNA Intergênico/genética , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Ordem dos Genes/genética , MicroRNAs/genética , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Filogenia , Sequências Repetitivas de Ácido Nucleico/genética , Reprodutibilidade dos Testes , Homologia de Sequência do Ácido Nucleico , Sintenia/genéticaRESUMO
Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in allele frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P<0.0001) across a wide range of allele frequencies (0.02-0.50). We further investigated the sensitivity of this method to detect allele frequency differences between contrived pools, also over a wide range of allele frequencies. We found that Pyrosequencing was able to detect an allele frequency difference of less than 2% between pools, indicating that this method may be sensitive enough for use in association studies involving complex diseases where a small difference in allele frequency between cases and controls is expected.
Assuntos
DNA/genética , DNA/isolamento & purificação , Frequência do Gene/genética , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Diabetes Mellitus/genética , Feminino , Humanos , Indígenas Norte-Americanos/genética , Masculino , Dados de Sequência Molecular , Reação em Cadeia da PolimeraseRESUMO
Circulating levels of the cytokine interleukin 6 (IL-6) are elevated in obesity, correlate with body mass index (BMI), and predict the development of type 2 diabetes mellitus (T2DM). A promoter polymorphism in the IL6 gene is associated with obesity, altered levels of insulin sensitivity, and T2DM. IL-6 exerts its effects by binding to the IL-6 receptor (IL-6R) and levels of IL-6R have been correlated with BMI. It is possible that IL6R variants may also be related to obesity, but to our knowledge, no study has yet examined this relationship. The objective of this study was to examine the relationship between genetic variants in the IL6R gene and obesity in Pima Indians, a population prone to excess adiposity. We sequenced 6kb of the IL6R gene, corresponding to all exons, exon-intron boundaries, and 2kb of promoter in 30 Pima Indians. We identified six single nucleotide polymorphisms (SNPs) in the IL6R gene: a predicted Asp --> Ala substitution at position 358, a variant in the 3'-untranslated region, and 4 intronic SNPs. All SNPs were in strong linkage disequilibrium (D' >/= 0.90) and varied in minor allele frequency from 0.33 to 0.48. Association between IL6R genotype and BMI (kg/m(2)) was assessed in approximately 700 nondiabetic, full-heritage Pima Indians. For each SNP, individuals carrying the variant allele had a higher mean BMI compared to those with the wild-type allele (range: [37.3+/-7.2-38.2+/-7.0] vs. [35.5+/-7.3-36.0+/-7.5]; P=0.02-0.004). Our findings suggest that genetic variants in the IL6R gene may play a role in susceptibility to obesity. Assessment of these SNPs in other populations will be useful to determine the magnitude of obesity risk.
Assuntos
Variação Genética , Indígenas Norte-Americanos/genética , Obesidade/etnologia , Receptores de Interleucina-6/genética , Arizona/epidemiologia , Índice de Massa Corporal , Primers do DNA , Frequência do Gene , Testes Genéticos , Genótipo , Humanos , Desequilíbrio de Ligação , Obesidade/genética , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNARESUMO
Linkage analysis has identified a susceptibility locus for type 2 diabetes mellitus (T2DM) on chromosome 1q21-q23 in several populations. Results from recent prospective studies indicate that increased levels of C-reactive protein (CRP), a marker of immune system activation, are predictive of diabetes, independent of adiposity. Because CRP is located on 1q21, we considered it a potential positional candidate gene for T2DM. We therefore evaluated CRP and the nearby serum amyloid P-component, APCS, which is structurally similar to CRP, as candidate diabetes susceptibility genes. Approximately 10.9kb of the CRP-APCS locus was screened for polymorphisms using denaturing high performance liquid chromatography and direct sequencing. We identified 27 informative polymorphisms, including 26 single nucleotide polymorphisms (SNPs) and 1 insertion/deletion, which were divided into 7 linkage disequilibrium clusters. We genotyped representative SNPs in approximately 1300 Pima samples and found a single variant in the CRP promoter (SNP 133552) that was associated with T2DM (P=0.014), as well as a common haplotype (CGCG) that was associated with both T2DM (P=0.029) and corrected insulin response, a surrogate measure of insulin secretion in non-diabetic subjects (P=0.050). Linkage analyses that adjusted for the effect of these polymorphisms indicated that they do not in themselves account for the observed linkage with T2DM on chromosome 1q. However, these findings suggest that variation within the CRP locus may play a role in diabetes susceptibility in Pima Indians.
Assuntos
Proteína C-Reativa/genética , Diabetes Mellitus Tipo 2/genética , Indígenas Norte-Americanos/genética , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Sequência de Bases , Primers do DNA , HumanosRESUMO
Chronic low-grade activation of the immune system may play a role in the pathogenesis of type-2 diabetes mellitus (T2DM). Interleukin-6 (IL6), a powerful inducer of hepatic acute phase response, has been implicated in the etiology of insulin resistance and T2DM. Recently, an IL6 promoter polymorphism (G/C) at position -174 was found to be associated with measures of insulin sensitivity. Because we have previously found an association between high IL6 levels and insulin resistance in both Pima Indians - a population with high rates of insulin resistance and T2DM - and Caucasians, we aimed to assess whether the IL6 promoter polymorphism is associated with T2DM in these populations. We genotyped the IL6 (-174) G/C polymorphism using pyrosequencing in 463 Native Americans and by PCR-RFLP in 329 Spanish Caucasians. Among the Spanish Caucasian subjects, there was a significant difference in genotypic distribution between diabetic and non-diabetic subjects (P=0.028); the GG genotype was more common in diabetic (0.40) than in non-diabetic (0.29) subjects. The G allele was much more frequent in the Native American sample, and among a sample of 143 cases and 145 controls, the GG genotype was significantly more common in diabetic subjects (P=0.019). When this sample population was stratified according to ethnic heritage, all 211 subjects who were of full Pima Indian heritage had the GG genotype, whereas in the 77 American Indian subjects with non-Pima admixture, T2DM was associated with IL6 genotype (P=0.001). These findings are consistent with a role for genetic determinants of inflammation in the development of T2DM in both Native Americans and Caucasians.