RESUMO
The founder population of Newfoundland and Labrador (NL) is a unique genetic resource, in part due to its geographic and cultural isolation, where historical records describe a migration of European settlers, primarily from Ireland and England, to NL in the 18th and 19th centuries. Whilst its historical isolation, and increased prevalence of certain monogenic disorders are well appreciated, details of the fine-scale genetic structure and ancestry of the population are lacking. Understanding the genetic origins and background of functional, disease causing, genetic variants would aid genetic mapping efforts in the Province. Here, we leverage dense genome-wide SNP data on 1,807 NL individuals to reveal fine-scale genetic structure in NL that is clustered around coastal communities and correlated with Christian denomination. We show that the majority of NL European ancestry can be traced back to the south-east and south-west of Ireland and England, respectively. We date a substantial population size bottleneck approximately 10-15 generations ago in NL, associated with increased haplotype sharing and autozygosity. Our results reveal insights into the population history of NL and demonstrate evidence of a population conducive to further genetic studies and biomarker discovery.
Assuntos
Genética Populacional , População Branca , Humanos , Terra Nova e Labrador , Irlanda , Migração HumanaRESUMO
The common-variant/common-disease model predicts that most risk alleles underlying complex health-related traits are common and, therefore, old and found in multiple populations, rather than being rare or population specific. Accordingly, there is widespread interest in assessing the population structure of common alleles. However, such assessments have been confounded by analysis of data sets with bias toward ascertainment of common alleles (e.g., HapMap and Perlegen) or in which a relatively small number of genes and/or populations were sampled. The aim of this study was to examine the structure of common variation ascertained in major U.S. populations, by resequencing the exons and flanking regions of 3,873 genes in 154 chromosomes from European, Latino/Hispanic, Asian, and African Americans generated by the Genaissance Resequencing Project. The frequency distributions of private and common single-nucleotide polymorphisms (SNPs) were measured, and the extent to which common SNPs were shared across populations was analyzed using several different estimators of population structure. Most SNPs that were common in one population were present in multiple populations, but SNPs common in one population were frequently not common in other populations. Moreover, SNPs that were common in two or more populations often differed significantly in frequency from one population to another, particularly in comparisons of African Americans versus other U.S. populations. These findings indicate that, even if the bulk of alleles underlying complex health-related traits are common SNPs, geographic ancestry might well be an important predictor of whether a person carries a risk allele.
Assuntos
Doenças Genéticas Inatas/genética , Variação Genética , Genética Populacional , Polimorfismo de Nucleotídeo Único , Regiões 3' não Traduzidas , Regiões 5' não Traduzidas , Linhagem Celular Transformada , DNA/sangue , DNA/genética , DNA/isolamento & purificação , Etnicidade/genética , Genótipo , Herpesvirus Humano 4/genética , Humanos , Fatores de Risco , Estados UnidosRESUMO
We have investigated the level of DNA-based variation (both SNPs and haplotypes) for several thousand human genes. In addition, we have characterized how this variation is distributed in a number of biologically and clinically important ways. First, we have determined how SNPs are distributed within human genes: where they occur relative to various functional regions; levels of variability of human SNPs; pattern of the molecular sequence of SNPs; and how these compare with the corresponding sequence of a chimpanzee. Second, we have determined how these aspects of SNP distribution vary among four human population samples. All genes were sequenced on DNA obtained from 82 unrelated individuals: 20 African-Americans, 20 East Asians, 21 European-Americans, 18 Hispanic-Latinos and three Native Americans. In particular, we looked at patterns of SNP and haplotype sharing among the four larger population samples. Third, we have determined the patterns of linkage disequilibrium among SNPs, which also determines the haplotype variability of each gene. These characteristics also vary substantially among populations. A deeper understanding of these aspects of human genetic variation will be of vital importance when trying to identify the genetic contribution to complex phenotypes such as aging.
Assuntos
DNA/genética , Variação Genética , Envelhecimento/genética , Animais , Evolução Molecular , Genética Populacional , Haplótipos , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Pan troglodytes/genética , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
INTRODUCTION: Guanine nucleotide binding proteins (G-proteins) represent the targets for >50% of all therapeutics. There is substantial interindividual variation in response to agonists and antagonists directed to these receptors, which may, in part, be due to genetic polymorphisms. As a class, the sequence variability of G-protein-coupled receptor (GPCR) genes has not been characterized. STUDY DESIGN: This variability was investigated by sequencing promoter, 5'- and 3'-UTR, coding blocks, and intron-exon boundaries, of 64 GPCR genes in an ethnically diverse group of 82 individuals. RESULTS: Of the 675 single-nucleotide variations found, 61% occurred in > or =1% of the population sample and the nature of these 412 single nucleotide polymorphisms (SNPs) was assessed. 5'-UTR (p = 0.002) and coding (p = 0.006) SNPs were observed more often in GPCR genes, compared with 309 non-GPCR genes similarly interrogated. The prevalence of non-synonymous coding SNPs was unexpectedly high, with 65% of GPCR genes having at least one. Intron-containing genes had half as many non-synonymous coding SNPs compared with intronless genes (p = 0.0009), suggesting that when introns are not available coding regions provide sites for variation. A distinct relationship between the prevalence of non-synonymous SNPs and receptor structural domains was evident (p = 0.0006 by ANOVA), with variability being most prominent in the transmembrane spanning domains (38%) and the intracellular loops (24%). Phosphoregulatory domains, particularly the carboxy terminus, often the site for agonist-promoted phosphorylation by G-protein coupled receptor kinases, were the least polymorphic (8%). CONCLUSIONS: There is substantial genetic variability in potentially pharmacologically relevant coding and noncoding regions of GPCRs. Such variability should be considered in the development of new agents, or optimization of existing agents, targeted to these receptors.
Assuntos
Proteínas de Ligação ao GTP/química , Proteínas de Ligação ao GTP/genética , Variação Genética/genética , Receptores de Superfície Celular/química , Receptores de Superfície Celular/genética , Análise de Variância , Humanos , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNARESUMO
We derive and compare several estimates of the number of SNPs that would be required to form the basis of a complete haplotype survey of the human genome. Our estimates make use of reports published by Stephens et al. [1], Patil et al. [2] and Daly et al. [3]. The estimated number of SNPs required for a genome-wide haplotype survey ranges from 180K (based on a European sample of 16 chromosomes) to 600K (based on an ethnically diverse sample of 164 chromosomes). We discuss the implications of using cohorts of different size and ethnic composition and the usefulness of public SNP databases for this effort. Finally, we estimate the experimental effort and cost required to complete a genome-wide haplotype survey.
Assuntos
Haplótipos/genética , Polimorfismo de Nucleotídeo Único/genética , Algoritmos , Mapeamento Cromossômico , Frequência do Gene , Genoma , Genótipo , HumanosRESUMO
We have studied the human genetic variability of single nucleotide polymorphisms (SNPs) and haplotypes in two pharmaceutically important classes of genes that might be expected to experience different evolutionary pressures: antigen presentation and processing (APP) and nuclear hormone receptor (NHR) genes. We compared the variation pattern in these two classes of genes with 5119 reference (REF) genes. We assessed this variability by sequencing and discovering SNPs in 5'-upstream, 5'-untranslated region (5'UTR), exon, intron, 3'UTR and 3'-downstream regions of all these genes in 79 unrelated humans from diverse ethnic backgrounds, one chimpanzee (Pan troglodytes) and a gorilla (Gorilla gorilla). SNP density and nucleotide diversity were higher in the APP genes than the REF genes. Relative to the REF genes, APP SNP density was significantly higher in the coding and 3'UTR regions. Higher variation in the coding region of the APP genes was due specifically to having more non-synonymous changes, which suggests that natural selection may be acting to promote change or diversity in these proteins. In contrast, the NHR genes showed lower SNP density and diversity relative to REF genes. The NHR genes consistently showed lower nucleotide diversity in all the genomic regions except in the 3'downstream region. SNP frequency data on the non-synonymous SNPs also suggested that the coding region in the NHR genes is conserved to a higher degree than the coding region in the REF genes. Significantly lower SNP density was observed in the 5'-upstream and 5'UTR regions of the NHR genes, perhaps reflecting selective conservation of these regions. Heterozygosity in the APP genes was significantly higher than in the NHR genes in each of the three species tested. Moreover, between species there were more fixed differences in the APP genes than in the NHR genes. Substantial variability exists in these two classes of genes. It is important to consider this interindividual variability pattern while developing drugs that act on such targets.
Assuntos
Apresentação de Antígeno/genética , Polimorfismo de Nucleotídeo Único , Receptores Citoplasmáticos e Nucleares/genética , Alelos , Animais , Evolução Biológica , Frequência do Gene , Hominidae/genética , HumanosRESUMO
The public SNP databases are an important resource for groups performing genetic association and linkage studies. Both academic and commercial groups are developing large numbers of genotyping assays for SNPs in candidate genes or spread across the genome. These databases now contain in excess of 6 million SNPs that have been generated using a large number of methods and cohorts. Today, however, only a small fraction of these SNPs are well characterized and validated. The latest release of dbSNP contains approximately 3.7 million non-redundant entries, only 0.5 million of which are validated, and 0.2 million of which have frequency information. Users of these databases have several common questions. How many of the SNPs are real? What is the frequency spectrum of the SNPs in these databases? What is the distribution picture of these SNPs across different ethnic and geographical populations? What fraction of the total number of SNPs is already captured by these databases? In order to address these questions, we compared the public SNPs against a well-characterized collection of gene-centric SNPs that we have developed. From this comparison, we find that > 50% of high frequency SNPs in the genome (> 20% minor allele frequency) have already been captured by these databases. The coverage drops dramatically below frequencies of 10%. At high frequencies, there is no sampling bias with respect to ethnicity or to regions of the genome. Finally, a relatively large fraction (> 40%) of SNPs in these databases were not seen in our study, which means that they are either of very low frequency, mismapped, or not polymorphic at all.
Assuntos
Bases de Dados Genéticas/estatística & dados numéricos , Genoma Humano , Polimorfismo de Nucleotídeo Único/genética , Bases de Dados Genéticas/tendências , HumanosRESUMO
OBJECTIVES: The purpose of this study was to determine the prevalence and spectrum of nonsynonymous polymorphisms (amino acid variants) in the cardiac sodium channel among healthy subjects. BACKGROUND: Pathogenic mutations in the cardiac sodium channel gene, SCN5A, cause approximately 15 to 20% of Brugada syndrome (BrS1), 5 to 10% of long QT syndrome (LQT3), and 2 to 5% of sudden infant death syndrome. METHODS: Using single-stranded conformation polymorphism, denaturing high-performance liquid chromatography, and/or direct DNA sequencing, mutational analysis of the protein-encoding exons of SCN5A was performed on 829 unrelated, anonymous healthy subjects: 319 black, 295 white, 112 Asian, and 103 Hispanic. RESULTS: In addition to the four known common polymorphisms (R34C, H558R, S1103Y, and R1193Q), four relatively ethnic-specific polymorphisms were identified: R481W, S524Y, P1090L, and V1951L. Overall, 39 distinct missense variants (28 novel) were elucidated. Nineteen variants (49%) were found only in the black cohort. Only seven variants (18%) localized to transmembrane-spanning domains. Four variants (F1293S, R1512W, and V1951L cited previously as BrS1-causing mutations and S1787N previously published as a possible LQT3-causing mutation) were identified in this healthy cohort. CONCLUSIONS: This study provides the first comprehensive determination of the prevalence and spectrum of cardiac sodium channel variants in healthy subjects from four distinct ethnic groups. This compendium of SCN5A variants is critical for proper interpretation of SCN5A genetic testing and provides an essential hit list of targets for future functional studies to determine whether or not any of these variants mediate genetic susceptibility for arrhythmias in the setting of either drugs or disease.
Assuntos
Frequência do Gene , Polimorfismo Conformacional de Fita Simples , Grupos Raciais/genética , Canais de Sódio/genética , Bloqueio de Ramo/genética , Cromatografia Líquida de Alta Pressão , Análise Mutacional de DNA , Éxons , Predisposição Genética para Doença , Humanos , Síndrome do QT Longo/genética , Mutação de Sentido Incorreto , Canal de Sódio Disparado por Voltagem NAV1.5 , Síndrome , Fibrilação Ventricular/genéticaRESUMO
The treatment of seriously mentally ill patients is complicated by variability in individual response to psychotropic drugs. Some patients remain treatment refractory even after two to three therapeutic modalities. Other patients experience adverse events that range from mild discomfort, to poor compliance, to life threatening. Genaissance Pharmaceuticals is actively engaged in a candidate gene-based haplotype (HAP Marker) approach to the pharmacogenetics of drug response and adverse events. In the present article, we review reasons why HAP Markers are more useful than single nucleotide polymorphisms (SNPs) for discovering genetic correlations to clinical response. In addition, we review our approach to HAP Marker discovery, which involves discovering SNPs in the functional regions of genes by sequencing, organizing these SNPs into HAP Markers for an index population of ethnically diverse individuals and calculating population frequencies for these HAP Markers. For clinical correlations, HAP Markers are defined and correlated to clinical data using the in-house DecoGen Informatics System. This approach has clear implications for the discovery of psychiatric disease-associated genes as well as for the development of safer, more efficacious psychiatric drugs.
Assuntos
Haplótipos , Farmacogenética/métodos , Psicotrópicos/farmacologia , Indústria Farmacêutica/tendências , Marcadores Genéticos , Humanos , Transtornos Mentais/tratamento farmacológico , Transtornos Mentais/genéticaRESUMO
We have surveyed and summarized several aspects of DNA variability among humans. The variation described is the result of mutation followed by a combination of drift, migration and selection bringing the frequencies high enough to be observed. This paper describes what we have learned about how DNA variability differs among genes and populations. We sequenced functional regions of a set of 3950 genes. DNA was sampled from 82 unrelated humans: 20 African-Americans, 20 East Asians, 21 Caucasians, 18 Hispanic-Latinos and 3 Native Americans. Different aspects of variability showed a great deal of concordance. In particular, we studied patterns of single nucleotide polymorphism (SNP) allele and haplotype sharing among the four, large sample populations. We also examined how linkage disequilibrium (LD) between SNPs relates to physical distance in the different populations. It is clear from our findings that while many variants are common to all populations, many others have a more restricted distribution. Research that attempts to find genetic variants that explain phenotypic variants must be careful in their choice of study population.
Assuntos
DNA/genética , Genoma Humano , Haplótipos/genética , Desequilíbrio de Ligação , Mutação/genética , Polimorfismo de Nucleotídeo Único/genética , Alelos , Análise Mutacional de DNA , Genética Populacional , Genótipo , Humanos , Grupos Raciais/genéticaRESUMO
In this article, we highlight some of the different types of natural selection, their effects on patterns of DNA variation, and some of the statistical tests that are commonly used to detect such effects. We also explain some of the relative strengths and weaknesses of different strategies that can be used to detect signatures of natural selection at individual loci. These strategies are illustrated by their application to empirical data from gene variants that are often associated with differences in disease susceptibility. We briefly outline some of the methods proposed to scan the genome for evidence of selection. Finally, we discuss some of the problems associated with identifying signatures of selection and with making inferences about the nature of the selective process.
Assuntos
Técnicas Genéticas , Variação Genética , Seleção Genética/genética , Genoma Humano/genética , HumanosRESUMO
Several studies of haplotype structures in the human genome in various populations have found that the human chromosomes are structured such that each chromosome can be divided into many blocks, within which there is limited haplotype diversity. In addition, only a few genetic markers in a putative block are needed to capture most of the diversity within a block. There has been no systematic empirical study of the effects of sample size and marker set on the identified block structures and representative marker sets, however. The purpose of this study was to conduct a detailed empirical study to examine such impacts. Towards this goal, we have analysed three representative autosomal regions from a large genome-wide study of haplotypes with samples consisting of African-Americans and samples consisting of Japanese and Chinese individuals. For both populations, we have found that the sample size and marker set have significant impact on the number of blocks and the total number of representative markers identified. The marker set in particular has very strong impacts, and our results indicate that the marker density in the original datasets may not be adequate to allow a meaningful characterisation of haplotype structures. In general, we conclude that we need a relatively large sample size and a very dense marker panel in the study of haplotype structures in human populations.