Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 124
Filter
Add more filters

Publication year range
1.
Nature ; 586(7831): 749-756, 2020 10.
Article in English | MEDLINE | ID: mdl-33087929

ABSTRACT

The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.


Subject(s)
Databases, Genetic , Exome Sequencing , Exome/genetics , Loss of Function Mutation/genetics , Phenotype , Aged , Bone Density/genetics , Collagen Type VI/genetics , Demography , Female , Genes, BRCA1 , Genes, BRCA2 , Genotype , Humans , Ion Channels/genetics , Male , Middle Aged , Neoplasms/genetics , Penetrance , Peptide Fragments/genetics , United Kingdom , Varicose Veins/genetics , ras GTPase-Activating Proteins/genetics
2.
Hum Mol Genet ; 25(R2): R166-R172, 2016 Oct 01.
Article in English | MEDLINE | ID: mdl-27538422

ABSTRACT

The hope for precision medicine has long been on the drug discovery horizon, well before the Human Genome Project gave it promise at the turn of the 21st century. In oncology, the concept has finally been realized and is now firmly embedded in ongoing drug discovery programs, and with many recent therapies involving some level of patient/disease stratification, including some highly personalized treatments. In addition, several drugs for rare diseases have been recently approved or are in late-stage clinical development, and new delivery modalities in cell and gene therapy and oligonucleotide approaches are yielding exciting new medicines for rare diseases of unmet need. For common complex diseases, however, the GWAS-driven advances in annotation of the genetic architecture over the past decade have not led to a concomitant shift in refined treatments. Similarly, attempts to disentangle treatment responders from non-responders via genetic predictors in pharmacogenetics studies have not met their anticipated success. It is possible that common diseases are simply lagging behind due to the inherent time lag with drug discovery, but it is also possible that their inherent multifactorial nature and their etiological and clinical heterogeneity will prove more resistant to refined treatment paradigms. The emergence of population-based resources in electronic health records, coupled with the rapid expansion of mobile devices and digital health may help to refine the measurement of phenotypic outcomes to match the exquisite detail emerging at the molecular level.

3.
Pharmacogenet Genomics ; 27(3): 89-100, 2017 03.
Article in English | MEDLINE | ID: mdl-27984508

ABSTRACT

OBJECTIVE: Proteins involving absorption, distribution, metabolism, and excretion (ADME) play a critical role in drug pharmacokinetics. The type and frequency of genetic variation in the ADME genes differ among populations. The aim of this study was to systematically investigate common and rare ADME coding variation in diverse ethnic populations by exome sequencing. MATERIALS AND METHODS: Data derived from commercial exome capture arrays and next-generation sequencing were used to characterize coding variation in 298 ADME genes in 251 Northeast Asians and 1181 individuals from the 1000 Genomes Project. RESULTS: Approximately 75% of the ADME coding sequence was captured at high quality across the joint samples harboring more than 8000 variants, with 49% of individuals carrying at least one 'knockout' allele. ADME genes carried 50% more nonsynonymous variation than non-ADME genes (P=8.2×10) and showed significantly greater levels of population differentiation (P=7.6×10). Out of the 2135 variants identified that were predicted to be deleterious, 633 were not on commercially available ADME or general-purpose genotyping arrays. Forty deleterious variants within important ADME genes, with frequencies of at least 2% in at least one population, were identified as candidates for future pharmacogenetic studies. CONCLUSION: Exome sequencing was effective in accurately genotyping most ADME variants important for pharmacogenetic research, in addition to identifying rare or potentially de novo coding variants that may be clinically meaningful. Furthermore, as a class, ADME genes are more variable and less sensitive to purifying selection than non-ADME genes.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Oligonucleotide Array Sequence Analysis/methods , Population Groups/genetics , Sequence Analysis, DNA/methods , Exome , Genetic Variation , Genetics, Population , Humans , Male , Polymorphism, Single Nucleotide , Population Groups/ethnology , Principal Component Analysis
4.
Nature ; 465(7296): 305-10, 2010 May 20.
Article in English | MEDLINE | ID: mdl-20485427

ABSTRACT

Malaria is a devastating infection caused by protozoa of the genus Plasmodium. Drug resistance is widespread, no new chemical class of antimalarials has been introduced into clinical practice since 1996 and there is a recent rise of parasite strains with reduced sensitivity to the newest drugs. We screened nearly 2 million compounds in GlaxoSmithKline's chemical library for inhibitors of P. falciparum, of which 13,533 were confirmed to inhibit parasite growth by at least 80% at 2 microM concentration. More than 8,000 also showed potent activity against the multidrug resistant strain Dd2. Most (82%) compounds originate from internal company projects and are new to the malaria community. Analyses using historic assay data suggest several novel mechanisms of antimalarial action, such as inhibition of protein kinases and host-pathogen interaction related targets. Chemical structures and associated data are hereby made public to encourage additional drug lead identification efforts and further research into this disease.


Subject(s)
Antimalarials/analysis , Antimalarials/pharmacology , Drug Discovery , Malaria, Falciparum/drug therapy , Plasmodium falciparum/drug effects , Small Molecule Libraries/analysis , Small Molecule Libraries/pharmacology , Animals , Antimalarials/chemistry , Antimalarials/toxicity , Cell Line, Tumor , Drug Resistance, Multiple/drug effects , Humans , Malaria, Falciparum/parasitology , Models, Biological , Phylogeny , Plasmodium falciparum/enzymology , Plasmodium falciparum/genetics , Plasmodium falciparum/growth & development , Small Molecule Libraries/chemistry , Small Molecule Libraries/toxicity
5.
Nat Genet ; 39(7): 827-9, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17558408

ABSTRACT

We tested 310,605 SNPs for association in 778 individuals with celiac disease and 1,422 controls. Outside the HLA region, the most significant finding (rs13119723; P = 2.0 x 10(-7)) was in the KIAA1109-TENR-IL2-IL21 linkage disequilibrium block. We independently confirmed association in two further collections (strongest association at rs6822844, 24 kb 5' of IL21; meta-analysis P = 1.3 x 10(-14), odds ratio = 0.63), suggesting that genetic variation in this region predisposes to celiac disease.


Subject(s)
Celiac Disease/genetics , Genetic Predisposition to Disease , Genetic Variation , Genome, Human , Interleukin-2/genetics , Interleukins/genetics , Animals , Chromosomes, Human, Pair 4/genetics , Humans , Linkage Disequilibrium , Mice , Polymorphism, Single Nucleotide , Risk Factors
6.
Nat Genet ; 39(7): 830-2, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17554261

ABSTRACT

A genome-wide association scan in individuals with Crohn's disease by the Wellcome Trust Case Control Consortium detected strong association at four novel loci. We tested 37 SNPs from these and other loci for association in an independent case-control sample. We obtained replication for the autophagy-inducing IRGM gene on chromosome 5q33.1 (replication P = 6.6 x 10(-4), combined P = 2.1 x 10(-10)) and for nine other loci, including NKX2-3, PTPN2 and gene deserts on chromosomes 1q and 5p13.


Subject(s)
Autophagy/genetics , Crohn Disease/genetics , GTP-Binding Proteins/genetics , Genetic Predisposition to Disease , Genetic Variation , Animals , Case-Control Studies , Humans , Mice , Polymorphism, Single Nucleotide , Sequence Analysis, DNA
7.
Nature ; 461(7265): 747-53, 2009 Oct 08.
Article in English | MEDLINE | ID: mdl-19812666

ABSTRACT

Genome-wide association studies have identified hundreds of genetic variants associated with complex human diseases and traits, and have provided valuable insights into their genetic architecture. Most variants identified so far confer relatively small increments in risk, and explain only a small proportion of familial clustering, leading many to question how the remaining, 'missing' heritability can be explained. Here we examine potential sources of missing heritability and propose research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.


Subject(s)
Genetic Diseases, Inborn/genetics , Genetic Predisposition to Disease/genetics , Genetics, Medical/methods , Genetics, Medical/trends , Genome-Wide Association Study/methods , Genome-Wide Association Study/trends , Humans , Inheritance Patterns/genetics , Pedigree
8.
Nat Genet ; 38(6): 659-62, 2006 Jun.
Article in English | MEDLINE | ID: mdl-16715099

ABSTRACT

Genome-wide association studies involving hundreds of thousands of SNPs in thousands of cases and controls are now underway. The first of many analytical challenges in these studies involves the choice of SNPs to genotype. It is not practical to construct a different panel of tag SNPs for each study, so the first generation of genome-wide scans will use predefined, commercially available marker panels, which will in part dictate their success or failure. We compare different approaches in use today, and show that although many of them provide substantial coverage of common variation in non-African populations, the precise extent is strongly dependent on the frequencies of alleles of interest and on specific considerations of study design. Overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first-generation high-throughput platforms all offer similar levels of genome coverage.


Subject(s)
Genome, Human , Polymorphism, Single Nucleotide , Case-Control Studies , Haplotypes , Humans , Likelihood Functions , Linkage Disequilibrium
9.
Nat Rev Genet ; 9(5): 356-69, 2008 May.
Article in English | MEDLINE | ID: mdl-18398418

ABSTRACT

The past year has witnessed substantial advances in understanding the genetic basis of many common phenotypes of biomedical importance. These advances have been the result of systematic, well-powered, genome-wide surveys exploring the relationships between common sequence variation and disease predisposition. This approach has revealed over 50 disease-susceptibility loci and has provided insights into the allelic architecture of multifactorial traits. At the same time, much has been learned about the successful prosecution of association studies on such a scale. This Review highlights the knowledge gained, defines areas of emerging consensus, and describes the challenges that remain as researchers seek to obtain more complete descriptions of the susceptibility architecture of biomedical traits of interest and to translate the information gathered into improvements in clinical management.


Subject(s)
Genetic Diseases, Inborn/genetics , Genetic Predisposition to Disease , Genetic Variation , Genome, Human , Quantitative Trait Loci , Quantitative Trait, Heritable , Alleles , Animals , Humans
10.
Nat Genet ; 37(4): 413-7, 2005 Apr.
Article in English | MEDLINE | ID: mdl-15793588

ABSTRACT

After nearly 10 years of intense academic and commercial research effort, large genome-wide association studies for common complex diseases are now imminent. Although these conditions involve a complex relationship between genotype and phenotype, including interactions between unlinked loci, the prevailing strategies for analysis of such studies focus on the locus-by-locus paradigm. Here we consider analytical methods that explicitly look for statistical interactions between loci. We show first that they are computationally feasible, even for studies of hundreds of thousands of loci, and second that even with a conservative correction for multiple testing, they can be more powerful than traditional analyses under a range of models for interlocus interactions. We also show that plausible variations across populations in allele frequencies among interacting loci can markedly affect the power to detect their marginal effects, which may account in part for the well-known difficulties in replicating association results. These results suggest that searching for interactions among genetic loci can be fruitfully incorporated into analysis strategies for genome-wide association studies.


Subject(s)
Genetic Diseases, Inborn/genetics , Genetic Predisposition to Disease , Genome, Human , Alleles , Genetic Linkage , Genetic Markers , Genetics, Population , Humans , Models, Genetic
11.
Nat Genet ; 37(12): 1320-2, 2005 Dec.
Article in English | MEDLINE | ID: mdl-16258542

ABSTRACT

A substantial investment has been made in the generation of large public resources designed to enable the identification of tag SNP sets, but data establishing the adequacy of the sample sizes used are limited. Using large-scale empirical and simulated data sets, we found that the sample sizes used in the HapMap project are sufficient to capture common variation, but that performance declines substantially for variants with minor allele frequencies of <5%.


Subject(s)
Chromosome Mapping , Databases, Nucleic Acid , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genome, Human/genetics , Polymorphism, Single Nucleotide , Gene Frequency , Humans , Linkage Disequilibrium , Sample Size
12.
Nat Genet ; 36(5): 512-7, 2004 May.
Article in English | MEDLINE | ID: mdl-15052271

ABSTRACT

Large-scale association studies hold substantial promise for unraveling the genetic basis of common human diseases. A well-known problem with such studies is the presence of undetected population structure, which can lead to both false positive results and failures to detect genuine associations. Here we examine approximately 15,000 genome-wide single-nucleotide polymorphisms typed in three population groups to assess the consequences of population structure on the coming generation of association studies. The consequences of population structure on association outcomes increase markedly with sample size. For the size of study needed to detect typical genetic effects in common diseases, even the modest levels of population structure within population groups cannot safely be ignored. We also examine one method for correcting for population structure (Genomic Control). Although it often performs well, it may not correct for structure if too few loci are used and may overcorrect in other settings, leading to substantial loss of power. The results of our analysis can guide the design of large-scale association studies.


Subject(s)
Genetic Markers , Genetic Predisposition to Disease , Genetics, Population , Polymorphism, Single Nucleotide/genetics , Genetic Variation , Humans , Linkage Disequilibrium , Models, Genetic , Quantitative Trait, Heritable
13.
Nat Genet ; 30(1): 97-101, 2002 Jan.
Article in English | MEDLINE | ID: mdl-11731797

ABSTRACT

Efforts to find disease genes using high-density single-nucleotide polymorphism (SNP) maps will produce data sets that exceed the limitations of current computational tools. Here we describe a new, efficient method for the analysis of dense genetic maps in pedigree data that provides extremely fast solutions to common problems such as allele-sharing analyses and haplotyping. We show that sparse binary trees represent patterns of gene flow in general pedigrees in a parsimonious manner, and derive a family of related algorithms for pedigree traversal. With these trees, exact likelihood calculations can be carried out efficiently for single markers or for multiple linked markers. Using an approximate multipoint calculation that ignores the unlikely possibility of a large number of recombinants further improves speed and provides accurate solutions in dense maps with thousands of markers. Our multipoint engine for rapid likelihood inference (Merlin) is a computer program that uses sparse inheritance trees for pedigree analysis; it performs rapid haplotyping, genotype error detection and affected pair linkage analyses and can handle more markers than other pedigree analysis packages.


Subject(s)
Algorithms , Genetic Linkage , Likelihood Functions , Software , Female , Genotype , Haplotypes , Humans , Male , Meiosis , Pedigree , Polymorphism, Genetic
14.
Nat Genet ; 30(1): 86-91, 2002 Jan.
Article in English | MEDLINE | ID: mdl-11743577

ABSTRACT

Developmental dyslexia is defined as a specific and significant impairment in reading ability that cannot be explained by deficits in intelligence, learning opportunity, motivation or sensory acuity. It is one of the most frequently diagnosed disorders in childhood, representing a major educational and social problem. It is well established that dyslexia is a significantly heritable trait with a neurobiological basis. The etiological mechanisms remain elusive, however, despite being the focus of intensive multidisciplinary research. All attempts to map quantitative-trait loci (QTLs) influencing dyslexia susceptibility have targeted specific chromosomal regions, so that inferences regarding genetic etiology have been made on the basis of very limited information. Here we present the first two complete QTL-based genome-wide scans for this trait, in large samples of families from the United Kingdom and United States. Using single-point analysis, linkage to marker D18S53 was independently identified as being one of the most significant results of the genome in each scan (P< or =0.0004 for single word-reading ability in each family sample). Multipoint analysis gave increased evidence of 18p11.2 linkage for single-word reading, yielding top empirical P values of 0.00001 (UK) and 0.0004 (US). Measures related to phonological and orthographic processing also showed linkage at this locus. We replicated linkage to 18p11.2 in a third independent sample of families (from the UK), in which the strongest evidence came from a phoneme-awareness measure (most significant P value=0.00004). A combined analysis of all UK families confirmed that this newly discovered 18p QTL is probably a general risk factor for dyslexia, influencing several reading-related processes. This is the first report of QTL-based genome-wide scanning for a human cognitive trait.


Subject(s)
Chromosome Mapping/methods , Chromosomes, Human, Pair 18/genetics , Dyslexia/genetics , Quantitative Trait, Heritable , Child , Chromosomes, Human, Pair 6/genetics , Diseases in Twins/genetics , Female , Genetic Heterogeneity , Genetic Linkage , Genetic Markers , Genotype , Humans , Lod Score , Male , Psychological Tests , United Kingdom , United States
15.
Nat Genet ; 34(2): 181-6, 2003 Jun.
Article in English | MEDLINE | ID: mdl-12754510

ABSTRACT

Atopic or immunoglobulin E (IgE)-mediated diseases include the common disorders of asthma, atopic dermatitis and allergic rhinitis. Chromosome 13q14 shows consistent linkage to atopy and the total serum IgE concentration. We previously identified association between total serum IgE levels and a novel 13q14 microsatellite (USAT24G1; ref. 7) and have now localized the underlying quantitative-trait locus (QTL) in a comprehensive single-nucleotide polymorphism (SNP) map. We found replicated association to IgE levels that was attributed to several alleles in a single gene, PHF11. We also found association with these variants to severe clinical asthma. The gene product (PHF11) contains two PHD zinc fingers and probably regulates transcription. Distinctive splice variants were expressed in immune tissues and cells.


Subject(s)
Asthma/genetics , Chromosomes, Human, Pair 13/genetics , Immunoglobulin E/blood , Quantitative Trait Loci , Adult , Alleles , Alternative Splicing , Case-Control Studies , Child , Female , Haplotypes , Humans , Male , Molecular Sequence Data , Polymorphism, Single Nucleotide , Tissue Distribution , Zinc Fingers/genetics
16.
Hum Mutat ; 33(7): 1087-98, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22415848

ABSTRACT

Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single-nucleotide variants, 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Among Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Protein Serine-Threonine Kinases/genetics , Europe , Genetic Predisposition to Disease , Genetics, Population , Humans , Leucine-Rich Repeat Serine-Threonine Protein Kinase-2 , Mutation , Parkinson Disease/genetics , White People/genetics
17.
Nature ; 443(7111): 574-7, 2006 Oct 05.
Article in English | MEDLINE | ID: mdl-17006452

ABSTRACT

Genes in the major histocompatibility complex (MHC) encode proteins important in activating antigen-specific immune responses. Alleles at adjacent MHC loci are often in strong linkage disequilibrium; however, little is known about the mechanisms responsible for this linkage disequilibrium. Here we report that the human MHC HLA-DR2 haplotype, which predisposes to multiple sclerosis, shows more extensive linkage disequilibrium than other common caucasian HLA haplotypes in the DR region and thus seems likely to have been maintained through positive selection. Characterization of two multiple-sclerosis-associated HLA-DR alleles at separate loci by a functional assay in humanized mice indicates that the linkage disequilibrium between the two alleles may be due to a functional epistatic interaction, whereby one allele modifies the T-cell response activated by the second allele through activation-induced cell death. This functional epistasis is associated with a milder form of multiple-sclerosis-like disease. Such epistatic interaction might prove to be an important general mechanism for modifying exuberant immune responses that are deleterious to the host and could also help to explain the strong linkage disequilibrium in this and perhaps other HLA haplotypes.


Subject(s)
Epistasis, Genetic , HLA-DR2 Antigen/genetics , Haplotypes/genetics , Multiple Sclerosis/genetics , Alleles , Animals , CD4-Positive T-Lymphocytes/immunology , Disease Models, Animal , Encephalomyelitis, Autoimmune, Experimental/genetics , Encephalomyelitis, Autoimmune, Experimental/pathology , Humans , Linkage Disequilibrium/genetics , Mice , Multiple Sclerosis/pathology
18.
J Biopharm Stat ; 22(6): 1174-92, 2012.
Article in English | MEDLINE | ID: mdl-23075016

ABSTRACT

Laboratory safety data are routinely collected in clinical studies for safety monitoring and assessment. We have developed a truncated robust multivariate outlier detection method for identifying subjects with clinically relevant abnormal laboratory measurements. The proposed method can be applied to historical clinical data to establish a multivariate decision boundary that can then be used for future clinical trial laboratory safety data monitoring and assessment. Simulations demonstrate that the proposed method has the ability to detect relevant outliers while automatically excluding irrelevant outliers. Two examples from actual clinical studies are used to illustrate the use of this method for identifying clinically relevant outliers.


Subject(s)
Clinical Trials as Topic/statistics & numerical data , Data Interpretation, Statistical , Drug Monitoring/statistics & numerical data , Models, Biological , Models, Statistical , Multivariate Analysis , Biomarkers/blood , Computer Simulation , Drug-Related Side Effects and Adverse Reactions , Humans , Lipoproteins, LDL/blood , Liver Function Tests , Safety/statistics & numerical data , Triglycerides/blood
19.
Genet Epidemiol ; 34(3): 266-74, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20013941

ABSTRACT

Significant allele flipping, where associations for the same disease occur at opposite alleles of the same bi-allelic locus, is increasing. But when is a significant allele flip genuine? We address the statistical issues of claiming and observing genuine allele flips in actual samples. We show that unless an allele flip is genuine, the probability of observing a significant allele flip in samples ascertained similarly from a common population is negligible. We derive expressions for the expected values of commonly used measures of association, which confirm previous findings that the underlying mechanism of a genuine allele flip is variation in the haplotype frequencies and show further how this variation interacts with variation in the genetic effects to impact allele flipping. We show that for association testing at proxy SNPs, common in genome-wide association studies, variation in haplotype frequencies must coincide with a reversal in the sign of linkage disequilibrium (LD) to trigger genuine allele flips. Using HapMap data and r, rather than r(2), to highlight previously unobserved effects, we show that unless genetic effects are large, variation in LD is unlikely to cause genuine allele flips in samples drawn from the same population. However, as populations diverge, it is an increasingly viable cause of a genuine allele flip for sufficiently large genetic effect and/or sample sizes. We conclude that evidence of variation in local patterns of LD, ancestral composition of study samples, and environmental exposures between study populations can provide compelling practical evidence in defense of a genuine allele flip.


Subject(s)
Alleles , Genome-Wide Association Study , Polymorphism, Single Nucleotide/genetics , Case-Control Studies , Gene Frequency , Genetic Predisposition to Disease , Haplotypes , Humans , Linkage Disequilibrium , Models, Genetic , Molecular Epidemiology/methods , Probability
20.
Genet Epidemiol ; 34(4): 319-26, 2010 May.
Article in English | MEDLINE | ID: mdl-20088020

ABSTRACT

Genome-wide association (GWA) studies have proved extremely successful in identifying novel genetic loci contributing effects to complex human diseases. In doing so, they have highlighted the fact that many potential loci of modest effect remain undetected, partly due to the need for samples consisting of many thousands of individuals. Large-scale international initiatives, such as the Wellcome Trust Case Control Consortium, the Genetic Association Information Network, and the database of genetic and phenotypic information, aim to facilitate discovery of modest-effect genes by making genome-wide data publicly available, allowing information to be combined for the purpose of pooled analysis. In principle, disease or control samples from these studies could be used to increase the power of any GWA study via judicious use as "genetically matched controls" for other traits. Here, we present the biological motivation for the problem and the theoretical potential for expanding the control group with publicly available disease or reference samples. We demonstrate that a naïve application of this strategy can greatly inflate the false-positive error rate in the presence of population structure. As a remedy, we make use of genome-wide data and model selection techniques to identify "axes" of genetic variation which are associated with disease. These axes are then included as covariates in association analysis to correct for population structure, which can result in increases in power over standard analysis of genetic information from the samples in the original GWA study.


Subject(s)
Genome-Wide Association Study , Alleles , Computer Simulation , Data Interpretation, Statistical , False Positive Reactions , Gene Frequency , Genetic Variation , Heterozygote , Humans , Models, Genetic , Models, Statistical , Odds Ratio , Reference Values , Research Design , Risk
SELECTION OF CITATIONS
SEARCH DETAIL