Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
2.
Am J Ophthalmol ; 233: 111-123, 2022 01.
Article in English | MEDLINE | ID: mdl-34166655

ABSTRACT

To identify functionally related genes associated with diabetic retinopathy (DR) risk using gene set enrichment analyses applied to genome-wide association study meta-analyses. METHODS: We analyzed DR GWAS meta-analyses performed on 3246 Europeans and 2611 African Americans with type 2 diabetes. Gene sets relevant to 5 key DR pathophysiology processes were investigated: tissue injury, vascular events, metabolic events and glial dysregulation, neuronal dysfunction, and inflammation. Keywords relevant to these processes were queried in 4 pathway and ontology databases. Two GSEA methods, Meta-Analysis Gene set Enrichment of variaNT Associations (MAGENTA) and Multi-marker Analysis of GenoMic Annotation (MAGMA), were used. Gene sets were defined to be enriched for gene associations with DR if the P value corrected for multiple testing (Pcorr) was <.05. RESULTS: Five gene sets were significantly enriched for numerous modest genetic associations with DR in one method (MAGENTA or MAGMA) and also at least nominally significant (uncorrected P < .05) in the other method. These pathways were regulation of the lipid catabolic process (2-fold enrichment, Pcorr = .014); nitric oxide biosynthesis (1.92-fold enrichment, Pcorr = .022); lipid digestion, mobilization, and transport (1.6-fold enrichment, P = .032); apoptosis (1.53-fold enrichment, P = .041); and retinal ganglion cell degeneration (2-fold enrichment, Pcorr = .049). The interferon gamma (IFNG) gene, previously implicated in DR by protein-protein interactions in our GWAS, was among the top ranked genes in the nitric oxide pathway (best variant P = .0001). CONCLUSIONS: These GSEA indicate that variants in genes involved in oxidative stress, lipid transport and catabolism, and cell degeneration are enriched for genes associated with DR risk. NOTE: Publication of this article is sponsored by the American Ophthalmological Society.


Subject(s)
Diabetes Mellitus, Type 2 , Diabetic Retinopathy , Diabetes Mellitus, Type 2/genetics , Diabetic Retinopathy/genetics , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide , Risk Factors
4.
F1000Res ; 8: 21, 2019.
Article in English | MEDLINE | ID: mdl-30828438

ABSTRACT

Bioconductor's SummarizedExperiment class unites numerical assay quantifications with sample- and experiment-level metadata.  SummarizedExperiment is the standard Bioconductor class for assays that produce matrix-like data, used by over 200 packages.  We describe the restfulSE package, a deployment of  this data model that supports remote storage.  We illustrate use of SummarizedExperiment with remote HDF5 and Google BigQuery back ends, with two applications in cancer genomics.  Our intent is to allow the use of familiar and semantically meaningful programmatic idioms to query genomic data, while abstracting the remote interface from end users and developers.


Subject(s)
Genomics , Software , Genome
5.
Diabetes ; 68(2): 441-456, 2019 02.
Article in English | MEDLINE | ID: mdl-30487263

ABSTRACT

To identify genetic variants associated with diabetic retinopathy (DR), we performed a large multiethnic genome-wide association study. Discovery included eight European cohorts (n = 3,246) and seven African American cohorts (n = 2,611). We meta-analyzed across cohorts using inverse-variance weighting, with and without liability threshold modeling of glycemic control and duration of diabetes. Variants with a P value <1 × 10-5 were investigated in replication cohorts that included 18,545 European, 16,453 Asian, and 2,710 Hispanic subjects. After correction for multiple testing, the C allele of rs142293996 in an intron of nuclear VCP-like (NVL) was associated with DR in European discovery cohorts (P = 2.1 × 10-9), but did not reach genome-wide significance after meta-analysis with replication cohorts. We applied the Disease Association Protein-Protein Link Evaluator (DAPPLE) to our discovery results to test for evidence of risk being spread across underlying molecular pathways. One protein-protein interaction network built from genes in regions associated with proliferative DR was found to have significant connectivity (P = 0.0009) and corroborated with gene set enrichment analyses. These findings suggest that genetic variation in NVL, as well as variation within a protein-protein interaction network that includes genes implicated in inflammation, may influence risk for DR.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genome-Wide Association Study/methods , Blood Glucose/metabolism , Diabetes Mellitus, Type 2/metabolism , Diabetic Retinopathy , Genetic Predisposition to Disease , Genotype , Glycated Hemoglobin/metabolism , Humans , Meta-Analysis as Topic , Polymorphism, Single Nucleotide/genetics , Protein Binding
6.
Acta Ophthalmol ; 96(7): e811-e819, 2018 Nov.
Article in English | MEDLINE | ID: mdl-30178632

ABSTRACT

PURPOSE: Diabetic retinopathy is the most common eye complication in patients with diabetes. The purpose of this study is to identify genetic factors contributing to severe diabetic retinopathy. METHODS: A genome-wide association approach was applied. In the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) datasets, cases of severe diabetic retinopathy were defined as type 2 diabetic patients who were ever graded as having severe background retinopathy (Level R3) or proliferative retinopathy (Level R4) in at least one eye according to the Scottish Diabetic Retinopathy Grading Scheme or who were once treated by laser photocoagulation. Controls were diabetic individuals whose longitudinal retinopathy screening records were either normal (Level R0) or only with mild background retinopathy (Level R1) in both eyes. Significant Single Nucleotide Polymorphisms (SNPs) were taken forward for meta-analysis using multiple Caucasian cohorts. RESULTS: Five hundred and sixty cases of type 2 diabetes with severe diabetic retinopathy and 4,106 controls were identified in the GoDARTS cohort. We revealed that rs3913535 in the NADPH Oxidase 4 (NOX4) gene reached a p value of 4.05 × 10-9 . Two nearby SNPs, rs10765219 and rs11018670 also showed promising p values (p values = 7.41 × 10-8 and 1.23 × 10-8 , respectively). In the meta-analysis using multiple Caucasian cohorts (excluding GoDARTS), rs10765219 and rs11018670 showed associations for diabetic retinopathy (p = 0.003 and 0.007, respectively), while the p value of rs3913535 was not significant (p = 0.429). CONCLUSION: This genome-wide association study of severe diabetic retinopathy suggests new evidence for the involvement of the NOX4 gene.


Subject(s)
Diabetes Mellitus, Type 2/complications , Diabetic Retinopathy/genetics , NADPH Oxidase 4/genetics , Polymorphism, Single Nucleotide , Adult , Diabetic Retinopathy/etiology , Diabetic Retinopathy/surgery , Female , Genetic Predisposition to Disease , Genome-Wide Association Study , Genotyping Techniques , Humans , Laser Coagulation , Male , Middle Aged , Scotland , White People/genetics
7.
Nat Genet ; 50(4): 621-629, 2018 04.
Article in English | MEDLINE | ID: mdl-29632380

ABSTRACT

We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.


Subject(s)
Gene Expression , Genetic Predisposition to Disease , Bipolar Disorder/genetics , Body Mass Index , Brain/metabolism , Chromatin/genetics , Epigenesis, Genetic , Gene Expression Profiling/statistics & numerical data , Genome-Wide Association Study/statistics & numerical data , Humans , Immune System Diseases/genetics , Linkage Disequilibrium , Models, Genetic , Multifactorial Inheritance , Neurons/metabolism , Schizophrenia/genetics , Tissue Distribution/genetics
8.
Diabetes ; 66(12): 3130-3141, 2017 12.
Article in English | MEDLINE | ID: mdl-28951389

ABSTRACT

Results from observational studies examining dyslipidemia as a risk factor for diabetic retinopathy (DR) have been inconsistent. We evaluated the causal relationship between plasma lipids and DR using a Mendelian randomization approach. We pooled genome-wide association studies summary statistics from 18 studies for two DR phenotypes: any DR (N = 2,969 case and 4,096 control subjects) and severe DR (N = 1,277 case and 3,980 control subjects). Previously identified lipid-associated single nucleotide polymorphisms served as instrumental variables. Meta-analysis to combine the Mendelian randomization estimates from different cohorts was conducted. There was no statistically significant change in odds ratios of having any DR or severe DR for any of the lipid fractions in the primary analysis that used single nucleotide polymorphisms that did not have a pleiotropic effect on another lipid fraction. Similarly, there was no significant association in the Caucasian and Chinese subgroup analyses. This study did not show evidence of a causal role of the four lipid fractions on DR. However, the study had limited power to detect odds ratios less than 1.23 per SD in genetically induced increase in plasma lipid levels, thus we cannot exclude that causal relationships with more modest effect sizes exist.


Subject(s)
Diabetic Retinopathy/etiology , Lipids/blood , Mendelian Randomization Analysis , Aged , Diabetic Retinopathy/blood , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide , Risk
9.
Am J Hum Genet ; 100(1): 31-39, 2017 Jan 05.
Article in English | MEDLINE | ID: mdl-28017371

ABSTRACT

Mixed models have become the tool of choice for genetic association studies; however, standard mixed model methods may be poorly calibrated or underpowered under family sampling bias and/or case-control ascertainment. Previously, we introduced a liability threshold-based mixed model association statistic (LTMLM) to address case-control ascertainment in unrelated samples. Here, we consider family-biased case-control ascertainment, where case and control subjects are ascertained non-randomly with respect to family relatedness. Previous work has shown that this type of ascertainment can severely bias heritability estimates; we show here that it also impacts mixed model association statistics. We introduce a family-based association statistic (LT-Fam) that is robust to this problem. Similar to LTMLM, LT-Fam is computed from posterior mean liabilities (PML) under a liability threshold model; however, LT-Fam uses published narrow-sense heritability estimates to avoid the problem of biased heritability estimation, enabling correct calibration. In simulations with family-biased case-control ascertainment, LT-Fam was correctly calibrated (average χ2 = 1.00-1.02 for null SNPs), whereas the Armitage trend test (ATT), standard mixed model association (MLM), and case-control retrospective association test (CARAT) were mis-calibrated (e.g., average χ2 = 0.50-1.22 for MLM, 0.89-2.65 for CARAT). LT-Fam also attained higher power than other methods in some settings. In 1,259 type 2 diabetes-affected case subjects and 5,765 control subjects from the CARe cohort, downsampled to induce family-biased ascertainment, LT-Fam was correctly calibrated whereas ATT, MLM, and CARAT were again mis-calibrated. Our results highlight the importance of modeling family sampling bias in case-control datasets with related samples.


Subject(s)
Family , Genetic Association Studies/methods , Models, Genetic , Bias , Calibration , Diabetes Mellitus, Type 2/genetics , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics , Retrospective Studies
10.
Nat Genet ; 47(12): 1385-92, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26523775

ABSTRACT

Heritability analyses of genome-wide association study (GWAS) cohorts have yielded important insights into complex disease architecture, and increasing sample sizes hold the promise of further discoveries. Here we analyze the genetic architectures of schizophrenia in 49,806 samples from the PGC and nine complex diseases in 54,734 samples from the GERA cohort. For schizophrenia, we infer an overwhelmingly polygenic disease architecture in which ≥71% of 1-Mb genomic regions harbor ≥1 variant influencing schizophrenia risk. We also observe significant enrichment of heritability in GC-rich regions and in higher-frequency SNPs for both schizophrenia and GERA diseases. In bivariate analyses, we observe significant genetic correlations (ranging from 0.18 to 0.85) for several pairs of GERA diseases; genetic correlations were on average 1.3 tunes stronger than the correlations of overall disease liabilities. To accomplish these analyses, we developed a fast algorithm for multicomponent, multi-trait variance-components analysis that overcomes prior computational barriers that made such analyses intractable at this scale.


Subject(s)
Algorithms , Analysis of Variance , Genetic Predisposition to Disease , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Schizophrenia/classification , Schizophrenia/genetics , Aging/genetics , Genome-Wide Association Study , Humans , Phenotype , Risk Factors , Schizophrenia/epidemiology
11.
Am J Hum Genet ; 96(5): 720-30, 2015 May 07.
Article in English | MEDLINE | ID: mdl-25892111

ABSTRACT

We introduce a liability-threshold mixed linear model (LTMLM) association statistic for case-control studies and show that it has a well-controlled false-positive rate and more power than existing mixed-model methods for diseases with low prevalence. Existing mixed-model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem by using a χ(2) score statistic computed from posterior mean liabilities (PMLs) under the liability-threshold model. Each individual's PML is conditional not only on that individual's case-control status but also on every individual's case-control status and the genetic relationship matrix (GRM) obtained from the data. The PMLs are estimated with a multivariate Gibbs sampler; the liability-scale phenotypic covariance matrix is based on the GRM, and a heritability parameter is estimated via Haseman-Elston regression on case-control phenotypes and then transformed to the liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed-model methods for diseases with low prevalence, and the magnitude of the improvement depended on sample size and severity of case-control ascertainment. In a Wellcome Trust Case Control Consortium 2 multiple sclerosis dataset with >10,000 samples, LTMLM was correctly calibrated and attained a 4.3% improvement (p = 0.005) in χ(2) statistics over existing mixed-model methods at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, case-control studies of diseases with low prevalence can achieve power higher than that in existing mixed-model methods.


Subject(s)
Genetic Association Studies , Models, Genetic , Models, Theoretical , Case-Control Studies , Chromosome Mapping , Computer Simulation , Humans , Multiple Sclerosis/genetics , Multiple Sclerosis/pathology , Phenotype , Polymorphism, Single Nucleotide , Sample Size
12.
Nat Genet ; 46(12): 1356-62, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25383972

ABSTRACT

Despite recent progress on estimating the heritability explained by genotyped SNPs (h(2)g), a large gap between h(2)g and estimates of total narrow-sense heritability (h(2)) remains. Explanations for this gap include rare variants or upward bias in family-based estimates of h(2) due to shared environment or epistasis. We estimate h(2) from unrelated individuals in admixed populations by first estimating the heritability explained by local ancestry (h(2)γ). We show that h(2)γ = 2FSTCθ(1 - θ)h(2), where FSTC measures frequency differences between populations at causal loci and θ is the genome-wide ancestry proportion. Our approach is not susceptible to biases caused by epistasis or shared environment. We applied this approach to the analysis of 13 phenotypes in 21,497 African-American individuals from 3 cohorts. For height and body mass index (BMI), we obtained h(2) estimates of 0.55 ± 0.09 and 0.23 ± 0.06, respectively, which are larger than estimates of h(2)g in these and other data but smaller than family-based estimates of h(2).


Subject(s)
Genetics, Population/methods , Genome-Wide Association Study , Multifactorial Inheritance , Quantitative Trait, Heritable , Black or African American/genetics , Aged , Black People , Body Mass Index , Cardiovascular Diseases/genetics , Case-Control Studies , Chromosome Mapping , Cohort Studies , Computer Simulation , Epistasis, Genetic , Female , Genotype , Humans , Male , Middle Aged , Models, Genetic , Models, Statistical , Phenotype , Polymorphism, Single Nucleotide , Prostatic Neoplasms/genetics , Reproducibility of Results , United States
13.
PLoS Genet ; 9(5): e1003520, 2013 May.
Article in English | MEDLINE | ID: mdl-23737753

ABSTRACT

Important knowledge about the determinants of complex human phenotypes can be obtained from the estimation of heritability, the fraction of phenotypic variation in a population that is determined by genetic factors. Here, we make use of extensive phenotype data in Iceland, long-range phased genotypes, and a population-wide genealogical database to examine the heritability of 11 quantitative and 12 dichotomous phenotypes in a sample of 38,167 individuals. Most previous estimates of heritability are derived from family-based approaches such as twin studies, which may be biased upwards by epistatic interactions or shared environment. Our estimates of heritability, based on both closely and distantly related pairs of individuals, are significantly lower than those from previous studies. We examine phenotypic correlations across a range of relationships, from siblings to first cousins, and find that the excess phenotypic correlation in these related individuals is predominantly due to shared environment as opposed to dominance or epistasis. We also develop a new method to jointly estimate narrow-sense heritability and the heritability explained by genotyped SNPs. Unlike existing methods, this approach permits the use of information from both closely and distantly related pairs of individuals, thereby reducing the variance of estimates of heritability explained by genotyped SNPs while preventing upward bias. Our results show that common SNPs explain a larger proportion of the heritability than previously thought, with SNPs present on Illumina 300K genotyping arrays explaining more than half of the heritability for the 23 phenotypes examined in this study. Much of the remaining heritability is likely to be due to rare alleles that are not captured by standard genotyping arrays.


Subject(s)
Genealogy and Heraldry , Genetic Variation , Heredity , Quantitative Trait Loci/genetics , Genotype , Humans , Iceland , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide , Siblings
14.
Hum Genet ; 132(9): 1039-47, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23666277

ABSTRACT

Asthma originates from genetic and environmental factors with about half the risk of disease attributable to heritable causes. Genome-wide association studies, mostly in populations of European ancestry, have identified numerous asthma-associated single nucleotide polymorphisms (SNPs). Studies in populations with diverse ancestries allow both for identification of robust associations that replicate across ethnic groups and for improved resolution of associated loci due to different patterns of linkage disequilibrium between ethnic groups. Here we report on an analysis of 745 African-American subjects with asthma and 3,238 African-American control subjects from the Candidate Gene Association Resource (CARe) Consortium, including analysis of SNPs imputed using 1,000 Genomes reference panels and adjustment for local ancestry. We show strong evidence that variation near RAD50/IL13, implicated in studies of European ancestry individuals, replicates in individuals largely of African ancestry. Fine mapping in African ancestry populations also refined the variants of interest for this association. We also provide strong or nominal evidence of replication at loci near ORMDL3/GSDMB, IL1RL1/IL18R1, and 10p14, all previously associated with asthma in European or Japanese populations, but not at the PYHIN1 locus previously reported in studies of African-American samples. These results improve the understanding of asthma genetics and further demonstrate the utility of genetic studies in populations other than those of largely European ancestry.


Subject(s)
Asthma/genetics , Black People/genetics , Chromosomes, Human, Pair 10/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation , Acid Anhydride Hydrolases , Asthma/ethnology , DNA Repair Enzymes/genetics , DNA-Binding Proteins/genetics , Female , Genetic Association Studies , Genetic Loci/genetics , Genotype , Humans , Interleukin-13/genetics , Male , Membrane Proteins/genetics , Neoplasm Proteins/genetics , Polymorphism, Single Nucleotide/genetics , Receptors, Interleukin-1/genetics , Receptors, Interleukin-18/genetics
15.
Bioinformatics ; 29(11): 1399-406, 2013 Jun 01.
Article in English | MEDLINE | ID: mdl-23539302

ABSTRACT

MOTIVATION: Inference of ancestry using genetic data is motivated by applications in genetic association studies, population genetics and personal genomics. Here, we provide methods and software for improved ancestry inference using genome-wide single nucleotide polymorphism (SNP) weights from external reference panels. This approach makes it possible to leverage the rich ancestry information that is available from large external reference panels, without the administrative and computational complexities of re-analyzing the raw genotype data from the reference panel in subsequent studies. RESULTS: We extensively validate our approach in multiple African American, Latino American and European American datasets, making use of genome-wide SNP weights derived from large reference panels, including HapMap 3 populations and 6546 European Americans from the Framingham Heart Study. We show empirically that our approach provides much greater accuracy than either the prevailing ancestry-informative marker (AIM) approach or the analysis of genome-wide target genotypes without a reference panel. For example, in an independent set of 1636 European American genome-wide association study samples, we attained prediction accuracy (R(2)) of 1.000 and 0.994 for the first two principal components using our method, compared with 0.418 and 0.407 using 150 published AIMs or 0.955 and 0.003 by applying principal component analysis directly to the target samples. We finally show that the higher accuracy in inferring ancestry using our method leads to more effective correction for population stratification in association studies. AVAILABILITY: The SNPweights software is available online at http://www.hsph.harvard.edu/faculty/alkes-price/software/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Racial Groups/genetics , Black or African American/genetics , Genotype , HapMap Project , Humans , Principal Component Analysis , Software , United States/ethnology , White People/genetics
16.
PLoS Genet ; 8(11): e1003032, 2012.
Article in English | MEDLINE | ID: mdl-23144628

ABSTRACT

Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-control-covariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled false-positive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1 × 10(-9)). The improvement varied across diseases with a 16% median increase in χ(2) test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.


Subject(s)
Case-Control Studies , Genetic Association Studies/statistics & numerical data , Genetic Predisposition to Disease , Models, Genetic , Age Factors , Body Mass Index , Chromosome Mapping , Factor Analysis, Statistical , Female , Genotype , Humans , Logistic Models , Male , Polymorphism, Single Nucleotide , Smoking
17.
Bioinformatics ; 28(13): 1729-37, 2012 Jul 01.
Article in English | MEDLINE | ID: mdl-22556366

ABSTRACT

MOTIVATION: The question of how to best use information from known associated variants when conducting disease association studies has yet to be answered. Some studies compute a marginal P-value for each Several Nucleotide Polymorphisms independently, ignoring previously discovered variants. Other studies include known variants as covariates in logistic regression, but a weakness of this standard conditioning strategy is that it does not account for disease prevalence and non-random ascertainment, which can induce a correlation structure between candidate variants and known associated variants even if the variants lie on different chromosomes. Here, we propose a new conditioning approach, which is based in part on the classical technique of liability threshold modeling. Roughly, this method estimates model parameters for each known variant while accounting for the published disease prevalence from the epidemiological literature. RESULTS: We show via simulation and application to empirical datasets that our approach outperforms both the no conditioning strategy and the standard conditioning strategy, with a properly controlled false-positive rate. Furthermore, in multiple data sets involving diseases of low prevalence, standard conditioning produces a severe drop in test statistics whereas our approach generally performs as well or better than no conditioning. Our approach may substantially improve disease gene discovery for diseases with many known risk variants. AVAILABILITY: LTSOFT software is available online http://www.hsph.harvard.edu/faculty/alkes-price/software/.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Case-Control Studies , Genetic Association Studies , Humans , Logistic Models , Models, Statistical , Prevalence , Risk , Software
18.
Am J Hum Genet ; 89(3): 368-81, 2011 Sep 09.
Article in English | MEDLINE | ID: mdl-21907010

ABSTRACT

The study of recent natural selection in human populations has important applications to human history and medicine. Positive natural selection drives the increase in beneficial alleles and plays a role in explaining diversity across human populations. By discovering traits subject to positive selection, we can better understand the population level response to environmental pressures including infectious disease. Our study examines unusual population differentiation between three large data sets to detect natural selection. The populations examined, African Americans, Nigerians, and Gambians, are genetically close to one another (F(ST) < 0.01 for all pairs), allowing us to detect selection even with moderate changes in allele frequency. We also develop a tree-based method to pinpoint the population in which selection occurred, incorporating information across populations. Our genome-wide significant results corroborate loci previously reported to be under selection in Africans including HBB and CD36. At the HLA locus on chromosome 6, results suggest the existence of multiple, independent targets of population-specific selective pressure. In addition, we report a genome-wide significant (p = 1.36 × 10(-11)) signal of selection in the prostate stem cell antigen (PSCA) gene. The most significantly differentiated marker in our analysis, rs2920283, is highly differentiated in both Africa and East Asia and has prior genome-wide significant associations to bladder and gastric cancers.


Subject(s)
Black People/genetics , Black or African American/genetics , Genetic Variation , Genetics, Population , Genome, Human/genetics , Selection, Genetic , Antigens, Neoplasm/genetics , CD36 Antigens/genetics , GPI-Linked Proteins/genetics , Gambia , Gene Frequency , Genotype , HLA Antigens/genetics , Humans , Models, Genetic , Neoplasm Proteins/genetics , Nigeria , United States
19.
Nature ; 467(7311): 52-8, 2010 Sep 02.
Article in English | MEDLINE | ID: mdl-20811451

ABSTRACT

Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called 'HapMap 3', includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of

Subject(s)
DNA Copy Number Variations , Genome, Human , Polymorphism, Single Nucleotide , Population Groups/genetics , Human Genome Project , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...