RESUMEN
Large biobank samples provide an opportunity to integrate broad phenotyping, familial records, and molecular genetics data to study complex traits and diseases. We introduce Pearson-Aitken Family Genetic Risk Scores (PA-FGRS), a method for estimating disease liability from patterns of diagnoses in extended, age-censored genealogical records. We then apply the method to study a paradigmatic complex disorder, major depressive disorder (MDD), using the iPSYCH2015 case-cohort study of 30,949 MDD cases, 39,655 random population controls, and more than 2 million relatives. We show that combining PA-FGRS liabilities estimated from family records with molecular genotypes of probands improves three lines of inquiry. Incorporating PA-FGRS liabilities improves classification of MDD over and above polygenic scores, identifies robust genetic contributions to clinical heterogeneity in MDD associated with comorbidity, recurrence, and severity and can improve the power of genome-wide association studies. Our method is flexible and easy to use, and our study approaches are generalizable to other datasets and other complex traits and diseases.
RESUMEN
Existing methods for generating synthetic genotype data are ill-suited for replicating the effects of assortative mating (AM). We propose rb_dplr, a novel and computationally efficient algorithm for generating high-dimensional binary random variates that effectively recapitulates AM-induced genetic architectures using the Bahadur order-2 approximation of the multivariate Bernoulli distribution. The rBahadur R library is available through the Comprehensive R Archive Network at https://CRAN.R-project.org/package=rBahadur .
Asunto(s)
Algoritmos , Comunicación Celular , Distribución Binomial , Simulación por Computador , GenotipoRESUMEN
INTRODUCTION: Tobacco smoking is the leading cause of preventable death globally. Smoking quantity, measured in cigarettes per day, is influenced both by the age of onset of regular smoking (AOS) and by genetic factors, including a strong effect of the nonsynonymous single-nucleotide polymorphism rs16969968. A previous study by Hartz et al. reported an interaction between these two factors, whereby rs16969968 risk allele carriers who started smoking earlier showed increased risk for heavy smoking compared with those who started later. This finding has yet to be replicated in a large, independent sample. METHODS: We performed a preregistered, direct replication attempt of the rs16969968 × AOS interaction on smoking quantity in 128 383 unrelated individuals from the UK Biobank, meta-analyzed across ancestry groups. We fit statistical association models mirroring the original publication as well as formal interaction tests on multiple phenotypic and analytical scales. RESULTS: We replicated the main effects of rs16969968 and AOS on cigarettes per day but failed to replicate the interaction using previous methods. Nominal significance of the rs16969968 × AOS interaction term depended strongly on the scale of analysis and the particular phenotype, as did associations stratified by early/late AOS. No interaction tests passed genome-wide correction (α = 5e-8), and all estimated interaction effect sizes were much smaller in magnitude than previous estimates. CONCLUSIONS: We failed to replicate the strong rs16969968 × AOS interaction effect previously reported. If such gene-moderator interactions influence complex traits, they likely depend on scale of measurement, and current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. IMPLICATIONS: We failed to replicate the strong rs16969968 × AOS interaction effect on smoking quantity previously reported. If such gene-moderator interactions influence complex traits, current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. Furthermore, many potential interaction effects are likely to depend on the scale of measurement employed.
Asunto(s)
Fumar , Edad de Inicio , Predisposición Genética a la Enfermedad , Humanos , Proteínas del Tejido Nervioso/genética , Polimorfismo de Nucleótido Simple , Receptores Nicotínicos/genética , Fumar/genética , Fumar TabacoRESUMEN
BACKGROUND: Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance component estimation by replacing direct matrix operations with iterative and stochastic methods and by employing loose tolerances to limit the number of iterations in the REML optimization procedure. Here, we introduce two novel algorithms, stochastic Lanczos derivative-free REML (SLDF_REML) and Lanczos first-order Monte Carlo REML (L_FOMC_REML), that exploit problem structure via the principle of Krylov subspace shift-invariance to speed computation beyond existing methods. Both novel algorithms only require a single round of computation involving iterative matrix operations, after which their respective objectives can be repeatedly evaluated using vector operations. Further, in contrast to existing stochastic methods, SLDF_REML can exploit precomputed genomic relatedness matrices (GRMs), when available, to further speed computation. RESULTS: Results of numerical experiments are congruent with theory and demonstrate that interpreted-language implementations of both algorithms match or exceed existing compiled-language software packages in speed, accuracy, and flexibility. CONCLUSIONS: Both the SLDF_REML and L_FOMC_REML algorithms outperform existing methods for REML estimation of variance components for LMM and are suitable for incorporation into existing GWAS LMM software implementations.
Asunto(s)
Algoritmos , Genómica , Funciones de Verosimilitud , Modelos Lineales , Método de Montecarlo , Programas Informáticos , Procesos Estocásticos , Factores de TiempoRESUMEN
Moore and Thoemmes elaborate on one particular source of difficulty in the study of candidate gene-by-environment interactions (cG × E): how different biologically plausible configurations of gene-environment covariation can bias estimates of cG × E when not explicitly modeled. However, even if cG × E investigators were able to account for the sources of bias Moore and Thoemmes elaborate, it is unlikely that conventional approaches would yield reliable results. Published cG × E findings to date have generally employed inadequate analytic procedures, have relied on samples orders of magnitude too small to detect plausible effects, and have relied on a particular candidate gene approach that has been unfruitful and largely jettisoned in mainstream genetic analyses of complex traits. Analytic procedures for the study of gene-environment interplay must evolve to meet the challenges that the genetic architecture of complex traits presents, and investigators must collaborate on grander scales if we hope to begin to understand how specific genes and environments combine to affect behavior.
Asunto(s)
Ambiente , Interacción Gen-Ambiente , Pruebas Genéticas , Humanos , FenotipoRESUMEN
BACKGROUND: The understanding of the molecular genetic contributions to smoking is largely limited to the additive effects of individual single nucleotide polymorphisms (SNPs), but the underlying genetic risk is likely to also include dominance, epistatic, and gene-environment interactions. METHODS: To begin to address this complexity, we attempted to identify genetic interactions between rs16969968, the most replicated SNP associated with smoking quantity, and all SNPs and genes across the genome. RESULTS: Using the UK Biobank European subsample, we found one SNP, rs1892967, and two genes, PCNA and TMEM230, that showed a significant genome-wide interaction with rs16969968 for log10 CPD and raw CPD, respectively, in a sample of 116 442 individuals who self-reported currently or previously smoking. We extended these analyses to individuals of South Asian descent and meta-analyzed the combined sample of 117 212 individuals of European and South Asian ancestry. We replicated the gene findings in a meta-analysis of five Finnish samples (N=40 140): FinHealth, FINRISK, Finnish Twin Cohort, GeneRISK, and Health-2000-2011. CONCLUSIONS: To our knowledge, this represents the first reliable epistatic association between single nucleotide polymorphisms for smoking behaviors and provides a novel direction for possible future functional studies related to this interaction. Furthermore, this work demonstrates the feasibility of these analyses by pooling multiple datasets across various ancestries, which may be applied to other top SNPs for smoking and/or other phenotypes.
Asunto(s)
Enfermedad de Parkinson , Productos de Tabaco , Humanos , Cromosomas Humanos Par 20 , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Proteínas de la Membrana/genética , Polimorfismo de Nucleótido Simple/genética , Fumar/genética , Personas del Sur de Asia , Reino Unido , Población BlancaRESUMEN
Emerging evidence has shown that assortative mating (AM) is a key factor that shapes the landscape of complex human traits. It can increase the overall prevalence of disorders, influence occurrences of comorbidities, and bias estimation of genetic architectures. However, there is lack of large-scale studies to examine the cultural differences and the generational trends of AM for psychiatric disorders. Here, using national registry datasets, we conduct the largest scale of AM analyses on nine psychiatric disorders, with up to 1.4 million mated cases and 6 million matched controls. We performed meta-analyses on AM estimates from Taiwan, Denmark, and Sweden, to examine the potential impact of cultural differences. Generational changes for people born after 1930s were investigated as well. We found that AM of psychiatric disorders are consistent across nations and persistent over generations, with a small proportion of disorders showing generational changes of AM. Our results provide additional insight into the mechanisms of AM across psychiatric disorders and have evident implications on the estimation of the genetic architectures of psychiatric disorders.
RESUMEN
Over three percent of people carry a dominant pathogenic mutation, yet only a fraction of carriers develop disease (incomplete penetrance), and phenotypes from mutations in the same gene range from mild to severe (variable expressivity). Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (epistasis). We leveraged exomes and clinical phenotypes from the UK Biobank and the Mt. Sinai Bio Me Biobank to identify carriers of pathogenic variants affecting cardiometabolic traits. We employed recently developed methods to study these cohorts, observing strong statistical support and clinical translational potential for all three mechanisms of variable penetrance and expressivity. For example, scores from our recent model of variant pathogenicity were tightly correlated with phenotype amongst clinical variant carriers, they predicted effects of variants of unknown significance, and they distinguished gain- from loss-of-function variants. We also found that polygenic scores predicted phenotypes amongst pathogenic carriers and that epistatic effects can exceed main carrier effects by an order of magnitude.
RESUMEN
Biobanks often contain several phenotypes relevant to diseases such as major depressive disorder (MDD), with partly distinct genetic architectures. Researchers face complex tradeoffs between shallow (large sample size, low specificity/sensitivity) and deep (small sample size, high specificity/sensitivity) phenotypes, and the optimal choices are often unclear. Here we propose to integrate these phenotypes to combine the benefits of each. We use phenotype imputation to integrate information across hundreds of MDD-relevant phenotypes, which significantly increases genome-wide association study (GWAS) power and polygenic risk score (PRS) prediction accuracy of the deepest available MDD phenotype in UK Biobank, LifetimeMDD. We demonstrate that imputation preserves specificity in its genetic architecture using a novel PRS-based pleiotropy metric. We further find that integration via summary statistics also enhances GWAS power and PRS predictions, but can introduce nonspecific genetic effects depending on input. Our work provides a simple and scalable approach to improve genetic studies in large biobanks by integrating shallow and deep phenotypes.
Asunto(s)
Trastorno Depresivo Mayor , Humanos , Trastorno Depresivo Mayor/genética , Predisposición Genética a la Enfermedad , Bancos de Muestras Biológicas , Estudio de Asociación del Genoma Completo , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Many traits are subject to assortative mating, with recent molecular genetic findings confirming longstanding theoretical predictions that assortative mating induces long range dependence across causal variants. However, all marker-based heritability estimators implicitly assume mating is random. We provide mathematical and simulation-based evidence demonstrating that both method-of-moments and likelihood-based estimators are biased in the presence of assortative mating and derive corrected heritability estimators for traits subject to assortment. Finally, we demonstrate that the empirical patterns of estimates across methods and sample sizes for real traits subject to assortative mating are congruent with expected assortative mating-induced biases. For example, marker-based heritability estimates for height are 14% - 23% higher than corrected estimates using UK Biobank data.
Asunto(s)
Algoritmos , Genética de Población/métodos , Modelos Genéticos , Reproducción/genética , Sesgo , Simulación por Computador , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Funciones de Verosimilitud , Desequilibrio de Ligamiento , Masculino , Análisis de la Aleatorización Mendeliana/métodos , Fenotipo , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo HeredableRESUMEN
The observation of genetic correlations between disparate human traits has been interpreted as evidence of widespread pleiotropy. Here, we introduce cross-trait assortative mating (xAM) as an alternative explanation. We observe that xAM affects many phenotypes and that phenotypic cross-mate correlation estimates are strongly associated with genetic correlation estimates (R2=74%). We demonstrate that existing xAM plausibly accounts for substantial fractions of genetic correlation estimates and that previously reported genetic correlation estimates between some pairs of psychiatric disorders are congruent with xAM alone. Finally, we provide evidence for a history of xAM at the genetic level using cross-trait even/odd chromosome polygenic score correlations. Together, our results demonstrate that previous reports have likely overestimated the true genetic similarity between many phenotypes.
Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Comunicación Celular , FenotipoRESUMEN
OBJECTIVE: To investigate the genetic architecture of internalizing symptoms in childhood and adolescence. METHOD: In 22 cohorts, multiple univariate genome-wide association studies (GWASs) were performed using repeated assessments of internalizing symptoms, in a total of 64,561 children and adolescents between 3 and 18 years of age. Results were aggregated in meta-analyses that accounted for sample overlap, first using all available data, and then using subsets of measurements grouped by rater, age, and instrument. RESULTS: The meta-analysis of overall internalizing symptoms (INToverall) detected no genome-wide significant hits and showed low single nucleotide polymorphism (SNP) heritability (1.66%, 95% CI = 0.84-2.48%, neffective = 132,260). Stratified analyses indicated rater-based heterogeneity in genetic effects, with self-reported internalizing symptoms showing the highest heritability (5.63%, 95% CI = 3.08%-8.18%). The contribution of additive genetic effects on internalizing symptoms appeared to be stable over age, with overlapping estimates of SNP heritability from early childhood to adolescence. Genetic correlations were observed with adult anxiety, depression, and the well-being spectrum (|rg| > 0.70), as well as with insomnia, loneliness, attention-deficit/hyperactivity disorder, autism, and childhood aggression (range |rg| = 0.42-0.60), whereas there were no robust associations with schizophrenia, bipolar disorder, obsessive-compulsive disorder, or anorexia nervosa. CONCLUSION: Genetic correlations indicate that childhood and adolescent internalizing symptoms share substantial genetic vulnerabilities with adult internalizing disorders and other childhood psychiatric traits, which could partially explain both the persistence of internalizing symptoms over time and the high comorbidity among childhood psychiatric traits. Reducing phenotypic heterogeneity in childhood samples will be key in paving the way to future GWAS success.
Asunto(s)
Trastorno por Déficit de Atención con Hiperactividad , Trastorno Autístico , Estudio de Asociación del Genoma Completo , Trastornos del Inicio y del Mantenimiento del Sueño , Adolescente , Adulto , Agresión , Ansiedad/genética , Trastorno por Déficit de Atención con Hiperactividad/genética , Trastorno Autístico/genética , Trastorno Bipolar , Niño , Preescolar , Depresión/genética , Humanos , Soledad , Polimorfismo de Nucleótido Simple , Esquizofrenia , Trastornos del Inicio y del Mantenimiento del Sueño/genéticaRESUMEN
Childhood aggressive behavior (AGG) has a substantial heritability of around 50%. Here we present a genome-wide association meta-analysis (GWAMA) of childhood AGG, in which all phenotype measures across childhood ages from multiple assessors were included. We analyzed phenotype assessments for a total of 328 935 observations from 87 485 children aged between 1.5 and 18 years, while accounting for sample overlap. We also meta-analyzed within subsets of the data, i.e., within rater, instrument and age. SNP-heritability for the overall meta-analysis (AGGoverall) was 3.31% (SE = 0.0038). We found no genome-wide significant SNPs for AGGoverall. The gene-based analysis returned three significant genes: ST3GAL3 (P = 1.6E-06), PCDH7 (P = 2.0E-06), and IPO13 (P = 2.5E-06). All three genes have previously been associated with educational traits. Polygenic scores based on our GWAMA significantly predicted aggression in a holdout sample of children (variance explained = 0.44%) and in retrospectively assessed childhood aggression (variance explained = 0.20%). Genetic correlations (rg) among rater-specific assessment of AGG ranged from rg = 0.46 between self- and teacher-assessment to rg = 0.81 between mother- and teacher-assessment. We obtained moderate-to-strong rgs with selected phenotypes from multiple domains, but hardly with any of the classical biomarkers thought to be associated with AGG. Significant genetic correlations were observed with most psychiatric and psychological traits (range [Formula: see text]: 0.19-1.00), except for obsessive-compulsive disorder. Aggression had a negative genetic correlation (rg = ~-0.5) with cognitive traits and age at first birth. Aggression was strongly genetically correlated with smoking phenotypes (range [Formula: see text]: 0.46-0.60). The genetic correlations between aggression and psychiatric disorders were weaker for teacher-reported AGG than for mother- and self-reported AGG. The current GWAMA of childhood aggression provides a powerful tool to interrogate the rater-specific genetic etiology of AGG.
Asunto(s)
Agresión , Trastornos Mentales , Adolescente , Niño , Preescolar , Femenino , Estudios de Asociación Genética , Estudio de Asociación del Genoma Completo , Humanos , Lactante , Estudios RetrospectivosRESUMEN
OBJECTIVE: Interest in candidate gene and candidate gene-by-environment interaction hypotheses regarding major depressive disorder remains strong despite controversy surrounding the validity of previous findings. In response to this controversy, the present investigation empirically identified 18 candidate genes for depression that have been studied 10 or more times and examined evidence for their relevance to depression phenotypes. METHODS: Utilizing data from large population-based and case-control samples (Ns ranging from 62,138 to 443,264 across subsamples), the authors conducted a series of preregistered analyses examining candidate gene polymorphism main effects, polymorphism-by-environment interactions, and gene-level effects across a number of operational definitions of depression (e.g., lifetime diagnosis, current severity, episode recurrence) and environmental moderators (e.g., sexual or physical abuse during childhood, socioeconomic adversity). RESULTS: No clear evidence was found for any candidate gene polymorphism associations with depression phenotypes or any polymorphism-by-environment moderator effects. As a set, depression candidate genes were no more associated with depression phenotypes than noncandidate genes. The authors demonstrate that phenotypic measurement error is unlikely to account for these null findings. CONCLUSIONS: The study results do not support previous depression candidate gene findings, in which large genetic effects are frequently reported in samples orders of magnitude smaller than those examined here. Instead, the results suggest that early hypotheses about depression candidate genes were incorrect and that the large number of associations reported in the depression candidate gene literature are likely to be false positives.
Asunto(s)
Experiencias Adversas de la Infancia , Trastorno Depresivo Mayor/genética , Interacción Gen-Ambiente , Trauma Psicológico , Factores Socioeconómicos , Estudios de Asociación Genética , Humanos , Fenotipo , Polimorfismo Genético , Recurrencia , Reproducibilidad de los Resultados , Índice de Severidad de la EnfermedadRESUMEN
Some of the most widely studied variants in psychiatric genetics include variable number tandem repeat variants (VNTRs) in SLC6A3, DRD4, SLC6A4, and MAOA. While initial findings suggested large effects, their importance with respect to psychiatric phenotypes is the subject of much debate with broadly conflicting results. Despite broad interest, these loci remain absent from the largest available samples, such as the UK Biobank, limiting researchers' ability to test these contentious hypotheses rigorously in large samples. Here, using two independent reference datasets, we report out-of-sample imputation accuracy estimates of >0.96 for all four VNTR variants and one modifying SNP, depending on the reference and target dataset. We describe the imputation procedures of these candidate variants in 486,551 UK Biobank individuals, and have made the imputed variant data available to UK Biobank researchers. This resource, provided to the scientific community, will allow the most rigorous tests to-date of the roles of these variants in behavioral and psychiatric phenotypes.
Asunto(s)
Bancos de Muestras Biológicas , Sitios Genéticos , Genotipo , Trastornos Mentales/genética , Repeticiones de Minisatélite , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo , Humanos , Reino UnidoRESUMEN
Externalizing problems (EP), including rule-breaking, aggression, and criminal involvement, are highly prevalent during adolescence, but the adult outcomes of adolescents exhibiting EP are characterized by heterogeneity. Although many youths' EP subside after adolescence, others' persists into adulthood. Characterizing the development of severe EP is essential to prevention and intervention efforts. Multiple predictors of adult antisocial personality disorder (ASPD) and legal outcomes of a large sample (N = 1205) of clinically- or legally-ascertained adolescents (ages 12-19 years) with severe EP were examined. Many psychosocial predictors hypothesized to predict persistence of EP demonstrated zero-order associations with adult outcomes, but accounted for little unique variation after accounting for baseline conduct disorder symptoms (CD) and demographic factors. Baseline measures of intelligence, which explained independent variation in legal outcomes, provided the only consistent exception to this pattern, though future work is needed to parse these effects from those of socioeconomic factors. CD severity during adolescence is a parsimonious index of liability for persistence of EP into adulthood that explains outcome variance above and beyond all other demographic and psychosocial predictors in this sample.
Asunto(s)
Conducta del Adolescente/fisiología , Trastorno de Personalidad Antisocial/diagnóstico , Trastorno de la Conducta/diagnóstico , Conducta Criminal , Emoción Expresada , Delincuencia Juvenil , Adolescente , Conducta del Adolescente/psicología , Adulto , Edad de Inicio , Trastorno de Personalidad Antisocial/complicaciones , Trastorno de Personalidad Antisocial/epidemiología , Trastorno de Personalidad Antisocial/psicología , Trastorno de la Conducta/complicaciones , Trastorno de la Conducta/epidemiología , Trastorno de la Conducta/psicología , Conducta Criminal/fisiología , Conducta Peligrosa , Progresión de la Enfermedad , Femenino , Humanos , Delincuencia Juvenil/legislación & jurisprudencia , Delincuencia Juvenil/psicología , Delincuencia Juvenil/estadística & datos numéricos , Estudios Longitudinales , Masculino , Pronóstico , Trastornos Relacionados con Sustancias/complicaciones , Trastornos Relacionados con Sustancias/diagnóstico , Trastornos Relacionados con Sustancias/epidemiología , Trastornos Relacionados con Sustancias/psicología , Adulto JovenRESUMEN
BACKGROUND AND AIMS: Adolescents with conduct and substance use problems are at increased risk for premature mortality, but the extent to which these risk factors reflect family- or individual-level differences and account for shared or unique variance is unknown. This study examined common and independent contributions to mortality hazard in adolescents ascertained for conduct disorder (CD) and substance use disorder (SUD), their siblings and community controls, hypothesizing that individual differences in CD and SUD severity would explain unique variation in mortality risk beyond that due to clinical/control status and demographic factors. DESIGN: Mortality analysis in a prospective study (Genetics of Antisocial Drug Dependence Study) that began in 1993. SETTING: Multi-site sample recruited in San Diego, California and Denver, Colorado, USA. PARTICIPANTS: A total of 1463 clinical probands were recruited through the juvenile correctional system, court-mandated substance abuse treatment programs and correctional schools, along with 1399 of their siblings, and 904 controls. MEASUREMENTS: Mortality and cause-of-death were assessed via National Death Index search (released October, 2017). FINDINGS: There were 104 deaths documented among 3766 (1168 female) adolescents and young adults (average age 16.79 years at assessment, 32.69 years at death/censoring). Mortality hazard for clinical probands and their siblings was 4.99 times greater than that of controls (95% confidence interval = 2.40-10.40; P < 0.001). After accounting for demographic characteristics, site, clinical status, familial dependence and shared contributions of CD and SUD, CD independently predicted mortality hazard, whereas SUD severity did not. CONCLUSIONS: In the United States, youth with conduct and substance use disorders and their siblings face far greater risk of premature death than demographically similar community controls. In contrast to substance use disorder severity, conduct disorder is a robust predictor of unique variance in all-cause mortality hazard beyond other risk factors.
Asunto(s)
Trastorno de la Conducta/epidemiología , Mortalidad Prematura , Trastornos Relacionados con Sustancias/epidemiología , Accidentes de Tránsito/mortalidad , Adolescente , Adulto , Estudios de Casos y Controles , Causas de Muerte , Estudios de Cohortes , Sobredosis de Droga/mortalidad , Femenino , Humanos , Masculino , Mortalidad , Modelos de Riesgos Proporcionales , Estudios Prospectivos , Factores de Riesgo , Hermanos , Suicidio Completo/estadística & datos numéricos , Estados Unidos/epidemiología , Violencia/estadística & datos numéricos , Adulto JovenRESUMEN
BACKGROUND: A recent analysis of 25 historical candidate gene polymorphisms for schizophrenia in the largest genome-wide association study conducted to date suggested that these commonly studied variants were no more associated with the disorder than would be expected by chance. However, the same study identified other variants within those candidate genes that demonstrated genome-wide significant associations with schizophrenia. As such, it is possible that variants within historic schizophrenia candidate genes are associated with schizophrenia at levels above those expected by chance, even if the most-studied specific polymorphisms are not. METHODS: The present study used association statistics from the largest schizophrenia genome-wide association study conducted to date as input to a gene set analysis to investigate whether variants within schizophrenia candidate genes are enriched for association with schizophrenia. RESULTS: As a group, variants in the most-studied candidate genes were no more associated with schizophrenia than were variants in control sets of noncandidate genes. While a small subset of candidate genes did appear to be significantly associated with schizophrenia, these genes were not particularly noteworthy given the large number of more strongly associated noncandidate genes. CONCLUSIONS: The history of schizophrenia research should serve as a cautionary tale to candidate gene investigators examining other phenotypes: our findings indicate that the most investigated candidate gene hypotheses of schizophrenia are not well supported by genome-wide association studies, and it is likely that this will be the case for other complex traits as well.