RESUMEN
An efficient approach to characterizing the disease burden of rare genetic variants is to impute them into large well-phenotyped cohorts with existing genome-wide genotype data using large sequenced referenced panels. The success of this approach hinges on the accuracy of rare variant imputation, which remains controversial. For example, a recent study suggested that one cannot adequately impute the HOXB13 G84E mutation associated with prostate cancer risk (carrier frequency of 0.0034 in European ancestry participants in the 1000 Genomes Project). We show that by utilizing the 1000 Genomes Project data plus an enriched reference panel of mutation carriers we were able to accurately impute the G84E mutation into a large cohort of 83,285 non-Hispanic White participants from the Kaiser Permanente Research Program on Genes, Environment and Health Genetic Epidemiology Research on Adult Health and Aging cohort. Imputation authenticity was confirmed via a novel classification and regression tree method, and then empirically validated analyzing a subset of these subjects plus an additional 1,789 men from Kaiser specifically genotyped for the G84E mutation (r2 = 0.57, 95% CI = 0.370.77). We then show the value of this approach by using the imputed data to investigate the impact of the G84E mutation on age-specific prostate cancer risk and on risk of fourteen other cancers in the cohort. The age-specific risk of prostate cancer among G84E mutation carriers was higher than among non-carriers. Risk estimates from Kaplan-Meier curves were 36.7% versus 13.6% by age 72, and 64.2% versus 24.2% by age 80, for G84E mutation carriers and non-carriers, respectively (p = 3.4x10-12). The G84E mutation was also associated with an increase in risk for the fourteen other most common cancers considered collectively (p = 5.8x10-4) and more so in cases diagnosed with multiple cancer types, both those including and not including prostate cancer, strongly suggesting pleiotropic effects. [corrected].
Asunto(s)
Proteínas de Homeodominio/genética , Neoplasias de la Próstata/epidemiología , Neoplasias de la Próstata/genética , Adulto , Anciano , Anciano de 80 o más Años , California , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Mutación de Línea Germinal , Haplotipos , Humanos , Masculino , Persona de Mediana Edad , Filogenia , Polimorfismo de Nucleótido Simple , Neoplasias de la Próstata/patología , Factores de RiesgoRESUMEN
Exome and whole-genome sequencing studies are becoming increasingly common, but little is known about the accuracy of the genotype calls made by the commonly used platforms. Here we use replicate high-coverage sequencing of blood and saliva DNA samples from four European-American individuals to estimate lower bounds on the error rates of Complete Genomics and Illumina HiSeq whole-genome and whole-exome sequencing. Error rates for nonreference genotype calls range from 0.1% to 0.6%, depending on the platform and the depth of coverage. Additionally, we found (1) no difference in the error profiles or rates between blood and saliva samples; (2) Complete Genomics sequences had substantially higher error rates than Illumina sequences had; (3) error rates were higher (up to 6%) for rare or unique variants; (4) error rates generally declined with genotype quality (GQ) score, but in a nonlinear fashion for the Illumina data, likely due to loss of specificity of GQ scores greater than 60; and (5) error rates increased with increasing depth of coverage for the Illumina data. These findings, especially (3)-(5), suggest that caution should be taken in interpreting the results of next-generation sequencing-based association studies, and even more so in clinical application of this technology in the absence of validation by other more robust sequencing or genotyping methods.
Asunto(s)
Exoma/genética , Genómica/métodos , Técnicas de Genotipaje/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Frecuencia de los Genes , Genoma Humano/genética , Genotipo , Humanos , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Población Blanca/genéticaRESUMEN
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies.
Asunto(s)
Pueblo Asiatico/genética , Negro o Afroamericano/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Polimorfismo de Nucleótido Simple , Algoritmos , Asia Oriental , Genoma Humano , Genotipo , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Proyectos Piloto , Población Blanca/genéticaRESUMEN
The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies.
Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Ensayos Analíticos de Alto Rendimiento , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genética , HumanosRESUMEN
To identify rare variants associated with prostate cancer susceptibility and better characterize the mechanisms and cumulative disease risk associated with common risk variants, we conducted an integrated study of prostate cancer genetic etiology in two cohorts using custom genotyping microarrays, large imputation reference panels, and functional annotation approaches. Specifically, 11,984 men (6,196 prostate cancer cases and 5,788 controls) of European ancestry from Northern California Kaiser Permanente were genotyped and meta-analyzed with 196,269 men of European ancestry (7,917 prostate cancer cases and 188,352 controls) from the UK Biobank. Three novel loci, including two rare variants (European ancestry minor allele frequency < 0.01, at 3p21.31 and 8p12), were significant genome wide in a meta-analysis. Gene-based rare variant tests implicated a known prostate cancer gene (HOXB13), as well as a novel candidate gene (ILDR1), which encodes a receptor highly expressed in prostate tissue and is related to the B7/CD28 family of T-cell immune checkpoint markers. Haplotypic patterns of long-range linkage disequilibrium were observed for rare genetic variants at HOXB13 and other loci, reflecting their evolutionary history. In addition, a polygenic risk score (PRS) of 188 prostate cancer variants was strongly associated with risk (90th vs. 40th-60th percentile OR = 2.62, P = 2.55 × 10-191). Many of the 188 variants exhibited functional signatures of gene expression regulation or transcription factor binding, including a 6-fold difference in log-probability of androgen receptor binding at the variant rs2680708 (17q22). Rare variant and PRS associations, with concomitant functional interpretation of risk mechanisms, can help clarify the full genetic architecture of prostate cancer and other complex traits. SIGNIFICANCE: This study maps the biological relationships between diverse risk factors for prostate cancer, integrating different functional datasets to interpret and model genome-wide data from over 200,000 men with and without prostate cancer.See related commentary by Lachance, p. 1637.
Asunto(s)
Herencia Multifactorial , Neoplasias de la Próstata , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Neoplasias de la Próstata/genéticaRESUMEN
High-density single-nucleotide polymorphism (SNP) microarrays provide a useful tool for the detection of copy number variants (CNVs). The analysis of such large amounts of data is complicated, especially with regard to determining where copy numbers change and their corresponding values. In this article, we propose a Bayesian multiple change-point model (BMCP) for segmentation and estimation of SNP microarray data. Segmentation concerns separating a chromosome into regions of equal copy number differences between the sample of interest and some reference, and involves the detection of locations of copy number difference changes. Estimation concerns determining true copy number for each segment. Our approach not only gives posterior estimates for the parameters of interest, namely locations for copy number difference changes and true copy number estimates, but also useful confidence measures. In addition, our algorithm can segment multiple samples simultaneously, and infer both common and rare CNVs across individuals. Finally, for studies of CNVs in tumors, we incorporate an adjustment factor for signal attenuation due to tumor heterogeneity or normal contamination that can improve copy number estimates.
Asunto(s)
Teorema de Bayes , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple , Algoritmos , Biometría/métodos , Cromosomas Humanos , Dosificación de Gen , Variación Genética , Humanos , Neoplasias/genéticaRESUMEN
Although cutaneous squamous cell carcinoma (cSCC) is one of the most common malignancies in individuals of European ancestry, the incidence of cSCC in Hispanic/Latinos is also increasing. cSCC has both a genetic and environmental etiology. Here, we examine the role of genetic ancestry, skin pigmentation, and sun exposure in Hispanic/Latinos and non-Hispanic whites on cSCC risk. We observe an increased cSCC risk with greater European ancestry (P = 1.27 × 10-42) within Hispanic/Latinos and with greater northern (P = 2.38 × 10-65) and western (P = 2.28 × 10-49) European ancestry within non-Hispanic whites. These associations are significantly, but not completely, attenuated after considering skin pigmentation-associated loci, history of actinic keratosis, and sun-protected versus sun-exposed anatomical sites. We also report an association of the well-known pigment variant Ala111Thr (rs1426654) at SLC24A5 with cSCC in Hispanic/Latinos. These findings demonstrate a strong correlation of northwestern European genetic ancestry with cSCC risk in both Hispanic/Latinos and non-Hispanic whites, largely but not entirely mediated through its impact on skin pigmentation.
Asunto(s)
Carcinoma de Células Escamosas/etiología , Antecedentes Genéticos , Hispánicos o Latinos/genética , Neoplasias Cutáneas/etiología , Pigmentación de la Piel , Población Blanca/genética , Anciano , Anciano de 80 o más Años , Biología Computacional/métodos , Susceptibilidad a Enfermedades , Femenino , Humanos , Masculino , Persona de Mediana Edad , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Medición de Riesgo , Factores de Riesgo , Pigmentación de la Piel/genéticaRESUMEN
Central corneal thickness (CCT) is one of the most heritable human traits, with broad-sense heritability estimates ranging between 0.68 to 0.95. Despite the high heritability and numerous previous association studies, only 8.5% of CCT variance is currently explained. Here, we report the results of a multiethnic meta-analysis of available genome-wide association studies in which we find association between CCT and 98 genomic loci, of which 41 are novel. Among these loci, 20 were significantly associated with keratoconus, and one (RAPSN rs3740685) was significantly associated with glaucoma after Bonferroni correction. Two-sample Mendelian randomization analysis suggests that thinner CCT does not causally increase the risk of primary open-angle glaucoma. This large CCT study explains up to 14.2% of CCT variance and increases substantially our understanding of the etiology of CCT variation. This may open new avenues of investigation into human ocular traits and their relationship to the risk of vision disorders.
Asunto(s)
Córnea/patología , Enfermedades de la Córnea/patología , Etnicidad/genética , Sitios Genéticos , Glaucoma/patología , Polimorfismo de Nucleótido Simple , Anciano , Estudios de Cohortes , Enfermedades de la Córnea/etnología , Enfermedades de la Córnea/genética , Femenino , Estudio de Asociación del Genoma Completo , Glaucoma/etnología , Glaucoma/genética , Humanos , Masculino , Análisis de la Aleatorización Mendeliana , Metaanálisis como Asunto , Persona de Mediana Edad , PronósticoRESUMEN
BACKGROUND: Telomere length (TL) may serve as a biologic marker of aging. We examined neighborhood and individual-level socioeconomic status (SES) in relation to TL. METHODS: The study included 84,996 non-Hispanic white subjects from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort, part of the Research Program on Genes, Environment and Health. Relative TL (T/S) was log2 transformed to improve normality and standardized to have mean 0 and variance 1. Neighborhood SES was measured using the Neighborhood Deprivation Index (NDI), and individual SES was measured by self-reported education level. We fit linear regression models of TL on age, sex, smoking, body mass index, comorbidities, NDI, and education level. We tested for differences in the associations by sex and nonlinearity in the association of NDI with TL. RESULTS: Each SD increase in NDI was associated with a decrease of 0.0192 in standardized TL, 95% confidence interval (CI) = -0.0306, -0.0078. There was no evidence of nonlinearity in the association of NDI with TL. We further found that less than high school education was associated with a decrease of 0.1371 in standardized TL, 95% CI = -0.1919, -0.0823 as compared to a college education. There were no differences in the associations by sex. CONCLUSIONS: We found evidence that both lower neighborhood SES and lower individual-level SES are associated with shorter TL among non-Hispanic whites. Our findings suggest that socioeconomic factors may influence aging by contributing to shorter TL.
RESUMEN
A genome-wide association study (GWAS) of 94,674 ancestrally diverse Kaiser Permanente members using 478,866 longitudinal electronic health record (EHR)-derived measurements for untreated serum lipid levels empowered multiple new findings: 121 new SNP associations (46 primary, 15 conditional, and 60 in meta-analysis with Global Lipids Genetic Consortium data); an increase of 33-42% in variance explained with multiple measurements; sex differences in genetic impact (greater impact in females for LDL, HDL, and total cholesterol and the opposite for triglycerides); differences in variance explained among non-Hispanic whites, Latinos, African Americans, and East Asians; genetic dominance and epistatic interaction, with strong evidence for both at the ABO and FUT2 genes for LDL; and tissue-specific enrichment of GWAS-associated SNPs among liver, adipose, and pancreas eQTLs. Using EHR pharmacy data, both LDL and triglyceride genetic risk scores (477 SNPs) were strongly predictive of age at initiation of lipid-lowering treatment. These findings highlight the value of longitudinal EHRs for identifying new genetic features of cholesterol and lipoprotein metabolism with implications for lipid treatment and risk of coronary heart disease.
Asunto(s)
Registros Electrónicos de Salud , Estudio de Asociación del Genoma Completo/métodos , Metabolismo de los Lípidos/genética , Lípidos/sangre , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Adulto , Anciano , Estudios de Cohortes , Bases de Datos Genéticas , Registros Electrónicos de Salud/estadística & datos numéricos , Etnicidad/genética , Etnicidad/estadística & datos numéricos , Femenino , Frecuencia de los Genes , Ligamiento Genético , Humanos , Desequilibrio de Ligamiento , Lípidos/análisis , Estudios Longitudinales , Masculino , Persona de Mediana EdadRESUMEN
Body mass index (BMI), a proxy measure for obesity, is determined by both environmental (including ethnicity, age, and sex) and genetic factors, with > 400 BMI-associated loci identified to date. However, the impact, interplay, and underlying biological mechanisms among BMI, environment, genetics, and ancestry are not completely understood. To further examine these relationships, we utilized 427,509 calendar year-averaged BMI measurements from 100,418 adults from the single large multiethnic Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. We observed substantial independent ancestry and nationality differences, including ancestry principal component interactions and nonlinear effects. To increase the list of BMI-associated variants before assessing other differences, we conducted a genome-wide association study (GWAS) in GERA, with replication in the Genetic Investigation of Anthropomorphic Traits (GIANT) consortium combined with the UK Biobank (UKB), followed by GWAS in GERA combined with GIANT, with replication in the UKB. We discovered 30 novel independent BMI loci (P < 5.0 × 10-8) that replicated. We then assessed the proportion of BMI variance explained by sex in the UKB using previously identified loci compared to previously and newly identified loci and found slight increases: from 3.0 to 3.3% for males and from 2.7 to 3.0% for females. Further, the variance explained by previously and newly identified variants decreased with increasing age in the GERA and UKB cohorts, echoed in the variance explained by the entire genome, which also showed gene-age interaction effects. Finally, we conducted a tissue expression QTL enrichment analysis, which revealed that GWAS BMI-associated variants were enriched in the cerebellum, consistent with prior work in humans and mice.
Asunto(s)
Índice de Masa Corporal , Sitios Genéticos , Peso Corporal/etnología , Peso Corporal/genética , Etnicidad/genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Herencia Multifactorial , Factores SexualesRESUMEN
Primary open-angle glaucoma (POAG) is a leading cause of irreversible vision loss, yet much of the genetic risk remains unaccounted for, especially in African-Americans who have a higher risk for developing POAG. We conduct a multiethnic genome-wide association study (GWAS) of POAG in the GERA cohort, with replication in the UK Biobank (UKB), and vice versa, GWAS in UKB with replication in GERA. We identify 24 loci (P < 5.0 × 10-8), including 14 novel, of which 9 replicate (near FMNL2, PDE7B, TMTC2, IKZF2, CADM2, DGKG, ANKH, EXOC2, and LMX1B). Functional studies support intraocular pressure-related influences of FMNL2 and LMX1B, with certain Lmx1b mutations causing high IOP and glaucoma resembling POAG in mice. The newly identified loci increase the proportion of variance explained in each GERA race/ethnicity group, with the largest gain in African-Americans (0.5-3.1%). A meta-analysis combining GERA and UKB identifies 24 additional loci. Our study provides important insights into glaucoma pathogenesis.
Asunto(s)
Glaucoma de Ángulo Abierto/genética , Anciano , Anciano de 80 o más Años , Animales , Estudios de Cohortes , Etnicidad/genética , Femenino , Forminas , Expresión Génica , Técnicas de Silenciamiento del Gen , Sitios Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Presión Intraocular/genética , Proteínas con Homeodominio LIM/genética , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Mutantes , Persona de Mediana Edad , Mutación , Polimorfismo de Nucleótido Simple , Proteínas/genética , Células Ganglionares de la Retina/metabolismo , Factores de Riesgo , Factores de Transcripción/genética , Reino UnidoRESUMEN
Elevated intraocular pressure (IOP) is a major risk factor for glaucoma, a leading cause of blindness. IOP heritability has been estimated to up to 67%, and to date only 11 IOP loci have been reported, accounting for 1.5% of IOP variability. Here, we conduct a genome-wide association study of IOP in 69,756 untreated individuals of European, Latino, Asian, and African ancestry. Multiple longitudinal IOP measurements were collected through electronic health records and, in total, 356,987 measurements were included. We identify 47 genome-wide significant IOP-associated loci (P < 5 × 10-8); of the 40 novel loci, 14 replicate at Bonferroni significance in an external genome-wide association study analysis of 37,930 individuals of European and Asian descent. We further examine their effect on the risk of glaucoma within our discovery sample. Using longitudinal IOP measurements from electronic health records improves our power to identify new variants, which together explain 3.7% of IOP variation.
Asunto(s)
Sitios Genéticos/genética , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Glaucoma/genética , Presión Intraocular/genética , Negro o Afroamericano/genética , Anciano , Anciano de 80 o más Años , Pueblo Asiatico/genética , Femenino , Predisposición Genética a la Enfermedad/etnología , Glaucoma/etnología , Glaucoma/fisiopatología , Hispánicos o Latinos/genética , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Población Blanca/genéticaRESUMEN
Prostate-specific antigen (PSA) levels have been used for detection and surveillance of prostate cancer (PCa). However, factors other than PCa-such as genetics-can impact PSA. Here we present findings from a genome-wide association study (GWAS) of PSA in 28,503 Kaiser Permanente whites and 17,428 men from replication cohorts. We detect 40 genome-wide significant (P<5 × 10-8) single-nucleotide polymorphisms (SNPs): 19 novel, 15 previously identified for PSA (14 of which were also PCa-associated), and 6 previously identified for PCa only. Further analysis incorporating PCa cases suggests that at least half of the 40 SNPs are PSA-associated independent of PCa. The 40 SNPs explain 9.5% of PSA variation in non-Hispanic whites, and the remaining GWAS SNPs explain an additional 31.7%; this percentage is higher in younger men, supporting the genetic basis of PSA levels. These findings provide important information about genetic markers for PSA that may improve PCa screening, thereby reducing over-diagnosis and over-treatment.
Asunto(s)
Biomarcadores de Tumor/genética , Sitios Genéticos , Polimorfismo de Nucleótido Simple , Antígeno Prostático Específico/genética , Neoplasias de la Próstata/genética , Adulto , Anciano , Anciano de 80 o más Años , Alelos , Pueblo Asiatico , Biomarcadores de Tumor/metabolismo , Población Negra , Expresión Génica , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Próstata/metabolismo , Próstata/patología , Antígeno Prostático Específico/metabolismo , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/etnología , Neoplasias de la Próstata/patología , Población BlancaRESUMEN
Age-related macular degeneration (AMD) risk variants in the complement system point to the important role of immune response and inflammation in the pathogenesis of AMD. Although the human leukocyte antigen (HLA) region has a central role in regulating immune response, previous studies of genetic variation in HLA genes and AMD have been limited by sample size or incomplete coverage of the HLA region by first-generation genotyping arrays and imputation panels. Here, we conducted a large-scale HLA fine-mapping study with 4841 AMD cases and 23 790 controls of non-Hispanic white ancestry from the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging cohort. Genotyping was conducted using custom Affymetrix Axiom arrays, with dense coverage of the HLA region. Classic HLA polymorphisms were imputed using SNP2HLA, which utilizes a large reference panel to provide improved imputation accuracy of variants in this region. We examined a total of 6937 SNPs and 172 classical HLA alleles, conditioning on established AMD risk variants, which revealed novel associations with two non-synonymous SNPs in perfect linkage disequilibrium, rs9274390 and rs41563814 (odds ratio (OR)=1.21; P=1.4 × 10(-11)) corresponding to amino-acid changes at position 66 and 67 in HLA-DQB1, respectively, and the DQB1*02 classical HLA allele (OR=1.22; P=3.9 × 10(-10)) with the risk of AMD. We confirmed these association signals, again conditioning on established risk variants, in the MMAP data set of subjects with advanced AMD (rs9274390/rs41563814: OR=1.28; P=1.30 × 10(-3), DQB1*02: OR=1.32; P=9.00 × 10(-4)). These findings support a role of HLA class II alleles in the risk of AMD.
Asunto(s)
Cadenas beta de HLA-DQ/genética , Degeneración Macular/genética , Mutación Missense , Polimorfismo de Nucleótido Simple , Anciano , Anciano de 80 o más Años , Estudios de Casos y Controles , Femenino , Humanos , Desequilibrio de Ligamiento , Degeneración Macular/patología , Masculino , Población Blanca/genéticaRESUMEN
PURPOSE: We compared across age-related macular degeneration (AMD) subtypes the effect of AMD risk variants, their predictive power, and heritability. METHODS: The prevalence of AMD was estimated among active non-Hispanic white Kaiser Permanente Northern California members who were at least 65 years of age as of June 2013. The genetic analysis included 5,170 overall AMD cases ascertained from electronic health records (EHR), including 1,239 choroidal neovascularization (CNV) cases and 1,060 nonexudative AMD cases without CNV, and 23,130 controls of non-Hispanic white ancestry from the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Imputation was based on the 1000 Genomes Project reference panel. RESULTS: The narrow-sense heritability due to common autosomal single nucleotide polymorphisms (SNPs) was 0.37 for overall AMD, 0.19 for AMD unspecified, 0.20 for nonexudative AMD, and 0.60 for CNV. For the 19 previously reported AMD risk loci, the area under the receiver operating characteristic (ROC) curve was 0.675 for overall AMD, 0.640 for AMD unspecified, 0.678 for nonexudative AMD, and 0.766 for CNV. The individual effects on the risk of AMD for 18 of the 19 SNPs were in a consistent direction with those previously reported, including a protective effect of the APOE ε4 allele. Conversely, the risk of AMD was significantly increased in carriers of the ε2 allele. CONCLUSIONS: These findings provide an independent confirmation of many of the previously identified AMD risk loci, and support a potentially greater role of genetic factors in the development of CNV. The replication of established associations validates the use of EHR in genetic studies of ophthalmologic traits.
Asunto(s)
Predisposición Genética a la Enfermedad , Degeneración Macular/genética , Polimorfismo Genético , Anciano , Alelos , California/epidemiología , Femenino , Genotipo , Humanos , Degeneración Macular/epidemiología , Masculino , Fenotipo , Prevalencia , Estudios Retrospectivos , Factores de RiesgoRESUMEN
Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to self-reported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian-European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals self-identified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent-child pairs, 93% were concordant for self-reported race/ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent-child pairs was largely due to intermarriage. The parent-child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.
Asunto(s)
Envejecimiento/genética , Etnicidad/genética , Genómica , Salud , Grupos Raciales/genética , Adulto , Estudios de Cohortes , Femenino , Humanos , Masculino , Persona de Mediana Edad , Epidemiología Molecular , Linaje , Análisis de Componente PrincipalRESUMEN
The Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort includes DNA specimens extracted from saliva samples of 110,266 individuals. Because of its relationship to aging, telomere length measurement was considered an important biomarker to develop on these subjects. To assay relative telomere length (TL) on this large cohort over a short time period, we created a novel high throughput robotic system for TL analysis and informatics. Samples were run in triplicate, along with control samples, in a randomized design. As part of quality control, we determined the within-sample variability and employed thresholds for the elimination of outlying measurements. Of 106,902 samples assayed, 105,539 (98.7%) passed all quality control (QC) measures. As expected, TL in general showed a decline with age and a sex difference. While telomeres showed a negative correlation with age up to 75 years, in those older than 75 years, age positively correlated with longer telomeres, indicative of an association of longer telomeres with more years of survival in those older than 75. Furthermore, while females in general had longer telomeres than males, this difference was significant only for those older than age 50. An additional novel finding was that the variance of TL between individuals increased with age. This study establishes reliable assay and analysis methodologies for measurement of TL in large, population-based human studies. The GERA cohort represents the largest currently available such resource, linked to comprehensive electronic health and genotype data for analysis.
Asunto(s)
Envejecimiento/genética , Biología Computacional/métodos , Salud , Telómero/genética , Adulto , Automatización , Estudios de Cohortes , Femenino , Genotipo , Humanos , Leucocitos Mononucleares/metabolismo , Masculino , Epidemiología Molecular , Caracteres SexualesRESUMEN
The Kaiser Permanente (KP) Research Program on Genes, Environment and Health (RPGEH), in collaboration with the University of California-San Francisco, undertook genome-wide genotyping of >100,000 subjects that constitute the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. The project, which generated >70 billion genotypes, represents the first large-scale use of the Affymetrix Axiom Genotyping Solution. Because genotyping took place over a short 14-month period, creating a near-real-time analysis pipeline for experimental assay quality control and final optimized analyses was critical. Because of the multi-ethnic nature of the cohort, four different ethnic-specific arrays were employed to enhance genome-wide coverage. All assays were performed on DNA extracted from saliva samples. To improve sample call rates and significantly increase genotype concordance, we partitioned the cohort into disjoint packages of plates with similar assay contexts. Using strict QC criteria, the overall genotyping success rate was 103,067 of 109,837 samples assayed (93.8%), with a range of 92.1-95.4% for the four different arrays. Similarly, the SNP genotyping success rate ranged from 98.1 to 99.4% across the four arrays, the variation depending mostly on how many SNPs were included as single copy vs. double copy on a particular array. The high quality and large scale of genotype data created on this cohort, in conjunction with comprehensive longitudinal data from the KP electronic health records of participants, will enable a broad range of highly powered genome-wide association studies on a diversity of traits and conditions.
Asunto(s)
Envejecimiento/genética , Biología Computacional/métodos , Técnicas de Genotipaje/métodos , Salud , Adulto , Estudios de Cohortes , Femenino , Humanos , Masculino , Epidemiología Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple , Control de CalidadRESUMEN
UNLABELLED: A genome-wide association study (GWAS) of prostate cancer in Kaiser Permanente health plan members (7,783 cases, 38,595 controls; 80.3% non-Hispanic white, 4.9% African-American, 7.0% East Asian, and 7.8% Latino) revealed a new independent risk indel rs4646284 at the previously identified locus 6q25.3 that replicated in PEGASUS (N = 7,539) and the Multiethnic Cohort (N = 4,679) with an overall P = 1.0 × 10(-19) (OR, 1.18). Across the 6q25.3 locus, rs4646284 exhibited the strongest association with expression of SLC22A1 (P = 1.3 × 10(-23)) and SLC22A3 (P = 3.2 × 10(-52)). At the known 19q13.33 locus, rs2659124 (P = 1.3 × 10(-13); OR, 1.18) nominally replicated in PEGASUS. A risk score of 105 known risk SNPs was strongly associated with prostate cancer (P < 1.0 × 10(-8)). Comparing the highest to lowest risk score deciles, the OR was 6.22 for non-Hispanic whites, 5.82 for Latinos, 3.77 for African-Americans, and 3.38 for East Asians. In non-Hispanic whites, the 105 risk SNPs explained approximately 7.6% of disease heritability. The entire GWAS array explained approximately 33.4% of heritability, with a 4.3-fold enrichment within DNaseI hypersensitivity sites (P = 0.004). SIGNIFICANCE: Taken together, our findings of independent risk variants, ethnic variation in existing SNP replication, and remaining unexplained heritability have important implications for further clarifying the genetic risk of prostate cancer. Our findings also suggest that there may be much promise in evaluating understudied variation, such as indels and ethnically diverse populations.