RESUMEN
Mutations in transporters can impact an individual's response to drugs and cause many diseases. Few variants in transporters have been evaluated for their functional impact. Here, we combine saturation mutagenesis and multi-phenotypic screening to dissect the impact of 11,213 missense single-amino-acid deletions, and synonymous variants across the 554 residues of OCT1, a key liver xenobiotic transporter. By quantifying in parallel expression and substrate uptake, we find that most variants exert their primary effect on protein abundance, a phenotype not commonly measured alongside function. Using our mutagenesis results combined with structure prediction and molecular dynamic simulations, we develop accurate structure-function models of the entire transport cycle, providing biophysical characterization of all known and possible human OCT1 polymorphisms. This work provides a complete functional map of OCT1 variants along with a framework for integrating functional genomics, biophysical modeling, and human genetics to predict variant effects on disease and drug efficacy.
Asunto(s)
Simulación de Dinámica Molecular , Transportador 1 de Catión Orgánico , Conformación Proteica , Humanos , Transporte Biológico , Células HEK293 , Mutación , Mutación Missense , Factor 1 de Transcripción de Unión a Octámeros , Transportador 1 de Catión Orgánico/genética , Transportador 1 de Catión Orgánico/metabolismo , Farmacogenética , Fenotipo , Relación Estructura-ActividadRESUMEN
Polygenic risk scores (PRSs) summarize the genetic predisposition of a complex human trait or disease and may become a valuable tool for advancing precision medicine. However, PRSs that are developed in populations of predominantly European genetic ancestries can increase health disparities due to poor predictive performance in individuals of diverse and complex genetic ancestries. We describe genetic and modifiable risk factors that limit the transferability of PRSs across populations and review the strengths and weaknesses of existing PRS construction methods for diverse ancestries. Developing PRSs that benefit global populations in research and clinical settings provides an opportunity for innovation and is essential for health equity.
Asunto(s)
Predisposición Genética a la Enfermedad , Humanos , Factores de Riesgo , Herencia Multifactorial , Medicina de Precisión , Estudio de Asociación del Genoma CompletoRESUMEN
The risk of congenital heart defects (CHDs) may be influenced by maternal genes, fetal genes, and their interactions. Existing methods commonly test the effects of maternal and fetal variants one-at-a-time and may have reduced statistical power to detect genetic variants with low minor allele frequencies. In this article, we propose a gene-based association test of interactions for maternal-fetal genotypes (GATI-MFG) using a case-mother and control-mother design. GATI-MFG can integrate the effects of multiple variants within a gene or genomic region and evaluate the joint effect of maternal and fetal genotypes while allowing for their interactions. In simulation studies, GATI-MFG had improved statistical power over alternative methods, such as the single-variant test and functional data analysis (FDA) under various disease scenarios. We further applied GATI-MFG to a two-phase genome-wide association study of CHDs for the testing of both common variants and rare variants using 947 CHD case mother-infant pairs and 1306 control mother-infant pairs from the National Birth Defects Prevention Study (NBDPS). After Bonferroni adjustment for 23,035 genes, two genes on chromosome 17, TMEM107 (p = 1.64e-06) and CTC1 (p = 2.0e-06), were identified for significant association with CHD in common variants analysis. Gene TMEM107 regulates ciliogenesis and ciliary protein composition and was found to be associated with heterotaxy. Gene CTC1 plays an essential role in protecting telomeres from degradation, which was suggested to be associated with cardiogenesis. Overall, GATI-MFG outperformed the single-variant test and FDA in the simulations, and the results of application to NBDPS samples are consistent with existing literature supporting the association of TMEM107 and CTC1 with CHDs.
Asunto(s)
Estudio de Asociación del Genoma Completo , Cardiopatías Congénitas , Femenino , Humanos , Modelos Genéticos , Genotipo , Cardiopatías Congénitas/genética , Madres , Estudios de Casos y ControlesRESUMEN
Acute lymphoblastic leukemia (ALL) is the most common childhood cancer. Despite overlap between genetic risk loci for ALL and hematologic traits, the etiological relevance of dysregulated blood-cell homeostasis remains unclear. We investigated this question in a genome-wide association study (GWAS) of childhood ALL (2,666 affected individuals, 60,272 control individuals) and a multi-trait GWAS of nine blood-cell indices in the UK Biobank. We identified 3,000 blood-cell-trait-associated (p < 5.0 × 10-8) variants, explaining 4.0% to 23.9% of trait variation and including 115 loci associated with blood-cell ratios (LMR, lymphocyte-to-monocyte ratio; NLR, neutrophil-to-lymphocyte ratio; PLR, platelet-to-lymphocyte ratio). ALL susceptibility was genetically correlated with lymphocyte counts (rg = 0.088, p = 4.0 × 10-4) and PLR (rg = -0.072, p = 0.0017). In Mendelian randomization analyses, genetically predicted increase in lymphocyte counts was associated with increased ALL risk (odds ratio [OR] = 1.16, p = 0.031) and strengthened after accounting for other cell types (OR = 1.43, p = 8.8 × 10-4). We observed positive associations with increasing LMR (OR = 1.22, p = 0.0017) and inverse effects for NLR (OR = 0.67, p = 3.1 × 10-4) and PLR (OR = 0.80, p = 0.002). Our study shows that a genetically induced shift toward higher lymphocyte counts, overall and in relation to monocytes, neutrophils, and platelets, confers an increased susceptibility to childhood ALL.
Asunto(s)
Biomarcadores de Tumor/genética , Plaquetas/patología , Linfocitos/patología , Monocitos/patología , Neutrófilos/patología , Leucemia-Linfoma Linfoblástico de Células Precursoras/epidemiología , Sitios de Carácter Cuantitativo , Adulto , Anciano , Estudios de Casos y Controles , Niño , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Análisis de la Aleatorización Mendeliana , Persona de Mediana Edad , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/patología , Pronóstico , Estudios Prospectivos , Reino Unido/epidemiologíaRESUMEN
DNA methylation may be regulated by genetic variants within a genomic region, referred to as methylation quantitative trait loci (mQTLs). The changes of methylation levels can further lead to alterations of gene expression, and influence the risk of various complex human diseases. Detecting mQTLs may provide insights into the underlying mechanism of how genotypic variations may influence the disease risk. In this article, we propose a methylation random field (MRF) method to detect mQTLs by testing the association between the methylation level of a CpG site and a set of genetic variants within a genomic region. The proposed MRF has two major advantages over existing approaches. First, it uses a beta distribution to characterize the bimodal and interval properties of the methylation trait at a CpG site. Second, it considers multiple common and rare genetic variants within a genomic region to identify mQTLs. Through simulations, we demonstrated that the MRF had improved power over other existing methods in detecting rare variants of relatively large effect, especially when the sample size is small. We further applied our method to a study of congenital heart defects with 83 cardiac tissue samples and identified two mQTL regions, MRPS10 and PSORS1C1, which were colocalized with expression QTL in cardiac tissue. In conclusion, the proposed MRF is a useful tool to identify novel mQTLs, especially for studies with limited sample sizes.
Asunto(s)
Biología Computacional/métodos , Metilación de ADN , Epigénesis Genética , Epigenómica/métodos , Sitios de Carácter Cuantitativo , Algoritmos , Alelos , Teorema de Bayes , Biología Computacional/normas , Islas de CpG , Análisis de Datos , Epigenómica/normas , Genotipo , Humanos , Especificidad de Órganos/genética , Polimorfismo de Nucleótido SimpleRESUMEN
MOTIVATION: CpG sites within the same genomic region often share similar methylation patterns and tend to be co-regulated by multiple genetic variants that may interact with one another. RESULTS: We propose a multi-trait methylation random field (multi-MRF) method to evaluate the joint association between a set of CpG sites and a set of genetic variants. The proposed method has several advantages. First, it is a multi-trait method that allows flexible correlation structures between neighboring CpG sites (e.g. distance-based correlation). Second, it is also a multi-locus method that integrates the effect of multiple common and rare genetic variants. Third, it models the methylation traits with a beta distribution to characterize their bimodal and interval properties. Through simulations, we demonstrated that the proposed method had improved power over some existing methods under various disease scenarios. We further illustrated the proposed method via an application to a study of congenital heart defects (CHDs) with 83 cardiac tissue samples. Our results suggested that gene BACE2, a methylation quantitative trait locus (QTL) candidate, colocalized with expression QTLs in artery tibial and harbored genetic variants with nominal significant associations in two genome-wide association studies of CHD. AVAILABILITY AND IMPLEMENTATION: https://github.com/chenlyu2656/Multi-MRF. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Metilación , Fenotipo , Genómica/métodos , Metilación de ADN , Polimorfismo de Nucleótido SimpleRESUMEN
Risk factors that contribute to inter-individual differences in the age-of-onset of allergic diseases are poorly understood. The aim of this study was to identify genetic risk variants associated with the age at which symptoms of allergic disease first develop, considering information from asthma, hay fever and eczema. Self-reported age-of-onset information was available for 117,130 genotyped individuals of European ancestry from the UK Biobank study. For each individual, we identified the earliest age at which asthma, hay fever and/or eczema was first diagnosed and performed a genome-wide association study (GWAS) of this combined age-of-onset phenotype. We identified 50 variants with a significant independent association (P<3x10-8) with age-of-onset. Forty-five variants had comparable effects on the onset of the three individual diseases and 38 were also associated with allergic disease case-control status in an independent study (n = 222,484). We observed a strong negative genetic correlation between age-of-onset and case-control status of allergic disease (rg = -0.63, P = 4.5x10-61), indicating that cases with early disease onset have a greater burden of allergy risk alleles than those with late disease onset. Subsequently, a multivariate GWAS of age-of-onset and case-control status identified a further 26 associations that were missed by the univariate analyses of age-of-onset or case-control status only. Collectively, of the 76 variants identified, 18 represent novel associations for allergic disease. We identified 81 likely target genes of the 76 associated variants based on information from expression quantitative trait loci (eQTL) and non-synonymous variants, of which we highlight ADAM15, FOSL2, TRIM8, BMPR2, CD200R1, PRKCQ, NOD2, SMAD4, ABCA7 and UBE2L3. Our results support the notion that early and late onset allergic disease have partly distinct genetic architectures, potentially explaining known differences in pathophysiology between individuals.
Asunto(s)
Asma/genética , Eccema/genética , Polimorfismo de Nucleótido Simple , Rinitis Alérgica Estacional/genética , Adolescente , Adulto , Edad de Inicio , Anciano , Asma/patología , Niño , Eccema/patología , Femenino , Sitios Genéticos , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Persona de Mediana Edad , Rinitis Alérgica Estacional/patologíaRESUMEN
Patricia Mabry and coauthors discuss application of systems approaches in cancer research.
Asunto(s)
Neoplasias , Humanos , Neoplasias/epidemiología , Neoplasias/genética , Neoplasias/terapia , InvestigaciónRESUMEN
BACKGROUND: Up to one of every six individuals diagnosed with one cancer will be diagnosed with a second primary cancer in their lifetime. Genetic factors contributing to the development of multiple primary cancers, beyond known cancer syndromes, have been underexplored. METHODS: To characterize genetic susceptibility to multiple cancers, we conducted a pan-cancer, whole-exome sequencing study of individuals drawn from two large multi-ancestry populations (6429 cases, 165,853 controls). We created two groupings of individuals diagnosed with multiple primary cancers: (1) an overall combined set with at least two cancers across any of 36 organ sites and (2) cancer-specific sets defined by an index cancer at one of 16 organ sites with at least 50 cases from each study population. We then investigated whether variants identified from exome sequencing were associated with these sets of multiple cancer cases in comparison to individuals with one and, separately, no cancers. RESULTS: We identified 22 variant-phenotype associations, 10 of which have not been previously discovered and were significantly overrepresented among individuals with multiple cancers, compared to those with a single cancer. CONCLUSIONS: Overall, we describe variants and genes that may play a fundamental role in the development of multiple primary cancers and improve our understanding of shared mechanisms underlying carcinogenesis.
Asunto(s)
Predisposición Genética a la Enfermedad , Neoplasias Primarias Múltiples , Exoma/genética , Predisposición Genética a la Enfermedad/genética , Humanos , Neoplasias Primarias Múltiples/genética , Fenotipo , Secuenciación del ExomaRESUMEN
MOTIVATION: While gene-environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. RESULTS: Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e. GxE pleiotropy), our approach offers substantial gain in power (18-43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. AVAILABILITY AND IMPLEMENTATION: We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
Obstructive heart defects (OHDs) share common structural lesions in arteries and cardiac valves, accounting for ~25% of all congenital heart defects. OHDs are highly heritable, resulting from interplay among maternal exposures, genetic susceptibilities, and epigenetic phenomena. A genome-wide association study was conducted in National Birth Defects Prevention Study participants (Ndiscovery = 3978; Nreplication = 2507), investigating the genetic architecture of OHDs using transmission/disequilibrium tests (TDT) in complete case-parental trios (Ndiscovery_TDT = 440; Nreplication_TDT = 275) and case-control analyses separately in infants (Ndiscovery_CCI = 1635; Nreplication_CCI = 990) and mothers (case status defined by infant; Ndiscovery_CCM = 1703; Nreplication_CCM = 1078). In the TDT analysis, the SLC44A2 single nucleotide polymorphism (SNP) rs2360743 was significantly associated with OHD (pdiscovery = 4.08 × 10-9 ; preplication = 2.44 × 10-4 ). A CAPN11 SNP (rs55877192) was suggestively associated with OHD (pdiscovery = 1.61 × 10-7 ; preplication = 0.0016). Two other SNPs were suggestively associated (p < 1 × 10-6 ) with OHD in only the discovery sample. In the case-control analyses, no SNPs were genome-wide significant, and, even with relaxed thresholds ( × discovery < 1 × 10-5 and preplication < 0.05), only one SNP (rs188255766) in the infant analysis was associated with OHDs (pdiscovery = 1.42 × 10-6 ; preplication = 0.04). Additional SNPs with pdiscovery < 1 × 10-5 were in loci supporting previous findings but did not replicate. Overall, there was modest evidence of an association between rs2360743 and rs55877192 and OHD and some evidence validating previously published findings.
Asunto(s)
Estudio de Asociación del Genoma Completo , Cardiopatías Congénitas , Estudios de Casos y Controles , Femenino , Predisposición Genética a la Enfermedad , Cardiopatías Congénitas/epidemiología , Cardiopatías Congénitas/genética , Humanos , Lactante , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: While pollution from vehicle sources is an established risk factor for preterm birth, it is unclear whether distance of residence to the nearest major road or related measures like major road density represent useful measures for characterising risk. OBJECTIVE: To determine whether major road proximity measures (including distance to major road, major road density and traffic volume) are more useful risk factors for preterm birth than other established vehicle-related measures (including particulate matter <2.5 µm in diameter (PM2.5 ) and diesel particulate matter (diesel PM)). METHODS: This retrospective cohort study included 2.7 million births across the state of California from 2011-2017; each address at delivery was geocoded. Geocoding was used to calculate distance to the nearest major road, major road density within a 500 m radius and major road density weighted by truck volume. We measured associations with preterm birth using risk ratios adjusted for target demographic, clinical, socioeconomic and environmental covariates (aRRs). We compared these to the associations between preterm birth and PM2.5 and diesel PM by census tract of residence. RESULTS: Findings showed that whereas higher mean levels of PM2.5 and diesel PM by census tract were associated with a higher risk of preterm birth, living closer to roads or living in higher traffic density areas was not associated with higher risk. Residence in a census tract with a mean PM2.5 in the top quartile compared with the lowest quartile was associated with the highest observed risk of preterm birth (aRR 1.04, 95% CI 1.04, 1.05). CONCLUSIONS: Over a large geographical region with a diverse population, PM2.5 and diesel PM were associated with preterm birth, while measures of distance to major road were not, suggesting that these distance measures do not serve as a proxy for measures of particulate matter in the context of preterm birth.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Nacimiento Prematuro , Contaminantes Atmosféricos/efectos adversos , Contaminantes Atmosféricos/análisis , Contaminación del Aire/efectos adversos , Contaminación del Aire/análisis , California/epidemiología , Tramo Censal , Humanos , Recién Nacido , Material Particulado/efectos adversos , Material Particulado/análisis , Nacimiento Prematuro/epidemiología , Nacimiento Prematuro/etiología , Estudios Retrospectivos , Factores de Riesgo , Emisiones de Vehículos/toxicidadRESUMEN
PURPOSE: We examined the demographic and clinicopathological parameters associated with the time to convert from active surveillance to treatment among men with prostate cancer. MATERIALS AND METHODS: A multi-institutional cohort of 7,279 patients managed with active surveillance had data and biospecimens collected for germline genetic analyses. RESULTS: Of 6,775 men included in the analysis, 2,260 (33.4%) converted to treatment at a median followup of 6.7 years. Earlier conversion was associated with higher Gleason grade groups (GG2 vs GG1 adjusted hazard ratio [aHR] 1.57, 95% CI 1.36-1.82; ≥GG3 vs GG1 aHR 1.77, 95% CI 1.29-2.43), serum prostate specific antigen concentrations (aHR per 5 ng/ml increment 1.18, 95% CI 1.11-1.25), tumor stages (cT2 vs cT1 aHR 1.58, 95% CI 1.41-1.77; ≥cT3 vs cT1 aHR 4.36, 95% CI 3.19-5.96) and number of cancerous biopsy cores (3 vs 1-2 cores aHR 1.59, 95% CI 1.37-1.84; ≥4 vs 1-2 cores aHR 3.29, 95% CI 2.94-3.69), and younger age (age continuous per 5-year increase aHR 0.96, 95% CI 0.93-0.99). Patients with high-volume GG1 tumors had a shorter interval to conversion than those with low-volume GG1 tumors and behaved like the higher-risk patients. We found no significant association between the time to conversion and self-reported race or genetic ancestry. CONCLUSIONS: A shorter time to conversion from active surveillance to treatment was associated with higher-risk clinicopathological tumor features. Furthermore, patients with high-volume GG1 tumors behaved similarly to those with intermediate and high-risk tumors. An exploratory analysis of self-reported race and genetic ancestry revealed no association with the time to conversion.
Asunto(s)
Prostatectomía/estadística & datos numéricos , Neoplasias de la Próstata/terapia , Espera Vigilante/estadística & datos numéricos , Anciano , Biopsia con Aguja Gruesa/estadística & datos numéricos , Progresión de la Enfermedad , Estudios de Seguimiento , Humanos , Calicreínas/sangre , Masculino , Persona de Mediana Edad , Clasificación del Tumor , Estadificación de Neoplasias , Próstata/patología , Próstata/cirugía , Antígeno Prostático Específico/sangre , Neoplasias de la Próstata/sangre , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Medición de Riesgo/estadística & datos numéricos , Factores de Riesgo , Factores de Tiempo , Carga TumoralRESUMEN
Cancer diagnoses are associated with better long-term memory in older adults, possibly reflecting a range of social confounders that increase cancer risk but improve memory. We used spouse's memory as a negative control outcome to evaluate this possible confounding, since spouses share social characteristics and environments, and individuals' cancers are unlikely to cause better memory among their spouses. We estimated the association of an individual's incident cancer diagnosis (exposure) with their own (primary outcome) and their spouse's (negative control outcome) memory decline in 3601 couples from 1998 to 2014 in the Health and Retirement Study, using linear mixed-effects models. Incident cancer predicted better long-term memory for the diagnosed individual. We observed no association between an individual's cancer diagnosis and rate of spousal memory decline. This negative control study suggests that the inverse association between incident cancer and rate of memory decline is unlikely to be attributable to social/behavioral factors shared between spouses.
Asunto(s)
Memoria/fisiología , Neoplasias/diagnóstico , Esposos/estadística & datos numéricos , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Encuestas y CuestionariosRESUMEN
Simultaneous analysis of genetic associations with multiple phenotypes may reveal shared genetic susceptibility across traits (pleiotropy). For a locus exhibiting overall pleiotropy, it is important to identify which specific traits underlie this association. We propose a Bayesian meta-analysis approach (termed CPBayes) that uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. This method uses a unified Bayesian statistical framework based on a spike and slab prior. CPBayes performs a fully Bayesian analysis by employing the Markov Chain Monte Carlo (MCMC) technique Gibbs sampling. It takes into account heterogeneity in the size and direction of the genetic effects across traits. It can be applied to both cohort data and separate studies of multiple traits having overlapping or non-overlapping subjects. Simulations show that CPBayes can produce higher accuracy in the selection of associated traits underlying a pleiotropic signal than the subset-based meta-analysis ASSET. We used CPBayes to undertake a genome-wide pleiotropic association study of 22 traits in the large Kaiser GERA cohort and detected six independent pleiotropic loci associated with at least two phenotypes. This includes a locus at chromosomal region 1q24.2 which exhibits an association simultaneously with the risk of five different diseases: Dermatophytosis, Hemorrhoids, Iron Deficiency, Osteoporosis and Peripheral Vascular Disease. We provide an R-package 'CPBayes' implementing the proposed method.
Asunto(s)
Teorema de Bayes , Estudios de Asociación Genética/métodos , Estudios de Asociación Genética/estadística & datos numéricos , Predisposición Genética a la Enfermedad , Fenotipo , Estudios de Casos y Controles , Estudios de Cohortes , Predisposición Genética a la Enfermedad/epidemiología , Humanos , Cadenas de Markov , Método de MontecarloRESUMEN
Latinos represent <1% of samples analyzed to date in genome-wide association studies of cancer. The clinical value of genetic information in guiding personalized medicine in populations of non-European ancestry will require additional discovery and risk locus characterization efforts across populations. In the present study, we performed a GWAS of prostate cancer (PrCa) in 2,820 Latino PrCa cases and 5,293 controls to search for novel PrCa risk loci and to examine the generalizability of known PrCa risk loci in Latino men. We also conducted a genetic admixture-mapping scan to identify PrCa risk alleles associated with local ancestry. Genome-wide significant associations were observed with 84 variants all located at the known PrCa risk regions at 8q24 (128.484-128.548) and 10q11.22 (MSMB gene). In admixture mapping, we observed genome-wide significant associations with local African ancestry at 8q24. Of the 162 established PrCa risk variants that are common in Latino men, 135 (83.3%) had effects that were directionally consistent as previously reported, among which 55 (34.0%) were statistically significant with p < 0.05. A polygenic risk model of the known PrCa risk variants showed that, compared to men with average risk (25th-75th percentile of the polygenic risk score distribution), men in the top 10% had a 3.19-fold (95% CI: 2.65, 3.84) increased PrCa risk. In conclusion, we found that the known PrCa risk variants can effectively stratify PrCa risk in Latino men. Larger studies in Latino populations will be required to discover and characterize genetic risk variants for PrCa and improve risk stratification for this population.
Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Hispánicos o Latinos , Neoplasias de la Próstata/epidemiología , Neoplasias de la Próstata/genética , Anciano , Alelos , Biomarcadores de Tumor , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Herencia Multifactorial , Oportunidad Relativa , Polimorfismo de Nucleótido SimpleRESUMEN
Inflammatory bowel disease (IBD) is an established risk factor for colorectal cancer. Recent reports suggesting IBD is also a risk factor for prostate cancer (PC) require further investigation. We studied 218 084 men in the population-based UK Biobank cohort, aged 40 to 69 at study entry between 2006 and 2010, with follow-up through mid-2015. We assessed the association between IBD and subsequent PC using multivariable Cox regression analyses, adjusting for age at assessment, ethnic group, UK region, smoking status, alcohol drinking frequency, body mass index, Townsend Deprivation Index, family history of PC and previous prostate-specific antigen testing. Mean age at study entry was 56 years, 94% of the men were white, and 1.1% (n = 2311) had a diagnosis of IBD. After a median follow-up of 78 months, men with IBD had an increased risk of PC (adjusted hazard ratio [aHR] = 1.31, 95% confidence interval [CI] = 1.03-1.67, P = .029). The association with PC was only among men with the ulcerative colitis (UC; aHR = 1.47, 95% CI = 1.11-1.95, P = .0070), and not Crohn's disease (aHR 1.06, 95% CI = 0.63-1.80, P = .82). Results are limited by lack of data on frequency of health care interactions. In a large-scale, prospective cohort study, we detected an association between IBD, and UC specifically, with incident PC diagnosis.
Asunto(s)
Colitis Ulcerosa/epidemiología , Enfermedad de Crohn/epidemiología , Neoplasias de la Próstata/epidemiología , Adulto , Anciano , Humanos , Incidencia , Masculino , Persona de Mediana Edad , Análisis Multivariante , Estudios Prospectivos , Factores de Riesgo , Reino Unido/etnología , Población BlancaRESUMEN
BACKGROUND: Cell-free DNA's (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel's ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients.
Asunto(s)
Secuencia de Bases/genética , ADN Tumoral Circulante/genética , Genoma Humano , Aprendizaje Automático , Neoplasias de la Próstata/genética , Adulto , Anciano , Anciano de 80 o más Años , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/aislamiento & purificación , ADN Tumoral Circulante/aislamiento & purificación , Estudios de Cohortes , Humanos , Masculino , Persona de Mediana Edad , Mutación , Análisis de Secuencia de ADN/métodos , Secuenciación Completa del Genoma/métodosRESUMEN
Our understanding of the genetic basis of disease has evolved from descriptions of overall heritability or familiality to the identification of large numbers of risk loci. One can quantify the impact of such loci on disease using a plethora of measures, which can guide future research decisions. However, different measures can attribute varying degrees of importance to a variant. In this Analysis, we consider and contrast the most commonly used measures - specifically, the heritability of disease liability, approximate heritability, sibling recurrence risk, overall genetic variance using a logarithmic relative risk scale, the area under the receiver-operating curve for risk prediction and the population attributable fraction - and give guidelines for their use that should be explicitly considered when assessing the contribution of genetic variants to disease.
Asunto(s)
Enfermedades Genéticas Congénitas/genética , Predisposición Genética a la Enfermedad , Variación Genética/genética , Modelos Estadísticos , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Carácter Cuantitativo Heredable , RiesgoRESUMEN
Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's "Up for a Challenge" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.