Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 114
Filter
Add more filters

Publication year range
1.
Nature ; 631(8019): 134-141, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38867047

ABSTRACT

Mosaic loss of the X chromosome (mLOX) is the most common clonal somatic alteration in leukocytes of female individuals1,2, but little is known about its genetic determinants or phenotypic consequences. Here, to address this, we used data from 883,574 female participants across 8 biobanks; 12% of participants exhibited detectable mLOX in approximately 2% of leukocytes. Female participants with mLOX had an increased risk of myeloid and lymphoid leukaemias. Genetic analyses identified 56 common variants associated with mLOX, implicating genes with roles in chromosomal missegregation, cancer predisposition and autoimmune diseases. Exome-sequence analyses identified rare missense variants in FBXO10 that confer a twofold increased risk of mLOX. Only a small fraction of associations was shared with mosaic Y chromosome loss, suggesting that distinct biological processes drive formation and clonal expansion of sex chromosome missegregation. Allelic shift analyses identified X chromosome alleles that are preferentially retained in mLOX, demonstrating variation at many loci under cellular selection. A polygenic score including 44 allelic shift loci correctly inferred the retained X chromosomes in 80.7% of mLOX cases in the top decile. Our results support a model in which germline variants predispose female individuals to acquiring mLOX, with the allelic content of the X chromosome possibly shaping the magnitude of clonal expansion.


Subject(s)
Aneuploidy , Chromosomes, Human, X , Clone Cells , Leukocytes , Mosaicism , Adult , Female , Humans , Male , Middle Aged , Alleles , Autoimmune Diseases/genetics , Biological Specimen Banks , Chromosome Segregation/genetics , Chromosomes, Human, X/genetics , Chromosomes, Human, Y/genetics , Clone Cells/metabolism , Clone Cells/pathology , Exome/genetics , F-Box Proteins/genetics , Genetic Predisposition to Disease/genetics , Germ-Line Mutation , Leukemia/genetics , Leukocytes/metabolism , Models, Genetic , Multifactorial Inheritance/genetics , Mutation, Missense/genetics
2.
Nature ; 606(7916): 999-1006, 2022 06.
Article in English | MEDLINE | ID: mdl-35676472

ABSTRACT

Large-scale human genetic data1-3 have shown that cancer mutations display strong tissue-selectivity, but how this selectivity arises remains unclear. Here, using experimental models, functional genomics and analyses of patient samples, we demonstrate that the lineage transcription factor paired box 8 (PAX8) is required for oncogenic signalling by two common genetic alterations that cause clear cell renal cell carcinoma (ccRCC) in humans: the germline variant rs7948643 at 11q13.3 and somatic inactivation of the von Hippel-Lindau tumour suppressor (VHL)4-6. VHL loss, which is observed in about 90% of ccRCCs, can lead to hypoxia-inducible factor 2α (HIF2A) stabilization6,7. We show that HIF2A is preferentially recruited to PAX8-bound transcriptional enhancers, including a pro-tumorigenic cyclin D1 (CCND1) enhancer that is controlled by PAX8 and HIF2A. The ccRCC-protective allele C at rs7948643 inhibits PAX8 binding at this enhancer and downstream activation of CCND1 expression. Co-option of a PAX8-dependent physiological programme that supports the proliferation of normal renal epithelial cells is also required for MYC expression from the ccRCC metastasis-associated amplicons at 8q21.3-q24.3 (ref. 8). These results demonstrate that transcriptional lineage factors are essential for oncogenic signalling and that they mediate tissue-specific cancer risk associated with somatic and inherited genetic variants.


Subject(s)
Carcinogenesis , Kidney Neoplasms , PAX8 Transcription Factor , Signal Transduction , Alleles , Basic Helix-Loop-Helix Transcription Factors/metabolism , Carcinogenesis/genetics , Carcinoma, Renal Cell/metabolism , Carcinoma, Renal Cell/pathology , Cyclin D1/genetics , Gene Expression Regulation, Neoplastic , Humans , Kidney/metabolism , Kidney/pathology , Kidney Neoplasms/metabolism , Kidney Neoplasms/pathology , Mutation , PAX8 Transcription Factor/genetics , PAX8 Transcription Factor/metabolism , Proto-Oncogene Proteins c-myc/genetics , Von Hippel-Lindau Tumor Suppressor Protein/genetics
3.
Am J Hum Genet ; 111(9): 1864-1876, 2024 Sep 05.
Article in English | MEDLINE | ID: mdl-39137781

ABSTRACT

We performed a series of integrative analyses including transcriptome-wide association studies (TWASs) and proteome-wide association studies (PWASs) of renal cell carcinoma (RCC) to nominate and prioritize molecular targets for laboratory investigation. On the basis of a genome-wide association study (GWAS) of 29,020 affected individuals and 835,670 control individuals and prediction models trained in transcriptomic reference models, our TWAS across four kidney transcriptomes (GTEx kidney cortex, kidney tubules, TCGA-KIRC [The Cancer Genome Atlas kidney renal clear-cell carcinoma], and TCGA-KIRP [TCGA kidney renal papillary cell carcinoma]) identified 38 gene associations (false-discovery rate <5%) in at least two of four transcriptomic panels and identified 12 genes that were independent of GWAS susceptibility regions. Analyses combining TWAS associations across 48 tissues from GTEx identified associations that were replicable in tumor transcriptomes for 23 additional genes. Analyses by the two major histologic types (clear-cell RCC and papillary RCC) revealed subtype-specific associations, although at least three gene associations were common to both subtypes. PWAS identified 13 associated proteins, all mapping to GWAS-significant loci. TWAS-identified genes were enriched for active enhancer or promoter regions in RCC tumors and hypoxia-inducible factor binding sites in relevant cell lines. Using gene expression correlation, common cancers (breast and prostate) and RCC risk factors (e.g., hypertension and BMI) display genetic contributions shared with RCC. Our work identifies potential molecular targets for RCC susceptibility for downstream functional investigation.


Subject(s)
Carcinoma, Renal Cell , Genome-Wide Association Study , Kidney Neoplasms , Proteome , Transcriptome , Carcinoma, Renal Cell/genetics , Humans , Kidney Neoplasms/genetics , Proteome/genetics , Genetic Predisposition to Disease , Gene Expression Regulation, Neoplastic , Polymorphism, Single Nucleotide , Gene Expression Profiling
4.
Hum Mol Genet ; 32(22): 3146-3152, 2023 11 03.
Article in English | MEDLINE | ID: mdl-37565819

ABSTRACT

Age-related clonal expansion of cells harbouring mosaic chromosomal alterations (mCAs) is one manifestation of clonal haematopoiesis. Identifying factors that influence the generation and promotion of clonal expansion of mCAs are key to investigate the role of mCAs in health and disease. Herein, we report on widely measured serum biomarkers and their possible association with mCAs, which could provide new insights into molecular alterations that promote acquisition and clonal expansion. We performed a cross-sectional investigation of the association of 32 widely measured serum biomarkers with autosomal mCAs, mosaic loss of the Y chromosome, and mosaic loss of the X chromosome in 436 784 cancer-free participants from the UK Biobank. mCAs were associated with a range of commonly measured serum biomarkers such as lipid levels, circulating sex hormones, blood sugar homeostasis, inflammation and immune function, vitamins and minerals, kidney function, and liver function. Biomarker levels in participants with mCAs were estimated to differ by up to 5% relative to mCA-free participants, and individuals with higher cell fraction mCAs had greater deviation in mean biomarker values. Polygenic scores associated with sex hormone binding globulin, vitamin D, and total cholesterol were also associated with mCAs. Overall, we observed commonly used clinical serum biomarkers related to disease risk are associated with mCAs, suggesting mechanisms involved in these diseases could be related to mCA proliferation and clonal expansion.


Subject(s)
Chromosomes, Human, Y , Mosaicism , Humans , Male , Biological Specimen Banks , Cross-Sectional Studies , Biomarkers , United Kingdom
5.
Am J Hum Genet ; 109(12): 2210-2229, 2022 12 01.
Article in English | MEDLINE | ID: mdl-36423637

ABSTRACT

The most recent genome-wide association study (GWAS) of cutaneous melanoma identified 54 risk-associated loci, but functional variants and their target genes for most have not been established. Here, we performed massively parallel reporter assays (MPRAs) by using malignant melanoma and normal melanocyte cells and further integrated multi-layer annotation to systematically prioritize functional variants and susceptibility genes from these GWAS loci. Of 1,992 risk-associated variants tested in MPRAs, we identified 285 from 42 loci (78% of the known loci) displaying significant allelic transcriptional activities in either cell type (FDR < 1%). We further characterized MPRA-significant variants by motif prediction, epigenomic annotation, and statistical/functional fine-mapping to create integrative variant scores, which prioritized one to six plausible candidate variants per locus for the 42 loci and nominated a single variant for 43% of these loci. Overlaying the MPRA-significant variants with genome-wide significant expression or methylation quantitative trait loci (eQTLs or meQTLs, respectively) from melanocytes or melanomas identified candidate susceptibility genes for 60% of variants (172 of 285 variants). CRISPRi of top-scoring variants validated their cis-regulatory effect on the eQTL target genes, MAFF (22q13.1) and GPRC5A (12p13.1). Finally, we identified 36 melanoma-specific and 45 melanocyte-specific MPRA-significant variants, a subset of which are linked to cell-type-specific target genes. Analyses of transcription factor availability in MPRA datasets and variant-transcription-factor interaction in eQTL datasets highlighted the roles of transcription factors in cell-type-specific variant functionality. In conclusion, MPRAs along with variant scoring effectively prioritized plausible candidates for most melanoma GWAS loci and highlighted cellular contexts where the susceptibility variants are functional.


Subject(s)
Melanoma , Skin Neoplasms , Humans , Melanoma/genetics , Skin Neoplasms/genetics , Genome-Wide Association Study , Biological Assay , Transcription Factors , Receptors, G-Protein-Coupled , Melanoma, Cutaneous Malignant
6.
Bioinformatics ; 40(4)2024 03 29.
Article in English | MEDLINE | ID: mdl-38485690

ABSTRACT

MOTIVATION: The acquisition of somatic mutations in hematopoietic stem and progenitor stem cells with resultant clonal expansion, termed clonal hematopoiesis (CH), is associated with increased risk of hematologic malignancies and other adverse outcomes. CH is generally present at low allelic fractions, but clonal expansion and acquisition of additional mutations leads to hematologic cancers in a small proportion of individuals. With high depth and high sensitivity sequencing, CH can be detected in most adults and its clonal trajectory mapped over time. However, accurate CH variant calling is challenging due to the difficulty in distinguishing low frequency CH mutations from sequencing artifacts. The lack of well-validated bioinformatic pipelines for CH calling may contribute to lack of reproducibility in studies of CH. RESULTS: Here, we developed ArCH, an Artifact filtering Clonal Hematopoiesis variant calling pipeline for detecting single nucleotide variants and short insertions/deletions by combining the output of four variant calling tools and filtering based on variant characteristics and sequencing error rate estimation. ArCH is an end-to-end cloud-based pipeline optimized to accept a variety of inputs with customizable parameters adaptable to multiple sequencing technologies, research questions, and datasets. Using deep targeted sequencing data generated from six acute myeloid leukemia patient tumor: normal dilutions, 31 blood samples with orthogonal validation, and 26 blood samples with technical replicates, we show that ArCH improves the sensitivity and positive predictive value of CH variant detection at low allele frequencies compared to standard application of commonly used variant calling approaches. AVAILABILITY AND IMPLEMENTATION: The code for this workflow is available at: https://github.com/kbolton-lab/ArCH.


Subject(s)
Clonal Hematopoiesis , Hematologic Neoplasms , Adult , Humans , High-Throughput Nucleotide Sequencing , Software , Reproducibility of Results , Mutation , Hematopoiesis/genetics
7.
Nature ; 575(7784): 652-657, 2019 11.
Article in English | MEDLINE | ID: mdl-31748747

ABSTRACT

Mosaic loss of chromosome Y (LOY) in circulating white blood cells is the most common form of clonal mosaicism1-5, yet our knowledge of the causes and consequences of this is limited. Here, using a computational approach, we estimate that 20% of the male population represented in the UK Biobank study (n = 205,011) has detectable LOY. We identify 156 autosomal genetic determinants of LOY, which we replicate in 757,114 men of European and Japanese ancestry. These loci highlight genes that are involved in cell-cycle regulation and cancer susceptibility, as well as somatic drivers of tumour growth and targets of cancer therapy. We demonstrate that genetic susceptibility to LOY is associated with non-haematological effects on health in both men and women, which supports the hypothesis that clonal haematopoiesis is a biomarker of genomic instability in other tissues. Single-cell RNA sequencing identifies dysregulated expression of autosomal genes in leukocytes with LOY and provides insights into why clonal expansion of these cells may occur. Collectively, these data highlight the value of studying clonal mosaicism to uncover fundamental mechanisms that underlie cancer and other ageing-related diseases.


Subject(s)
Chromosome Deletion , Chromosomes, Human, Y/genetics , Genetic Predisposition to Disease/genetics , Genomic Instability/genetics , Leukocytes/pathology , Mosaicism , Adult , Aged , Computational Biology , Databases, Genetic , Female , Genetic Markers/genetics , Humans , Male , Middle Aged , Neoplasms/genetics , United Kingdom
8.
Am J Hum Genet ; 108(9): 1590-1610, 2021 09 02.
Article in English | MEDLINE | ID: mdl-34390653

ABSTRACT

Our study investigated the underlying mechanism for the 14q24 renal cell carcinoma (RCC) susceptibility risk locus identified by a genome-wide association study (GWAS). The sentinel single-nucleotide polymorphism (SNP), rs4903064, at 14q24 confers an allele-specific effect on expression of the double PHD fingers 3 (DPF3) of the BAF SWI/SNF complex as assessed by massively parallel reporter assay, confirmatory luciferase assays, and eQTL analyses. Overexpression of DPF3 in renal cell lines increases growth rates and alters chromatin accessibility and gene expression, leading to inhibition of apoptosis and activation of oncogenic pathways. siRNA interference of multiple DPF3-deregulated genes reduces growth. Our results indicate that germline variation in DPF3, a component of the BAF complex, part of the SWI/SNF complexes, can lead to reduced apoptosis and activation of the STAT3 pathway, both critical in RCC carcinogenesis. In addition, we show that altered DPF3 expression in the 14q24 RCC locus could influence the effectiveness of immunotherapy treatment for RCC by regulating tumor cytokine secretion and immune cell activation.


Subject(s)
Carcinoma, Renal Cell/genetics , Chromosomes, Human, Pair 14 , DNA-Binding Proteins/genetics , Genetic Loci , Kidney Neoplasms/genetics , STAT3 Transcription Factor/genetics , Transcription Factors/genetics , Carcinogenesis/genetics , Carcinogenesis/immunology , Carcinogenesis/pathology , Carcinoma, Renal Cell/immunology , Carcinoma, Renal Cell/pathology , Carcinoma, Renal Cell/therapy , Cell Line, Tumor , Chromatin/chemistry , Chromatin/immunology , Chromatin Assembly and Disassembly/immunology , Cytokines/genetics , Cytokines/immunology , DNA-Binding Proteins/immunology , Gene Expression Regulation , Genetic Predisposition to Disease , Genome, Human , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Immunotherapy/methods , Kidney Neoplasms/immunology , Kidney Neoplasms/pathology , Kidney Neoplasms/therapy , Polymorphism, Single Nucleotide , STAT3 Transcription Factor/immunology , T-Lymphocytes, Cytotoxic , Transcription Factors/immunology
9.
Thorax ; 79(3): 274-278, 2024 Feb 15.
Article in English | MEDLINE | ID: mdl-38238005

ABSTRACT

We investigated phenotypic leucocyte telomere length (LTL), genetically predicted LTL (gTL), and lung cancer risk among 371 890 participants, including 2829 incident cases, from the UK Biobank. Using multivariable Cox regression, we found dose-response relationships between longer phenotypic LTL (p-trendcontinuous=2.6×10-5), longer gTL predicted using a polygenic score with 130 genetic instruments (p-trendcontinuous=4.2×10-10), and overall lung cancer risk, particularly for adenocarcinoma. The associations were prominent among never smokers. Mendelian Randomization analyses supported causal associations between longer telomere length and lung cancer (HRper 1 SD gTL=1.87, 95% CI: 1.49 to 2.36, p=4.0×10-7), particularly adenocarcinoma (HRper 1 SD gTL=2.45, 95%CI: 1.69 to 3.57, p=6.5×10-6).


Subject(s)
Adenocarcinoma , Lung Neoplasms , Humans , Lung Neoplasms/epidemiology , Lung Neoplasms/genetics , Biological Specimen Banks , Prospective Studies , UK Biobank , Telomere Homeostasis/genetics , Leukocytes , Telomere/genetics
10.
Int J Cancer ; 152(2): 239-248, 2023 01 15.
Article in English | MEDLINE | ID: mdl-36082445

ABSTRACT

Pleiotropy, which consists of a single gene or allelic variant affecting multiple unrelated traits, is common across cancers, with evidence for genome-wide significant loci shared across cancer and noncancer traits. This feature is particularly relevant in multiple myeloma (MM) because several susceptibility loci that have been identified to date are pleiotropic. Therefore, the aim of this study was to identify novel pleiotropic variants involved in MM risk using 28 684 independent single nucleotide polymorphisms (SNPs) from GWAS Catalog that reached a significant association (P < 5 × 10-8 ) with their respective trait. The selected SNPs were analyzed in 2434 MM cases and 3446 controls from the International Lymphoma Epidemiology Consortium (InterLymph). The 10 SNPs showing the strongest associations with MM risk in InterLymph were selected for replication in an independent set of 1955 MM cases and 1549 controls from the International Multiple Myeloma rESEarch (IMMEnSE) consortium and 418 MM cases and 147 282 controls from the FinnGen project. The combined analysis of the three studies identified an association between DNAJB4-rs34517439-A and an increased risk of developing MM (OR = 1.22, 95%CI 1.13-1.32, P = 4.81 × 10-7 ). rs34517439-A is associated with a modified expression of the FUBP1 gene, which encodes a multifunctional DNA and RNA-binding protein that it was observed to influence the regulation of various genes involved in cell cycle regulation, among which various oncogenes and oncosuppressors. In conclusion, with a pleiotropic scan approach we identified DNAJB4-rs34517439 as a potentially novel MM risk locus.


Subject(s)
Multiple Myeloma , Humans , Multiple Myeloma/epidemiology , Multiple Myeloma/genetics , Oncogenes , Alleles , Phenotype , Polymorphism, Single Nucleotide , Genome-Wide Association Study , Genetic Predisposition to Disease , HSP40 Heat-Shock Proteins/genetics , DNA-Binding Proteins/genetics , RNA-Binding Proteins
11.
Bioinformatics ; 38(18): 4434-4436, 2022 09 15.
Article in English | MEDLINE | ID: mdl-35900159

ABSTRACT

MOTIVATION: The Division of Cancer Epidemiology and Genetics (DCEG) and the Division of Cancer Prevention (DCP) at the National Cancer Institute (NCI) have recently generated genome-wide association study (GWAS) data for multiple traits in the Prostate, Lung, Colorectal, and Ovarian (PLCO) Genomic Atlas project. The GWAS included 110 000 participants. The dissemination of the genetic association data through a data portal called GWAS Explorer, in a manner that addresses the modern expectations of FAIR reusability by data scientists and engineers, is the main motivation for the development of the open-source JavaScript software development kit (SDK) reported here. RESULTS: The PLCO GWAS Explorer resource relies on a public stateless HTTP application programming interface (API) deployed as the sole backend service for both the landing page's web application and third-party analytical workflows. The core PLCOjs SDK is mapped to each of the API methods, and also to each of the reference graphic visualizations in the GWAS Explorer. A few additional visualization methods extend it. As is the norm with web SDKs, no download or installation is needed and modularization supports targeted code injection for web applications, reactive notebooks (Observable) and node-based web services. AVAILABILITY AND IMPLEMENTATION: code at https://github.com/episphere/plco; project page at https://episphere.github.io/plco.


Subject(s)
Colorectal Neoplasms , Ovarian Neoplasms , United States , Male , Humans , Female , Genome-Wide Association Study , National Cancer Institute (U.S.) , Prostate , Software , Ovarian Neoplasms/genetics , Lung
12.
BMC Med Res Methodol ; 23(1): 153, 2023 06 29.
Article in English | MEDLINE | ID: mdl-37386403

ABSTRACT

BACKGROUND: The rule of thumb that there is little gain in statistical power by obtaining more than 4 controls per case, is based on type-1 error α = 0.05. However, association studies that evaluate thousands or millions of associations use smaller α and may have access to plentiful controls. We investigate power gains, and reductions in p-values, when increasing well beyond 4 controls per case, for small α. METHODS: We calculate the power, the median expected p-value, and the minimum detectable odds-ratio (OR), as a function of the number of controls/case, as α decreases. RESULTS: As α decreases, at each ratio of controls per case, the increase in power is larger than for α = 0.05. For α between 10-6 and 10-9 (typical for thousands or millions of associations), increasing from 4 controls per case to 10-50 controls per case increases power. For example, a study with power = 0.2 (α = 5 × 10-8) with 1 control/case has power = 0.65 with 4 controls/case, but with 10 controls/case has power = 0.78, and with 50 controls/case has power = 0.84. For situations where obtaining more than 4 controls per case provides small increases in power beyond 0.9 (at small α), the expected p-value can decrease by orders-of-magnitude below α. Increasing from 1 to 4 controls/case reduces the minimum detectable OR toward the null by 20.9%, and from 4 to 50 controls/case reduces by an additional 9.7%, a result which applies regardless of α and hence also applies to "regular" α = 0.05 epidemiology. CONCLUSIONS: At small α, versus 4 controls/case, recruiting 10 or more controls/cases can increase power, reduce the expected p-value by 1-2 orders of magnitude, and meaningfully reduce the minimum detectable OR. These benefits of increasing the controls/case ratio increase as the number of cases increases, although the amount of benefit depends on exposure frequencies and true OR. Provided that controls are comparable to cases, our findings suggest greater sharing of comparable controls in large-scale association studies.


Subject(s)
Control Groups , Odds Ratio , Research Design , Humans
13.
PLoS Genet ; 16(10): e1009078, 2020 10.
Article in English | MEDLINE | ID: mdl-33090998

ABSTRACT

Telomeres are DNA-protein structures at the ends of chromosomes essential in maintaining chromosomal stability. Observational studies have identified associations between telomeres and elevated cancer risk, including hematologic malignancies; but biologic mechanisms relating telomere length to cancer etiology remain unclear. Our study sought to better understand the relationship between telomere length and cancer risk by evaluating genetically-predicted telomere length (gTL) in relation to the presence of clonal somatic copy number alterations (SCNAs) in peripheral blood leukocytes. Genotyping array data were acquired from 431,507 participants in the UK Biobank and used to detect SCNAs from intensity information and infer telomere length using a polygenic risk score (PRS) of variants previously associated with leukocyte telomere length. In total, 15,236 (3.5%) of individuals had a detectable clonal SCNA on an autosomal chromosome. Overall, higher gTL value was positively associated with the presence of an autosomal SCNA (OR = 1.07, 95% CI = 1.05-1.09, P = 1.61×10-15). There was high consistency in effect estimates across strata of chromosomal event location (e.g., telomeric ends, interstitial or whole chromosome event; Phet = 0.37) and strata of copy number state (e.g., gain, loss, or neutral events; Phet = 0.05). Higher gTL value was associated with a greater cellular fraction of clones carrying autosomal SCNAs (ß = 0.004, 95% CI = 0.002-0.007, P = 6.61×10-4). Our population-based examination of gTL and SCNAs suggests inherited components of telomere length do not preferentially impact autosomal SCNA event location or copy number status, but rather likely influence cellular replicative potential.


Subject(s)
Clonal Evolution/genetics , Neoplasms/blood , Telomere Homeostasis/genetics , Telomere/genetics , Adult , Aged , Cell Division/genetics , DNA Copy Number Variations/genetics , Female , Genetics, Population , Humans , Leukocytes/metabolism , Leukocytes/pathology , Male , Middle Aged , Neoplasms/epidemiology , Neoplasms/genetics , United Kingdom/epidemiology
14.
Bioinformatics ; 37(8): 1178-1181, 2021 05 23.
Article in English | MEDLINE | ID: mdl-32926120

ABSTRACT

SUMMARY: A concern when conducting genome-wide association studies (GWAS) is the potential for population stratification, i.e. ancestry-based genetic differences between cases and controls, that if not properly accounted for, could lead to biased association results. We developed PCAmatchR as an open source R package for performing optimal case-control matching using principal component analysis (PCA) to aid in selecting controls that are well matched by ancestry to cases. PCAmatchR takes user supplied PCA outputs and selects matching controls for cases by utilizing a weighted Mahalanobis distance metric which weights each principal component by the percentage of genetic variation explained. Results from the 1000 Genomes Project data demonstrate both the functionality and performance of PCAmatchR for selecting matching controls for case populations as well as reducing inflation of association test statistics. PCAmatchR improves genomic similarity between matched cases and controls, which minimizes the effects of population stratification in GWAS analyses. AVAILABILITY AND IMPLEMENTATION: PCAmatchR is freely available for download on GitHub (https://github.com/machiela-lab/PCAmatchR) or through CRAN (https://CRAN.R-project.org/package=PCAmatchR). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome-Wide Association Study , Software , Case-Control Studies , Genomics , Principal Component Analysis
15.
BMC Bioinformatics ; 22(1): 608, 2021 Dec 20.
Article in English | MEDLINE | ID: mdl-34930111

ABSTRACT

Genome-wide association studies have identified thousands of genetic susceptibility loci associated with cancer as well as other traits and diseases. Mapping germline variation in identified genetic susceptibility regions to alterations in nearby gene expression nominates candidate genes potentially related to disease risk for further functional investigation. We developed LDexpress as an online resource that integrates population-specific linkage disequilibrium data from the 1000 Genomes (1000G) project and tissue-specific expression data from the Genotype-Tissue Expression project to better study regional germline variation impacting gene expression. LDexpress is a publicly available web tool designed to be easy to use, flexible to conduct a wide range of variant queries, and quick to efficiently investigate dozens of query variants across multiple tissue types. We demonstrate the utility of LDexpress using example genomic queries and anticipate this tool will accelerate understanding of disease etiology by uncovering associations of regional germline variation to nearby gene expression.


Subject(s)
Genetics, Population/methods , Genome-Wide Association Study , Genomics , Humans , Linkage Disequilibrium
16.
Int J Cancer ; 149(5): 1054-1066, 2021 09 01.
Article in English | MEDLINE | ID: mdl-33961701

ABSTRACT

Ewing sarcoma (ES) is the second most common primary bone tumor in children and adolescents. There are few known epidemiological or genetic risk factors for ES. Numerous reports describe incidence rates and trends within the United States, but international comparisons are sparse. We used the Cancer Incidence in Five Continents (CI5) data to estimate age standardized incidence rates (ASRs; cases per million) and 95% confidence intervals (95% CIs), male-to-female incidence rate ratios (IRRs; 95% CI), and the average annual percent change in incidence (AAPC; 95% CI) for ES by geographic region for children and adults aged 0 to 49 years. We also estimated the ASR for each country or country subpopulation among the 10- to 19-year-old age range; capturing the peak incidence of ES. In total, 15 874 ES cases ages 0 to 49 were reported in the CI5 series between 1988 and 2012. AAPC estimates varied by age group and geographic region. Most of the statistically significant AAPCs showed an increased incidence over time; the only statistically significant decreases in incidence were observed among 20- to 29-year-olds and 30- to 39-year-olds in Southern Asia at -1.93% and -1.67%. When categorized by predominant ancestry, we observed countries and subpopulations with predominately African, East Asian, and Southeast Asian ancestry had the lowest incidence rates, whereas Pacific Islanders and populations with predominantly European and North African/Middle Eastern ancestry had the highest. An excess incidence in males was observed in most regions. Our results highlight substantial variation in ES incidence across geographic populations, reflecting potential ancestral influence on disease risk.


Subject(s)
Bone Neoplasms/epidemiology , Global Health/trends , Sarcoma, Ewing/epidemiology , Adolescent , Adult , Aged , Child , Child, Preschool , Female , Follow-Up Studies , Humans , Incidence , Infant , Infant, Newborn , International Agencies , Male , Middle Aged , Prognosis , Time Factors , Young Adult
17.
Int J Health Geogr ; 20(1): 13, 2021 03 18.
Article in English | MEDLINE | ID: mdl-33736677

ABSTRACT

BACKGROUND: Cancer epidemiology studies require sufficient power to assess spatial relationships between exposures and cancer incidence accurately. However, methods for power calculations of spatial statistics are complicated and underdeveloped, and therefore underutilized by investigators. The spatial relative risk function, a cluster detection technique that detects spatial clusters of point-level data for two groups (e.g., cancer cases and controls, two exposure groups), is a commonly used spatial statistic but does not have a readily available power calculation for study design. RESULTS: We developed sparrpowR as an open-source R package to estimate the statistical power of the spatial relative risk function. sparrpowR generates simulated data applying user-defined parameters (e.g., sample size, locations) to detect spatial clusters with high statistical power. We present applications of sparrpowR that perform a power calculation for a study designed to detect a spatial cluster of incident cancer in relation to a point source of numerous environmental emissions. The conducted power calculations demonstrate the functionality and utility of sparrpowR to calculate the local power for spatial cluster detection. CONCLUSIONS: sparrpowR improves the current capacity of investigators to calculate the statistical power of spatial clusters, which assists in designing more efficient studies. This newly developed R package addresses a critically underdeveloped gap in cancer epidemiology by estimating statistical power for a common spatial cluster detection technique.


Subject(s)
Neoplasms , Cluster Analysis , Humans , Incidence , Spatial Analysis
18.
Am J Epidemiol ; 189(12): 1451-1460, 2020 12 01.
Article in English | MEDLINE | ID: mdl-32613232

ABSTRACT

Although transgenerational effects of exposure to ionizing radiation have long been a concern, human research to date has been confined to studies of disease phenotypes in groups exposed to high doses and high dose rates, such as the Japanese atomic bomb survivors. Transgenerational effects of parental irradiation can be addressed using powerful new genomic technologies. In collaboration with the Ukrainian National Research Center for Radiation Medicine, the US National Cancer Institute, in 2014-2018, initiated a genomic alterations study among children born in selected regions of Ukraine to cleanup workers and/or evacuees exposed to low-dose-rate radiation after the 1986 Chornobyl (Chernobyl) nuclear accident. To investigate whether parental radiation exposure is associated with germline mutations and genomic alterations in the offspring, we are collecting biospecimens from father-mother-offspring constellations to study de novo mutations, minisatellite mutations, copy-number changes, structural variants, genomic insertions and deletions, methylation profiles, and telomere length. Genomic alterations are being examined in relation to parental gonadal dose, reconstructed using questionnaire and measurement data. Subjects are being recruited in exposure categories that will allow examination of parental origin, duration, and timing of exposure in relation to conception. Here we describe the study methodology and recruitment results and provide descriptive information on the first 150 families (mother-father-child(ren)) enrolled.


Subject(s)
Chernobyl Nuclear Accident , Germ-Line Mutation , Maternal Exposure/adverse effects , Paternal Exposure/adverse effects , Radiation Dosage , Adult , Female , Follow-Up Studies , Humans , Male , Young Adult
19.
PLoS Med ; 16(1): e1002724, 2019 01.
Article in English | MEDLINE | ID: mdl-30605491

ABSTRACT

BACKGROUND: Several obesity-related factors have been associated with renal cell carcinoma (RCC), but it is unclear which individual factors directly influence risk. We addressed this question using genetic markers as proxies for putative risk factors and evaluated their relation to RCC risk in a mendelian randomization (MR) framework. This methodology limits bias due to confounding and is not affected by reverse causation. METHODS AND FINDINGS: Genetic markers associated with obesity measures, blood pressure, lipids, type 2 diabetes, insulin, and glucose were initially identified as instrumental variables, and their association with RCC risk was subsequently evaluated in a genome-wide association study (GWAS) of 10,784 RCC patients and 20,406 control participants in a 2-sample MR framework. The effect on RCC risk was estimated by calculating odds ratios (ORSD) for a standard deviation (SD) increment in each risk factor. The MR analysis indicated that higher body mass index increases the risk of RCC (ORSD: 1.56, 95% confidence interval [CI] 1.44-1.70), with comparable results for waist-to-hip ratio (ORSD: 1.63, 95% CI 1.40-1.90) and body fat percentage (ORSD: 1.66, 95% CI 1.44-1.90). This analysis further indicated that higher fasting insulin (ORSD: 1.82, 95% CI 1.30-2.55) and diastolic blood pressure (DBP; ORSD: 1.28, 95% CI 1.11-1.47), but not systolic blood pressure (ORSD: 0.98, 95% CI 0.84-1.14), increase the risk for RCC. No association with RCC risk was seen for lipids, overall type 2 diabetes, or fasting glucose. CONCLUSIONS: This study provides novel evidence for an etiological role of insulin in RCC, as well as confirmatory evidence that obesity and DBP influence RCC risk.


Subject(s)
Carcinoma, Renal Cell/etiology , Kidney Neoplasms/etiology , Obesity/complications , Blood Glucose/analysis , Blood Pressure , Body Mass Index , Carcinoma, Renal Cell/genetics , Diabetes Mellitus, Type 2/complications , Female , Genetic Markers , Genome-Wide Association Study , Humans , Insulin/blood , Kidney Neoplasms/genetics , Lipids/blood , Male , Mendelian Randomization Analysis , Obesity/genetics , Risk Factors
20.
Hum Mol Genet ; 26(22): 4388-4394, 2017 11 15.
Article in English | MEDLINE | ID: mdl-28973384

ABSTRACT

Recent studies have reported a higher than anticipated frequency of large clonal autosomal mosaic events >2 Mb in size in the aging population. Mosaic events are detected from analyses of intensity parameters of linear stretches with deviations in heterozygous probes of single nucleotide polymorphism microarrays. The non-random distribution of detected mosaic events throughout the genome suggests common mechanisms could influence the formation of mosaic events. Here we use publicly available data tracks from the University of California Santa Cruz Genome Browser to investigate the genomic characteristics of the regions at the terminal ends of two frequent types of large structural mosaic events: telomeric neutral events and interstitial losses. We observed breakpoints are more likely to occur in regions enriched for open chromatin, increased gene density, elevated meiotic recombination rates and in the proximity of repetitive elements. These observations suggest that detected mosaic event breakpoints are preferentially recovered in genomic regions that are observed to be active and thus more accessible to environmental exposures and events related to gene transcription. We propose that errors in DNA repair pathways, such as non-homologous end joining and homologous recombination, may be important cellular mechanisms that lead to the formation of large structural mosaic events such as interstitial losses and copy neutral events that include telomeres. Further studies using next generation sequencing technologies should be instrumental in mapping the specific junctions of mosaic events to the nucleotide and provide insights into the molecular mechanisms responsible for clonal somatic structural events.


Subject(s)
Chromosome Breakage , Chromosomes, Human , Chromatin , DNA Breaks , DNA Copy Number Variations , Databases, Nucleic Acid , Genome, Human , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Mosaicism , Polymorphism, Single Nucleotide , Recombination, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL