Search | VHL Search Portal

1.

Automated AI labeling of optic nerve head enables insights into cross-ancestry glaucoma risk and genetic discovery in >280,000 images from UKB and CLSA.

Han, Xikun; Steven, Kaiah; Qassim, Ayub; Marshall, Henry N; Bean, Cameron; Tremeer, Michael; An, Jiyuan; Siggs, Owen M; Gharahkhani, Puya; Craig, Jamie E; Hewitt, Alex W; Trzaskowski, Maciej; MacGregor, Stuart.

Am J Hum Genet ; 108(7): 1204-1216, 2021 07 01.

Article in English | MEDLINE | ID: mdl-34077762

ABSTRACT

Cupping of the optic nerve head, a highly heritable trait, is a hallmark of glaucomatous optic neuropathy. Two key parameters are vertical cup-to-disc ratio (VCDR) and vertical disc diameter (VDD). However, manual assessment often suffers from poor accuracy and is time intensive. Here, we show convolutional neural network models can accurately estimate VCDR and VDD for 282,100 images from both UK Biobank and an independent study (Canadian Longitudinal Study on Aging), enabling cross-ancestry epidemiological studies and new genetic discovery for these optic nerve head parameters. Using the AI approach, we perform a systematic comparison of the distribution of VCDR and VDD and compare these with intraocular pressure and glaucoma diagnoses across various genetically determined ancestries, which provides an explanation for the high rates of normal tension glaucoma in East Asia. We then used the large number of AI gradings to conduct a more powerful genome-wide association study (GWAS) of optic nerve head parameters. Using the AI-based gradings increased estimates of heritability by â¼50% for VCDR and VDD. Our GWAS identified more than 200 loci associated with both VCDR and VDD (double the number of loci from previous studies) and uncovered dozens of biological pathways; many of the loci we discovered also confer risk for glaucoma.

Subject(s)

Artificial Intelligence , Glaucoma/genetics , Optic Disk/diagnostic imaging , Adult , Aged , Algorithms , Female , Genome-Wide Association Study , Glaucoma/diagnosis , Glaucoma/pathology , Humans , Image Processing, Computer-Assisted , Inheritance Patterns , Intraocular Pressure , Male , Middle Aged , Nerve Net , Optic Disk/pathology , Photography , Polymorphism, Single Nucleotide , Risk Factors

2.

Evidence for within-species transition between drought response strategies in Nicotiana benthamiana.

Asadyar, Leila; de Felippes, Felipe Fenselau; Bally, Julia; Blackman, Chris J; An, Jiyuan; Sussmilch, Frances C; Moghaddam, Lalehvash; Williams, Brett; Blanksby, Stephen J; Brodribb, Timothy J; Waterhouse, Peter M.

New Phytol ; 2024 Jun 11.

Article in English | MEDLINE | ID: mdl-38863314

ABSTRACT

Nicotiana benthamiana is predominantly distributed in arid habitats across northern Australia. However, none of six geographically isolated accessions shows obvious xerophytic morphological features. To investigate how these tender-looking plants withstand drought, we examined their responses to water deprivation, assessed phenotypic, physiological, and cellular responses, and analysed cuticular wax composition and wax biosynthesis gene expression profiles. Results showed that the Central Australia (CA) accession, globally known as a research tool, has evolved a drought escape strategy with early vigour, short life cycle, and weak, water loss-limiting responses. By contrast, a northern Queensland (NQ) accession responded to drought by slowing growth, inhibiting flowering, increasing leaf cuticle thickness, and altering cuticular wax composition. Under water stress, NQ increased the heat stability and water impermeability of its cuticle by extending the carbon backbone of cuticular long-chain alkanes from c. 25 to 33. This correlated with rapid upregulation of at least five wax biosynthesis genes. In CA, the alkane chain lengths (c. 25) and gene expression profiles remained largely unaltered. This study highlights complex genetic and environmental control over cuticle composition and provides evidence for divergence into at least two fundamentally different drought response strategies within the N. benthamiana species in < 1 million years.

3.

The causal relationship between gastro-oesophageal reflux disease and idiopathic pulmonary fibrosis: a bidirectional two-sample Mendelian randomisation study.

Reynolds, Carl J; Del Greco M, Fabiola; Allen, Richard J; Flores, Carlos; Jenkins, R Gisli; Maher, Toby M; Molyneaux, Philip L; Noth, Imre; Oldham, Justin M; Wain, Louise V; An, Jiyuan; Ong, Jue-Sheng; MacGregor, Stuart; Yates, Tom A; Cullinan, Paul; Minelli, Cosetta.

Eur Respir J ; 61(5)2023 05.

Article in English | MEDLINE | ID: mdl-37080571

ABSTRACT

BACKGROUND: Gastro-oesophageal reflux disease (GORD) is associated with idiopathic pulmonary fibrosis (IPF) in observational studies. It is not known if this association arises because GORD causes IPF or because IPF causes GORD, or because of confounding by factors, such as smoking, associated with both GORD and IPF. We used bidirectional Mendelian randomisation (MR), where genetic variants are used as instrumental variables to address issues of confounding and reverse causation, to examine how, if at all, GORD and IPF are causally related. METHODS: A bidirectional two-sample MR was performed to estimate the causal effect of GORD on IPF risk and of IPF on GORD risk, using genetic data from the largest GORD (78 707 cases and 288 734 controls) and IPF (4125 cases and 20 464 controls) genome-wide association meta-analyses currently available. RESULTS: GORD increased the risk of IPF, with an OR of 1.6 (95% CI 1.04-2.49; p=0.032). There was no evidence of a causal effect of IPF on the risk of GORD, with an OR of 0.999 (95% CI 0.997-1.000; p=0.245). CONCLUSIONS: We found that GORD increases the risk of IPF, but found no evidence that IPF increases the risk of GORD. GORD should be considered in future studies of IPF risk and interest in it as a potential therapeutic target should be renewed. The mechanisms underlying the effect of GORD on IPF should also be investigated.

Subject(s)

Gastroesophageal Reflux , Idiopathic Pulmonary Fibrosis , Humans , Gastroesophageal Reflux/complications , Gastroesophageal Reflux/genetics , Gastroesophageal Reflux/drug therapy , Genome-Wide Association Study , Idiopathic Pulmonary Fibrosis/genetics , Idiopathic Pulmonary Fibrosis/complications

4.

Multitrait genetic association analysis identifies 50 new risk loci for gastro-oesophageal reflux, seven new loci for Barrett's oesophagus and provides insights into clinical heterogeneity in reflux diagnosis.

Ong, Jue-Sheng; An, Jiyuan; Han, Xikun; Law, Matthew H; Nandakumar, Priyanka; Schumacher, Johannes; Gockel, Ines; Bohmer, Anne; Jankowski, Janusz; Palles, Claire; Olsen, Catherine M; Neale, Rachel E; Fitzgerald, Rebecca; Thrift, Aaron P; Vaughan, Thomas L; Buas, Matthew F; Hinds, David A; Gharahkhani, Puya; Kendall, Bradley J; MacGregor, Stuart.

Gut ; 71(6): 1053-1061, 2022 06.

Article in English | MEDLINE | ID: mdl-34187846

ABSTRACT

OBJECTIVE: Gastro-oesophageal reflux disease (GERD) has heterogeneous aetiology primarily attributable to its symptom-based definitions. GERD genome-wide association studies (GWASs) have shown strong genetic overlaps with established risk factors such as obesity and depression. We hypothesised that the shared genetic architecture between GERD and these risk factors can be leveraged to (1) identify new GERD and Barrett's oesophagus (BE) risk loci and (2) explore potentially heterogeneous pathways leading to GERD and oesophageal complications. DESIGN: We applied multitrait GWAS models combining GERD (78 707 cases; 288 734 controls) and genetically correlated traits including education attainment, depression and body mass index. We also used multitrait analysis to identify BE risk loci. Top hits were replicated in 23andMe (462 753 GERD cases, 24 099 BE cases, 1 484 025 controls). We additionally dissected the GERD loci into obesity-driven and depression-driven subgroups. These subgroups were investigated to determine how they relate to tissue-specific gene expression and to risk of serious oesophageal disease (BE and/or oesophageal adenocarcinoma, EA). RESULTS: We identified 88 loci associated with GERD, with 59 replicating in 23andMe after multiple testing corrections. Our BE analysis identified seven novel loci. Additionally we showed that only the obesity-driven GERD loci (but not the depression-driven loci) were associated with genes enriched in oesophageal tissues and successfully predicted BE/EA. CONCLUSION: Our multitrait model identified many novel risk loci for GERD and BE. We present strong evidence for a genetic underpinning of disease heterogeneity in GERD and show that GERD loci associated with depressive symptoms are not strong predictors of BE/EA relative to obesity-driven GERD loci.

Subject(s)

Barrett Esophagus , Esophageal Neoplasms , Esophagitis, Peptic , Gastroesophageal Reflux , Barrett Esophagus/complications , Barrett Esophagus/diagnosis , Barrett Esophagus/genetics , Esophageal Neoplasms/diagnosis , Esophageal Neoplasms/genetics , Gastroesophageal Reflux/complications , Gastroesophageal Reflux/diagnosis , Gastroesophageal Reflux/genetics , Genome-Wide Association Study , Humans , Obesity/complications , Obesity/genetics

5.

Evaluating the role of alcohol consumption in breast and ovarian cancer susceptibility using population-based cohort studies and two-sample Mendelian randomization analyses.

Ong, Jue-Sheng; Derks, Eske M; Eriksson, Mikael; An, Jiyuan; Hwang, Liang-Dar; Easton, Douglas F; Pharoah, Paul P; Berchuck, Andrew; Kelemen, Linda E; Matsuo, Keitaro; Chenevix-Trench, Georgia; Hall, Per; Bojesen, Stig E; Webb, Penelope M; MacGregor, Stuart.

Int J Cancer ; 148(6): 1338-1350, 2021 03 15.

Article in English | MEDLINE | ID: mdl-32976626

ABSTRACT

Alcohol consumption is correlated positively with risk for breast cancer in observational studies, but observational studies are subject to reverse causation and confounding. The association with epithelial ovarian cancer (EOC) is unclear. We performed both observational Cox regression and two-sample Mendelian randomization (MR) analyses using data from various European cohort studies (observational) and publicly available cancer consortia (MR). These estimates were compared to World Cancer Research Fund (WCRF) findings. In our observational analyses, the multivariable-adjusted hazard ratios (HR) for a one standard drink/day increase was 1.06 (95% confidence interval [CI]; 1.04, 1.08) for breast cancer and 1.00 (0.92, 1.08) for EOC, both of which were consistent with previous WCRF findings. MR ORs per genetically predicted one standard drink/day increase estimated via 34 SNPs using MR-PRESSO were 1.00 (0.93, 1.08) for breast cancer and 0.95 (0.85, 1.06) for EOC. Stratification by EOC subtype or estrogen receptor status in breast cancers made no meaningful difference to the results. For breast cancer, the CIs for the genetically derived estimates include the point-estimate from observational studies so are not inconsistent with a small increase in risk. Our data provide additional evidence that alcohol intake is unlikely to have anything other than a very small effect on risk of EOC.

Subject(s)

Alcohol Drinking/adverse effects , Breast Neoplasms/epidemiology , Carcinoma, Ovarian Epithelial/epidemiology , Ovarian Neoplasms/epidemiology , Causality , Cohort Studies , Female , Humans , Mendelian Randomization Analysis , Odds Ratio

6.

Combined analysis of keratinocyte cancers identifies novel genome-wide loci.

Liyanage, Upekha E; Law, Matthew H; Han, Xikun; An, Jiyuan; Ong, Jue-Sheng; Gharahkhani, Puya; Gordon, Scott; Neale, Rachel E; Olsen, Catherine M; MacGregor, Stuart; Whiteman, David C.

Hum Mol Genet ; 28(18): 3148-3160, 2019 09 15.

Article in English | MEDLINE | ID: mdl-31174203

ABSTRACT

The keratinocyte cancers (KC), basal cell carcinoma (BCC) and squamous cell carcinoma (SCC) are the most common cancers in fair-skinned people. KC treatment represents the second highest cancer healthcare expenditure in Australia. Increasing our understanding of the genetic architecture of KC may provide new avenues for prevention and treatment. We first conducted a series of genome-wide association studies (GWAS) of KC across three European ancestry datasets from Australia, Europe and USA, and used linkage disequilibrium (LD) Score regression (LDSC) to estimate their pairwise genetic correlations. We employed a multiple-trait approach to map genes across the combined set of KC GWAS (total N = 47 742 cases, 634 413 controls). We also performed meta-analyses of BCC and SCC separately to identify trait specific loci. We found substantial genetic correlations (generally 0.5-1) between BCC and SCC suggesting overlapping genetic risk variants. The multiple trait combined KC GWAS identified 63 independent genome-wide significant loci, 29 of which were novel. Individual separate meta-analyses of BCC and SCC identified an additional 13 novel loci not found in the combined KC analysis. Three new loci were implicated using gene-based tests. New loci included common variants in BRCA2 (distinct to known rare high penetrance cancer risk variants), and in CTLA4, a target of immunotherapy in melanoma. We found shared and trait specific genetic contributions to BCC and SCC. Considering both, we identified a total of 79 independent risk loci, 45 of which are novel.

Subject(s)

Carcinoma, Basal Cell/genetics , Carcinoma, Squamous Cell/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Keratinocytes/metabolism , Quantitative Trait Loci , Skin Neoplasms/genetics , Alleles , Carcinoma, Basal Cell/metabolism , Carcinoma, Basal Cell/pathology , Carcinoma, Squamous Cell/metabolism , Carcinoma, Squamous Cell/pathology , Case-Control Studies , Computational Biology/methods , Gene Expression Profiling , Humans , Keratinocytes/pathology , Molecular Sequence Annotation , Odds Ratio , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , Skin Neoplasms/metabolism , Skin Neoplasms/pathology

7.

Genome-wide association analysis of 95 549 individuals identifies novel loci and genes influencing optic disc morphology.

Han, Xikun; Qassim, Ayub; An, Jiyuan; Marshall, Henry; Zhou, Tiger; Ong, Jue-Sheng; Hassall, Mark M; Hysi, Pirro G; Foster, Paul J; Khaw, Peng T; Mackey, David A; Gharahkhani, Puya; Khawaja, Anthony P; Hewitt, Alex W; Craig, Jamie E; MacGregor, Stuart.

Hum Mol Genet ; 28(21): 3680-3690, 2019 11 01.

Article in English | MEDLINE | ID: mdl-31809533

ABSTRACT

Optic nerve head morphology is affected by several retinal diseases. We measured the vertical optic disc diameter (DD) of the UK Biobank (UKBB) cohort (N = 67 040) and performed the largest genome-wide association study (GWAS) of DD to date. We identified 81 loci (66 novel) for vertical DD. We then replicated the novel loci in International Glaucoma Genetic Consortium (IGGC, N = 22 504) and European Prospective Investigation into Cancer-Norfolk (N = 6005); in general the concordance in effect sizes was very high (correlation in effect size estimates 0.90): 44 of the 66 novel loci were significant at P < 0.05, with 19 remaining significant after Bonferroni correction. We identified another 26 novel loci in the meta-analysis of UKBB and IGGC data. Gene-based analyses identified an additional 57 genes. Human ocular tissue gene expression analysis showed that most of the identified genes are enriched in optic nerve head tissue. Some of the identified loci exhibited pleiotropic effects with vertical cup-to-disc ratio, intraocular pressure, glaucoma and myopia. These results can enhance our understanding of the genetics of optic disc morphology and shed light on the genetic findings for other ophthalmic disorders such as glaucoma and other optic nerve diseases.

Subject(s)

Genome-Wide Association Study , Glaucoma/genetics , Optic Disk/anatomy & histology , Adult , Aged , Databases, Factual , Female , Gene Expression , Glaucoma/metabolism , Humans , Male , Middle Aged , Optic Disk/metabolism , Polymorphism, Single Nucleotide , Prospective Studies

8.

Vitamin D and overall cancer risk and cancer mortality: a Mendelian randomization study.

Ong, Jue-Sheng; Gharahkhani, Puya; An, Jiyuan; Law, Matthew H; Whiteman, David C; Neale, Rachel E; MacGregor, Stuart.

Hum Mol Genet ; 27(24): 4315-4322, 2018 12 15.

Article in English | MEDLINE | ID: mdl-30508204

ABSTRACT

There is considerable debate regarding the role that 25-hydroxyvitamin D [25(OH)D] concentrations play in cancer risk or mortality, with earlier studies drawing mixed conclusions. Using data from the UK Biobank (UKB), we evaluate whether genetically predicted 25(OH)D concentrations are associated with overall cancer susceptibility and cancer mortality using five 25(OH)D genetic markers. Data comprised 438 870 white British UKB participants aged 37-73, including 46 155 cancer cases and 6998 cancer-specific deaths. Participants with keratinocyte cancers and/or benign tumors were excluded from the analysis. Odds ratios were calculated per 20 nmol/L increase in genetically predicted 25(OH)D for cancer risk and cancer mortality. For individual cancer risks, estimates were meta-analyzed with publicly available data using a fixed-effect inverse-variance-weighted model. We demonstrated that genetically low plasma 25(OH)D concentrations were not associated with increased cancer risk nor cancer mortality. Stratification by sex or cancer types did not reveal any meaningful differences albeit wider confidence intervals. Fixed-effect meta-analysis of our individual cancer risk estimates with those derived from publicly available cancer consortia data and previous studies further reinforced our null Mendelian randomization findings on prostate, lung, colorectal and breast cancers with tight confidence intervals; for ovarian and pancreatic cancers, our estimates were less precise despite being not statistically significant. Taken altogether, our results provide no genetic evidence for an association between vitamin D and overall cancer outcomes, with tight confidence intervals to exclude all but very small effect sizes.

Subject(s)

Mendelian Randomization Analysis , Neoplasms/blood , Neoplasms/genetics , Vitamin D/blood , Adult , Aged , Female , Genetic Predisposition to Disease , Genotype , Humans , Male , Middle Aged , Neoplasms/mortality , Neoplasms/pathology , Polymorphism, Single Nucleotide , Risk Factors , Vitamin D/analogs & derivatives , White People

9.

Genetic heterogeneity in self-reported depressive symptoms identified through genetic analyses of the PHQ-9.

Thorp, Jackson G; Marees, Andries T; Ong, Jue-Sheng; An, Jiyuan; MacGregor, Stuart; Derks, Eske M.

Psychol Med ; 50(14): 2385-2396, 2020 10.

Article in English | MEDLINE | ID: mdl-31530331

ABSTRACT

BACKGROUND: Depression is a clinically heterogeneous disorder. Previous large-scale genetic studies of depression have explored genetic risk factors of depression case-control status or aggregated sums of depressive symptoms, ignoring possible clinical or genetic heterogeneity. METHODS: We analyse data from 148 752 subjects of white British ancestry in the UK Biobank who completed nine items of a self-rated measure of current depressive symptoms: the Patient Health Questionnaire (PHQ-9). Genome-Wide Association analyses were conducted for nine symptoms and two composite measures. LD Score Regression was used to calculate SNP-based heritability (h2SNP) and genetic correlations (rg) across symptoms and to investigate genetic correlations with 25 external phenotypes. Genomic structural equation modelling was used to test the genetic factor structure across the nine symptoms. RESULTS: We identified nine genome-wide significant genomic loci (8 novel), with no overlap in loci across symptoms. h2SNP ranged from 6% (concentration problems) to 9% (appetite changes). Genetic correlations ranged from 0.54 to 0.96 (all p < 1.39 × 10-3) with 30 of 36 correlations being significantly smaller than one. A two-factor model provided the best fit to the genetic covariance matrix, with factors representing 'psychological' and 'somatic' symptoms. The genetic correlations with external phenotypes showed large variation across the nine symptoms. CONCLUSIONS: Patterns of SNP associations and genetic correlations differ across the nine symptoms, suggesting that current depressive symptoms are genetically heterogeneous. Our study highlights the value of symptom-level analyses in understanding the genetic architecture of a psychiatric trait. Future studies should investigate whether genetic heterogeneity is recapitulated in clinical symptoms of major depression.

Subject(s)

Depression/genetics , Genetic Heterogeneity , Genetic Loci , Genetic Predisposition to Disease , Aged , Aged, 80 and over , Case-Control Studies , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Patient Health Questionnaire , Phenotype , Self Report , United Kingdom , White People/genetics

10.

Potential influence of socioeconomic status on genetic correlations between alcohol consumption measures and mental health.

Marees, Andries T; Smit, Dirk J A; Ong, Jue-Sheng; MacGregor, Stuart; An, Jiyuan; Denys, Damiaan; Vorspan, Florence; van den Brink, Wim; Derks, Eske M.

Psychol Med ; 50(3): 484-498, 2020 02.

Article in English | MEDLINE | ID: mdl-30874500

ABSTRACT

BACKGROUND: Frequency and quantity of alcohol consumption are metrics commonly used to measure alcohol consumption behaviors. Epidemiological studies indicate that these alcohol consumption measures are differentially associated with (mental) health outcomes and socioeconomic status (SES). The current study aims to elucidate to what extent genetic risk factors are shared between frequency and quantity of alcohol consumption, and how these alcohol consumption measures are genetically associated with four broad phenotypic categories: (i) SES; (ii) substance use disorders; (iii) other psychiatric disorders; and (iv) psychological/personality traits. METHODS: Genome-Wide Association analyses were conducted to test genetic associations with alcohol consumption frequency (N = 438 308) and alcohol consumption quantity (N = 307 098 regular alcohol drinkers) within UK Biobank. For the other phenotypes, we used genome-wide association studies summary statistics. Genetic correlations (rg) between the alcohol measures and other phenotypes were estimated using LD score regression. RESULTS: We found a substantial genetic correlation between the frequency and quantity of alcohol consumption (rg = 0.52). Nevertheless, both measures consistently showed opposite genetic correlations with SES traits, and many substance use, psychiatric, and psychological/personality traits. High alcohol consumption frequency was genetically associated with high SES and low risk of substance use disorders and other psychiatric disorders, whereas the opposite applies for high alcohol consumption quantity. CONCLUSIONS: Although the frequency and quantity of alcohol consumption show substantial genetic overlap, they consistently show opposite patterns of genetic associations with SES-related phenotypes. Future studies should carefully consider the potential influence of SES on the shared genetic etiology between alcohol and adverse (mental) health outcomes.

Subject(s)

Alcohol Drinking/genetics , Mental Health , Social Class , Adult , Aged , Alcoholism/genetics , Biological Specimen Banks , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Substance-Related Disorders/genetics , United Kingdom

11.

Using Mendelian randomization to evaluate the causal relationship between serum C-reactive protein levels and age-related macular degeneration.

Han, Xikun; Ong, Jue-Sheng; An, Jiyuan; Hewitt, Alex W; Gharahkhani, Puya; MacGregor, Stuart.

Eur J Epidemiol ; 35(2): 139-146, 2020 Feb.

Article in English | MEDLINE | ID: mdl-31900758

ABSTRACT

Serum C-reactive protein (CRP), an important inflammatory marker, has been associated with age-related macular degeneration (AMD) in observational studies; however, the findings are inconsistent. It remains unclear whether the association between circulating CRP levels and AMD is causal. We used two-sample Mendelian randomization (MR) to evaluate the potential causal relationship between serum CRP levels and AMD risk. We derived genetic instruments for serum CRP levels in 418,642 participants of European ancestry from UK Biobank, and then conducted a genome-wide association study for 12,711 advanced AMD cases and 14,590 controls of European descent from the International AMD Genomics Consortium. Genetic variants which predicted elevated serum CRP levels were associated with advanced AMD (odds ratio [OR] for per standard deviation increase in serum CRP levels: 1.31, 95% confidence interval [CI]: 1.19-1.44, P = 5.2 × 10-8). The OR for the increase in advanced AMD risk when moving from low (< 3 mg/L) to high (> 3 mg/L) CRP levels is 1.29 (95% CI: 1.17-1.41). Our results were unchanged in sensitivity analyses using MR models which make different modelling assumptions. Our findings were broadly similar across the different forms of AMD (intermediate AMD, choroidal neovascularization, and geographic atrophy). We used multivariable MR to adjust for the effects of other potential AMD risk factors including smoking, body mass index, blood pressure and cholesterol; this did not alter our findings. Our study provides strong genetic evidence that higher circulating CRP levels lead to increases in risk for all forms of AMD. These findings highlight the potential utility for using circulating CRP as a biomarker in future trials aimed at modulating AMD risk via systemic therapies.

Subject(s)

C-Reactive Protein/genetics , Macular Degeneration/blood , Macular Degeneration/genetics , Mendelian Randomization Analysis , Aged , Aged, 80 and over , C-Reactive Protein/metabolism , Case-Control Studies , Female , Genome-Wide Association Study , Genotype , Humans , Macular Degeneration/epidemiology , Male , Middle Aged , Polymorphism, Single Nucleotide , Risk Factors

12.

A Two-Stage Mutual Information Based Bayesian Lasso Algorithm for Multi-Locus Genome-Wide Association Studies.

Guo, Hongping; Yu, Zuguo; An, Jiyuan; Han, Guosheng; Ma, Yuanlin; Tang, Runbin.

Entropy (Basel) ; 22(3)2020 Mar 13.

Article in English | MEDLINE | ID: mdl-33286103

ABSTRACT

Genome-wide association study (GWAS) has turned out to be an essential technology for exploring the genetic mechanism of complex traits. To reduce the complexity of computation, it is well accepted to remove unrelated single nucleotide polymorphisms (SNPs) before GWAS, e.g., by using iterative sure independence screening expectation-maximization Bayesian Lasso (ISIS EM-BLASSO) method. In this work, a modified version of ISIS EM-BLASSO is proposed, which reduces the number of SNPs by a screening methodology based on Pearson correlation and mutual information, then estimates the effects via EM-Bayesian Lasso (EM-BLASSO), and finally detects the true quantitative trait nucleotides (QTNs) through likelihood ratio test. We call our method a two-stage mutual information based Bayesian Lasso (MBLASSO). Under three simulation scenarios, MBLASSO improves the statistical power and retains the higher effect estimation accuracy when comparing with three other algorithms. Moreover, MBLASSO performs best on model fitting, the accuracy of detected associations is the highest, and 21 genes can only be detected by MBLASSO in Arabidopsis thaliana datasets.

13.

Effect of increased body mass index on risk of diagnosis or death from cancer.

Gharahkhani, Puya; Ong, Jue-Sheng; An, Jiyuan; Law, Matthew H; Whiteman, David C; Neale, Rachel E; MacGregor, Stuart.

Br J Cancer ; 120(5): 565-570, 2019 03.

Article in English | MEDLINE | ID: mdl-30733581

ABSTRACT

BACKGROUND: Whether body mass index (BMI) is causally associated with the risk of being diagnosed with or dying from any cancer remains unclear. Weight reduction has clinical importance for cancer control only if weight gain causes cancer development or death. We aimed to answer the question 'does genetically predicted BMI influence my risk of being diagnosed with or dying from any cancer'. METHODS: We used a Mendelian randomisation (MR) approach to estimate causal effect of BMI in 46,155 white-British participants aged between 40 and 69 years at recruitment (median age at follow-up 61 years) from the UK Biobank, who developed any type of cancer, among whom 6998 died from cancer. To derive MR instruments for BMI, we selected up to 390,628 cancer-free participants. RESULTS: For each standard deviation (4.78 units) increase in genetically predicted BMI, we estimated a causal odds ratio (COR) of 1.07 (1.02-1.12) and 1.28 (1.16-1.41) for overall cancer risk and mortality, respectively. The corresponding estimates were similar for males and females, and smokers and non-smokers. CONCLUSIONS: Higher genetically predicted BMI increases the risk of being diagnosed with or dying from any cancer. These data suggest that increased overall weight may causally increase overall cancer incidence and mortality among Europeans.

Subject(s)

Neoplasms/epidemiology , Neoplasms/mortality , Obesity/epidemiology , Adult , Aged , Body Mass Index , Female , Humans , Male , Mendelian Randomization Analysis , Middle Aged , Neoplasms/genetics , Obesity/genetics , Overweight/epidemiology , Overweight/genetics , United Kingdom , White People

14.

Height and overall cancer risk and mortality: evidence from a Mendelian randomisation study on 310,000 UK Biobank participants.

Ong, Jue-Sheng; An, Jiyuan; Law, Matthew H; Whiteman, David C; Neale, Rachel E; Gharahkhani, Puya; MacGregor, Stuart.

Br J Cancer ; 118(9): 1262-1267, 2018 05.

Article in English | MEDLINE | ID: mdl-29581483

ABSTRACT

BACKGROUND: Observational studies have shown that being taller is associated with greater cancer risk. However, the interpretation of such studies can be hampered by important issues such as confounding and reporting bias. METHODS: We used the UK Biobank resource to develop genetic predictors of height and applied these in a Mendelian randomisation framework to estimate the causal relationship between height and cancer. Up to 438,870 UK Biobank participants were considered in our analysis. We addressed two primary cancer outcomes, cancer incidence by age ~60 and cancer mortality by age ~60 (where age ~60 is the typical age of UK Biobank participants). RESULTS: We found that each genetically predicted 9 cm increase in height conferred an odds ratio of 1.10 (95% confidence interval 1.07-1.13) and 1.09 (1.02-1.16) for diagnosis of any cancer and death from any cancer, respectively. For both risk and mortality, the effect was larger in females than in males. CONCLUSIONS: Height increases the risk of being diagnosed with and dying from cancer. These findings from Mendelian randomisation analyses agree with observational studies and provide evidence that they were not likely to have been strongly affected by confounding or reporting bias.

Subject(s)

Biological Specimen Banks/statistics & numerical data , Body Height/physiology , Neoplasms/epidemiology , Case-Control Studies , Databases, Factual/statistics & numerical data , Female , Humans , Male , Mendelian Randomization Analysis , Middle Aged , Mortality , Neoplasms/mortality , Registries/statistics & numerical data , Risk Factors , United Kingdom/epidemiology

15.

Pre-B acute lymphoblastic leukaemia recurrent fusion, EP300-ZNF384, is associated with a distinct gene expression.

McClure, Barbara J; Heatley, Susan L; Kok, Chung H; Sadras, Teresa; An, Jiyuan; Hughes, Timothy P; Lock, Richard B; Yeung, David; Sutton, Rosemary; White, Deborah L.

Br J Cancer ; 118(7): 1000-1004, 2018 04.

Article in English | MEDLINE | ID: mdl-29531323

ABSTRACT

BACKGROUND: Zinc-finger protein 384 (ZNF384) fusions are an emerging subtype of precursor B-cell acute lymphoblastic leukaemia (pre-B-ALL) and here we further characterised their prevalence, survival outcomes and transcriptome. METHODS: Bone marrow mononuclear cells from 274 BCR-ABL1-negative pre-B-ALL patients were immunophenotyped and transcriptome molecularly characterised. Transcriptomic data was analysed by principal component analysis and gene-set enrichment analysis to identify gene and pathway expression changes. RESULTS: We exclusively detect E1A-associated protein p300 (EP300)-ZNF384 in 5.7% of BCR-ABL1-negative adolescent/young adult (AYA)/adult pre-B-ALL patients. EP300-ZNF384 patients do not appear to be a high-risk subgroup. Transcriptomic analysis revealed that EP300-ZNF384 samples have a distinct gene expression profile that results in the up-regulation of Janus kinase/signal transducers and activators of transcription (JAK/STAT) and cell adhesion pathways and down-regulation of cell cycle and DNA repair pathways. CONCLUSIONS: Importantly, this report contributes to a better overview of the incidence of EP300-ZNF384 patients and show that they have a distinct gene signature with concurrent up-regulation of JAK-STAT pathway, reduced expression of B-cell regulators and reduced DNA repair capacity.

Subject(s)

E1A-Associated p300 Protein/genetics , Oncogene Proteins, Fusion/genetics , Precursor Cell Lymphoblastic Leukemia-Lymphoma/epidemiology , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Trans-Activators/genetics , Transcriptome , Adolescent , Adult , Child , Female , Gene Expression Profiling , Gene Expression Regulation, Leukemic , Gene Frequency , Genes, abl/genetics , Humans , Janus Kinases/metabolism , Male , Precursor Cell Lymphoblastic Leukemia-Lymphoma/mortality , Precursor Cell Lymphoblastic Leukemia-Lymphoma/pathology , Recurrence , STAT Transcription Factors/metabolism , Signal Transduction/genetics , Survival Analysis , Young Adult

16.

A computational analysis of the genetic and transcript diversity at the kallikrein locus.

Lai, John; An, Jiyuan; Srinivasan, Srilakshmi; Clements, Judith A; Batra, Jyotsna.

Biol Chem ; 397(12): 1307-1313, 2016 12 01.

Article in English | MEDLINE | ID: mdl-27289002

ABSTRACT

The kallikrein related peptidase gene family (KLKs) comprises 15 genes located between 19q13.3-13.4. KLKs have chymotrypsin and/or trypsin like activity, but the tissue/organ expression profile of each KLK varies considerably. Thus, the role of KLKs in human biology is also very diverse, and the deregulation of their function results in a wide-range of diseases. Here, we have cataloged the transcript (variants and fusions) and genetic (single nucleotide polymorphisms, small insertions/deletions, copy number variations (CNVs), and short tandem repeats) diversity at the KLK locus, providing a data set for researchers to explore the mechanisms through which KLK function may be deregulated. We reveal that the KLK locus hosts 85 fusion transcripts, and 80 variant transcripts. Interestingly, some fusion transcripts comprise up to 6 KLK genes. Our analysis of genetic variations of 2504 individuals from the 1000 Genome Project indicated that the KLK locus is rich in genetic diversity, with some fusion transcripts harboring over 1000 single nucleotide variations. We also found evidence from the literature linking 2387 KLK genetic variants with many types of diseases. Finally, genotyping data from the 131 KLK genetic variants in the NCI-60 cancer cell lines is provided as a resource for the cancer and KLK field.

Subject(s)

Genetic Loci/genetics , Genetic Variation , Genomics , Kallikreins/genetics , Cluster Analysis , Humans , RNA, Messenger/genetics , RNA, Messenger/metabolism

17.

J-Circos: an interactive Circos plotter.

An, Jiyuan; Lai, John; Sajjanhar, Atul; Batra, Jyotsna; Wang, Chenwei; Nelson, Colleen C.

Bioinformatics ; 31(9): 1463-5, 2015 May 01.

Article in English | MEDLINE | ID: mdl-25540184

ABSTRACT

SUMMARY: Circos plots are graphical outputs that display three dimensional chromosomal interactions and fusion transcripts. However, the Circos plot tool is not an interactive visualization tool, but rather a figure generator. For example, it does not enable data to be added dynamically nor does it provide information for specific data points interactively. Recently, an R-based Circos tool (RCircos) has been developed to integrate Circos to R, but similarly, Rcircos can only be used to generate plots. Thus, we have developed a Circos plot tool (J-Circos) that is an interactive visualization tool that can plot Circos figures, as well as being able to dynamically add data to the figure, and providing information for specific data points using mouse hover display and zoom in/out functions. J-Circos uses the Java computer language to enable, it to be used on most operating systems (Windows, MacOS, Linux). Users can input data into J-Circos using flat data formats, as well as from the Graphical user interface (GUI). J-Circos will enable biologists to better study more complex chromosomal interactions and fusion transcripts that are otherwise difficult to visualize from next-generation sequencing data. AVAILABILITY AND IMPLEMENTATION: J-circos and its manual are freely available at http://www.australianprostatecentre.org/research/software/jcircos CONTACT: j.an@qut.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Chromosomes , Computer Graphics , Gene Fusion , Software

18.

RNASeqBrowser: a genome browser for simultaneous visualization of raw strand specific RNAseq reads and UCSC genome browser custom tracks.

An, Jiyuan; Lai, John; Wood, David L A; Sajjanhar, Atul; Wang, Chenwei; Tevz, Gregor; Lehman, Melanie L; Nelson, Colleen C.

BMC Genomics ; 16: 145, 2015 Mar 01.

Article in English | MEDLINE | ID: mdl-25766521

ABSTRACT

BACKGROUND: Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. RESULTS: This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. CONCLUSIONS: The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC's can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser or http://sourceforge.net/projects/rnaseqbrowser/.

Subject(s)

Databases, Genetic , Genome , Sequence Analysis, RNA/methods , Software , Computational Biology/methods , INDEL Mutation/genetics , Internet

19.

Fusion transcript loci share many genomic features with non-fusion loci.

Lai, John; An, Jiyuan; Seim, Inge; Walpole, Carina; Hoffman, Andrea; Moya, Leire; Srinivasan, Srilakshmi; Perry-Keene, Joanna L; Wang, Chenwei; Lehman, Melanie L; Nelson, Colleen C; Clements, Judith A; Batra, Jyotsna.

BMC Genomics ; 16: 1021, 2015 Dec 01.

Article in English | MEDLINE | ID: mdl-26626734

ABSTRACT

BACKGROUND: Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. RESULTS: We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76%) of these fusion transcripts were 'read-through chimeras' derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76%) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85%) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. CONCLUSIONS: Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.

Subject(s)

Gene Expression Regulation, Neoplastic , Prostatic Neoplasms/genetics , Quantitative Trait Loci , RNA Splicing , Transcription, Genetic , Androgens/pharmacology , Antineoplastic Agents, Hormonal/pharmacology , Computational Biology/methods , Conserved Sequence , Datasets as Topic , Gene Expression Regulation, Neoplastic/drug effects , High-Throughput Nucleotide Sequencing , Humans , Male , Nucleotide Motifs , RNA Splice Sites , Repetitive Sequences, Nucleic Acid

20.

miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data.

An, Jiyuan; Lai, John; Lehman, Melanie L; Nelson, Colleen C.

Nucleic Acids Res ; 41(2): 727-37, 2013 Jan.

Article in English | MEDLINE | ID: mdl-23221645

ABSTRACT

miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star.

Subject(s)

Gene Expression Profiling , High-Throughput Nucleotide Sequencing , MicroRNAs/chemistry , Sequence Analysis, RNA , Software , Cell Line, Tumor , Humans , Male , MicroRNAs/metabolism , Prostatic Neoplasms/genetics , Prostatic Neoplasms/metabolism

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL