The diagnostic yield of exome and genome sequencing remains low (8-70%), due to incomplete knowledge on the genes that cause disease. To improve this, we use RNA-seq data from 31,499 samples to predict which genes cause specific disease phenotypes, and develop GeneNetwork Assisted Diagnostic Optimization (GADO). We show that this unbiased method, which does not rely upon specific knowledge on individual genes, is effective in both identifying previously unknown disease gene associations, and flagging genes that have previously been incorrectly implicated in disease. GADO can be run on by supplying HPO-terms and a list of genes that contain candidate variants. Finally, applying GADO to a cohort of 61 patients for whom exome-sequencing analysis had not resulted in a genetic diagnosis, yields likely causative genes for ten cases.

BACKGROUND: Atherosclerosis starts in childhood but low-density lipoprotein cholesterol (LDL-C), a causal risk factor, is mostly studied and dealt with when clinical events have occurred. Women are usually affected later in life than men and are underdiagnosed, undertreated, and understudied in cardiovascular trials and research. This study aims at a better understanding of lifestyle and genetic factors that affect LDL-C in young women. METHODS: We randomly selected for every year of age 8 women with LDL-C ≤1st percentile (≤50 mg/dL) and 8 women with LDL-C ≥99th percentile (≥186 mg/dL) from 28 000 female participants aged between 25 to 40 years of a population-based cohort study. The resulting groups include 119 and 121 women, respectively, of an average 33 years of age. A gene-sequencing panel was used to assess established monogenic and polygenic origins of these phenotypes. Information on lifestyle was extracted from questionnaires. A healthy lifestyle score was allocated based on a recently developed algorithm. RESULTS: Of the women with LDL-C ≤1st percentile, 19 (15.7%) carried mutations that are causing monogenic hypocholesterolemia and 60 (49.6%) were genetically predisposed to low LDL-C on the basis of an extremely low weighted genetic risk score. In comparison with control groups, a healthier lifestyle was not associated with low LDL-C in women without genetic predispositions. Among women with LDL-C ≥99th percentile, 20 women (16.8%) carried mutations that cause familial hypercholesterolemia, whereas 25 (21%) were predisposed to high LDL-C on the basis of a high-weighted genetic risk score. The women in whom no genetic origin for hypercholesterolemia could be identified were found to exhibit a significantly unfavorable lifestyle in comparison with controls. CONCLUSIONS: This study highlights the need for early assessment of the cardiovascular risk profile in apparently healthy young women to identify those with LDL-C ≥99th percentile for their age: first, because, in this study, 17% of the cases were molecularly diagnosed with familial hypercholesterolemia, which needs further attention; second, because our data indicate that an unfavorable lifestyle is significantly associated with severe hypercholesterolemia in genetically unaffected women, which may also need further attention.

BACKGROUND: The majority of coeliac disease (CD) patients are not being properly diagnosed and therefore remain untreated, leading to a greater risk of developing CD-associated complications. The major genetic risk heterodimer, HLA-DQ2 and DQ8, is already used clinically to help exclude disease. However, approximately 40% of the population carry these alleles and the majority never develop CD. OBJECTIVE: We explored whether CD risk prediction can be improved by adding non-HLA-susceptible variants to common HLA testing. DESIGN: We developed an average weighted genetic risk score with 10, 26 and 57 single nucleotide polymorphisms (SNP) in 2675 cases and 2815 controls and assessed the improvement in risk prediction provided by the non-HLA SNP. Moreover, we assessed the transferability of the genetic risk model with 26 non-HLA variants to a nested case-control population (n=1709) and a prospective cohort (n=1245) and then tested how well this model predicted CD outcome for 985 independent individuals. RESULTS: Adding 57 non-HLA variants to HLA testing showed a statistically significant improvement compared to scores from models based on HLA only, HLA plus 10 SNP and HLA plus 26 SNP. With 57 non-HLA variants, the area under the receiver operator characteristic curve reached 0.854 compared to 0.823 for HLA only, and 11.1% of individuals were reclassified to a more accurate risk group. We show that the risk model with HLA plus 26 SNP is useful in independent populations. CONCLUSIONS: Predicting risk with 57 additional non-HLA variants improved the identification of potential CD patients. This demonstrates a possible role for combined HLA and non-HLA genetic testing in diagnostic work for CD.

