Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 62
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Cell ; 161(3): 647-660, 2015 Apr 23.
Article in English | MEDLINE | ID: mdl-25910212

ABSTRACT

How disease-associated mutations impair protein activities in the context of biological networks remains mostly undetermined. Although a few renowned alleles are well characterized, functional information is missing for over 100,000 disease-associated variants. Here we functionally profile several thousand missense mutations across a spectrum of Mendelian disorders using various interaction assays. The majority of disease-associated alleles exhibit wild-type chaperone binding profiles, suggesting they preserve protein folding or stability. While common variants from healthy individuals rarely affect interactions, two-thirds of disease-associated alleles perturb protein-protein interactions, with half corresponding to "edgetic" alleles affecting only a subset of interactions while leaving most other interactions unperturbed. With transcription factors, many alleles that leave protein-protein interactions intact affect DNA binding. Different mutations in the same gene leading to different interaction profiles often result in distinct disease phenotypes. Thus disease-associated alleles that perturb distinct protein activities rather than grossly affecting folding and stability are relatively widespread.


Subject(s)
Disease/genetics , Mutation, Missense , Protein Interaction Maps , Proteins/genetics , Proteins/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Genome-Wide Association Study , Humans , Open Reading Frames , Protein Folding , Protein Stability
2.
Am J Hum Genet ; 109(1): 33-49, 2022 01 06.
Article in English | MEDLINE | ID: mdl-34951958

ABSTRACT

The identification of genes that evolve under recessive natural selection is a long-standing goal of population genetics research that has important applications to the discovery of genes associated with disease. We found that commonly used methods to evaluate selective constraint at the gene level are highly sensitive to genes under heterozygous selection but ubiquitously fail to detect recessively evolving genes. Additionally, more sophisticated likelihood-based methods designed to detect recessivity similarly lack power for a human gene of realistic length from current population sample sizes. However, extensive simulations suggested that recessive genes may be detectable in aggregate. Here, we offer a method informed by population genetics simulations designed to detect recessive purifying selection in gene sets. Applying this to empirical gene sets produced significant enrichments for strong recessive selection in genes previously inferred to be under recessive selection in a consanguineous cohort and in genes involved in autosomal recessive monogenic disorders.


Subject(s)
Gene Frequency , Genes, Recessive , Genetics, Population , Selection, Genetic , Algorithms , Alleles , Genes, Dominant , Genetic Predisposition to Disease , Genetic Variation , Genetics, Population/methods , Genomics/methods , Genotype , Humans , Inheritance Patterns , Likelihood Functions , Models, Genetic , Mutation , United Kingdom
3.
PLoS Genet ; 18(11): e1010367, 2022 11.
Article in English | MEDLINE | ID: mdl-36327219

ABSTRACT

Host genetics is a key determinant of COVID-19 outcomes. Previously, the COVID-19 Host Genetics Initiative genome-wide association study used common variants to identify multiple loci associated with COVID-19 outcomes. However, variants with the largest impact on COVID-19 outcomes are expected to be rare in the population. Hence, studying rare variants may provide additional insights into disease susceptibility and pathogenesis, thereby informing therapeutics development. Here, we combined whole-exome and whole-genome sequencing from 21 cohorts across 12 countries and performed rare variant exome-wide burden analyses for COVID-19 outcomes. In an analysis of 5,085 severe disease cases and 571,737 controls, we observed that carrying a rare deleterious variant in the SARS-CoV-2 sensor toll-like receptor TLR7 (on chromosome X) was associated with a 5.3-fold increase in severe disease (95% CI: 2.75-10.05, p = 5.41x10-7). This association was consistent across sexes. These results further support TLR7 as a genetic determinant of severe disease and suggest that larger studies on rare variants influencing COVID-19 outcomes could provide additional insights.


Subject(s)
COVID-19 , Exome , Humans , Exome/genetics , Genome-Wide Association Study , COVID-19/genetics , Genetic Predisposition to Disease , Toll-Like Receptor 7/genetics , SARS-CoV-2/genetics
4.
Lancet ; 401(10372): 215-225, 2023 Jan 21.
Article in English | MEDLINE | ID: mdl-36563696

ABSTRACT

BACKGROUND: Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model. METHODS: In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts. We measured the association of ISCAD with clinical outcomes-namely, coronary artery stenosis, obstructive coronary artery disease, multivessel coronary artery disease, all-cause death, and coronary artery disease sequelae. FINDINGS: Among 95 935 participants, 35 749 were from the BioMe Biobank (median age 61 years [IQR 18]; 14 599 [41%] were male and 21 150 [59%] were female; 5130 [14%] were with diagnosed coronary artery disease) and 60 186 were from the UK Biobank (median age 62 [15] years; 25 031 [42%] male and 35 155 [58%] female; 8128 [14%] with diagnosed coronary artery disease). The model predicted coronary artery disease with an area under the receiver operating characteristic curve of 0·95 (95% CI 0·94-0·95; sensitivity of 0·94 [0·94-0·95] and specificity of 0·82 [0·81-0·83]) and 0·93 (0·92-0·93; sensitivity of 0·90 [0·89-0·90] and specificity of 0·88 [0·87-0·88]) in the BioMe validation and holdout sets, respectively, and 0·91 (0·91-0·91; sensitivity of 0·84 [0·83-0·84] and specificity of 0·83 [0·82-0·83]) in the UK Biobank external test set. ISCAD captured coronary artery disease risk from known risk factors, pooled cohort equations, and polygenic risk scores. Coronary artery stenosis increased quantitatively with ascending ISCAD quartiles (increase per quartile of 12 percentage points), including risk of obstructive coronary artery disease, multivessel coronary artery disease, and stenosis of major coronary arteries. Hazard ratios (HRs) and prevalence of all-cause death increased stepwise over ISCAD deciles (decile 1: HR 1·0 [95% CI 1·0-1·0], 0·2% prevalence; decile 6: 11 [3·9-31], 3·1% prevalence; and decile 10: 56 [20-158], 11% prevalence). A similar trend was observed for recurrent myocardial infarction. 12 (46%) undiagnosed individuals with high ISCAD (≥0·9) had clinical evidence of coronary artery disease according to the 2014 American College of Cardiology/American Heart Association Task Force guidelines. INTERPRETATION: Electronic health record-based machine learning was used to generate an in-silico marker for coronary artery disease that can non-invasively quantify atherosclerosis and risk of death on a continuous spectrum, and identify underdiagnosed individuals. FUNDING: National Institutes of Health.


Subject(s)
Coronary Artery Disease , Coronary Stenosis , Humans , Male , Female , Middle Aged , Coronary Artery Disease/diagnosis , Coronary Artery Disease/epidemiology , Cohort Studies , Predictive Value of Tests , Coronary Stenosis/diagnosis , Risk Factors , Machine Learning , Coronary Angiography
5.
PLoS Genet ; 17(1): e1009337, 2021 01.
Article in English | MEDLINE | ID: mdl-33493176

ABSTRACT

Understanding the relationship between natural selection and phenotypic variation has been a long-standing challenge in human population genetics. With the emergence of biobank-scale datasets, along with new statistical metrics to approximate strength of purifying selection at the variant level, it is now possible to correlate a proxy of individual relative fitness with a range of medical phenotypes. We calculated a per-individual deleterious load score by summing the total number of derived alleles per individual after incorporating a weight that approximates strength of purifying selection. We assessed four methods for the weight, including GERP, phyloP, CADD, and fitcons. By quantitatively tracking each of these scores with the site frequency spectrum, we identified phyloP as the most appropriate weight. The phyloP-weighted load score was then calculated across 15,129,142 variants in 335,161 individuals from the UK Biobank and tested for association on 1,380 medical phenotypes. After accounting for multiple test correction, we observed a strong association of the load score amongst coding sites only on 27 traits including body mass, adiposity and metabolic rate. We further observed that the association signals were driven by common variants (derived allele frequency > 5%) with high phyloP score (phyloP > 2). Finally, through permutation analyses, we showed that the load score amongst coding sites had an excess of nominally significant associations on many medical phenotypes. These results suggest a broad impact of deleterious load on medical phenotypes and highlight the deleterious load score as a tool to disentangle the complex relationship between natural selection and medical phenotypes.


Subject(s)
Evolution, Molecular , Genetic Fitness/genetics , Genetics, Population , Selection, Genetic/genetics , Alleles , Biological Specimen Banks , Body Mass Index , Female , Gene Frequency , Genetic Association Studies , Genetic Predisposition to Disease , Genetic Variation/genetics , Humans , Male , United Kingdom
6.
J Wound Care ; 31(4): 340-347, 2022 Apr 02.
Article in English | MEDLINE | ID: mdl-35404693

ABSTRACT

OBJECTIVE: This study aimed to explore the efficacy of the IV3000 semi-occlusive, transparent adhesive film dressing in the non-surgical management of simple as well as more complex fingertip injuries. METHOD: In this qualitative study, patients with fingertip injuries were prospectively recruited and treated conservatively with the dressing between 2015 and 2017. Inclusion criteria included any fingertip injury with tissue loss and patient consent for non-surgical treatment consistent with the study protocol. Exclusion criteria included injuries needing surgical intervention for tendon injury or exposure, joint dislocations, distal phalangeal fractures requiring fixation, bone exposure, isolated nail bed lacerations and any patients eligible for surgical repair who did not wish to be managed conservatively. RESULTS: A total of 64 patients took part in the study. The patients treated with the dressing were asked to rate functional outcome, of whom 40 (62.5%) patients reported the outcome as 'excellent', 19 (29.7%) as 'satisfactory', five (7.8%) as 'indifferent' and none (0%) as 'unsatisfactory'. A reduced pulp volume at completion of healing was felt by 21 (32.8%) patients, but all patients were 'satisfied' with the aesthetic appearance of their fingertips at final clinical review. Average healing time was 4.5 weeks across the group, with the average time for return to work being just under one week. We estimate a 60% reduction in cost with the conservative versus the surgical management option. CONCLUSION: This study showed that, for participants, the IV3000 dressing was an affordable and effective option for the conservative treatment of simple fingertip injuries and in the management of more complex fingertip injuries.


Subject(s)
Finger Injuries , Occlusive Dressings , Bandages , Costs and Cost Analysis , Finger Injuries/therapy , Humans , Wound Healing
7.
JAMA ; 327(4): 350-359, 2022 01 25.
Article in English | MEDLINE | ID: mdl-35076666

ABSTRACT

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, Setting, and Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020. Participants had linked exome and electronic health record data, were older than 20 years, and were of diverse ancestral backgrounds. Exposures: Variants previously reported as pathogenic or predicted to cause a loss of protein function by bioinformatic algorithms (pathogenic/loss-of-function variants). Main Outcomes and Measures: The primary outcome was the disease risk associated with clinical variants. The risk difference (RD) between the prevalence of disease in individuals with a variant allele (penetrance) vs in individuals with a normal allele was measured. Results: Among 72 434 study participants, 43 395 were from the UK Biobank (mean [SD] age, 57 [8.0] years; 24 065 [55%] women; 2948 [7%] non-European) and 29 039 were from the BioMe Biobank (mean [SD] age, 56 [16] years; 17 355 [60%] women; 19 663 [68%] non-European). Of 5360 pathogenic/loss-of-function variants, 4795 (89%) were associated with an RD less than or equal to 0.05. Mean penetrance was 6.9% (95% CI, 6.0%-7.8%) for pathogenic variants and 0.85% (95% CI, 0.76%-0.95%) for benign variants reported in ClinVar (difference, 6.0 [95% CI, 5.6-6.4] percentage points), with a median of 0% for both groups due to large numbers of nonpenetrant variants. Penetrance of pathogenic/loss-of-function variants for late-onset diseases was modified by age: mean penetrance was 10.3% (95% CI, 9.0%-11.6%) in individuals 70 years or older and 8.5% (95% CI, 7.9%-9.1%) in individuals 20 years or older (difference, 1.8 [95% CI, 0.40-3.3] percentage points). Penetrance of pathogenic/loss-of-function variants was heterogeneous even in known disease predisposition genes, including BRCA1 (mean [range], 38% [0%-100%]), BRCA2 (mean [range], 38% [0%-100%]), and PALB2 (mean [range], 26% [0%-100%]). Conclusions and Relevance: In 2 large biobank cohorts, the estimated penetrance of pathogenic/loss-of-function variants was variable but generally low. Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.


Subject(s)
Genetic Predisposition to Disease , Genetic Variation , Loss of Function Mutation , Penetrance , Aged , Biological Specimen Banks , Cohort Studies , Female , Humans , Male , Mutation , United Kingdom
8.
Annu Rev Genomics Hum Genet ; 19: 289-301, 2018 08 31.
Article in English | MEDLINE | ID: mdl-29641912

ABSTRACT

While sequence-based genetic tests have long been available for specific loci, especially for Mendelian disease, the rapidly falling costs of genome-wide genotyping arrays, whole-exome sequencing, and whole-genome sequencing are moving us toward a future where full genomic information might inform the prognosis and treatment of a variety of diseases, including complex disease. Similarly, the availability of large populations with full genomic information has enabled new insights about the etiology and genetic architecture of complex disease. Insights from the latest generation of genomic studies suggest that our categorization of diseases as complex may conceal a wide spectrum of genetic architectures and causal mechanisms that ranges from Mendelian forms of complex disease to complex regulatory structures underlying Mendelian disease. Here, we review these insights, along with advances in the prediction of disease risk and outcomes from full genomic information.


Subject(s)
Genetic Diseases, Inborn/genetics , Genetic Diseases, Inborn/complications , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Exome Sequencing
9.
Nature ; 524(7564): 225-9, 2015 Aug 13.
Article in English | MEDLINE | ID: mdl-26123021

ABSTRACT

Patterns of amino acid conservation have served as a tool for understanding protein evolution. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity.


Subject(s)
Disease/genetics , Genomics , Mutation, Missense/genetics , Suppression, Genetic/genetics , Adaptor Proteins, Signal Transducing/genetics , Alleles , Animals , Evolution, Molecular , Genome, Human/genetics , Humans , Immediate-Early Proteins/genetics , Microcephaly/genetics , Microtubule-Associated Proteins , Phenotype , Proteins/genetics , Sequence Alignment , Tumor Suppressor Proteins/genetics
10.
Surgeon ; 19(6): e338-e343, 2021 Dec.
Article in English | MEDLINE | ID: mdl-32994124

ABSTRACT

AIMS: Under the Ionising Radiation Medical Exposure Regulations, hospitals using fluoroscopy and image intensifiers should monitor doses from exposures using ionising radiation. There is a need for national diagnostic reference levels to advise Orthopaedic and Plastic surgeons on safe screening times and radiation doses for patients having upper limb surgical procedures. METHODS: Retrospective study of all patients who underwent upper limb surgical procedures requiring intra-operative mini C-arm image intensifier use at our hospital between 2013 and 2019. This included results from three machines in different rooms. Procedures were classified as closed and open procedures. RESULTS: Information on a total of 2910 procedures over 6 years (June 2013 to June 2019) were obtained. 133 procedures with incomplete data and 4 cases of lower extremities were excluded. 1719 closed procedures had a median dose area product of 0.48 cGycm2 and median screening time of 7 s, compared to 1054 open procedures, with a median dose area product of 1.88 cGycm2 and median screening time of 28 s. National diagnostic reference levels are set at the third quartile and indicate the difference between good and poor practice. For diagnostic reference levels, we suggest a dose area product of 0.82 cGycm2 and a screening time of 11 s for closed procedures and a dose area product of 3.07 cGycm2 and screening time of 40 s for open procedures. Public Health England state that national diagnostic reference levels should be derived from multiple patients, radiology rooms and hospitals. Our data meets the first two criteria and is an initial step in establishing national diagnostic reference levels for upper limb mini C-arm use. CONCLUSIONS: This large audit reports results, which, with further work across multiple hospital sites, should lead to establishing national diagnostic reference levels for mini C-arm fluoroscopy for upper limb Orthopaedic procedures.


Subject(s)
Diagnostic Reference Levels , Radiation Exposure , Fluoroscopy , Humans , Retrospective Studies , Upper Extremity/surgery
11.
Med Sci Monit ; 26: e923331, 2020 Apr 07.
Article in English | MEDLINE | ID: mdl-32255771

ABSTRACT

BACKGROUND Osteoarthritis (OA) is a common disorder in the elderly. OA influences the daily life of patients and has become a worldwide health problem. It is still unclear whether the pathogenesis mechanism is different between males and females. This study investigated the differentially expressed genes (DEGs) and explored the different signaling pathways of OA between males and females. MATERIAL AND METHODS Data sets of GSE55457, GSE55584, and GSE12021 were retrieved from Gene Expression Omnibus to conduct DEGs analysis. Enrichment analysis of Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology term was performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) bioinformatics tool. The protein interaction network was constructed in Cytoscape 3.7.2. qRT-PCR was then performed to validate the expression of hub genes in OA patients and healthy people. RESULTS In total, 4 co-upregulated and 10 co-downregulated genes were identified. We found that enriched pathways were different between males and females. BCL2L1, EEF1A1, EEF2, HNRNPD, and PABPN1 were considered as hub genes in OA pathogenesis in males, while EEF2, EEF1A1, RPL37A, FN1 were considered as hub genes in OA pathogenesis in females. Consistent with the bioinformatics analysis, the qRT-PCR analysis also showed that the gene expression of BCL2L1, HNRNPD, and PABPN1 was significantly lower in male OA patients. In contrast, EEF2, EEF1A1, and RPL37A were significantly lower in female OA patients. CONCLUSIONS The DEGs identified may be involved in different OA disease progression mechanisms between males and females, and they are considered as treatment targets or prognosis markers for males and females. The pathogenesis mechanism is sex-dependent.


Subject(s)
Computational Biology/methods , Osteoarthritis/genetics , Osteoarthritis/pathology , Aged , Aged, 80 and over , Databases, Genetic , Eukaryotic Initiation Factor-2/genetics , Female , Gene Expression Profiling/methods , Gene Ontology , Gene Regulatory Networks/genetics , Heterogeneous Nuclear Ribonucleoprotein D0/genetics , Humans , Male , Middle Aged , Osteoarthritis/metabolism , Peptide Elongation Factor 1/genetics , Poly(A)-Binding Protein I/genetics , Protein Interaction Maps/genetics , Sex Characteristics , Signal Transduction , Software , Transcriptome , bcl-X Protein/genetics
12.
PLoS Med ; 16(1): e1002725, 2019 01.
Article in English | MEDLINE | ID: mdl-30645594

ABSTRACT

BACKGROUND: Studies have shown strong positive associations between serum urate (SU) levels and chronic kidney disease (CKD) risk; however, whether the relation is causal remains uncertain. We evaluate whether genetic data are consistent with a causal impact of SU level on the risk of CKD and estimated glomerular filtration rate (eGFR). METHODS AND FINDINGS: We used Mendelian randomization (MR) methods to evaluate the presence of a causal effect. We used aggregated genome-wide association data (N = 110,347 for SU, N = 69,374 for gout, N = 133,413 for eGFR, N = 117,165 for CKD), electronic-medical-record-linked UK Biobank data (N = 335,212), and population-based cohorts (N = 13,425), all in individuals of European ancestry, for SU levels and CKD. Our MR analysis showed that SU has a causal effect on neither eGFR level nor CKD risk across all MR analyses (all P > 0.05). These null associations contrasted with our epidemiological association findings from the 4 population-based cohorts (change in eGFR level per 1-mg/dl [59.48 µmol/l] increase in SU: -1.99 ml/min/1.73 m2; 95% CI -2.86 to -1.11; P = 8.08 × 10(-6); odds ratio [OR] for CKD: 1.48; 95% CI 1.32 to 1.65; P = 1.52 × 10(-11)). In contrast, the same MR approaches showed that SU has a causal effect on the risk of gout (OR estimates ranging from 3.41 to 6.04 per 1-mg/dl increase in SU, all P < 10-3), which served as a positive control of our approach. Overall, our MR analysis had >99% power to detect a causal effect of SU level on the risk of CKD of the same magnitude as the observed epidemiological association between SU and CKD. Limitations of this study include the lifelong effect of a genetic perturbation not being the same as an acute perturbation, the inability to study non-European populations, and some sample overlap between the datasets used in the study. CONCLUSIONS: Evidence from our series of causal inference approaches using genetics does not support a causal effect of SU level on eGFR level or CKD risk. Reducing SU levels is unlikely to reduce the risk of CKD development.


Subject(s)
Renal Insufficiency, Chronic/etiology , Uric Acid/blood , Adult , Age Factors , Female , Genome-Wide Association Study , Glomerular Filtration Rate/genetics , Humans , Male , Mendelian Randomization Analysis , Renal Insufficiency, Chronic/blood , Renal Insufficiency, Chronic/genetics , Sex Factors , Young Adult
14.
J Allergy Clin Immunol ; 142(5): 1375-1390, 2018 11.
Article in English | MEDLINE | ID: mdl-30409247

ABSTRACT

Itch is a common sensory experience that is prevalent in patients with inflammatory skin diseases, as well as in those with systemic and neuropathic conditions. In patients with these conditions, itch is often severe and significantly affects quality of life. Itch is encoded by 2 major neuronal pathways: histaminergic (in acute itch) and nonhistaminergic (in chronic itch). In the majority of cases, crosstalk existing between keratinocytes, the immune system, and nonhistaminergic sensory nerves is responsible for the pathophysiology of chronic itch. This review provides an overview of the current understanding of the molecular, neural, and immune mechanisms of itch: beginning in the skin, proceeding to the spinal cord, and eventually ascending to the brain, where itch is processed. A growing understanding of the mechanisms of chronic itch is expanding, as is our pipeline of more targeted topical and systemic therapies. Our therapeutic armamentarium for treating chronic itch has expanded in the last 5 years, with developments of topical and systemic treatments targeting the neural and immune systems.


Subject(s)
Pruritus , Animals , Brain/physiology , Chronic Disease , Humans , Neurons/physiology , Pruritus/etiology , Pruritus/metabolism , Pruritus/physiopathology , Pruritus/therapy , Spinal Cord/physiology
15.
Genet Med ; 20(9): 936-941, 2018 09.
Article in English | MEDLINE | ID: mdl-29388949

ABSTRACT

PURPOSE: Over 150,000 variants have been reported to cause Mendelian disease in the medical literature. It is still difficult to leverage this knowledge base in clinical practice, as many reports lack strong statistical evidence or may include false associations. Clinical laboratories assess whether these variants (along with newly observed variants that are adjacent to these published ones) underlie clinical disorders. METHODS: We investigated whether citation data-including journal impact factor and the number of cited variants (NCV) in each gene with published disease associations-can be used to improve variant assessment. RESULTS: Surprisingly, we found that impact factor is not predictive of pathogenicity, but the NCV score for each gene can provide statistical support for prediction of pathogenicity. When this gene-level citation metric is combined with variant-level evolutionary conservation and structural features, classification accuracy reaches 89.5%. Further, variants identified in clinical exome sequencing cases have higher NCVs than do simulated rare variants from the Exome Aggregation Consortium database within the same set of genes and functional consequences (P < 2.22 × 10-16). CONCLUSION: Aggregate citation data can complement existing variant-based predictive algorithms, and can boost their performance without the need to access and review large numbers of papers. The NCV is a slow-growing metric of scientific knowledge about each gene's association with disease.


Subject(s)
Computational Biology/methods , Genome-Wide Association Study/methods , Algorithms , Databases, Genetic , Forecasting , Genetic Variation , Humans , Journal Impact Factor
17.
PLoS Genet ; 11(10): e1005622, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26509271

ABSTRACT

Large genome-wide association studies (GWAS) have identified many genetic loci associated with risk for myocardial infarction (MI) and coronary artery disease (CAD). Concurrently, efforts such as the National Institutes of Health (NIH) Roadmap Epigenomics Project and the Encyclopedia of DNA Elements (ENCODE) Consortium have provided unprecedented data on functional elements of the human genome. In the present study, we systematically investigate the biological link between genetic variants associated with this complex disease and their impacts on gene function. First, we examined the heritability of MI/CAD according to genomic compartments. We observed that single nucleotide polymorphisms (SNPs) residing within nearby regulatory regions show significant polygenicity and contribute between 59-71% of the heritability for MI/CAD. Second, we showed that the polygenicity and heritability explained by these SNPs are enriched in histone modification marks in specific cell types. Third, we found that a statistically higher number of 45 MI/CAD-associated SNPs that have been identified from large-scale GWAS studies reside within certain functional elements of the genome, particularly in active enhancer and promoter regions. Finally, we observed significant heterogeneity of this signal across cell types, with strong signals observed within adipose nuclei, as well as brain and spleen cell types. These results suggest that the genetic etiology of MI/CAD is largely explained by tissue-specific regulatory perturbation within the human genome.


Subject(s)
Coronary Artery Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Polymorphism, Single Nucleotide/genetics , Coronary Artery Disease/pathology , Genome, Human , Genotype , Humans , Regulatory Sequences, Nucleic Acid , Risk Factors
18.
Mol Biol Evol ; 33(10): 2555-64, 2016 10.
Article in English | MEDLINE | ID: mdl-27436009

ABSTRACT

Deleterious mutations are expected to evolve under negative selection and are usually purged from the population. However, deleterious alleles segregate in the human population and some disease-associated variants are maintained at considerable frequencies. Here, we test the hypothesis that balancing selection may counteract purifying selection in neighboring regions and thus maintain deleterious variants at higher frequency than expected from their detrimental fitness effect. We first show in realistic simulations that balancing selection reduces the density of polymorphic sites surrounding a locus under balancing selection, but at the same time markedly increases the population frequency of the remaining variants, including even substantially deleterious alleles. To test the predictions of our simulations empirically, we then use whole-exome sequencing data from 6,500 human individuals and focus on the most established example for balancing selection in the human genome, the major histocompatibility complex (MHC). Our analysis shows an elevated frequency of putatively deleterious coding variants in nonhuman leukocyte antigen (non-HLA) genes localized in the MHC region. The mean frequency of these variants declined with physical distance from the classical HLA genes, indicating dependency on genetic linkage. These results reveal an indirect cost of the genetic diversity maintained by balancing selection, which has hitherto been perceived as mostly advantageous, and have implications both for the evolution of recombination and also for the epidemiology of various MHC-associated diseases.


Subject(s)
HLA Antigens/genetics , Major Histocompatibility Complex/genetics , Selection, Genetic , Sequence Deletion , Alleles , Biological Evolution , Computer Simulation , Databases, Genetic , Evolution, Molecular , Gene Frequency/genetics , Genetic Variation , Genome, Human , Haplotypes/genetics , Humans , Polymorphism, Genetic/genetics
19.
Hum Mutat ; 36(10): 998-1003, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26378430

ABSTRACT

Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations.


Subject(s)
Computational Biology/methods , Genetic Association Studies/methods , Rare Diseases/genetics , Algorithms , Databases, Genetic , Exome , False Positive Reactions , Genetic Variation , Humans , Phenotype , Web Browser
20.
Am J Hum Genet ; 88(2): 183-92, 2011 Feb 11.
Article in English | MEDLINE | ID: mdl-21310275

ABSTRACT

Assessing the significance of novel genetic variants revealed by DNA sequencing is a major challenge to the integration of genomic techniques with medical practice. Many variants remain difficult to classify by traditional genetic methods. Computational methods have been developed that could contribute to classifying these variants, but they have not been properly validated and are generally not considered mature enough to be used effectively in a clinical setting. We developed a computational method for predicting the effects of missense variants detected in patients with hypertrophic cardiomyopathy (HCM). We used a curated clinical data set of 74 missense variants in six genes associated with HCM to train and validate an automated predictor. The predictor is based on support vector regression and uses phylogenetic and structural features specific to genes involved in HCM. Ten-fold cross validation estimated our predictor's sensitivity at 94% (95% confidence interval: 83%-98%) and specificity at 89% (95% confidence interval: 72%-100%). This corresponds to an odds ratio of 10 for a prediction of pathogenic (95% confidence interval: 4.0-infinity), or an odds ratio of 9.9 for a prediction of benign (95% confidence interval: 4.6-21). Coverage (proportion of variants for which a prediction was made) was 57% (95% confidence interval: 49%-64%). This performance exceeds that of existing methods that are not specifically designed for HCM. The accuracy of this predictor provides support for the clinical use of automated predictions alongside family segregation and population frequency data in the interpretation of new missense variants and suggests future development of similar tools for other diseases.


Subject(s)
Cardiomyopathy, Hypertrophic/genetics , Computational Biology , Genetic Variation/genetics , Mutation, Missense/genetics , Nuclear Proteins/genetics , Genetic Predisposition to Disease , Humans
SELECTION OF CITATIONS
SEARCH DETAIL