Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 110
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Bioinformatics ; 33(1): 79-86, 2017 01 01.
Article in English | MEDLINE | ID: mdl-27591082

ABSTRACT

MOTIVATION: Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. RESULTS: Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). AVAILABILITY AND IMPLEMENTATION: The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Chromosome Mapping/methods , Haplotypes , Polymorphism, Single Nucleotide , Software , Gene Frequency , Genome-Wide Association Study , Genotype , Humans , Linkage Disequilibrium , Quantitative Trait, Heritable , Sample Size
2.
Neurobiol Learn Mem ; 146: 37-46, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29032015

ABSTRACT

BACKGROUND: ε4 allele possession is associated with an increased risk of Alzheimer's disease. Its effects earlier in life are less well understood. Previous studies have reported both detrimental effects and a lack of effect on cognition outside dementia. We used genotype based recall from the ALSPAC study to investigate whether APOE genotype influences cognition in earlier adult life. METHODS: We invited all individuals with the rarer ε22 or ε44 genotypes and equal numbers of those with ε32, ε33 or ε34 APOE genotypes (total n invited = 1936, ages 23-67). Participants were screened for dementia using the Addenbrooke's Cognitive Examination Revised (ACE-R). Participants were asked to complete a 3 h battery of neuropsychological tests covering a range of cognitive domains. The primary outcome was performance on the Rey Auditory Verbal Learning Test (RAVLT). Transformation of variables was used where required to permit parametric testing. As genotypes are unlikely to be confounded unadjusted analyses were performed. RESULTS: 114 participants were recruited to the study (39 ε33, 27 ε34, 15 ε44, 26 ε32 & 7 ε22). ε4+ participants had higher scores on the cognitive failures questionnaire (10 point increase, p = 0.006) but no deficits on objective cognitive testing. ε2 carriers had slightly better episodic memory performance (p = 0.016), slightly improved n-back accuracy and better executive functioning (trails A&B, p = 0.005). CONCLUSIONS: It is intriguing that the ε2+ group performed better as this group have a lower risk of Alzheimer's disease. Most previous studies have analysed as ε4/non ε4 so may have missed this effect.


Subject(s)
Apolipoprotein E2/genetics , Apolipoprotein E4/genetics , Cognition/physiology , Executive Function/physiology , Memory, Episodic , Verbal Learning/physiology , Adult , Aged , Female , Genotype , Humans , Male , Middle Aged , Neuropsychological Tests , United Kingdom , Young Adult
3.
J Med Genet ; 53(4): 280-8, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26831755

ABSTRACT

BACKGROUND: Several recessive Mendelian disorders are common in Europeans, including cystic fibrosis (CFTR), medium-chain-acyl-Co-A-dehydrogenase deficiency (ACADM), phenylketonuria (PAH) and alpha 1-antitrypsin deficiency (SERPINA1). METHODS: In a multicohort study of >19,000 older individuals, we investigated the relevant phenotypes in heterozygotes for these genes: lung function (forced expiratory volume in 1 second (FEV1), forced vital capacity (FVC)) for CFTR and SERPINA1; cognitive measures for ACADM and PAH; and physical capability for ACADM, PAH and SERPINA1. RESULTS: Findings were mostly negative but lung function in SERPINA1 (protease inhibitor (PI) Z allele, rs28929474) showed enhanced FEV1 and FVC (0.13 z-score increase in FEV1 (p=1.7 × 10(-5)) and 0.16 z-score increase in FVC (p=5.2 × 10(-8))) in PI-MZ individuals. Height adjustment (a known, strong correlate of FEV1 and FVC) revealed strong positive height associations of the Z allele (1.50 cm increase in height (p=3.6 × 10(-10))). CONCLUSIONS: The PI-MZ rare (2%) SNP effect is nearly four times greater than the 'top' common height SNP in HMGA2. However, height only partially attenuates the SERPINA1-FEV1 or FVC association (around 50%) and vice versa. Height SNP variants have recently been shown to be positively selected collectively in North versus South Europeans, while the Z allele high frequency is localised to North Europe. Although PI-ZZ is clinically disadvantageous to lung function, PI-MZ increases both height and respiratory function; potentially a balanced polymorphism. Partial blockade of PI could conceivably form part of a future poly-therapeutic approach in very short children. The notion that elastase inhibition should benefit patients with chronic obstructive pulmonary disease may also merit re-evaluation. PI is already a therapeutic target: our findings invite a reconsideration of the optimum level in respiratory care and novel pathway potential for development of agents for the management of growth disorders.


Subject(s)
Cystic Fibrosis/genetics , Pulmonary Disease, Chronic Obstructive/genetics , alpha 1-Antitrypsin Deficiency/genetics , alpha 1-Antitrypsin/genetics , Alleles , Cystic Fibrosis/epidemiology , Cystic Fibrosis/pathology , Europe , Female , Forced Expiratory Volume/genetics , Genotype , HMGA2 Protein/genetics , Heterozygote , Humans , Male , Phenotype , Phenylketonurias/epidemiology , Phenylketonurias/genetics , Phenylketonurias/pathology , Polymorphism, Genetic , Pulmonary Disease, Chronic Obstructive/epidemiology , Pulmonary Disease, Chronic Obstructive/pathology , alpha 1-Antitrypsin Deficiency/epidemiology , alpha 1-Antitrypsin Deficiency/pathology
4.
Ann Hum Genet ; 80(3): 187-96, 2016 May.
Article in English | MEDLINE | ID: mdl-27000383

ABSTRACT

Consanguineous offspring have elevated levels of homozygosity. Autozygous stretches within their genome are likely to harbour loss of function (LoF) mutations which will lead to complete inactivation or dysfunction of genes. Studying consanguineous offspring with clinical phenotypes has been very useful for identifying disease causal mutations. However, at present, most of the genes in the human genome have no disorder associated with them or have unknown function. This is presumably mostly due to the fact that homozygous LoF variants are not observed in outbred populations which are the main focus of large sequencing projects. However, another reason may be that many genes in the genome-even when completely "knocked out," do not cause a distinct or defined phenotype. Here, we discuss the benefits and implications of studying consanguineous populations, as opposed to the traditional approach of analysing a subset of consanguineous families or individuals with disease. We suggest that studying consanguineous populations "as a whole" can speed up the characterisation of novel gene functions as well as indicating nonessential genes and/or regions in the human genome. We also suggest designing a single nucleotide variant (SNV) array to make the process more efficient.


Subject(s)
Consanguinity , Genetics, Population , Genome, Human , Chromosome Mapping , Gene Silencing , Heterozygote , Homozygote , Humans , Phenotype
5.
Bioinformatics ; 31(10): 1536-43, 2015 May 15.
Article in English | MEDLINE | ID: mdl-25583119

ABSTRACT

MOTIVATION: Technological advances have enabled the identification of an increasingly large spectrum of single nucleotide variants within the human genome, many of which may be associated with monogenic disease or complex traits. Here, we propose an integrative approach, named FATHMM-MKL, to predict the functional consequences of both coding and non-coding sequence variants. Our method utilizes various genomic annotations, which have recently become available, and learns to weight the significance of each component annotation source. RESULTS: We show that our method outperforms current state-of-the-art algorithms, CADD and GWAVA, when predicting the functional consequences of non-coding variants. In addition, FATHMM-MKL is comparable to the best of these algorithms when predicting the impact of coding variants. The method includes a confidence measure to rank order predictions.


Subject(s)
Algorithms , Genetic Variation/genetics , Genome, Human , Molecular Sequence Annotation , Open Reading Frames/genetics , Untranslated Regions/genetics , Genome-Wide Association Study , Genomics/methods , Humans , Phenotype
6.
Am J Hum Genet ; 90(3): 410-25, 2012 Mar 09.
Article in English | MEDLINE | ID: mdl-22325160

ABSTRACT

To identify genetic factors contributing to type 2 diabetes (T2D), we performed large-scale meta-analyses by using a custom ∼50,000 SNP genotyping array (the ITMAT-Broad-CARe array) with ∼2000 candidate genes in 39 multiethnic population-based studies, case-control studies, and clinical trials totaling 17,418 cases and 70,298 controls. First, meta-analysis of 25 studies comprising 14,073 cases and 57,489 controls of European descent confirmed eight established T2D loci at genome-wide significance. In silico follow-up analysis of putative association signals found in independent genome-wide association studies (including 8,130 cases and 38,987 controls) performed by the DIAGRAM consortium identified a T2D locus at genome-wide significance (GATAD2A/CILP2/PBX4; p = 5.7 × 10(-9)) and two loci exceeding study-wide significance (SREBF1, and TH/INS; p < 2.4 × 10(-6)). Second, meta-analyses of 1,986 cases and 7,695 controls from eight African-American studies identified study-wide-significant (p = 2.4 × 10(-7)) variants in HMGA2 and replicated variants in TCF7L2 (p = 5.1 × 10(-15)). Third, conditional analysis revealed multiple known and novel independent signals within five T2D-associated genes in samples of European ancestry and within HMGA2 in African-American samples. Fourth, a multiethnic meta-analysis of all 39 studies identified T2D-associated variants in BCL2 (p = 2.1 × 10(-8)). Finally, a composite genetic score of SNPs from new and established T2D signals was significantly associated with increased risk of diabetes in African-American, Hispanic, and Asian populations. In summary, large-scale meta-analysis involving a dense gene-centric approach has uncovered additional loci and variants that contribute to T2D risk and suggests substantial overlap of T2D association signals across multiple ethnic groups.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Genetic Loci , Adolescent , Adult , Aged , Aged, 80 and over , Case-Control Studies , Diabetes Mellitus, Type 2/ethnology , Ethnicity , Female , Follow-Up Studies , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Genotype , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide , Young Adult
7.
Bioinformatics ; 30(6): 838-45, 2014 Mar 15.
Article in English | MEDLINE | ID: mdl-24162466

ABSTRACT

MOTIVATION: Within medical research there is an increasing trend toward deriving multiple types of data from the same individual. The most effective prognostic prediction methods should use all available data, as this maximizes the amount of information used. In this article, we consider a variety of learning strategies to boost prediction performance based on the use of all available data. IMPLEMENTATION: We consider data integration via the use of multiple kernel learning supervised learning methods. We propose a scheme in which feature selection by statistical score is performed separately per data type and by pathway membership. We further consider the introduction of a confidence measure for the class assignment, both to remove some ambiguously labeled datapoints from the training data and to implement a cautious classifier that only makes predictions when the associated confidence is high. RESULTS: We use the METABRIC dataset for breast cancer, with prediction of survival at 2000 days from diagnosis. Predictive accuracy is improved by using kernels that exclusively use those genes, as features, which are known members of particular pathways. We show that yet further improvements can be made by using a range of additional kernels based on clinical covariates such as Estrogen Receptor (ER) status. Using this range of measures to improve prediction performance, we show that the test accuracy on new instances is nearly 80%, though predictions are only made on 69.2% of the patient cohort. AVAILABILITY: https://github.com/jseoane/FSMKL CONTACT: J.Seoane@bristol.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Breast Neoplasms/pathology , Disease Progression , Gene Expression Regulation , Genomics , Humans , Metabolomics , Software
8.
Hum Genomics ; 8: 11, 2014 Jun 30.
Article in English | MEDLINE | ID: mdl-24980617

ABSTRACT

As the number of non-synonymous single nucleotide polymorphisms (nsSNPs) identified through whole-exome/whole-genome sequencing programs increases, researchers and clinicians are becoming increasingly reliant upon computational prediction algorithms designed to prioritize potential functional variants for further study. A large proportion of existing prediction algorithms are 'disease agnostic' but are nevertheless quite capable of predicting when a mutation is likely to be deleterious. However, most clinical and research applications of these algorithms relate to specific diseases and would therefore benefit from an approach that discriminates between functional variants specifically related to that disease from those which are not. In a whole-exome/whole-genome sequencing context, such an approach could substantially reduce the number of false positive candidate mutations. Here, we test this postulate by incorporating a disease-specific weighting scheme into the Functional Analysis through Hidden Markov Models (FATHMM) algorithm. When compared to traditional prediction algorithms, we observed an overall reduction in the number of false positives identified using a disease-specific approach to functional prediction across 17 distinct disease concepts/categories. Our results illustrate the potential benefits of making disease-specific predictions when prioritizing candidate variants in relation to specific diseases. A web-based implementation of our algorithm is available at http://fathmm.biocompute.org.uk.


Subject(s)
Amino Acid Substitution/genetics , Mutation/genetics , Polymorphism, Single Nucleotide/genetics , Software , Computational Biology , Humans , Internet , Markov Chains , Phenotype
9.
PLoS Comput Biol ; 10(10): e1003876, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25329069

ABSTRACT

Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy) and for testing multiple variants for association with a single phenotype (gene-based association tests). Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA) measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study), we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1) with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.


Subject(s)
Genome-Wide Association Study/methods , Genomics/methods , Models, Genetic , Aged , Blood Coagulation Factors/genetics , Cluster Analysis , Databases, Genetic , Female , Humans , Hypertrophy, Left Ventricular/genetics , Middle Aged , Phenotype , Polymorphism, Single Nucleotide/genetics
10.
Hum Mutat ; 35(12): 1446-8, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25224326

ABSTRACT

Primary ciliary dyskinesia (PCD) is an autosomal-recessive disorder characterized by impaired ciliary function that leads to subsequent clinical phenotypes such as chronic sinopulmonary disease. PCD is also a genetically heterogeneous disorder with many single gene mutations leading to similar clinical phenotypes. Here, we present a novel PCD causal gene, coiled-coil domain containing 151 (CCDC151), which has been shown to be essential in motile cilia of many animals and other vertebrates but its effects in humans was not observed until currently. We observed a novel nonsense mutation in a homozygous state in the CCDC151 gene (NM_145045.4:c.925G>T:p.[E309*]) in a clinically diagnosed PCD patient from a consanguineous family of Arabic ancestry. The variant was absent in 238 randomly selected individuals indicating that the variant is rare and likely not to be a founder mutation. Our finding also shows that given prior knowledge from model organisms, even a single whole-exome sequence can be sufficient to discover a novel causal gene.


Subject(s)
Carrier Proteins/genetics , Codon, Nonsense , Genetic Predisposition to Disease , Kartagener Syndrome/genetics , Humans
11.
Am J Hum Genet ; 89(6): 688-700, 2011 Dec 09.
Article in English | MEDLINE | ID: mdl-22100073

ABSTRACT

Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies.


Subject(s)
Genetic Loci , Hypertension/genetics , Oligonucleotide Array Sequence Analysis , Adult , Aged , Blood Pressure/genetics , Case-Control Studies , Female , Gene Expression Profiling , Gene Frequency , Genome-Wide Association Study , Haplotypes , Humans , Linkage Disequilibrium , Male , Methylenetetrahydrofolate Reductase (NADPH2)/genetics , Middle Aged , Plasma Membrane Calcium-Transporting ATPases/genetics , Polymorphism, Single Nucleotide , Receptors, Atrial Natriuretic Factor/genetics , Sequence Analysis, DNA
12.
Bioinformatics ; 29(12): 1504-10, 2013 Jun 15.
Article in English | MEDLINE | ID: mdl-23620363

ABSTRACT

MOTIVATION: The number of missense mutations being identified in cancer genomes has greatly increased as a consequence of technological advances and the reduced cost of whole-genome/whole-exome sequencing methods. However, a high proportion of the amino acid substitutions detected in cancer genomes have little or no effect on tumour progression (passenger mutations). Therefore, accurate automated methods capable of discriminating between driver (cancer-promoting) and passenger mutations are becoming increasingly important. In our previous work, we developed the Functional Analysis through Hidden Markov Models (FATHMM) software and, using a model weighted for inherited disease mutations, observed improved performances over alternative computational prediction algorithms. Here, we describe an adaptation of our original algorithm that incorporates a cancer-specific model to potentiate the functional analysis of driver mutations. RESULTS: The performance of our algorithm was evaluated using two separate benchmarks. In our analysis, we observed improved performances when distinguishing between driver mutations and other germ line variants (both disease-causing and putatively neutral mutations). In addition, when discriminating between somatic driver and passenger mutations, we observed performances comparable with the leading computational prediction algorithms: SPF-Cancer and TransFIC. AVAILABILITY AND IMPLEMENTATION: A web-based implementation of our cancer-specific model, including a downloadable stand-alone package, is available at http://fathmm.biocompute.org.uk.


Subject(s)
Amino Acid Substitution , DNA Mutational Analysis/methods , Neoplasms/genetics , Algorithms , Genomics , Humans , Mutation, Missense , Software
13.
Eur Heart J ; 34(13): 972-81, 2013 Apr.
Article in English | MEDLINE | ID: mdl-22977227

ABSTRACT

AIMS: The aim of this study was to quantify the collective effect of common lipid-associated single nucleotide polymorphisms (SNPs) on blood lipid levels, cardiovascular risk, use of lipid-lowering medication, and risk of coronary heart disease (CHD) events. METHODS AND RESULTS: Analysis was performed in two prospective cohorts: Whitehall II (WHII; N = 5059) and the British Women's Heart and Health Study (BWHHS; N = 3414). For each participant, scores were calculated based on the cumulative effect of multiple genetic variants influencing total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG). Compared with the bottom quintile, individuals in the top quintile of the LDL-C genetic score distribution had higher LDL-C {mean difference of 0.85 [95% confidence interval, (CI) = 0.76-0.94] and 0.63 [95% CI = 0.50-0.76] mmol/l in WHII and BWHHS, respectively}. They also tended to have greater odds of having 'high-risk' status (Framingham 10-year cardiovascular disease risk >20%) [WHII: odds ratio (OR) = 1.36 (0.93-1.98), BWHHS: OR = 1.49 (1.14-1.94)]; receiving lipid-lowering treatment [WHII: OR = 2.38 (1.57-3.59), BWHHS: OR = 2.24 (1.52-3.29)]; and CHD events [WHII: OR = 1.43 (1.02-2.00), BWHHS: OR = 1.31 (0.99-1.72)]. Similar associations were observed for the TC score in both studies. The TG score was associated with high-risk status and medication use in both studies. Neither HDL nor TG scores were associated with the risk of coronary events. The genetic scores did not improve discrimination over the Framingham risk score. CONCLUSION: At the population level, common SNPs associated with LDL-C and TC contribute to blood lipid variation, cardiovascular risk, use of lipid-lowering medications and coronary events. However, their effects are too small to discriminate future lipid-lowering medication requirements or coronary events.


Subject(s)
Cardiovascular Diseases/genetics , Hyperlipidemias/genetics , Polymorphism, Single Nucleotide/genetics , Adult , Aged , Cholesterol, HDL/genetics , Cholesterol, HDL/metabolism , Cholesterol, LDL/genetics , Cholesterol, LDL/metabolism , Coronary Disease/genetics , Female , Genotype , Humans , Hyperlipidemias/drug therapy , Hypolipidemic Agents/therapeutic use , Male , Middle Aged , Phenotype , Prospective Studies , Risk Factors , Triglycerides/genetics , Triglycerides/metabolism
14.
Hum Mutat ; 34(1): 57-65, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23033316

ABSTRACT

The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.


Subject(s)
Algorithms , Amino Acid Substitution , Computational Biology/methods , Mutation , Proteins/genetics , Genetic Association Studies/methods , Genotype , Humans , Internet , Phenotype , Polymorphism, Single Nucleotide , Proteins/metabolism , Reproducibility of Results , Software , Triticum/genetics
15.
Ann Hum Genet ; 77(1): 67-79, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23278391

ABSTRACT

Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data.


Subject(s)
Meta-Analysis as Topic , Polymorphism, Single Nucleotide , Electrocardiography , Gene Frequency , Haplotypes , Humans , Linkage Disequilibrium , Lipids/analysis , Regression Analysis
16.
Clin Chem ; 59(1): 234-44, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23169475

ABSTRACT

BACKGROUND: Prostate-specific antigen (PSA), a widely used biomarker for prostate cancer (PCa), is encoded by a kallikrein gene (KLK3, kallikrein-related peptidase 3). Serum PSA concentrations vary in the population, with PCa patients generally showing higher PSA concentrations than control individuals, although a small proportion of individuals in the population display very low PSA concentrations. We hypothesized that very low PSA concentrations might reflect gene-inactivating mutations in KLK3 that lead to abnormally reduced gene expression. METHODS: We have sequenced all KLK3 exons and the promoter and searched for gross deletions or duplications in KLK3 in the 30 individuals with the lowest observed PSA concentrations in a sample of approximately 85 000 men from the Prostate Testing for Cancer and Treatment (ProtecT) study. The ProtecT study examines a community-based population of men from across the UK with little prior PSA testing. RESULTS: We observed no stop codons or frameshift mutations, but we did find 30 single-base genetic variants, including 3 variants not described previously. These variants included missense variants that could be functionally inactivating and splicing variants. At this stage, however, we cannot confidently conclude whether these variants markedly lower PSA concentration or activity. More importantly, we identified 3 individuals with different large heterozygous deletions that encompass all KLK3 exons. The absence of a functional copy of KLK3 in these individuals is consistent with their reduced serum PSA concentrations. CONCLUSIONS: The clinical interpretation of the PSA test for individuals with KLK3 gene inactivation could lead to false-negative PSA findings used for screening, diagnosis, or monitoring of PCa.


Subject(s)
Gene Deletion , Kallikreins/genetics , Prostate-Specific Antigen/blood , Exons , Humans , Male , Mutation , Polymerase Chain Reaction , Prostate-Specific Antigen/genetics
17.
J Nutr ; 143(5): 606-12, 2013 May.
Article in English | MEDLINE | ID: mdl-23468552

ABSTRACT

Several investigations have observed positive associations between good nutritional status, as indicated by micronutrients, and cognitive measures; however, these associations may not be causal. Genetic polymorphisms that affect nutritional biomarkers may be useful for providing evidence for associations between micronutrients and cognitive measures. As part of the Healthy Ageing across the Life Course (HALCyon) program, men and women aged between 44 and 90 y from 6 UK cohorts were genotyped for polymorphisms associated with circulating concentrations of iron [rs4820268 transmembrane protease, serine 6 (TMPRSS6) and rs1800562 hemochromatosis (HFE)], vitamin B-12 [(rs492602 fucosyltransferase 2 (FUT2)], vitamin D ([rs2282679 group-specific component (GC)] and ß-carotene ([rs6564851 beta-carotene 15,15'-monooxygenase 1 (BCMO1)]. Meta-analysis was used to pool within-study effects of the associations between these polymorphisms and the following measures of cognitive capability: word recall, phonemic fluency, semantic fluency, and search speed. Among the several statistical tests conducted, we found little evidence for associations. We found the minor allele of rs1800562 was associated with poorer word recall scores [pooled ß on Z-score for carriers vs. noncarriers: -0.05 (95% CI: -0.09, -0.004); P = 0.03, n = 14,105] and poorer word recall scores for the vitamin D-raising allele of rs2282679 [pooled ß per T allele: -0.03 (95% CI: -0.05, -0.003); P = 0.03, n = 16,527]. However, there was no evidence for other associations. Our findings provide little evidence to support associations between these genotypes and cognitive capability in older adults. Further investigations are required to elucidate whether the previous positive associations from observational studies between circulating measures of these micronutrients and cognitive performance are due to confounding and reverse causality.


Subject(s)
Cognition/physiology , Iron , Mental Recall/physiology , Micronutrients/blood , Nutritional Status/genetics , Polymorphism, Genetic , Vitamin D , Adult , Aged , Aged, 80 and over , Alleles , Biomarkers/blood , Cognition Disorders/blood , Cognition Disorders/genetics , Female , Genotype , Hemochromatosis/genetics , Humans , Iron/blood , Male , Middle Aged , United Kingdom , Vitamin D/blood , Vitamin D/genetics , Vitamin D-Binding Protein/genetics
18.
Arterioscler Thromb Vasc Biol ; 32(8): 2029-34, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22679311

ABSTRACT

OBJECTIVE: Short leukocyte telomere length (LTL) is associated with cardiovascular (CV) disease in adulthood. However, the biological basis of this association remains unclear. We sought to define early determinants of the association between CV disease and LTL in an adolescent population. METHODS AND RESULTS: One thousand eighty adolescents, aged 13 to 16 years and participating in the Ten Towns Heart Health Study, provided blood samples for DNA extraction and measurement of a range of CV risk factors. LTL was measured by real-time polymerase chain reaction. LTL was inversely associated with age (P=0.04), longer in females than in males (P=0.03), and longer in South Asians than in white Europeans (P=0.01). No associations were found between LTL and traditional CV risk factors. There was a significant and inverse association between LTL and inflammatory markers, including C-reactive protein (P<0.001) and fibrinogen (P=0.001). The associations between LTL and inflammatory markers were not affected by multiple adjustments for behavioral and metabolic factors. CONCLUSIONS: High levels of inflammation are associated with shorter LTL from early adolescence; traditional CV risk factors have little association with LTL in adolescence. Inflammation in early life may play a causal role in the adult association between short LTL and CV disease.


Subject(s)
Cardiovascular Diseases/etiology , Inflammation/genetics , Leukocytes/physiology , Telomere , Adolescent , C-Reactive Protein/analysis , Female , Humans , Inflammation/blood , Male , Risk Factors
19.
Nucleic Acids Res ; 39(8): e54, 2011 Apr.
Article in English | MEDLINE | ID: mdl-21300641

ABSTRACT

We describe a generic design for ratiometric analysis suitable for determination of copy number variation (CNV) class of a gene. Following two initial sequence-specific PCR priming cycles, both ends of both amplicons (one test and one reference) in a duplex reaction, are all primed by the same universal primer (UP). Following each amplification denaturation step, the UP target and its reverse complement (UP') in each strand form a hairpin. The bases immediately beyond the 3'-end of the UP and 5' of UP' are chosen such as not to base pair in the hairpin (otherwise priming is ablated). This hairpin creates a single constant environment for priming events and chaperones free 3'-ends of amplicon strands. The resultant 'amplification ratio control system' (ARCS) permits ratiometric representation of amplicons relative to the original template into PCR plateau phase. These advantages circumvent the need for real-time PCR for quantitation. Choice of different %(G+C) content for the target and reference amplicons allows liquid phase thermal melt discrimination and quantitation of amplicons. The design is generic, simple to set up and economical. Comparisons with real-time PCR and other techniques are made and CNV assays demonstrated for haptoglobin duplicon and 'chemokine (C-C motif) ligand 3-like 1' gene.


Subject(s)
Gene Dosage , Polymerase Chain Reaction/methods , Base Pairing , Chemokines, CC/genetics , DNA Copy Number Variations , DNA Primers/chemistry , Genotype , Haptoglobins/genetics , Nucleic Acid Denaturation
20.
Laterality ; 18(2): 251-61, 2013.
Article in English | MEDLINE | ID: mdl-22721421

ABSTRACT

A recent report found that left-handed adolescents were more than three times more likely to have an Apolipoprotein (APOE) ϵ2 allele. This study was unable to replicate this association in young adults (N=166). A meta-analysis of nine other datasets (N=360 to 7559, Power > 0.999) including that of National Alzheimer's Coordinating Center also failed to find an over-representation of ϵ2 among left-handers indicating that this earlier outcome was most likely a statistical artefact.


Subject(s)
Alleles , Apolipoproteins E/genetics , Functional Laterality/genetics , Adolescent , Female , Genetic Association Studies , Genotype , Humans , Male , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL