Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 78
Filter
1.
Cell ; 167(5): 1415-1429.e19, 2016 11 17.
Article in English | MEDLINE | ID: mdl-27863252

ABSTRACT

Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.


Subject(s)
Genetic Variation , Genome-Wide Association Study , Hematopoietic Stem Cells/metabolism , Immune System Diseases/genetics , Alleles , Cell Differentiation , Genetic Predisposition to Disease , Hematopoietic Stem Cells/pathology , Humans , Immune System Diseases/pathology , Polymorphism, Single Nucleotide , Quantitative Trait Loci , White People/genetics
2.
Nature ; 2024 May 20.
Article in English | MEDLINE | ID: mdl-38768635

ABSTRACT

Rare coding variants that significantly impact function provide insights into the biology of a gene1-3. However, ascertaining their frequency requires large sample sizes4-8. Here, we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. 23% of the Regeneron Genetics Center Million Exome data (RGC-ME) comes from non-European individuals of African, East Asian, Indigenous American, Middle Eastern, and South Asian ancestry. This catalogue includes over 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss-of-function, we identify 3,988 loss-of-function intolerant genes, including 86 that were previously assessed as tolerant and 1,153 lacking established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions depleted of missense variants despite being tolerant to pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this important resource of coding variation from the RGC-ME accessible via a public variant allele frequency browser.

3.
Nature ; 622(7984): 784-793, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37821707

ABSTRACT

The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent.


Subject(s)
Exome Sequencing , Genome, Human , Genotype , Hispanic or Latino , Adult , Humans , Africa/ethnology , Americas/ethnology , Europe/ethnology , Gene Frequency/genetics , Genetics, Population , Genome, Human/genetics , Genotyping Techniques , Hispanic or Latino/genetics , Homozygote , Loss of Function Mutation/genetics , Mexico , Prospective Studies
4.
Nature ; 612(7939): 301-309, 2022 12.
Article in English | MEDLINE | ID: mdl-36450978

ABSTRACT

Clonal haematopoiesis involves the expansion of certain blood cell lineages and has been associated with ageing and adverse health outcomes1-5. Here we use exome sequence data on 628,388 individuals to identify 40,208 carriers of clonal haematopoiesis of indeterminate potential (CHIP). Using genome-wide and exome-wide association analyses, we identify 24 loci (21 of which are novel) where germline genetic variation influences predisposition to CHIP, including missense variants in the lymphocytic antigen coding gene LY75, which are associated with reduced incidence of CHIP. We also identify novel rare variant associations with clonal haematopoiesis and telomere length. Analysis of 5,041 health traits from the UK Biobank (UKB) found relationships between CHIP and severe COVID-19 outcomes, cardiovascular disease, haematologic traits, malignancy, smoking, obesity, infection and all-cause mortality. Longitudinal and Mendelian randomization analyses revealed that CHIP is associated with solid cancers, including non-melanoma skin cancer and lung cancer, and that CHIP linked to DNMT3A is associated with the subsequent development of myeloid but not lymphoid leukaemias. Additionally, contrary to previous findings from the initial 50,000 UKB exomes6, our results in the full sample do not support a role for IL-6 inhibition in reducing the risk of cardiovascular disease among CHIP carriers. Our findings demonstrate that CHIP represents a complex set of heterogeneous phenotypes with shared and unique germline genetic causes and varied clinical implications.


Subject(s)
COVID-19 , Cardiovascular Diseases , Humans , Clonal Hematopoiesis/genetics , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/genetics
5.
Nature ; 599(7886): 628-634, 2021 11.
Article in English | MEDLINE | ID: mdl-34662886

ABSTRACT

A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing1 to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study2. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10-11. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.


Subject(s)
Biological Specimen Banks , Databases, Genetic , Exome Sequencing , Exome/genetics , Africa/ethnology , Asia/ethnology , Asthma/genetics , Diabetes Mellitus/genetics , Europe/ethnology , Eye Diseases/genetics , Female , Genetic Predisposition to Disease/genetics , Genetic Variation , Genome-Wide Association Study , Humans , Hypertension/genetics , Liver Diseases/genetics , Male , Mutation , Neoplasms/genetics , Quantitative Trait, Heritable , United Kingdom
6.
Nature ; 586(7831): 749-756, 2020 10.
Article in English | MEDLINE | ID: mdl-33087929

ABSTRACT

The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.


Subject(s)
Databases, Genetic , Exome Sequencing , Exome/genetics , Loss of Function Mutation/genetics , Phenotype , Aged , Bone Density/genetics , Collagen Type VI/genetics , Demography , Female , Genes, BRCA1 , Genes, BRCA2 , Genotype , Humans , Ion Channels/genetics , Male , Middle Aged , Neoplasms/genetics , Penetrance , Peptide Fragments/genetics , United Kingdom , Varicose Veins/genetics , ras GTPase-Activating Proteins/genetics
7.
N Engl J Med ; 387(4): 332-344, 2022 07 28.
Article in English | MEDLINE | ID: mdl-35939579

ABSTRACT

BACKGROUND: Exome sequencing in hundreds of thousands of persons may enable the identification of rare protein-coding genetic variants associated with protection from human diseases like liver cirrhosis, providing a strategy for the discovery of new therapeutic targets. METHODS: We performed a multistage exome sequencing and genetic association analysis to identify genes in which rare protein-coding variants were associated with liver phenotypes. We conducted in vitro experiments to further characterize associations. RESULTS: The multistage analysis involved 542,904 persons with available data on liver aminotransferase levels, 24,944 patients with various types of liver disease, and 490,636 controls without liver disease. We found that rare coding variants in APOB, ABCB4, SLC30A10, and TM6SF2 were associated with increased aminotransferase levels and an increased risk of liver disease. We also found that variants in CIDEB, which encodes a structural protein found in hepatic lipid droplets, had a protective effect. The burden of rare predicted loss-of-function variants plus missense variants in CIDEB (combined carrier frequency, 0.7%) was associated with decreased alanine aminotransferase levels (beta per allele, -1.24 U per liter; 95% confidence interval [CI], -1.66 to -0.83; P = 4.8×10-9) and with 33% lower odds of liver disease of any cause (odds ratio per allele, 0.67; 95% CI, 0.57 to 0.79; P = 9.9×10-7). Rare coding variants in CIDEB were associated with a decreased risk of liver disease across different underlying causes and different degrees of severity, including cirrhosis of any cause (odds ratio per allele, 0.50; 95% CI, 0.36 to 0.70). Among 3599 patients who had undergone bariatric surgery, rare coding variants in CIDEB were associated with a decreased nonalcoholic fatty liver disease activity score (beta per allele in score units, -0.98; 95% CI, -1.54 to -0.41 [scores range from 0 to 8, with higher scores indicating more severe disease]). In human hepatoma cell lines challenged with oleate, CIDEB small interfering RNA knockdown prevented the buildup of large lipid droplets. CONCLUSIONS: Rare germline mutations in CIDEB conferred substantial protection from liver disease. (Funded by Regeneron Pharmaceuticals.).


Subject(s)
Apoptosis Regulatory Proteins , Germ-Line Mutation , Liver Diseases , Apoptosis Regulatory Proteins/genetics , Apoptosis Regulatory Proteins/metabolism , Genetic Predisposition to Disease/genetics , Genetic Predisposition to Disease/prevention & control , Humans , Liver/metabolism , Liver Diseases/genetics , Liver Diseases/metabolism , Liver Diseases/prevention & control , Transaminases/genetics , Exome Sequencing
9.
Nature ; 562(7726): 210-216, 2018 10.
Article in English | MEDLINE | ID: mdl-30305740

ABSTRACT

The genetic architecture of brain structure and function is largely unknown. To investigate this, we carried out genome-wide association studies of 3,144 functional and structural brain imaging phenotypes from UK Biobank (discovery dataset 8,428 subjects). Here we show that many of these phenotypes are heritable. We identify 148 clusters of associations between single nucleotide polymorphisms and imaging phenotypes that replicate at P < 0.05, when we would expect 21 to replicate by chance. Notable significant, interpretable associations include: iron transport and storage genes, related to magnetic susceptibility of subcortical brain tissue; extracellular matrix and epidermal growth factor genes, associated with white matter micro-structure and lesions; genes that regulate mid-line axon development, associated with organization of the pontine crossing tract; and overall 17 genes involved in development, pathway signalling and plasticity. Our results provide insights into the genetic architecture of the brain that are relevant to neurological and psychiatric disorders, brain development and ageing.


Subject(s)
Biological Specimen Banks , Brain/diagnostic imaging , Genome-Wide Association Study , Heredity , Neuroimaging , Phenotype , Polymorphism, Single Nucleotide/genetics , Aging/genetics , Brain/anatomy & histology , Brain/growth & development , Brain/pathology , Datasets as Topic , Epidermal Growth Factor/genetics , Extracellular Matrix , Female , Humans , Iron/metabolism , Male , Neuronal Plasticity/genetics , Putamen/anatomy & histology , Putamen/metabolism , Signal Transduction/genetics , United Kingdom , White Matter/anatomy & histology , White Matter/metabolism , White Matter/pathology
10.
Nature ; 562(7726): 203-209, 2018 10.
Article in English | MEDLINE | ID: mdl-30305743

ABSTRACT

The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.


Subject(s)
Databases, Factual , Genomics , Phenotype , Adult , Aged , Alleles , Biomarkers/blood , Biomarkers/urine , Body Height/genetics , Brain/diagnostic imaging , Cohort Studies , Databases, Genetic , Electronic Health Records , Family , Female , Genome-Wide Association Study , Haplotypes/genetics , Humans , Life Style , Major Histocompatibility Complex/genetics , Male , Middle Aged , Quality Control , Racial Groups/genetics , United Kingdom
12.
Proc Natl Acad Sci U S A ; 118(40)2021 10 05.
Article in English | MEDLINE | ID: mdl-34580220

ABSTRACT

We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.


Subject(s)
Genome, Human/genetics , Algorithms , Genome-Wide Association Study/methods , Genotype , Humans , Linkage Disequilibrium/genetics , Multifactorial Inheritance/genetics , Phenotype , Software
13.
Am J Hum Genet ; 107(4): 698-713, 2020 10 01.
Article in English | MEDLINE | ID: mdl-32888427

ABSTRACT

The contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that 9.3%, 3.9%, 1.6%, and 12.5%, respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores (-log10p>7.3).


Subject(s)
Gene-Environment Interaction , Genome, Human , Models, Statistical , Quantitative Trait Loci , Quantitative Trait, Heritable , Bayes Theorem , Blood Pressure/physiology , Body Mass Index , Datasets as Topic , Genetic Markers , Humans , United Kingdom
14.
PLoS Genet ; 16(11): e1009049, 2020 11.
Article in English | MEDLINE | ID: mdl-33196638

ABSTRACT

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has ∼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.


Subject(s)
Computational Biology/methods , Genome-Wide Association Study/methods , Haplotypes/genetics , Alleles , Forecasting/methods , Gene Frequency/genetics , Genotype , Humans , Models, Theoretical , Polymorphism, Single Nucleotide/genetics
15.
Bioinformatics ; 36(24): 5632-5639, 2021 Apr 05.
Article in English | MEDLINE | ID: mdl-33367483

ABSTRACT

MOTIVATION: Gene-environment (GxE) interactions are one of the least studied aspects of the genetic architecture of human traits and diseases. The environment of an individual is inherently high dimensional, evolves through time and can be expensive and time consuming to measure. The UK Biobank study, with all 500 000 participants having undergone an extensive baseline questionnaire, represents a unique opportunity to assess GxE heritability for many traits and diseases in a well powered setting. RESULTS: We have developed a randomized Haseman-Elston non-linear regression method applicable when many environmental variables have been measured on each individual. The method (GPLEMMA) simultaneously estimates a linear environmental score (ES) and its GxE heritability. We compare the method via simulation to a whole-genome regression approach (LEMMA) for estimating GxE heritability. We show that GPLEMMA is more computationally efficient than LEMMA on large datasets, and produces results highly correlated with those from LEMMA when applied to simulated data and real data from the UK Biobank. AVAILABILITY AND IMPLEMENTATION: Software implementing the GPLEMMA method is available from https://jmarchini.org/gplemma/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

16.
Nature ; 526(7571): 68-74, 2015 Oct 01.
Article in English | MEDLINE | ID: mdl-26432245

ABSTRACT

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.


Subject(s)
Genetic Variation/genetics , Genetics, Population/standards , Genome, Human/genetics , Genomics/standards , Internationality , Datasets as Topic , Demography , Disease Susceptibility , Exome/genetics , Genetics, Medical , Genome-Wide Association Study , Genotype , Haplotypes/genetics , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation/genetics , Physical Chromosome Mapping , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Rare Diseases/genetics , Reference Standards , Sequence Analysis, DNA
17.
Stroke ; 51(7): 2111-2121, 2020 07.
Article in English | MEDLINE | ID: mdl-32517579

ABSTRACT

BACKGROUND AND PURPOSE: Periventricular white matter hyperintensities (WMH; PVWMH) and deep WMH (DWMH) are regional classifications of WMH and reflect proposed differences in cause. In the first study, to date, we undertook genome-wide association analyses of DWMH and PVWMH to show that these phenotypes have different genetic underpinnings. METHODS: Participants were aged 45 years and older, free of stroke and dementia. We conducted genome-wide association analyses of PVWMH and DWMH in 26,654 participants from CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology), ENIGMA (Enhancing Neuro-Imaging Genetics Through Meta-Analysis), and the UKB (UK Biobank). Regional correlations were investigated using the genome-wide association analyses -pairwise method. Cross-trait genetic correlations between PVWMH, DWMH, stroke, and dementia were estimated using LDSC. RESULTS: In the discovery and replication analysis, for PVWMH only, we found associations on chromosomes 2 (NBEAL), 10q23.1 (TSPAN14/FAM231A), and 10q24.33 (SH3PXD2A). In the much larger combined meta-analysis of all cohorts, we identified ten significant regions for PVWMH: chromosomes 2 (3 regions), 6, 7, 10 (2 regions), 13, 16, and 17q23.1. New loci of interest include 7q36.1 (NOS3) and 16q24.2. In both the discovery/replication and combined analysis, we found genome-wide significant associations for the 17q25.1 locus for both DWMH and PVWMH. Using gene-based association analysis, 19 genes across all regions were identified for PVWMH only, including the new genes: CALCRL (2q32.1), KLHL24 (3q27.1), VCAN (5q27.1), and POLR2F (22q13.1). Thirteen genes in the 17q25.1 locus were significant for both phenotypes. More extensive genetic correlations were observed for PVWMH with small vessel ischemic stroke. There were no associations with dementia for either phenotype. CONCLUSIONS: Our study confirms these phenotypes have distinct and also shared genetic architectures. Genetic analyses indicated PVWMH was more associated with ischemic stroke whilst DWMH loci were implicated in vascular, astrocyte, and neuronal function. Our study confirms these phenotypes are distinct neuroimaging classifications and identifies new candidate genes associated with PVWMH only.


Subject(s)
Brain/pathology , Cerebral Small Vessel Diseases/genetics , Cerebral Small Vessel Diseases/pathology , Genetic Predisposition to Disease/genetics , White Matter/pathology , Aged , Brain/diagnostic imaging , Cerebral Small Vessel Diseases/diagnostic imaging , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , White Matter/diagnostic imaging
18.
Brain ; 142(10): 2938-2947, 2019 10 01.
Article in English | MEDLINE | ID: mdl-31504236

ABSTRACT

Ninety per cent of the human population has been right-handed since the Paleolithic, yet the brain signature and genetic basis of handedness remain poorly characterized. Here, we correlated brain imaging phenotypes from ∼9000 UK Biobank participants with handedness, and with loci found significantly associated with handedness after we performed genome-wide association studies (GWAS) in ∼400 000 of these participants. Our imaging-handedness analysis revealed an increase in functional connectivity between left and right language networks in left-handers. GWAS of handedness uncovered four significant loci (rs199512, rs45608532, rs13017199, and rs3094128), three of which are in-or expression quantitative trait loci of-genes encoding proteins involved in brain development and patterning. These included microtubule-related MAP2 and MAPT, as well as WNT3 and MICB, all implicated in the pathogenesis of diseases such as Parkinson's, Alzheimer's and schizophrenia. In particular, with rs199512, we identified a common genetic influence on handedness, psychiatric phenotypes, Parkinson's disease, and the integrity of white matter tracts connecting the same language-related regions identified in the handedness-imaging analysis. This study has identified in the general population genome-wide significant loci for human handedness in, and expression quantitative trait loci of, genes associated with brain development, microtubules and patterning. We suggest that these genetic variants contribute to neurodevelopmental lateralization of brain organization, which in turn influences both the handedness phenotype and the predisposition to develop certain neurological and psychiatric diseases.


Subject(s)
Functional Laterality/genetics , Mental Disorders/diagnostic imaging , Mental Disorders/genetics , Adult , Brain/physiology , Brain Mapping/methods , Female , Functional Laterality/physiology , Genome-Wide Association Study , Humans , Language , Magnetic Resonance Imaging/methods , Male , Microtubules/genetics , Neuroimaging/methods , Parkinson Disease/genetics , Phenotype , White Matter/diagnostic imaging
19.
Bioinformatics ; 32(13): 1974-80, 2016 07 01.
Article in English | MEDLINE | ID: mdl-27153703

ABSTRACT

MOTIVATION: There is growing recognition that estimating haplotypes from high coverage sequencing of single samples in clinical settings is an important problem. At the same time very large datasets consisting of tens and hundreds of thousands of high-coverage sequenced samples will soon be available. We describe a method that takes advantage of these huge human genetic variation resources and rare variant sharing patterns to estimate haplotypes on single sequenced samples. Sharing rare variants between two individuals is more likely to arise from a recent common ancestor and, hence, also more likely to indicate similar shared haplotypes over a substantial flanking region of sequence. RESULTS: Our method exploits this idea to select a small set of highly informative copying states within a Hidden Markov Model (HMM) phasing algorithm. Using rare variants in this way allows us to avoid iterative MCMC methods to infer haplotypes. Compared to other approaches that do not explicitly use rare variants we obtain significant gains in phasing accuracy, less variation over phasing runs and improvements in speed. For example, using a reference panel of 7420 haplotypes from the UK10K project, we are able to reduce switch error rates by up to 50% when phasing samples sequenced at high-coverage. In addition, a single step rephasing of the UK10K panel, using rare variant information, has a downstream impact on phasing performance. These results represent a proof of concept that rare variant sharing patterns can be utilized to phase large high-coverage sequencing studies such as the 100 000 Genomes Project dataset. AVAILABILITY AND IMPLEMENTATION: A webserver that includes an implementation of this new method and allows phasing of high-coverage clinical samples is available at https://phasingserver.stats.ox.ac.uk/ CONTACT: marchini@stats.ox.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Genetic Variation , Haplotypes , Algorithms , Alleles , Genotype , Humans
SELECTION OF CITATIONS
SEARCH DETAIL