Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Nature ; 631(8021): 583-592, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38768635

RESUMEN

Rare coding variants that substantially affect function provide insights into the biology of a gene1-3. However, ascertaining the frequency of such variants requires large sample sizes4-8. Here we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. In total, 23% of the Regeneron Genetics Center Million Exome (RGC-ME) data come from individuals of African, East Asian, Indigenous American, Middle Eastern and South Asian ancestry. The catalogue includes more than 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss of function (LOF), we identify 3,988 LOF-intolerant genes, including 86 that were previously assessed as tolerant and 1,153 that lack established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions that are depleted of missense variants despite being tolerant of pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this resource of coding variation from the RGC-ME dataset publicly accessible through a variant allele frequency browser.


Asunto(s)
Variación Genética , Humanos , Variación Genética/genética , Secuenciación del Exoma , Mutación con Pérdida de Función/genética , Exoma/genética , Heterocigoto , Mutación Missense/genética , Frecuencia de los Genes , Alelos , Sistemas de Lectura Abierta/genética
2.
Nature ; 599(7886): 628-634, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34662886

RESUMEN

A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing1 to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study2. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10-11. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.


Asunto(s)
Bancos de Muestras Biológicas , Bases de Datos Genéticas , Secuenciación del Exoma , Exoma/genética , África/etnología , Asia/etnología , Asma/genética , Diabetes Mellitus/genética , Europa (Continente)/etnología , Oftalmopatías/genética , Femenino , Predisposición Genética a la Enfermedad/genética , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Hipertensión/genética , Hepatopatías/genética , Masculino , Mutación , Neoplasias/genética , Carácter Cuantitativo Heredable , Reino Unido
3.
Nature ; 586(7831): 749-756, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33087929

RESUMEN

The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.


Asunto(s)
Bases de Datos Genéticas , Secuenciación del Exoma , Exoma/genética , Mutación con Pérdida de Función/genética , Fenotipo , Anciano , Densidad Ósea/genética , Colágeno Tipo VI/genética , Demografía , Femenino , Genes BRCA1 , Genes BRCA2 , Genotipo , Humanos , Canales Iónicos/genética , Masculino , Persona de Mediana Edad , Neoplasias/genética , Penetrancia , Fragmentos de Péptidos/genética , Reino Unido , Várices/genética , Proteínas Activadoras de ras GTPasa/genética
4.
Nature ; 570(7759): 71-76, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31118516

RESUMEN

Protein-coding genetic variants that strongly affect disease risk can yield relevant clues to disease pathogenesis. Here we report exome-sequencing analyses of 20,791 individuals with type 2 diabetes (T2D) and 24,440 non-diabetic control participants from 5 ancestries. We identify gene-level associations of rare variants (with minor allele frequencies of less than 0.5%) in 4 genes at exome-wide significance, including a series of more than 30 SLC30A8 alleles that conveys protection against T2D, and in 12 gene sets, including those corresponding to T2D drug targets (P = 6.1 × 10-3) and candidate genes from knockout mice (P = 5.2 × 10-3). Within our study, the strongest T2D gene-level signals for rare variants explain at most 25% of the heritability of the strongest common single-variant signals, and the gene-level effect sizes of the rare variants that we observed in established T2D drug targets will require 75,000-185,000 sequenced cases to achieve exome-wide significance. We propose a method to interpret these modest rare-variant associations and to incorporate these associations into future target or gene prioritization efforts.


Asunto(s)
Diabetes Mellitus Tipo 2/genética , Secuenciación del Exoma , Exoma/genética , Animales , Estudios de Casos y Controles , Técnicas de Apoyo para la Decisión , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Ratones , Ratones Noqueados
5.
Am J Hum Genet ; 108(7): 1350-1355, 2021 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-34115965

RESUMEN

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19), a respiratory illness that can result in hospitalization or death. We used exome sequence data to investigate associations between rare genetic variants and seven COVID-19 outcomes in 586,157 individuals, including 20,952 with COVID-19. After accounting for multiple testing, we did not identify any clear associations with rare variants either exome wide or when specifically focusing on (1) 13 interferon pathway genes in which rare deleterious variants have been reported in individuals with severe COVID-19, (2) 281 genes located in susceptibility loci identified by the COVID-19 Host Genetics Initiative, or (3) 32 additional genes of immunologic relevance and/or therapeutic potential. Our analyses indicate there are no significant associations with rare protein-coding variants with detectable effect sizes at our current sample sizes. Analyses will be updated as additional data become available, and results are publicly available through the Regeneron Genetics Center COVID-19 Results Browser.


Asunto(s)
COVID-19/diagnóstico , COVID-19/genética , Secuenciación del Exoma , Exoma/genética , Predisposición Genética a la Enfermedad , Hospitalización/estadística & datos numéricos , COVID-19/inmunología , COVID-19/terapia , Femenino , Humanos , Interferones/genética , Masculino , Pronóstico , SARS-CoV-2 , Tamaño de la Muestra
6.
Genet Epidemiol ; 45(6): 664-681, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34184762

RESUMEN

Serum alanine aminotransferase (ALT) and aspartate aminotransferase (AST) are biomarkers for liver health. Here we report the largest genome-wide association analysis to date of serum ALT and AST levels in over 388k people of European ancestry from UK biobank and DiscovEHR. Eleven million imputed markers with a minor allele frequency (MAF) ≥ 0.5% were analyzed. Overall, 300 ALT and 336 AST independent genome-wide significant associations were identified. Among them, 81 ALT and 61 AST associations are reported for the first time. Genome-wide interaction study identified 9 ALT and 12 AST independent associations significantly modified by body mass index (BMI), including several previously reported potential liver disease therapeutic targets, for example, PNPLA3, HSD17B13, and MARC1. While further work is necessary to understand the effect of ALT and AST-associated variants on liver disease, the weighted burden of significant BMI-modified signals is significantly associated with liver disease outcomes. In summary, this study identifies genetic associations which offer an important step forward in understanding the genetic architecture of serum ALT and AST levels. Significant interactions between BMI and genetic loci not only highlight the important role of adiposity in liver damage but also shed light on the genetic etiology of liver disease in obese individuals.


Asunto(s)
Alanina Transaminasa/sangre , Aspartato Aminotransferasas/sangre , Índice de Masa Corporal , Estudio de Asociación del Genoma Completo , Humanos
7.
Am J Hum Genet ; 102(1): 103-115, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29290336

RESUMEN

Atrial fibrillation (AF) is a common cardiac arrhythmia and a major risk factor for stroke, heart failure, and premature death. The pathogenesis of AF remains poorly understood, which contributes to the current lack of highly effective treatments. To understand the genetic variation and biology underlying AF, we undertook a genome-wide association study (GWAS) of 6,337 AF individuals and 61,607 AF-free individuals from Norway, including replication in an additional 30,679 AF individuals and 278,895 AF-free individuals. Through genotyping and dense imputation mapping from whole-genome sequencing, we tested almost nine million genetic variants across the genome and identified seven risk loci, including two novel loci. One novel locus (lead single-nucleotide variant [SNV] rs12614435; p = 6.76 × 10-18) comprised intronic and several highly correlated missense variants situated in the I-, A-, and M-bands of titin, which is the largest protein in humans and responsible for the passive elasticity of heart and skeletal muscle. The other novel locus (lead SNV rs56202902; p = 1.54 × 10-11) covered a large, gene-dense chromosome 1 region that has previously been linked to cardiac conduction. Pathway and functional enrichment analyses suggested that many AF-associated genetic variants act through a mechanism of impaired muscle cell differentiation and tissue formation during fetal heart development.


Asunto(s)
Fibrilación Atrial/genética , Sitios Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Corazón/embriología , Secuencias Reguladoras de Ácidos Nucleicos/genética , Humanos , Patrón de Herencia/genética , Herencia Multifactorial/genética , Especificidad de Órganos/genética , Mapeo Físico de Cromosoma , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados , Factores de Riesgo
8.
Genome Res ; 28(7): 1039-1052, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29773658

RESUMEN

Current approaches to detect and characterize mosaic chromosomal aneuploidy are limited by sensitivity, efficiency, cost, or the need to culture cells. We describe the mosaic aneuploidy detection by massively parallel sequencing (MAD-seq) capture assay and the MADSEQ analytical approach that allow low (<10%) levels of mosaicism for chromosomal aneuploidy or regional loss of heterozygosity to be detected, assigned to a meiotic or mitotic origin, and quantified as a proportion of the cells in the sample. We show results from a multi-ethnic MAD-seq (meMAD-seq) capture design that works equally well in populations of diverse racial and ethnic origins and how the MADSEQ analytical approach can be applied to exome or whole-genome sequencing data, revealing previously unrecognized aneuploidy or copy number neutral loss of heterozygosity in samples studied by the 1000 Genomes Project, cell lines from public repositories, and one of the Illumina Platinum Genomes samples. We have made the meMAD-seq capture design and MADSEQ analytical software open for unrestricted use, with the goal that they can be applied in clinical samples to allow new insights into the unrecognized prevalence of mosaic chromosomal aneuploidy in humans and its phenotypic associations.


Asunto(s)
Cromosomas/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Aneuploidia , Exoma/genética , Femenino , Genoma/genética , Humanos , Masculino , Mosaicismo , Programas Informáticos
9.
bioRxiv ; 2023 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-37214792

RESUMEN

Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.

10.
Nat Genet ; 54(4): 382-392, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35241825

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) enters human host cells via angiotensin-converting enzyme 2 (ACE2) and causes coronavirus disease 2019 (COVID-19). Here, through a genome-wide association study, we identify a variant (rs190509934, minor allele frequency 0.2-2%) that downregulates ACE2 expression by 37% (P = 2.7 × 10-8) and reduces the risk of SARS-CoV-2 infection by 40% (odds ratio = 0.60, P = 4.5 × 10-13), providing human genetic evidence that ACE2 expression levels influence COVID-19 risk. We also replicate the associations of six previously reported risk variants, of which four were further associated with worse outcomes in individuals infected with the virus (in/near LZTFL1, MHC, DPP9 and IFNAR2). Lastly, we show that common variants define a risk score that is strongly associated with severe disease among cases and modestly improves the prediction of disease severity relative to demographic and clinical factors alone.


Asunto(s)
COVID-19 , Enzima Convertidora de Angiotensina 2/genética , COVID-19/genética , Estudio de Asociación del Genoma Completo , Humanos , Factores de Riesgo , SARS-CoV-2/genética
11.
HGG Adv ; 2(3): 100039, 2021 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-35047837

RESUMEN

Parent-of-origin (PoO) effects refer to the differential phenotypic impacts of genetic variants dependent on their parental inheritance due to imprinting. While PoO effects can influence complex traits, they may be poorly captured by models that do not differentiate the parental origin of the variant. The aim of this study was to conduct a genome-wide screen for PoO effects on a broad range of clinical traits derived from electronic health records (EHR) in the DiscovEHR study enriched with familial relationships. Using pairwise kinship estimates from genetic data and demographic data, we identified 22,051 offspring among 134,049 individuals in the DiscovEHR study. PoO of ~9 million variants was assigned in the offspring by comparing offspring and parental genotypes and haplotypes. We then performed genome-wide PoO association analyses across 154 quantitative and 611 binary traits extracted from EHR. Of the 732 significant PoO associations identified (p < 5 × 10-8), we attempted to replicate 274 PoO associations in the UK Biobank study with 5,015 offspring and replicated 9 PoO associations (p < 0.05). In summary, our study implements a bioinformatic and statistical approach to examine PoO effects genome-wide in a large population study enriched with familial relationships and systematically characterizes PoO effects on hundreds of clinical traits derived from EHR. Our results suggest that, while the statistical power to detect PoO effects remains modest yet, accurately modeling PoO effects has the potential to find new associations that may have been missed by the standard additive model, further enhancing the mechanistic understanding of genetic influence on complex traits.

12.
Nat Genet ; 53(7): 1097-1103, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34017140

RESUMEN

Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case-control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.


Asunto(s)
Biología Computacional , Estudio de Asociación del Genoma Completo , Genómica , Estudios de Casos y Controles , Biología Computacional/métodos , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Genotipo , Humanos , Modelos Logísticos , Aprendizaje Automático , Fenotipo , Reproducibilidad de los Resultados
13.
Nat Commun ; 9(1): 2252, 2018 06 13.
Artículo en Inglés | MEDLINE | ID: mdl-29899519

RESUMEN

Angiopoietin-like 4 (ANGPTL4) is an endogenous inhibitor of lipoprotein lipase that modulates lipid levels, coronary atherosclerosis risk, and nutrient partitioning. We hypothesize that loss of ANGPTL4 function might improve glucose homeostasis and decrease risk of type 2 diabetes (T2D). We investigate protein-altering variants in ANGPTL4 among 58,124 participants in the DiscovEHR human genetics study, with follow-up studies in 82,766 T2D cases and 498,761 controls. Carriers of p.E40K, a variant that abolishes ANGPTL4 ability to inhibit lipoprotein lipase, have lower odds of T2D (odds ratio 0.89, 95% confidence interval 0.85-0.92, p = 6.3 × 10-10), lower fasting glucose, and greater insulin sensitivity. Predicted loss-of-function variants are associated with lower odds of T2D among 32,015 cases and 84,006 controls (odds ratio 0.71, 95% confidence interval 0.49-0.99, p = 0.041). Functional studies in Angptl4-deficient mice confirm improved insulin sensitivity and glucose homeostasis. In conclusion, genetic inactivation of ANGPTL4 is associated with improved glucose homeostasis and reduced risk of T2D.


Asunto(s)
Proteína 4 Similar a la Angiopoyetina/deficiencia , Proteína 4 Similar a la Angiopoyetina/genética , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Sustitución de Aminoácidos , Proteína 4 Similar a la Angiopoyetina/metabolismo , Animales , Glucemia/metabolismo , Estudios de Casos y Controles , Diabetes Mellitus Tipo 2/etiología , Femenino , Silenciador del Gen , Estudios de Asociación Genética , Variación Genética , Heterocigoto , Homeostasis , Humanos , Resistencia a la Insulina/genética , Lipoproteína Lipasa/metabolismo , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Factores de Riesgo , Secuenciación del Exoma
14.
Nat Genet ; 48(6): 593-9, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27111036

RESUMEN

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.


Asunto(s)
Cromosomas Humanos Y , Demografía , Haplotipos , Humanos , Masculino , Mutación , Filogenia , Polimorfismo de Nucleótido Simple
15.
Science ; 354(6319)2016 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-28008009

RESUMEN

The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery.


Asunto(s)
Prestación Integrada de Atención de Salud , Enfermedad/genética , Registros Electrónicos de Salud , Exoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Adulto , Diseño de Fármacos , Frecuencia de los Genes , Genómica , Humanos , Hipolipemiantes/farmacología , Mutación INDEL , Lípidos/sangre , Terapia Molecular Dirigida , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA