Pesquisa | Portal Regional da BVS

1.

A deep catalogue of protein-coding variation in 983,578 individuals.

Sun, Kathie Y; Bai, Xiaodong; Chen, Siying; Bao, Suying; Zhang, Chuanyi; Kapoor, Manav; Backman, Joshua; Joseph, Tyler; Maxwell, Evan; Mitra, George; Gorovits, Alexander; Mansfield, Adam; Boutkov, Boris; Gokhale, Sujit; Habegger, Lukas; Marcketta, Anthony; Locke, Adam E; Ganel, Liron; Hawes, Alicia; Kessler, Michael D; Sharma, Deepika; Staples, Jeffrey; Bovijn, Jonas; Gelfman, Sahar; Di Gioia, Alessandro; Rajagopal, Veera M; Lopez, Alexander; Varela, Jennifer Rico; Alegre, Jesus; Berumen, Jaime; Tapia-Conyer, Roberto; Kuri-Morales, Pablo; Torres, Jason; Emberson, Jonathan; Collins, Rory; Cantor, Michael; Thornton, Timothy; Kang, Hyun Min; Overton, John D; Shuldiner, Alan R; Cremona, M Laura; Nafde, Mona; Baras, Aris; Abecasis, Goncalo; Marchini, Jonathan; Reid, Jeffrey G; Salerno, William; Balasubramanian, Suganthi.

Nature ; 2024 May 20.

Artigo em Inglês | MEDLINE | ID: mdl-38768635

RESUMO

Rare coding variants that significantly impact function provide insights into the biology of a gene1-3. However, ascertaining their frequency requires large sample sizes4-8. Here, we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. 23% of the Regeneron Genetics Center Million Exome data (RGC-ME) comes from non-European individuals of African, East Asian, Indigenous American, Middle Eastern, and South Asian ancestry. This catalogue includes over 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss-of-function, we identify 3,988 loss-of-function intolerant genes, including 86 that were previously assessed as tolerant and 1,153 lacking established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions depleted of missense variants despite being tolerant to pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this important resource of coding variation from the RGC-ME accessible via a public variant allele frequency browser.

2.

Author Correction: Genotyping, sequencing and analysis of 140,000 adults from Mexico City.

Ziyatdinov, Andrey; Torres, Jason; Alegre-Díaz, Jesús; Backman, Joshua; Mbatchou, Joelle; Turner, Michael; Gaynor, Sheila M; Joseph, Tyler; Zou, Yuxin; Liu, Daren; Wade, Rachel; Staples, Jeffrey; Panea, Razvan; Popov, Alex; Bai, Xiaodong; Balasubramanian, Suganthi; Habegger, Lukas; Lanche, Rouel; Lopez, Alex; Maxwell, Evan; Jones, Marcus; García-Ortiz, Humberto; Ramirez-Reyes, Raul; Santacruz-Benítez, Rogelio; Nag, Abhishek; Smith, Katherine R; Damask, Amy; Lin, Nan; Paulding, Charles; Reppell, Mark; Zöllner, Sebastian; Jorgenson, Eric; Salerno, William; Petrovski, Slavé; Overton, John; Reid, Jeffrey; Thornton, Timothy A; Abecasis, Gonçalo; Berumen, Jaime; Orozco-Orozco, Lorena; Collins, Rory; Baras, Aris; Hill, Michael R; Emberson, Jonathan R; Marchini, Jonathan; Kuri-Morales, Pablo; Tapia-Conyer, Roberto.

Nature ; 626(8001): E18, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38332034

3.

Genotyping, sequencing and analysis of 140,000 adults from Mexico City.

Ziyatdinov, Andrey; Torres, Jason; Alegre-Díaz, Jesús; Backman, Joshua; Mbatchou, Joelle; Turner, Michael; Gaynor, Sheila M; Joseph, Tyler; Zou, Yuxin; Liu, Daren; Wade, Rachel; Staples, Jeffrey; Panea, Razvan; Popov, Alex; Bai, Xiaodong; Balasubramanian, Suganthi; Habegger, Lukas; Lanche, Rouel; Lopez, Alex; Maxwell, Evan; Jones, Marcus; García-Ortiz, Humberto; Ramirez-Reyes, Raul; Santacruz-Benítez, Rogelio; Nag, Abhishek; Smith, Katherine R; Damask, Amy; Lin, Nan; Paulding, Charles; Reppell, Mark; Zöllner, Sebastian; Jorgenson, Eric; Salerno, William; Petrovski, Slavé; Overton, John; Reid, Jeffrey; Thornton, Timothy A; Abecasis, Gonçalo; Berumen, Jaime; Orozco-Orozco, Lorena; Collins, Rory; Baras, Aris; Hill, Michael R; Emberson, Jonathan R; Marchini, Jonathan; Kuri-Morales, Pablo; Tapia-Conyer, Roberto.

Nature ; 622(7984): 784-793, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37821707

RESUMO

The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent.

Assuntos

Sequenciamento do Exoma , Genoma Humano , Genótipo , Hispânico ou Latino , Adulto , Humanos , África/etnologia , América/etnologia , Europa (Continente)/etnologia , Frequência do Gene/genética , Genética Populacional , Genoma Humano/genética , Técnicas de Genotipagem , Hispânico ou Latino/genética , Homozigoto , Mutação com Perda de Função/genética , México , Estudos Prospectivos

4.

Rare coding variants in CHRNB2 reduce the likelihood of smoking.

Rajagopal, Veera M; Watanabe, Kyoko; Mbatchou, Joelle; Ayer, Ariane; Quon, Peter; Sharma, Deepika; Kessler, Michael D; Praveen, Kavita; Gelfman, Sahar; Parikshak, Neelroop; Otto, Jacqueline M; Bao, Suying; Chim, Shek Man; Pavlopoulos, Elias; Avbersek, Andreja; Kapoor, Manav; Chen, Esteban; Jones, Marcus B; Leblanc, Michelle; Emberson, Jonathan; Collins, Rory; Torres, Jason; Morales, Pablo Kuri; Tapia-Conyer, Roberto; Alegre, Jesus; Berumen, Jaime; Shuldiner, Alan R; Balasubramanian, Suganthi; Abecasis, Gonçalo R; Kang, Hyun M; Marchini, Jonathan; Stahl, Eli A; Jorgenson, Eric; Sanchez, Robert; Liedtke, Wolfgang; Anderson, Matthew; Cantor, Michael; Lederer, David; Baras, Aris; Coppola, Giovanni.

Nat Genet ; 55(7): 1138-1148, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37308787

RESUMO

Human genetic studies of smoking behavior have been thus far largely limited to common variants. Studying rare coding variants has the potential to identify drug targets. We performed an exome-wide association study of smoking phenotypes in up to 749,459 individuals and discovered a protective association in CHRNB2, encoding the ß2 subunit of the α4ß2 nicotine acetylcholine receptor. Rare predicted loss-of-function and likely deleterious missense variants in CHRNB2 in aggregate were associated with a 35% decreased odds for smoking heavily (odds ratio (OR) = 0.65, confidence interval (CI) = 0.56-0.76, P = 1.9 × 10-8). An independent common variant association in the protective direction ( rs2072659 ; OR = 0.96; CI = 0.94-0.98; P = 5.3 × 10-6) was also evident, suggesting an allelic series. Our findings in humans align with decades-old experimental observations in mice that ß2 loss abolishes nicotine-mediated neuronal responses and attenuates nicotine self-administration. Our genetic discovery will inspire future drug designs targeting CHRNB2 in the brain for the treatment of nicotine addiction.

Assuntos

Nicotina , Tabagismo , Humanos , Animais , Camundongos , Fumar/genética , Tabagismo/genética , Fenótipo , Razão de Chances

5.

A deep catalog of protein-coding variation in 985,830 individuals.

Sun, Kathie Y; Bai, Xiaodong; Chen, Siying; Bao, Suying; Kapoor, Manav; Zhang, Chuanyi; Backman, Joshua; Joseph, Tyler; Maxwell, Evan; Mitra, George; Gorovits, Alexander; Mansfield, Adam; Boutkov, Boris; Gokhale, Sujit; Habegger, Lukas; Marcketta, Anthony; Locke, Adam; Kessler, Michael D; Sharma, Deepika; Staples, Jeffrey; Bovijn, Jonas; Gelfman, Sahar; Gioia, Alessandro Di; Rajagopal, Veera; Lopez, Alexander; Varela, Jennifer Rico; Alegre, Jesus; Berumen, Jaime; Tapia-Conyer, Roberto; Kuri-Morales, Pablo; Torres, Jason; Emberson, Jonathan; Collins, Rory; Cantor, Michael; Thornton, Timothy; Kang, Hyun Min; Overton, John; Shuldiner, Alan R; Cremona, M Laura; Nafde, Mona; Baras, Aris; Abecasis, Goncalo; Marchini, Jonathan; Reid, Jeffrey G; Salerno, William; Balasubramanian, Suganthi.

bioRxiv ; 2023 Nov 02.

Artigo em Inglês | MEDLINE | ID: mdl-37214792

RESUMO

Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.

6.

Author Correction: Common and rare variant associations with clonal haematopoiesis phenotypes.

Kessler, Michael D; Damask, Amy; O'Keeffe, Sean; Banerjee, Nilanjana; Li, Dadong; Watanabe, Kyoko; Marketta, Anthony; Van Meter, Michael; Semrau, Stefan; Horowitz, Julie; Tang, Jing; Kosmicki, Jack A; Rajagopal, Veera M; Zou, Yuxin; Houvras, Yariv; Ghosh, Arkopravo; Gillies, Christopher; Mbatchou, Joelle; White, Ryan R; Verweij, Niek; Bovijn, Jonas; Parikshak, Neelroop N; LeBlanc, Michelle G; Jones, Marcus; Glass, David J; Lotta, Luca A; Cantor, Michael N; Atwal, Gurinder S; Locke, Adam E; Ferreira, Manuel A R; Deering, Raquel; Paulding, Charles; Shuldiner, Alan R; Thurston, Gavin; Ferrando, Adolfo A; Salerno, Will; Reid, Jeffrey G; Overton, John D; Marchini, Jonathan; Kang, Hyun M; Baras, Aris; Abecasis, Gonçalo R; Jorgenson, Eric.

Nature ; 615(7950): E3, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-36807635

7.

Common and rare variant associations with clonal haematopoiesis phenotypes.

Kessler, Michael D; Damask, Amy; O'Keeffe, Sean; Banerjee, Nilanjana; Li, Dadong; Watanabe, Kyoko; Marketta, Anthony; Van Meter, Michael; Semrau, Stefan; Horowitz, Julie; Tang, Jing; Kosmicki, Jack A; Rajagopal, Veera M; Zou, Yuxin; Houvras, Yariv; Ghosh, Arkopravo; Gillies, Christopher; Mbatchou, Joelle; White, Ryan R; Verweij, Niek; Bovijn, Jonas; Parikshak, Neelroop N; LeBlanc, Michelle G; Jones, Marcus; Glass, David J; Lotta, Luca A; Cantor, Michael N; Atwal, Gurinder S; Locke, Adam E; Ferreira, Manuel A R; Deering, Raquel; Paulding, Charles; Shuldiner, Alan R; Thurston, Gavin; Ferrando, Adolfo A; Salerno, Will; Reid, Jeffrey G; Overton, John D; Marchini, Jonathan; Kang, Hyun M; Baras, Aris; Abecasis, Gonçalo R; Jorgenson, Eric.

Nature ; 612(7939): 301-309, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36450978

RESUMO

Clonal haematopoiesis involves the expansion of certain blood cell lineages and has been associated with ageing and adverse health outcomes1-5. Here we use exome sequence data on 628,388 individuals to identify 40,208 carriers of clonal haematopoiesis of indeterminate potential (CHIP). Using genome-wide and exome-wide association analyses, we identify 24 loci (21 of which are novel) where germline genetic variation influences predisposition to CHIP, including missense variants in the lymphocytic antigen coding gene LY75, which are associated with reduced incidence of CHIP. We also identify novel rare variant associations with clonal haematopoiesis and telomere length. Analysis of 5,041 health traits from the UK Biobank (UKB) found relationships between CHIP and severe COVID-19 outcomes, cardiovascular disease, haematologic traits, malignancy, smoking, obesity, infection and all-cause mortality. Longitudinal and Mendelian randomization analyses revealed that CHIP is associated with solid cancers, including non-melanoma skin cancer and lung cancer, and that CHIP linked to DNMT3A is associated with the subsequent development of myeloid but not lymphoid leukaemias. Additionally, contrary to previous findings from the initial 50,000 UKB exomes6, our results in the full sample do not support a role for IL-6 inhibition in reducing the risk of cardiovascular disease among CHIP carriers. Our findings demonstrate that CHIP represents a complex set of heterogeneous phenotypes with shared and unique germline genetic causes and varied clinical implications.

Assuntos

COVID-19 , Doenças Cardiovasculares , Humanos , Hematopoiese Clonal/genética , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética

8.

Universal clinical Parkinson's disease axes identify a major influence of neuroinflammation.

Sandor, Cynthia; Millin, Stephanie; Dahl, Andrew; Schalkamp, Ann-Kathrin; Lawton, Michael; Hubbard, Leon; Rahman, Nabila; Williams, Nigel; Ben-Shlomo, Yoav; Grosset, Donald G; Hu, Michele T; Marchini, Jonathan; Webber, Caleb.

Genome Med ; 14(1): 129, 2022 11 16.

Artigo em Inglês | MEDLINE | ID: mdl-36384636

RESUMO

BACKGROUND: There is large individual variation in both clinical presentation and progression between Parkinson's disease patients. Generation of deeply and longitudinally phenotyped patient cohorts has enormous potential to identify disease subtypes for prognosis and therapeutic targeting. METHODS: Replicating across three large Parkinson's cohorts (Oxford Discovery cohort (n = 842)/Tracking UK Parkinson's study (n = 1807) and Parkinson's Progression Markers Initiative (n = 472)) with clinical observational measures collected longitudinally over 5-10 years, we developed a Bayesian multiple phenotypes mixed model incorporating genetic relationships between individuals able to explain many diverse clinical measurements as a smaller number of continuous underlying factors ("phenotypic axes"). RESULTS: When applied to disease severity at diagnosis, the most influential of three phenotypic axes "Axis 1" was characterised by severe non-tremor motor phenotype, anxiety and depression at diagnosis, accompanied by faster progression in cognitive function measures. Axis 1 was associated with increased genetic risk of Alzheimer's disease and reduced CSF Aß1-42 levels. As observed previously for Alzheimer's disease genetic risk, and in contrast to Parkinson's disease genetic risk, the loci influencing Axis 1 were associated with microglia-expressed genes implicating neuroinflammation. When applied to measures of disease progression for each individual, integration of Alzheimer's disease genetic loci haplotypes improved the accuracy of progression modelling, while integrating Parkinson's disease genetics did not. CONCLUSIONS: We identify universal axes of Parkinson's disease phenotypic variation which reveal that Parkinson's patients with high concomitant genetic risk for Alzheimer's disease are more likely to present with severe motor and non-motor features at baseline and progress more rapidly to early dementia.

Assuntos

Doença de Alzheimer , Doença de Parkinson , Humanos , Doença de Parkinson/genética , Doenças Neuroinflamatórias , Teorema de Bayes , Estudos de Coortes

9.

Germline Mutations in CIDEB and Protection against Liver Disease.

Verweij, Niek; Haas, Mary E; Nielsen, Jonas B; Sosina, Olukayode A; Kim, Minhee; Akbari, Parsa; De, Tanima; Hindy, George; Bovijn, Jonas; Persaud, Trikaldarshi; Miloscio, Lawrence; Germino, Mary; Panagis, Lampros; Watanabe, Kyoko; Mbatchou, Joelle; Jones, Marcus; LeBlanc, Michelle; Balasubramanian, Suganthi; Lammert, Craig; Enhörning, Sofia; Melander, Olle; Carey, David J; Still, Christopher D; Mirshahi, Tooraj; Rader, Daniel J; Parasoglou, Prodromos; Walls, Johnathon R; Overton, John D; Reid, Jeffrey G; Economides, Aris; Cantor, Michael N; Zambrowicz, Brian; Murphy, Andrew J; Abecasis, Goncalo R; Ferreira, Manuel A R; Smagris, Eriks; Gusarova, Viktoria; Sleeman, Mark; Yancopoulos, George D; Marchini, Jonathan; Kang, Hyun M; Karalis, Katia; Shuldiner, Alan R; Della Gatta, Giusy; Locke, Adam E; Baras, Aris; Lotta, Luca A.

N Engl J Med ; 387(4): 332-344, 2022 07 28.

Artigo em Inglês | MEDLINE | ID: mdl-35939579

RESUMO

BACKGROUND: Exome sequencing in hundreds of thousands of persons may enable the identification of rare protein-coding genetic variants associated with protection from human diseases like liver cirrhosis, providing a strategy for the discovery of new therapeutic targets. METHODS: We performed a multistage exome sequencing and genetic association analysis to identify genes in which rare protein-coding variants were associated with liver phenotypes. We conducted in vitro experiments to further characterize associations. RESULTS: The multistage analysis involved 542,904 persons with available data on liver aminotransferase levels, 24,944 patients with various types of liver disease, and 490,636 controls without liver disease. We found that rare coding variants in APOB, ABCB4, SLC30A10, and TM6SF2 were associated with increased aminotransferase levels and an increased risk of liver disease. We also found that variants in CIDEB, which encodes a structural protein found in hepatic lipid droplets, had a protective effect. The burden of rare predicted loss-of-function variants plus missense variants in CIDEB (combined carrier frequency, 0.7%) was associated with decreased alanine aminotransferase levels (beta per allele, -1.24 U per liter; 95% confidence interval [CI], -1.66 to -0.83; P = 4.8×10-9) and with 33% lower odds of liver disease of any cause (odds ratio per allele, 0.67; 95% CI, 0.57 to 0.79; P = 9.9×10-7). Rare coding variants in CIDEB were associated with a decreased risk of liver disease across different underlying causes and different degrees of severity, including cirrhosis of any cause (odds ratio per allele, 0.50; 95% CI, 0.36 to 0.70). Among 3599 patients who had undergone bariatric surgery, rare coding variants in CIDEB were associated with a decreased nonalcoholic fatty liver disease activity score (beta per allele in score units, -0.98; 95% CI, -1.54 to -0.41 [scores range from 0 to 8, with higher scores indicating more severe disease]). In human hepatoma cell lines challenged with oleate, CIDEB small interfering RNA knockdown prevented the buildup of large lipid droplets. CONCLUSIONS: Rare germline mutations in CIDEB conferred substantial protection from liver disease. (Funded by Regeneron Pharmaceuticals.).

Assuntos

Proteínas Reguladoras de Apoptose , Mutação em Linhagem Germinativa , Hepatopatias , Proteínas Reguladoras de Apoptose/genética , Proteínas Reguladoras de Apoptose/metabolismo , Predisposição Genética para Doença/genética , Predisposição Genética para Doença/prevenção & controle , Humanos , Fígado/metabolismo , Hepatopatias/genética , Hepatopatias/metabolismo , Hepatopatias/prevenção & controle , Transaminases/genética , Sequenciamento do Exoma

10.

Population-scale analysis of common and rare genetic variation associated with hearing loss in adults.

Praveen, Kavita; Dobbyn, Lee; Gurski, Lauren; Ayer, Ariane H; Staples, Jeffrey; Mishra, Shawn; Bai, Yu; Kaufman, Alexandra; Moscati, Arden; Benner, Christian; Chen, Esteban; Chen, Siying; Popov, Alexander; Smith, Janell; Melander, Olle; Jones, Marcus B; Marchini, Jonathan; Balasubramanian, Suganthi; Zambrowicz, Brian; Drummond, Meghan C; Baras, Aris; Abecasis, Goncalo R; Ferreira, Manuel A; Stahl, Eli A; Coppola, Giovanni.

Commun Biol ; 5(1): 540, 2022 06 03.

Artigo em Inglês | MEDLINE | ID: mdl-35661827

RESUMO

To better understand the genetics of hearing loss, we performed a genome-wide association meta-analysis with 125,749 cases and 469,497 controls across five cohorts. We identified 53/c loci affecting hearing loss risk, including common coding variants in COL9A3 and TMPRSS3. Through exome sequencing of 108,415 cases and 329,581 controls, we observed rare coding associations with 11 Mendelian hearing loss genes, including additive effects in known hearing loss genes GJB2 (Gly12fs; odds ratio [OR] = 1.21, P = 4.2 × 10-11) and SLC26A5 (gene burden; OR = 1.96, P = 2.8 × 10-17). We also identified hearing loss associations with rare coding variants in FSCN2 (OR = 1.14, P = 1.9 × 10-15) and KLHDC7B (OR = 2.14, P = 5.2 × 10-30). Our results suggest a shared etiology between Mendelian and common hearing loss in adults. This work illustrates the potential of large-scale exome sequencing to elucidate the genetic architecture of common disorders where both common and rare variation contribute to risk.

Assuntos

Estudo de Associação Genômica Ampla , Perda Auditiva , Exoma/genética , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Perda Auditiva/genética , Humanos , Proteínas de Membrana/genética , Proteínas de Neoplasias/genética , Serina Endopeptidases/genética , Sequenciamento do Exoma

11.

Genome-wide analysis provides genetic evidence that ACE2 influences COVID-19 risk and yields risk scores associated with severe disease.

Horowitz, Julie E; Kosmicki, Jack A; Damask, Amy; Sharma, Deepika; Roberts, Genevieve H L; Justice, Anne E; Banerjee, Nilanjana; Coignet, Marie V; Yadav, Ashish; Leader, Joseph B; Marcketta, Anthony; Park, Danny S; Lanche, Rouel; Maxwell, Evan; Knight, Spencer C; Bai, Xiaodong; Guturu, Harendra; Sun, Dylan; Baltzell, Asher; Kury, Fabricio S P; Backman, Joshua D; Girshick, Ahna R; O'Dushlaine, Colm; McCurdy, Shannon R; Partha, Raghavendran; Mansfield, Adam J; Turissini, David A; Li, Alexander H; Zhang, Miao; Mbatchou, Joelle; Watanabe, Kyoko; Gurski, Lauren; McCarthy, Shane E; Kang, Hyun M; Dobbyn, Lee; Stahl, Eli; Verma, Anurag; Sirugo, Giorgio; Ritchie, Marylyn D; Jones, Marcus; Balasubramanian, Suganthi; Siminovitch, Katherine; Salerno, William J; Shuldiner, Alan R; Rader, Daniel J; Mirshahi, Tooraj; Locke, Adam E; Marchini, Jonathan; Overton, John D; Carey, David J.

Nat Genet ; 54(4): 382-392, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35241825

RESUMO

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) enters human host cells via angiotensin-converting enzyme 2 (ACE2) and causes coronavirus disease 2019 (COVID-19). Here, through a genome-wide association study, we identify a variant (rs190509934, minor allele frequency 0.2-2%) that downregulates ACE2 expression by 37% (P = 2.7 × 10-8) and reduces the risk of SARS-CoV-2 infection by 40% (odds ratio = 0.60, P = 4.5 × 10-13), providing human genetic evidence that ACE2 expression levels influence COVID-19 risk. We also replicate the associations of six previously reported risk variants, of which four were further associated with worse outcomes in individuals infected with the virus (in/near LZTFL1, MHC, DPP9 and IFNAR2). Lastly, we show that common variants define a risk score that is strongly associated with severe disease among cases and modestly improves the prediction of disease severity relative to demographic and clinical factors alone.

Assuntos

COVID-19 , Enzima de Conversão de Angiotensina 2/genética , COVID-19/genética , Estudo de Associação Genômica Ampla , Humanos , Fatores de Risco , SARS-CoV-2/genética

12.

Exome sequencing and analysis of 454,787 UK Biobank participants.

Backman, Joshua D; Li, Alexander H; Marcketta, Anthony; Sun, Dylan; Mbatchou, Joelle; Kessler, Michael D; Benner, Christian; Liu, Daren; Locke, Adam E; Balasubramanian, Suganthi; Yadav, Ashish; Banerjee, Nilanjana; Gillies, Christopher E; Damask, Amy; Liu, Simon; Bai, Xiaodong; Hawes, Alicia; Maxwell, Evan; Gurski, Lauren; Watanabe, Kyoko; Kosmicki, Jack A; Rajagopal, Veera; Mighty, Jason; Jones, Marcus; Mitnaul, Lyndon; Stahl, Eli; Coppola, Giovanni; Jorgenson, Eric; Habegger, Lukas; Salerno, William J; Shuldiner, Alan R; Lotta, Luca A; Overton, John D; Cantor, Michael N; Reid, Jeffrey G; Yancopoulos, George; Kang, Hyun M; Marchini, Jonathan; Baras, Aris; Abecasis, Gonçalo R; Ferreira, Manuel A R.

Nature ; 599(7886): 628-634, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34662886

RESUMO

A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing1 to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study2. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10-11. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.

Assuntos

Bancos de Espécimes Biológicos , Bases de Dados Genéticas , Sequenciamento do Exoma , Exoma/genética , África/etnologia , Ásia/etnologia , Asma/genética , Diabetes Mellitus/genética , Europa (Continente)/etnologia , Oftalmopatias/genética , Feminino , Predisposição Genética para Doença/genética , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Hipertensão/genética , Hepatopatias/genética , Masculino , Mutação , Neoplasias/genética , Característica Quantitativa Herdável , Reino Unido

13.

False discovery rate control in genome-wide association studies with population structure.

Sesia, Matteo; Bates, Stephen; Candès, Emmanuel; Marchini, Jonathan; Sabatti, Chiara.

Proc Natl Acad Sci U S A ; 118(40)2021 10 05.

Artigo em Inglês | MEDLINE | ID: mdl-34580220

RESUMO

We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.

Assuntos

Genoma Humano/genética , Algoritmos , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Desequilíbrio de Ligação/genética , Herança Multifatorial/genética , Fenótipo , Software

14.

Computationally efficient whole-genome regression for quantitative and binary traits.

Mbatchou, Joelle; Barnard, Leland; Backman, Joshua; Marcketta, Anthony; Kosmicki, Jack A; Ziyatdinov, Andrey; Benner, Christian; O'Dushlaine, Colm; Barber, Mathew; Boutkov, Boris; Habegger, Lukas; Ferreira, Manuel; Baras, Aris; Reid, Jeffrey; Abecasis, Goncalo; Maxwell, Evan; Marchini, Jonathan.

Nat Genet ; 53(7): 1097-1103, 2021 07.

Artigo em Inglês | MEDLINE | ID: mdl-34017140

RESUMO

Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine-learning method called REGENIE for fitting a whole-genome regression model for quantitative and binary phenotypes that is substantially faster than alternatives in multi-trait analyses while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes and requires only local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives, which must load genome-wide matrices into memory. This results in substantial savings in compute time and memory usage. We introduce a fast, approximate Firth logistic regression test for unbalanced case-control phenotypes. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach using the UK Biobank dataset with up to 407,746 individuals.

Assuntos

Biologia Computacional , Estudo de Associação Genômica Ampla , Genômica , Estudos de Casos e Controles , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Genótipo , Humanos , Modelos Logísticos , Aprendizado de Máquina , Fenótipo , Reprodutibilidade dos Testes

15.

A non-linear regression method for estimation of gene-environment heritability.

Kerin, Matthew; Marchini, Jonathan.

Bioinformatics ; 36(24): 5632-5639, 2021 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-33367483

RESUMO

MOTIVATION: Gene-environment (GxE) interactions are one of the least studied aspects of the genetic architecture of human traits and diseases. The environment of an individual is inherently high dimensional, evolves through time and can be expensive and time consuming to measure. The UK Biobank study, with all 500 000 participants having undergone an extensive baseline questionnaire, represents a unique opportunity to assess GxE heritability for many traits and diseases in a well powered setting. RESULTS: We have developed a randomized Haseman-Elston non-linear regression method applicable when many environmental variables have been measured on each individual. The method (GPLEMMA) simultaneously estimates a linear environmental score (ES) and its GxE heritability. We compare the method via simulation to a whole-genome regression approach (LEMMA) for estimating GxE heritability. We show that GPLEMMA is more computationally efficient than LEMMA on large datasets, and produces results highly correlated with those from LEMMA when applied to simulated data and real data from the UK Biobank. AVAILABILITY AND IMPLEMENTATION: Software implementing the GPLEMMA method is available from https://jmarchini.org/gplemma/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

16.

Genotype imputation using the Positional Burrows Wheeler Transform.

Rubinacci, Simone; Delaneau, Olivier; Marchini, Jonathan.

PLoS Genet ; 16(11): e1009049, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-33196638

RESUMO

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has â¼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.

Assuntos

Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Haplótipos/genética , Alelos , Previsões/métodos , Frequência do Gene/genética , Genótipo , Humanos , Modelos Teóricos , Polimorfismo de Nucleotídeo Único/genética

17.

Exome sequencing and characterization of 49,960 individuals in the UK Biobank.

Van Hout, Cristopher V; Tachmazidou, Ioanna; Backman, Joshua D; Hoffman, Joshua D; Liu, Daren; Pandey, Ashutosh K; Gonzaga-Jauregui, Claudia; Khalid, Shareef; Ye, Bin; Banerjee, Nilanjana; Li, Alexander H; O'Dushlaine, Colm; Marcketta, Anthony; Staples, Jeffrey; Schurmann, Claudia; Hawes, Alicia; Maxwell, Evan; Barnard, Leland; Lopez, Alexander; Penn, John; Habegger, Lukas; Blumenfeld, Andrew L; Bai, Xiaodong; O'Keeffe, Sean; Yadav, Ashish; Praveen, Kavita; Jones, Marcus; Salerno, William J; Chung, Wendy K; Surakka, Ida; Willer, Cristen J; Hveem, Kristian; Leader, Joseph B; Carey, David J; Ledbetter, David H; Cardon, Lon; Yancopoulos, George D; Economides, Aris; Coppola, Giovanni; Shuldiner, Alan R; Balasubramanian, Suganthi; Cantor, Michael; Nelson, Matthew R; Whittaker, John; Reid, Jeffrey G; Marchini, Jonathan; Overton, John D; Scott, Robert A; Abecasis, Gonçalo R; Yerges-Armstrong, Laura.

Nature ; 586(7831): 749-756, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-33087929

RESUMO

The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.

Assuntos

Bases de Dados Genéticas , Sequenciamento do Exoma , Exoma/genética , Mutação com Perda de Função/genética , Fenótipo , Idoso , Densidade Óssea/genética , Colágeno Tipo VI/genética , Demografia , Feminino , Genes BRCA1 , Genes BRCA2 , Genótipo , Humanos , Canais Iônicos/genética , Masculino , Pessoa de Meia-Idade , Neoplasias/genética , Penetrância , Fragmentos de Peptídeos/genética , Reino Unido , Varizes/genética , Proteínas Ativadoras de ras GTPase/genética

18.

Inferring Gene-by-Environment Interactions with a Bayesian Whole-Genome Regression Model.

Kerin, Matthew; Marchini, Jonathan.

Am J Hum Genet ; 107(4): 698-713, 2020 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-32888427

RESUMO

The contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that 9.3%, 3.9%, 1.6%, and 12.5%, respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores (-log10p>7.3).

Assuntos

Interação Gene-Ambiente , Genoma Humano , Modelos Estatísticos , Locos de Características Quantitativas , Característica Quantitativa Herdável , Teorema de Bayes , Pressão Sanguínea/fisiologia , Índice de Massa Corporal , Conjuntos de Dados como Assunto , Marcadores Genéticos , Humanos , Reino Unido

19.

Common Genetic Variation Indicates Separate Causes for Periventricular and Deep White Matter Hyperintensities.

Armstrong, Nicola J; Mather, Karen A; Sargurupremraj, Muralidharan; Knol, Maria J; Malik, Rainer; Satizabal, Claudia L; Yanek, Lisa R; Wen, Wei; Gudnason, Vilmundur G; Dueker, Nicole D; Elliott, Lloyd T; Hofer, Edith; Bis, Joshua; Jahanshad, Neda; Li, Shuo; Logue, Mark A; Luciano, Michelle; Scholz, Markus; Smith, Albert V; Trompet, Stella; Vojinovic, Dina; Xia, Rui; Alfaro-Almagro, Fidel; Ames, David; Amin, Najaf; Amouyel, Philippe; Beiser, Alexa S; Brodaty, Henry; Deary, Ian J; Fennema-Notestine, Christine; Gampawar, Piyush G; Gottesman, Rebecca; Griffanti, Ludovica; Jack, Clifford R; Jenkinson, Mark; Jiang, Jiyang; Kral, Brian G; Kwok, John B; Lampe, Leonie; C M Liewald, David; Maillard, Pauline; Marchini, Jonathan; Bastin, Mark E; Mazoyer, Bernard; Pirpamer, Lukas; Rafael Romero, José; Roshchupkin, Gennady V; Schofield, Peter R; Schroeter, Matthias L; Stott, David J.

Stroke ; 51(7): 2111-2121, 2020 07.

Artigo em Inglês | MEDLINE | ID: mdl-32517579

RESUMO

BACKGROUND AND PURPOSE: Periventricular white matter hyperintensities (WMH; PVWMH) and deep WMH (DWMH) are regional classifications of WMH and reflect proposed differences in cause. In the first study, to date, we undertook genome-wide association analyses of DWMH and PVWMH to show that these phenotypes have different genetic underpinnings. METHODS: Participants were aged 45 years and older, free of stroke and dementia. We conducted genome-wide association analyses of PVWMH and DWMH in 26,654 participants from CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology), ENIGMA (Enhancing Neuro-Imaging Genetics Through Meta-Analysis), and the UKB (UK Biobank). Regional correlations were investigated using the genome-wide association analyses -pairwise method. Cross-trait genetic correlations between PVWMH, DWMH, stroke, and dementia were estimated using LDSC. RESULTS: In the discovery and replication analysis, for PVWMH only, we found associations on chromosomes 2 (NBEAL), 10q23.1 (TSPAN14/FAM231A), and 10q24.33 (SH3PXD2A). In the much larger combined meta-analysis of all cohorts, we identified ten significant regions for PVWMH: chromosomes 2 (3 regions), 6, 7, 10 (2 regions), 13, 16, and 17q23.1. New loci of interest include 7q36.1 (NOS3) and 16q24.2. In both the discovery/replication and combined analysis, we found genome-wide significant associations for the 17q25.1 locus for both DWMH and PVWMH. Using gene-based association analysis, 19 genes across all regions were identified for PVWMH only, including the new genes: CALCRL (2q32.1), KLHL24 (3q27.1), VCAN (5q27.1), and POLR2F (22q13.1). Thirteen genes in the 17q25.1 locus were significant for both phenotypes. More extensive genetic correlations were observed for PVWMH with small vessel ischemic stroke. There were no associations with dementia for either phenotype. CONCLUSIONS: Our study confirms these phenotypes have distinct and also shared genetic architectures. Genetic analyses indicated PVWMH was more associated with ischemic stroke whilst DWMH loci were implicated in vascular, astrocyte, and neuronal function. Our study confirms these phenotypes are distinct neuroimaging classifications and identifies new candidate genes associated with PVWMH only.

Assuntos

Encéfalo/patologia , Doenças de Pequenos Vasos Cerebrais/genética , Doenças de Pequenos Vasos Cerebrais/patologia , Predisposição Genética para Doença/genética , Substância Branca/patologia , Idoso , Encéfalo/diagnóstico por imagem , Doenças de Pequenos Vasos Cerebrais/diagnóstico por imagem , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Substância Branca/diagnóstico por imagem

20.

Retraction Note: 11,670 whole-genome sequences representative of the Han Chinese population from the CONVERGE project.

Cai, Na; Bigdeli, Tim B; Kretzschmar, Warren W; Li, Yihan; Liang, Jieqin; Hu, Jingchu; Peterson, Roseann E; Bacanu, Silviu; Webb, Bradley Todd; Riley, Brien; Li, Qibin; Marchini, Jonathan; Mott, Richard; Kendler, Kenneth S; Flint, Jonathan.

Sci Data ; 7(1): 123, 2020 04 16.

Artigo em Inglês | MEDLINE | ID: mdl-32300216

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA