Pesquisa | Secretaria de Estado da Saúde

A comprehensive analysis of clinical and polygenic germline influences on somatic mutational burden.

Taraszka, Kodi; Groha, Stefan; King, David; Tell, Robert; White, Kevin; Ziv, Elad; Zaitlen, Noah; Gusev, Alexander.

Am J Hum Genet ; 111(2): 242-258, 2024 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-38211585

RESUMO

Tumor mutational burden (TMB), the total number of somatic mutations in the tumor, and copy number burden (CNB), the corresponding measure of aneuploidy, are established fundamental somatic features and emerging biomarkers for immunotherapy. However, the genetic and non-genetic influences on TMB/CNB and, critically, the manner by which they influence patient outcomes remain poorly understood. Here, we present a large germline-somatic study of TMB/CNB with >23,000 individuals across 17 cancer types, of which 12,000 also have extensive clinical, treatment, and overall survival (OS) measurements available. We report dozens of clinical associations with TMB/CNB, observing older age and male sex to have a strong effect on TMB and weaker impact on CNB. We additionally identified significant germline influences on TMB/CNB, including fine-scale European ancestry and germline polygenic risk scores (PRSs) for smoking, tanning, white blood cell counts, and educational attainment. We quantify the causal effect of exposures on somatic mutational processes using Mendelian randomization. Many of the identified features associated with TMB/CNB were additionally associated with OS for individuals treated at a single tertiary cancer center. For individuals receiving immunotherapy, we observed a complex relationship between PRSs for educational attainment, self-reported college attainment, TMB, and survival, suggesting that the influence of this biomarker may be substantially modified by socioeconomic status. While the accumulation of somatic alterations is a stochastic process, our work demonstrates that it can be shaped by host characteristics including germline genetics.

Assuntos

Neoplasias , Humanos , Masculino , Mutação/genética , Neoplasias/genética , Neoplasias/patologia , Imunoterapia , Biomarcadores Tumorais/genética , Células Germinativas/patologia

Leveraging pleiotropy for joint analysis of genome-wide association studies with per trait interpretations.

Taraszka, Kodi; Zaitlen, Noah; Eskin, Eleazar.

PLoS Genet ; 18(11): e1010447, 2022 11.

Artigo em Inglês | MEDLINE | ID: mdl-36342933

RESUMO

We introduce pleiotropic association test (PAT) for joint analysis of multiple traits using genome-wide association study (GWAS) summary statistics. The method utilizes the decomposition of phenotypic covariation into genetic and environmental components to create a likelihood ratio test statistic for each genetic variant. Though PAT does not directly interpret which trait(s) drive the association, a per trait interpretation of the omnibus p-value is provided through an extension to the meta-analysis framework, m-values. In simulations, we show PAT controls the false positive rate, increases statistical power, and is robust to model misspecifications of genetic effect. Additionally, simulations comparing PAT to three multi-trait methods, HIPO, MTAG, and ASSET, show PAT identified 15.3% more omnibus associations over the next best method. When these associations were interpreted on a per trait level using m-values, PAT had 37.5% more true per trait interpretations with a 0.92% false positive assignment rate. When analyzing four traits from the UK Biobank, PAT discovered 22,095 novel variants. Through the m-values interpretation framework, the number of per trait associations for two traits were almost tripled and were nearly doubled for another trait relative to the original single trait GWAS.

Assuntos

Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Pleiotropia Genética , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Metanálise como Assunto

Identifying causal variants by fine mapping across multiple studies.

LaPierre, Nathan; Taraszka, Kodi; Huang, Helen; He, Rosemary; Hormozdiari, Farhad; Eskin, Eleazar.

PLoS Genet ; 17(9): e1009733, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34543273

RESUMO

Increasingly large Genome-Wide Association Studies (GWAS) have yielded numerous variants associated with many complex traits, motivating the development of "fine mapping" methods to identify which of the associated variants are causal. Additionally, GWAS of the same trait for different populations are increasingly available, raising the possibility of refining fine mapping results further by leveraging different linkage disequilibrium (LD) structures across studies. Here, we introduce multiple study causal variants identification in associated regions (MsCAVIAR), a method that extends the popular CAVIAR fine mapping framework to a multiple study setting using a random effects model. MsCAVIAR only requires summary statistics and LD as input, accounts for uncertainty in association statistics using a multivariate normal model, allows for multiple causal variants at a locus, and explicitly models the possibility of different SNP effect sizes in different populations. We demonstrate the efficacy of MsCAVIAR in both a simulation study and a trans-ethnic, trans-biobank fine mapping analysis of High Density Lipoprotein (HDL).

Assuntos

Estudo de Associação Genômica Ampla , Causalidade , Mapeamento Cromossômico/métodos , Humanos , Desequilíbrio de Ligação , Lipoproteínas HDL/genética , Polimorfismo de Nucleotídeo Único

MSGene: a multistate model using genetic risk and the electronic health record applied to lifetime risk of coronary artery disease.

Urbut, Sarah M; Yeung, Ming Wai; Khurshid, Shaan; Cho, So Mi Jemma; Schuermans, Art; German, Jakob; Taraszka, Kodi; Paruchuri, Kaavya; Fahed, Akl C; Ellinor, Patrick T; Trinquart, Ludovic; Parmigiani, Giovanni; Gusev, Alexander; Natarajan, Pradeep.

Nat Commun ; 15(1): 4884, 2024 Jun 07.

Artigo em Inglês | MEDLINE | ID: mdl-38849421

RESUMO

Coronary artery disease (CAD) is the leading cause of death among adults worldwide. Accurate risk stratification can support optimal lifetime prevention. Current methods lack the ability to incorporate new information throughout the life course or to combine innate genetic risk factors with acquired lifetime risk. We designed a general multistate model (MSGene) to estimate age-specific transitions across 10 cardiometabolic states, dependent on clinical covariates and a CAD polygenic risk score. This model is designed to handle longitudinal data over the lifetime to address this unmet need and support clinical decision-making. We analyze longitudinal data from 480,638 UK Biobank participants and compared predicted lifetime risk with the 30-year Framingham risk score. MSGene improves discrimination (C-index 0.71 vs 0.66), age of high-risk detection (C-index 0.73 vs 0.52), and overall prediction (RMSE 1.1% vs 10.9%), in held-out data. We also use MSGene to refine estimates of lifetime absolute risk reduction from statin initiation. Our findings underscore our multistate model's potential public health value for accurate lifetime CAD risk estimation using clinical factors and increasingly available genetics toward earlier more effective prevention.

Assuntos

Doença da Artéria Coronariana , Registros Eletrônicos de Saúde , Humanos , Doença da Artéria Coronariana/genética , Doença da Artéria Coronariana/epidemiologia , Masculino , Feminino , Pessoa de Meia-Idade , Registros Eletrônicos de Saúde/estatística & dados numéricos , Idoso , Medição de Risco/métodos , Fatores de Risco , Adulto , Predisposição Genética para Doença , Inibidores de Hidroximetilglutaril-CoA Redutases/uso terapêutico , Reino Unido/epidemiologia , Estudos Longitudinais , Herança Multifatorial/genética

Ensemble neural network model for detecting thyroid eye disease using external photographs.

Karlin, Justin; Gai, Lisa; LaPierre, Nathan; Danesh, Kayla; Farajzadeh, Justin; Palileo, Bea; Taraszka, Kodi; Zheng, Jie; Wang, Wei; Eskin, Eleazar; Rootman, Daniel.

Br J Ophthalmol ; 107(11): 1722-1729, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-36126104

RESUMO

PURPOSE: To describe an artificial intelligence platform that detects thyroid eye disease (TED). DESIGN: Development of a deep learning model. METHODS: 1944 photographs from a clinical database were used to train a deep learning model. 344 additional images ('test set') were used to calculate performance metrics. Receiver operating characteristic, precision-recall curves and heatmaps were generated. From the test set, 50 images were randomly selected ('survey set') and used to compare model performance with ophthalmologist performance. 222 images obtained from a separate clinical database were used to assess model recall and to quantitate model performance with respect to disease stage and grade. RESULTS: The model achieved test set accuracy of 89.2%, specificity 86.9%, recall 93.4%, precision 79.7% and an F1 score of 86.0%. Heatmaps demonstrated that the model identified pixels corresponding to clinical features of TED. On the survey set, the ensemble model achieved accuracy, specificity, recall, precision and F1 score of 86%, 84%, 89%, 77% and 82%, respectively. 27 ophthalmologists achieved mean performance of 75%, 82%, 63%, 72% and 66%, respectively. On the second test set, the model achieved recall of 91.9%, with higher recall for moderate to severe (98.2%, n=55) and active disease (98.3%, n=60), as compared with mild (86.8%, n=68) or stable disease (85.7%, n=63). CONCLUSIONS: The deep learning classifier is a novel approach to identify TED and is a first step in the development of tools to improve diagnostic accuracy and lower barriers to specialist evaluation.

MSGene: Derivation and validation of a multistate model for lifetime risk of coronary artery disease using genetic risk and the electronic health record.

Urbut, Sarah M; Yeung, Ming Wai; Khurshid, Shaan; Cho, So Mi Jemma; Schuermans, Art; German, Jakob; Taraszka, Kodi; Fahed, Akl C; Ellinor, Patrick; Trinquart, Ludovic; Parmigiani, Giovanni; Gusev, Alexander; Natarajan, Pradeep.

medRxiv ; 2023 Nov 08.

Artigo em Inglês | MEDLINE | ID: mdl-37986972

RESUMO

Currently, coronary artery disease (CAD) is the leading cause of death among adults worldwide. Accurate risk stratification can support optimal lifetime prevention. We designed a novel and general multistate model (MSGene) to estimate age-specific transitions across 10 cardiometabolic states, dependent on clinical covariates and a CAD polygenic risk score. MSGene supports decision making about CAD prevention related to any of these states. We analyzed longitudinal data from 480,638 UK Biobank participants and compared predicted lifetime risk with the 30-year Framingham risk score. MSGene improved discrimination (C-index 0.71 vs 0.66), age of high-risk detection (C-index 0.73 vs 0.52), and overall prediction (RMSE 1.1% vs 10.9%), with external validation. We also used MSGene to refine estimates of lifetime absolute risk reduction from statin initiation. Our findings underscore the potential public health value of our novel multistate model for accurate lifetime CAD risk estimation using clinical factors and increasingly available genetics.

Ancestry-driven recalibration of tumor mutational burden and disparate clinical outcomes in response to immune checkpoint inhibitors.

Nassar, Amin H; Adib, Elio; Abou Alaiwi, Sarah; El Zarif, Talal; Groha, Stefan; Akl, Elie W; Nuzzo, Pier Vitale; Mouhieddine, Tarek H; Perea-Chamblee, Tomin; Taraszka, Kodi; El-Khoury, Habib; Labban, Muhieddine; Fong, Christopher; Arora, Kanika S; Labaki, Chris; Xu, Wenxin; Sonpavde, Guru; Haddad, Robert I; Mouw, Kent W; Giannakis, Marios; Hodi, F Stephen; Zaitlen, Noah; Schoenfeld, Adam J; Schultz, Nikolaus; Berger, Michael F; MacConaill, Laura E; Ananda, Guruprasad; Kwiatkowski, David J; Choueiri, Toni K; Schrag, Deborah; Carrot-Zhang, Jian; Gusev, Alexander.

Cancer Cell ; 40(10): 1161-1172.e5, 2022 10 10.

Artigo em Inglês | MEDLINE | ID: mdl-36179682

RESUMO

The immune checkpoint inhibitor (ICI) pembrolizumab is US FDA approved for treatment of solid tumors with high tumor mutational burden (TMB-high; ≥10 variants/Mb). However, the extent to which TMB-high generalizes as an accurate biomarker in diverse patient populations is largely unknown. Using two clinical cohorts, we investigated the interplay between genetic ancestry, TMB, and tumor-only versus tumor-normal paired sequencing in solid tumors. TMB estimates from tumor-only panels substantially overclassified individuals into the clinically important TMB-high group due to germline contamination, and this bias was particularly pronounced in patients with Asian/African ancestry. Among patients with non-small cell lung cancer treated with ICIs, those misclassified as TMB-high from tumor-only panels did not associate with improved outcomes. TMB-high was significantly associated with improved outcomes only in European ancestries and merits validation in non-European ancestry populations. Ancestry-aware tumor-only TMB calibration and ancestry-diverse biomarker studies are critical to ensure that existing disparities are not exacerbated in precision medicine.

Assuntos

Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Biomarcadores Tumorais/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Humanos , Inibidores de Checkpoint Imunológico/farmacologia , Inibidores de Checkpoint Imunológico/uso terapêutico , Neoplasias Pulmonares/genética , Mutação , Carga Tumoral

Constructing germline research cohorts from the discarded reads of clinical tumor sequences.

Gusev, Alexander; Groha, Stefan; Taraszka, Kodi; Semenov, Yevgeniy R; Zaitlen, Noah.

Genome Med ; 13(1): 179, 2021 11 08.

Artigo em Inglês | MEDLINE | ID: mdl-34749793

RESUMO

BACKGROUND: Hundreds of thousands of cancer patients have had targeted (panel) tumor sequencing to identify clinically meaningful mutations. In addition to improving patient outcomes, this activity has led to significant discoveries in basic and translational domains. However, the targeted nature of clinical tumor sequencing has a limited scope, especially for germline genetics. In this work, we assess the utility of discarded, off-target reads from tumor-only panel sequencing for the recovery of genome-wide germline genotypes through imputation. METHODS: We developed a framework for inference of germline variants from tumor panel sequencing, including imputation, quality control, inference of genetic ancestry, germline polygenic risk scores, and HLA alleles. We benchmarked our framework on 833 individuals with tumor sequencing and matched germline SNP array data. We then applied our approach to a prospectively collected panel sequencing cohort of 25,889 tumors. RESULTS: We demonstrate high to moderate accuracy of each inferred feature relative to direct germline SNP array genotyping: individual common variants were imputed with a mean accuracy (correlation) of 0.86, genetic ancestry was inferred with a correlation of > 0.98, polygenic risk scores were inferred with a correlation of > 0.90, and individual HLA alleles were inferred with a correlation of > 0.80. We demonstrate a minimal influence on the accuracy of somatic copy number alterations and other tumor features. We showcase the feasibility and utility of our framework by analyzing 25,889 tumors and identifying the relationships between genetic ancestry, polygenic risk, and tumor characteristics that could not be studied with conventional on-target tumor data. CONCLUSIONS: We conclude that targeted tumor sequencing can be leveraged to build rich germline research cohorts from existing data and make our analysis pipeline publicly available to facilitate this effort.

Assuntos

Predisposição Genética para Doença/genética , Células Germinativas , Neoplasias/genética , Análise de Sequência de DNA , Alelos , Biologia Computacional , Variações do Número de Cópias de DNA , Frequência do Gene , Estudo de Associação Genômica Ampla , Genótipo , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Polimorfismo de Nucleotídeo Único

Technology dictates algorithms: recent developments in read alignment.

Alser, Mohammed; Rotman, Jeremy; Deshpande, Dhrithi; Taraszka, Kodi; Shi, Huwenbo; Baykal, Pelin Icer; Yang, Harry Taegyun; Xue, Victor; Knyazev, Sergey; Singer, Benjamin D; Balliu, Brunilda; Koslicki, David; Skums, Pavel; Zelikovsky, Alex; Alkan, Can; Mutlu, Onur; Mangul, Serghei.

Genome Biol ; 22(1): 249, 2021 08 26.

Artigo em Inglês | MEDLINE | ID: mdl-34446078

RESUMO

Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.

Assuntos

Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência , Genoma Humano , HIV/fisiologia , Humanos , Metagenômica , Sulfitos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa