Pesquisa | Biblioteca Virtual em Saúde

1.

Large-scale machine-learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology.

Alipanahi, Babak; Hormozdiari, Farhad; Behsaz, Babak; Cosentino, Justin; McCaw, Zachary R; Schorsch, Emanuel; Sculley, D; Dorfman, Elizabeth H; Foster, Paul J; Peng, Lily H; Phene, Sonia; Hammel, Naama; Carroll, Andrew; Khawaja, Anthony P; McLean, Cory Y.

Am J Hum Genet ; 108(7): 1217-1230, 2021 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-34077760

RESUMO

Genome-wide association studies (GWASs) require accurate cohort phenotyping, but expert labeling can be costly, time intensive, and variable. Here, we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; p ≤ 5 × 10-8) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 93 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR: select loci near genes involved in neuronal and synaptic biology or harboring variants are known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort.

Assuntos

Aprendizado de Máquina , Disco Óptico/anatomia & histologia , Conjuntos de Dados como Assunto , Angiofluoresceinografia , Estudo de Associação Genômica Ampla , Glaucoma de Ângulo Aberto/diagnóstico por imagem , Humanos , Modelos Anatômicos , Disco Óptico/diagnóstico por imagem , Fenótipo , Medição de Risco

2.

Predicting cardiovascular disease risk using photoplethysmography and deep learning.

Weng, Wei-Hung; Baur, Sebastien; Daswani, Mayank; Chen, Christina; Harrell, Lauren; Kakarmath, Sujay; Jabara, Mariam; Behsaz, Babak; McLean, Cory Y; Matias, Yossi; Corrado, Greg S; Shetty, Shravya; Prabhakara, Shruthi; Liu, Yun; Danaei, Goodarz; Ardila, Diego.

PLOS Glob Public Health ; 4(6): e0003204, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38833495

RESUMO

Cardiovascular diseases (CVDs) are responsible for a large proportion of premature deaths in low- and middle-income countries. Early CVD detection and intervention is critical in these populations, yet many existing CVD risk scores require a physical examination or lab measurements, which can be challenging in such health systems due to limited accessibility. We investigated the potential to use photoplethysmography (PPG), a sensing technology available on most smartphones that can potentially enable large-scale screening at low cost, for CVD risk prediction. We developed a deep learning PPG-based CVD risk score (DLS) to predict the probability of having major adverse cardiovascular events (MACE: non-fatal myocardial infarction, stroke, and cardiovascular death) within ten years, given only age, sex, smoking status and PPG as predictors. We compare the DLS with the office-based refit-WHO score, which adopts the shared predictors from WHO and Globorisk scores (age, sex, smoking status, height, weight and systolic blood pressure) but refitted on the UK Biobank (UKB) cohort. All models were trained on a development dataset (141,509 participants) and evaluated on a geographically separate test (54,856 participants) dataset, both from UKB. DLS's C-statistic (71.1%, 95% CI 69.9-72.4) is non-inferior to office-based refit-WHO score (70.9%, 95% CI 69.7-72.2; non-inferiority margin of 2.5%, p<0.01) in the test dataset. The calibration of the DLS is satisfactory, with a 1.8% mean absolute calibration error. Adding DLS features to the office-based score increases the C-statistic by 1.0% (95% CI 0.6-1.4). DLS predicts ten-year MACE risk comparable with the office-based refit-WHO score. Interpretability analyses suggest that the DLS-extracted features are related to PPG waveform morphology and are independent of heart rate. Our study provides a proof-of-concept and suggests the potential of a PPG-based approach strategies for community-based primary prevention in resource-limited regions.

3.

Utilizing multimodal AI to improve genetic analyses of cardiovascular traits.

Zhou, Yuchen; Cosentino, Justin; Yun, Taedong; Biradar, Mahantesh I; Shreibati, Jacqueline; Lai, Dongbing; Schwantes-An, Tae-Hwi; Luben, Robert; McCaw, Zachary; Engmann, Jorgen; Providencia, Rui; Schmidt, Amand Floriaan; Munroe, Patricia; Yang, Howard; Carroll, Andrew; Khawaja, Anthony P; McLean, Cory Y; Behsaz, Babak; Hormozdiari, Farhad.

medRxiv ; 2024 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-38562791

RESUMO

Electronic health records, biobanks, and wearable biosensors contain multiple high-dimensional clinical data (HDCD) modalities (e.g., ECG, Photoplethysmography (PPG), and MRI) for each individual. Access to multimodal HDCD provides a unique opportunity for genetic studies of complex traits because different modalities relevant to a single physiological system (e.g., circulatory system) encode complementary and overlapping information. We propose a novel multimodal deep learning method, M-REGLE, for discovering genetic associations from a joint representation of multiple complementary HDCD modalities. We showcase the effectiveness of this model by applying it to several cardiovascular modalities. M-REGLE jointly learns a lower representation (i.e., latent factors) of multimodal HDCD using a convolutional variational autoencoder, performs genome wide association studies (GWAS) on each latent factor, then combines the results to study the genetics of the underlying system. To validate the advantages of M-REGLE and multimodal learning, we apply it to common cardiovascular modalities (PPG and ECG), and compare its results to unimodal learning methods in which representations are learned from each data modality separately, but the downstream genetic analyses are performed on the combined unimodal representations. M-REGLE identifies 19.3% more loci on the 12-lead ECG dataset, 13.0% more loci on the ECG lead I + PPG dataset, and its genetic risk score significantly outperforms the unimodal risk score at predicting cardiac phenotypes, such as atrial fibrillation (Afib), in multiple biobanks.

4.

Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction.

Yun, Taedong; Cosentino, Justin; Behsaz, Babak; McCaw, Zachary R; Hill, Davin; Luben, Robert; Lai, Dongbing; Bates, John; Yang, Howard; Schwantes-An, Tae-Hwi; Zhou, Yuchen; Khawaja, Anthony P; Carroll, Andrew; Hobbs, Brian D; Cho, Michael H; McLean, Cory Y; Hormozdiari, Farhad.

Nat Genet ; 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38977853

RESUMO

Although high-dimensional clinical data (HDCD) are increasingly available in biobank-scale datasets, their use for genetic discovery remains challenging. Here we introduce an unsupervised deep learning model, Representation Learning for Genetic Discovery on Low-Dimensional Embeddings (REGLE), for discovering associations between genetic variants and HDCD. REGLE leverages variational autoencoders to compute nonlinear disentangled embeddings of HDCD, which become the inputs to genome-wide association studies (GWAS). REGLE can uncover features not captured by existing expert-defined features and enables the creation of accurate disease-specific polygenic risk scores (PRSs) in datasets with very few labeled data. We apply REGLE to perform GWAS on respiratory and circulatory HDCD-spirograms measuring lung function and photoplethysmograms measuring blood volume changes. REGLE replicates known loci while identifying others not previously detected. REGLE are predictive of overall survival, and PRSs constructed from REGLE loci improve disease prediction across multiple biobanks. Overall, REGLE contain clinically relevant information beyond that captured by existing expert-defined features, leading to improved genetic discovery and disease prediction.

5.

Inference of chronic obstructive pulmonary disease with deep learning on raw spirograms identifies new genetic loci and improves risk models.

Cosentino, Justin; Behsaz, Babak; Alipanahi, Babak; McCaw, Zachary R; Hill, Davin; Schwantes-An, Tae-Hwi; Lai, Dongbing; Carroll, Andrew; Hobbs, Brian D; Cho, Michael H; McLean, Cory Y; Hormozdiari, Farhad.

Nat Genet ; 55(5): 787-795, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37069358

RESUMO

Chronic obstructive pulmonary disease (COPD), the third leading cause of death worldwide, is highly heritable. While COPD is clinically defined by applying thresholds to summary measures of lung function, a quantitative liability score has more power to identify genetic signals. Here we train a deep convolutional neural network on noisy self-reported and International Classification of Diseases labels to predict COPD case-control status from high-dimensional raw spirograms and use the model's predictions as a liability score. The machine-learning-based (ML-based) liability score accurately discriminates COPD cases and controls, and predicts COPD-related hospitalization without any domain-specific knowledge. Moreover, the ML-based liability score is associated with overall survival and exacerbation events. A genome-wide association study on the ML-based liability score replicates existing COPD and lung function loci and also identifies 67 new loci. Lastly, our method provides a general framework to use ML methods and medical-record-based labels that does not require domain knowledge or expert curation to improve disease prediction and genomic discovery for drug design.

Assuntos

Aprendizado Profundo , Doença Pulmonar Obstrutiva Crônica , Humanos , Estudo de Associação Genômica Ampla/métodos , Doença Pulmonar Obstrutiva Crônica/genética , Loci Gênicos , Polimorfismo de Nucleotídeo Único/genética

6.

Unsupervised representation learning improves genomic discovery and risk prediction for respiratory and circulatory functions and diseases.

Yun, Taedong; Cosentino, Justin; Behsaz, Babak; McCaw, Zachary R; Hill, Davin; Luben, Robert; Lai, Dongbing; Bates, John; Yang, Howard; Schwantes-An, Tae-Hwi; Zhou, Yuchen; Khawaja, Anthony P; Carroll, Andrew; Hobbs, Brian D; Cho, Michael H; McLean, Cory Y; Hormozdiari, Farhad.

medRxiv ; 2023 Aug 29.

Artigo em Inglês | MEDLINE | ID: mdl-37163049

RESUMO

High-dimensional clinical data are becoming more accessible in biobank-scale datasets. However, effectively utilizing high-dimensional clinical data for genetic discovery remains challenging. Here we introduce a general deep learning-based framework, REpresentation learning for Genetic discovery on Low-dimensional Embeddings (REGLE), for discovering associations between genetic variants and high-dimensional clinical data. REGLE uses convolutional variational autoencoders to compute a non-linear, low-dimensional, disentangled embedding of the data with highly heritable individual components. REGLE can incorporate expert-defined or clinical features and provides a framework to create accurate disease-specific polygenic risk scores (PRS) in datasets which have minimal expert phenotyping. We apply REGLE to both respiratory and circulatory systems: spirograms which measure lung function and photoplethysmograms (PPG) which measure blood volume changes. Genome-wide association studies on REGLE embeddings identify more genome-wide significant loci than existing methods and replicate known loci for both spirograms and PPG, demonstrating the generality of the framework. Furthermore, these embeddings are associated with overall survival. Finally, we construct a set of PRSs that improve predictive performance of asthma, chronic obstructive pulmonary disease, hypertension, and systolic blood pressure in multiple biobanks. Thus, REGLE embeddings can quantify clinically relevant features that are not currently captured in a standardized or automated way.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA