Your browser doesn't support javascript.
loading
Unsupervised representation learning improves genomic discovery and risk prediction for respiratory and circulatory functions and diseases.
Yun, Taedong; Cosentino, Justin; Behsaz, Babak; McCaw, Zachary R; Hill, Davin; Luben, Robert; Lai, Dongbing; Bates, John; Yang, Howard; Schwantes-An, Tae-Hwi; Zhou, Yuchen; Khawaja, Anthony P; Carroll, Andrew; Hobbs, Brian D; Cho, Michael H; McLean, Cory Y; Hormozdiari, Farhad.
Afiliação
  • Yun T; Google Research, Cambridge, MA 02142, USA.
  • Cosentino J; Google Research, Palo Alto, CA 94304, USA.
  • Behsaz B; Google Research, Cambridge, MA 02142, USA.
  • McCaw ZR; Google Research, Palo Alto, CA 94304, USA.
  • Hill D; Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 94304, USA.
  • Luben R; Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.
  • Lai D; NIHR Biomedical Research Centre at Moorfields Eye Hospital & UCL Institute of Ophthalmology, London EC1V 9EL, UK.
  • Bates J; MRC Epidemiology Unit, University of Cambridge, Cambridge CB2 0SL, UK.
  • Yang H; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
  • Schwantes-An TH; Verily Life Sciences, South San Francisco, CA 94080, USA.
  • Zhou Y; Google Research, Palo Alto, CA 94304, USA.
  • Khawaja AP; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
  • Carroll A; Division of Cardiology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202, USA.
  • Hobbs BD; Google Research, Cambridge, MA 02142, USA.
  • Cho MH; NIHR Biomedical Research Centre at Moorfields Eye Hospital & UCL Institute of Ophthalmology, London EC1V 9EL, UK.
  • McLean CY; MRC Epidemiology Unit, University of Cambridge, Cambridge CB2 0SL, UK.
  • Hormozdiari F; Google Research, Palo Alto, CA 94304, USA.
medRxiv ; 2023 Aug 29.
Article em En | MEDLINE | ID: mdl-37163049
ABSTRACT
High-dimensional clinical data are becoming more accessible in biobank-scale datasets. However, effectively utilizing high-dimensional clinical data for genetic discovery remains challenging. Here we introduce a general deep learning-based framework, REpresentation learning for Genetic discovery on Low-dimensional Embeddings (REGLE), for discovering associations between genetic variants and high-dimensional clinical data. REGLE uses convolutional variational autoencoders to compute a non-linear, low-dimensional, disentangled embedding of the data with highly heritable individual components. REGLE can incorporate expert-defined or clinical features and provides a framework to create accurate disease-specific polygenic risk scores (PRS) in datasets which have minimal expert phenotyping. We apply REGLE to both respiratory and circulatory systems spirograms which measure lung function and photoplethysmograms (PPG) which measure blood volume changes. Genome-wide association studies on REGLE embeddings identify more genome-wide significant loci than existing methods and replicate known loci for both spirograms and PPG, demonstrating the generality of the framework. Furthermore, these embeddings are associated with overall survival. Finally, we construct a set of PRSs that improve predictive performance of asthma, chronic obstructive pulmonary disease, hypertension, and systolic blood pressure in multiple biobanks. Thus, REGLE embeddings can quantify clinically relevant features that are not currently captured in a standardized or automated way.

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Idioma: En Revista: MedRxiv Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Etiology_studies / Prognostic_studies / Risk_factors_studies Idioma: En Revista: MedRxiv Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Estados Unidos