Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
Nat Commun ; 15(1): 4304, 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38773065

RESUMO

Increased left atrial volume and decreased left atrial function have long been associated with atrial fibrillation. The availability of large-scale cardiac magnetic resonance imaging data paired with genetic data provides a unique opportunity to assess the genetic contributions to left atrial structure and function, and understand their relationship with risk for atrial fibrillation. Here, we use deep learning and surface reconstruction models to measure left atrial minimum volume, maximum volume, stroke volume, and emptying fraction in 40,558 UK Biobank participants. In a genome-wide association study of 35,049 participants without pre-existing cardiovascular disease, we identify 20 common genetic loci associated with left atrial structure and function. We find that polygenic contributions to increased left atrial volume are associated with atrial fibrillation and its downstream consequences, including stroke. Through Mendelian randomization, we find evidence supporting a causal role for left atrial enlargement and dysfunction on atrial fibrillation risk.


Assuntos
Fibrilação Atrial , Aprendizado Profundo , Estudo de Associação Genômica Ampla , Átrios do Coração , Humanos , Fibrilação Atrial/fisiopatologia , Fibrilação Atrial/genética , Fibrilação Atrial/diagnóstico por imagem , Átrios do Coração/diagnóstico por imagem , Átrios do Coração/fisiopatologia , Átrios do Coração/patologia , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Imageamento por Ressonância Magnética , Análise da Randomização Mendeliana , Fatores de Risco , Função do Átrio Esquerdo/fisiologia , Volume Sistólico , Acidente Vascular Cerebral , Reino Unido/epidemiologia , Loci Gênicos , Predisposição Genética para Doença
2.
Nat Med ; 30(6): 1749-1760, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38806679

RESUMO

Fibrotic diseases affect multiple organs and are associated with morbidity and mortality. To examine organ-specific and shared biologic mechanisms that underlie fibrosis in different organs, we developed machine learning models to quantify T1 time, a marker of interstitial fibrosis, in the liver, pancreas, heart and kidney among 43,881 UK Biobank participants who underwent magnetic resonance imaging. In phenome-wide association analyses, we demonstrate the association of increased organ-specific T1 time, reflecting increased interstitial fibrosis, with prevalent diseases across multiple organ systems. In genome-wide association analyses, we identified 27, 18, 11 and 10 independent genetic loci associated with liver, pancreas, myocardial and renal cortex T1 time, respectively. There was a modest genetic correlation between the examined organs. Several loci overlapped across the examined organs implicating genes involved in a myriad of biologic pathways including metal ion transport (SLC39A8, HFE and TMPRSS6), glucose metabolism (PCK2), blood group antigens (ABO and FUT2), immune function (BANK1 and PPP3CA), inflammation (NFKB1) and mitosis (CENPE). Finally, we found that an increasing number of organs with T1 time falling in the top quintile was associated with increased mortality in the population. Individuals with a high burden of fibrosis in ≥3 organs had a 3-fold increase in mortality compared to those with a low burden of fibrosis across all examined organs in multivariable-adjusted analysis (hazard ratio = 3.31, 95% confidence interval 1.77-6.19; P = 1.78 × 10-4). By leveraging machine learning to quantify T1 time across multiple organs at scale, we uncovered new organ-specific and shared biologic pathways underlying fibrosis that may provide therapeutic targets.


Assuntos
Fibrose , Estudo de Associação Genômica Ampla , Imageamento por Ressonância Magnética , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Aprendizado de Máquina , Idoso , Pâncreas/patologia , Pâncreas/diagnóstico por imagem , Especificidade de Órgãos/genética , Rim/patologia , Fígado/patologia , Fígado/metabolismo , Miocárdio/patologia , Miocárdio/metabolismo , Adulto
3.
Nat Biotechnol ; 42(4): 582-586, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37291427

RESUMO

Full-length RNA-sequencing methods using long-read technologies can capture complete transcript isoforms, but their throughput is limited. We introduce multiplexed arrays isoform sequencing (MAS-ISO-seq), a technique for programmably concatenating complementary DNAs (cDNAs) into molecules optimal for long-read sequencing, increasing the throughput >15-fold to nearly 40 million cDNA reads per run on the Sequel IIe sequencer. When applied to single-cell RNA sequencing of tumor-infiltrating T cells, MAS-ISO-seq demonstrated a 12- to 32-fold increase in the discovery of differentially spliced genes.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Isoformas de RNA , DNA Complementar/genética , Isoformas de RNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética , Análise de Sequência de RNA/métodos , Transcriptoma , Perfilação da Expressão Gênica/métodos , RNA/genética
4.
Nat Commun ; 14(1): 5419, 2023 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-37669985

RESUMO

Recently, large scale genomic projects such as All of Us and the UK Biobank have introduced a new research paradigm where data are stored centrally in cloud-based Trusted Research Environments (TREs). To characterize the advantages and drawbacks of different TRE attributes in facilitating cross-cohort analysis, we conduct a Genome-Wide Association Study of standard lipid measures using two approaches: meta-analysis and pooled analysis. Comparison of full summary data from both approaches with an external study shows strong correlation of known loci with lipid levels (R2 ~ 83-97%). Importantly, 90 variants meet the significance threshold only in the meta-analysis and 64 variants are significant only in pooled analysis, with approximately 20% of variants in each of those groups being most prevalent in non-European, non-Asian ancestry individuals. These findings have important implications, as technical and policy choices lead to cross-cohort analyses generating similar, but not identical results, particularly for non-European ancestral populations.


Assuntos
Estudo de Associação Genômica Ampla , Saúde da População , Humanos , Genômica , Políticas , Lipídeos
5.
Nat Methods ; 20(9): 1323-1335, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37550580

RESUMO

Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.


Assuntos
RNA Nuclear Pequeno , Software , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
6.
Annu Rev Biomed Data Sci ; 6: 443-464, 2023 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-37561600

RESUMO

The All of Us Research Program's Data and Research Center (DRC) was established to help acquire, curate, and provide access to one of the world's largest and most diverse datasets for precision medicine research. Already, over 500,000 participants are enrolled in All of Us, 80% of whom are underrepresented in biomedical research, and data are being analyzed by a community of over 2,300 researchers. The DRC created this thriving data ecosystem by collaborating with engaged participants, innovative program partners, and empowered researchers. In this review, we first describe how the DRC is organized to meet the needs of this broad group of stakeholders. We then outline guiding principles, common challenges, and innovative approaches used to build the All of Us data ecosystem. Finally, we share lessons learned to help others navigate important decisions and trade-offs in building a modern biomedical data platform.


Assuntos
Pesquisa Biomédica , Saúde da População , Humanos , Ecossistema , Medicina de Precisão
7.
Nature ; 619(7971): 828-836, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37438524

RESUMO

Splice-switching antisense oligonucleotides (ASOs) could be used to treat a subset of individuals with genetic diseases1, but the systematic identification of such individuals remains a challenge. Here we performed whole-genome sequencing analyses to characterize genetic variation in 235 individuals (from 209 families) with ataxia-telangiectasia, a severely debilitating and life-threatening recessive genetic disorder2,3, yielding a complete molecular diagnosis in almost all individuals. We developed a predictive taxonomy to assess the amenability of each individual to splice-switching ASO intervention; 9% and 6% of the individuals had variants that were 'probably' or 'possibly' amenable to ASO splice modulation, respectively. Most amenable variants were in deep intronic regions that are inaccessible to exon-targeted sequencing. We developed ASOs that successfully rescued mis-splicing and ATM cellular signalling in patient fibroblasts for two recurrent variants. In a pilot clinical study, one of these ASOs was used to treat a child who had been diagnosed with ataxia-telangiectasia soon after birth, and showed good tolerability without serious adverse events for three years. Our study provides a framework for the prospective identification of individuals with genetic diseases who might benefit from a therapeutic approach involving splice-switching ASOs.


Assuntos
Ataxia Telangiectasia , Splicing de RNA , Criança , Humanos , Ataxia Telangiectasia/tratamento farmacológico , Ataxia Telangiectasia/genética , Oligonucleotídeos Antissenso/genética , Oligonucleotídeos Antissenso/farmacologia , Oligonucleotídeos Antissenso/uso terapêutico , Estudos Prospectivos , Splicing de RNA/efeitos dos fármacos , Splicing de RNA/genética , Sequenciamento Completo do Genoma , Íntrons , Éxons , Medicina de Precisão , Projetos Piloto
9.
J Am Coll Cardiol ; 81(14): 1320-1335, 2023 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-37019578

RESUMO

BACKGROUND: As the largest conduit vessel, the aorta is responsible for the conversion of phasic systolic inflow from ventricular ejection into more continuous peripheral blood delivery. Systolic distention and diastolic recoil conserve energy and are enabled by the specialized composition of the aortic extracellular matrix. Aortic distensibility decreases with age and vascular disease. OBJECTIVES: In this study, we sought to discover epidemiologic correlates and genetic determinants of aortic distensibility and strain. METHODS: We trained a deep learning model to quantify thoracic aortic area throughout the cardiac cycle from cardiac magnetic resonance images and calculated aortic distensibility and strain in 42,342 UK Biobank participants. RESULTS: Descending aortic distensibility was inversely associated with future incidence of cardiovascular diseases, such as stroke (HR: 0.59 per SD; P = 0.00031). The heritabilities of aortic distensibility and strain were 22% to 25% and 30% to 33%, respectively. Common variant analyses identified 12 and 26 loci for ascending and 11 and 21 loci for descending aortic distensibility and strain, respectively. Of the newly identified loci, 22 were not significantly associated with thoracic aortic diameter. Nearby genes were involved in elastogenesis and atherosclerosis. Aortic strain and distensibility polygenic scores had modest effect sizes for predicting cardiovascular outcomes (delaying or accelerating disease onset by 2%-18% per SD change in scores) and remained statistically significant predictors after accounting for aortic diameter polygenic scores. CONCLUSIONS: Genetic determinants of aortic function influence risk for stroke and coronary artery disease and may lead to novel targets for medical intervention.


Assuntos
Doenças da Aorta , Acidente Vascular Cerebral , Humanos , Aorta Torácica , Aorta , Doenças da Aorta/patologia , Imageamento por Ressonância Magnética
10.
Nat Commun ; 14(1): 2436, 2023 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-37105979

RESUMO

A fundamental challenge in diagnostics is integrating multiple modalities to develop a joint characterization of physiological state. Using the heart as a model system, we develop a cross-modal autoencoder framework for integrating distinct data modalities and constructing a holistic representation of cardiovascular state. In particular, we use our framework to construct such cross-modal representations from cardiac magnetic resonance images (MRIs), containing structural information, and electrocardiograms (ECGs), containing myoelectric information. We leverage the learned cross-modal representation to (1) improve phenotype prediction from a single, accessible phenotype such as ECGs; (2) enable imputation of hard-to-acquire cardiac MRIs from easy-to-acquire ECGs; and (3) develop a framework for performing genome-wide association studies in an unsupervised manner. Our results systematically integrate distinct diagnostic modalities into a common representation that better characterizes physiologic state.


Assuntos
Sistema Cardiovascular , Estudo de Associação Genômica Ampla , Coração/diagnóstico por imagem , Sistema Cardiovascular/diagnóstico por imagem , Eletrocardiografia , Aprendizagem
11.
Nat Genet ; 55(5): 777-786, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37081215

RESUMO

Myocardial interstitial fibrosis is associated with cardiovascular disease and adverse prognosis. Here, to investigate the biological pathways that underlie fibrosis in the human heart, we developed a machine learning model to measure native myocardial T1 time, a marker of myocardial fibrosis, in 41,505 UK Biobank participants who underwent cardiac magnetic resonance imaging. Greater T1 time was associated with diabetes mellitus, renal disease, aortic stenosis, cardiomyopathy, heart failure, atrial fibrillation, conduction disease and rheumatoid arthritis. Genome-wide association analysis identified 11 independent loci associated with T1 time. The identified loci implicated genes involved in glucose transport (SLC2A12), iron homeostasis (HFE, TMPRSS6), tissue repair (ADAMTSL1, VEGFC), oxidative stress (SOD2), cardiac hypertrophy (MYH7B) and calcium signaling (CAMK2D). Using a transforming growth factor ß1-mediated cardiac fibroblast activation assay, we found that 9 of the 11 loci consisted of genes that exhibited temporal changes in expression or open chromatin conformation supporting their biological relevance to myofibroblast cell state acquisition. By harnessing machine learning to perform large-scale quantification of myocardial interstitial fibrosis using cardiac imaging, we validate associations between cardiac fibrosis and disease, and identify new biologically relevant pathways underlying fibrosis.


Assuntos
Cardiomiopatias , Estudo de Associação Genômica Ampla , Humanos , Miocárdio/patologia , Coração , Cardiomiopatias/genética , Cardiomiopatias/patologia , Fibrose
12.
Nat Commun ; 14(1): 1558, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944631

RESUMO

Left ventricular mass is a risk marker for cardiovascular events, and may indicate an underlying cardiomyopathy. Cardiac magnetic resonance is the gold-standard for left ventricular mass estimation, but is challenging to obtain at scale. Here, we use deep learning to enable genome-wide association study of cardiac magnetic resonance-derived left ventricular mass indexed to body surface area within 43,230 UK Biobank participants. We identify 12 genome-wide associations (1 known at TTN and 11 novel for left ventricular mass), implicating genes previously associated with cardiac contractility and cardiomyopathy. Cardiac magnetic resonance-derived indexed left ventricular mass is associated with incident dilated and hypertrophic cardiomyopathies, and implantable cardioverter-defibrillator implant. An indexed left ventricular mass polygenic risk score ≥90th percentile is also associated with incident implantable cardioverter-defibrillator implant in separate UK Biobank (hazard ratio 1.22, 95% CI 1.05-1.44) and Mass General Brigham (hazard ratio 1.75, 95% CI 1.12-2.74) samples. Here, we perform a genome-wide association study of cardiac magnetic resonance-derived indexed left ventricular mass to identify 11 novel variants and demonstrate that cardiac magnetic resonance-derived and genetically predicted indexed left ventricular mass are associated with incident cardiomyopathy.


Assuntos
Cardiomiopatias , Aprendizado Profundo , Humanos , Estudo de Associação Genômica Ampla , Imagem Cinética por Ressonância Magnética , Espectroscopia de Ressonância Magnética , Valor Preditivo dos Testes
13.
Circ Genom Precis Med ; 16(1): e003676, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36580284

RESUMO

BACKGROUND: Absence of a dicrotic notch on finger photoplethysmography is an easily ascertainable and inexpensive trait that has been associated with age and prevalent cardiovascular disease. However, the trait exists along a continuum, and little is known about its genetic underpinnings or prognostic value for incident cardiovascular disease. METHODS: In 169 787 participants in the UK Biobank, we identified absent dicrotic notch on photoplethysmography and created a novel continuous trait reflecting notch smoothness using machine learning. Next, we determined the heritability, genetic basis, polygenic risk, and clinical relations for the binary absent notch trait and the newly derived continuous notch smoothness trait. RESULTS: Heritability of the continuous notch smoothness trait was 7.5%, compared with 5.6% for the binary absent notch trait. A genome-wide association study of notch smoothness identified 15 significant loci, implicating genes including NT5C2 (P=1.2×10-26), IGFBP3 (P=4.8×10-18), and PHACTR1 (P=1.4×10-13), compared with 6 loci for the binary absent notch trait. Notch smoothness stratified risk of incident myocardial infarction or coronary artery disease, stroke, heart failure, and aortic stenosis. A polygenic risk score for notch smoothness was associated with incident cardiovascular disease and all-cause death in UK Biobank participants without available photoplethysmography data. CONCLUSIONS: We found that a machine learning derived continuous trait reflecting dicrotic notch smoothness on photoplethysmography was heritable and associated with genes involved in vascular stiffness. Greater notch smoothness was associated with greater risk of incident cardiovascular disease. Raw digital phenotyping may identify individuals at risk for disease via specific genetic pathways.


Assuntos
Doenças Cardiovasculares , Doença da Artéria Coronariana , Humanos , Estudo de Associação Genômica Ampla , Fatores de Risco , Fenótipo
14.
JACC Adv ; 1(3)2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-36147540

RESUMO

BACKGROUND: State-of-the-art genetic risk interpretation for a common complex disease such as coronary artery disease (CAD) requires assessment for both monogenic variants-such as those related to familial hypercholesterolemia-as well as the cumulative impact of many common variants, as quantified by a polygenic score. OBJECTIVES: The objective of the study was to describe a combined monogenic and polygenic CAD risk assessment program and examine its impact on patient understanding and changes to clinical management. METHODS: Study participants attended an initial visit in a preventive genomics clinic and a disclosure visit to discuss results and recommendations, primarily via telemedicine. Digital postdisclosure surveys and chart review evaluated the impact of disclosure. RESULTS: There were 60 participants (mean age 51 years, 37% women, 72% with no known CAD), including 30 (50%) referred by their cardiologists and 30 (50%) self-referred. Two (3%) participants had a monogenic variant pathogenic for familial hypercholesterolemia, and 19 (32%) had a high polygenic score in the top quintile of the population distribution. In a postdisclosure survey, both the genetic test report (in 80% of participants) and the discussion with the clinician (in 89% of participants) were ranked as very or extremely helpful in understanding the result. Of the 42 participants without CAD, 17 or 40% had a change in management, including statin initiation, statin intensification, or coronary imaging. CONCLUSIONS: Combined monogenic and polygenic assessments for CAD risk provided by preventive genomics clinics are beneficial for patients and result in changes in management in a significant portion of patients.

15.
Cardiovasc Digit Health J ; 3(4): 161-170, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-36046430

RESUMO

Background and Objective: Postexercise heart rate recovery (HRR) is an important indicator of cardiac autonomic function and abnormal HRR is associated with adverse outcomes. We hypothesized that deep learning on resting electrocardiogram (ECG) tracings may identify individuals with impaired HRR. Methods: We trained a deep learning model (convolutional neural network) to infer HRR based on resting ECG waveforms (HRRpred) among UK Biobank participants who had undergone exercise testing. We examined the association of HRRpred with incident cardiovascular disease using Cox models, and investigated the genetic architecture of HRRpred in genome-wide association analysis. Results: Among 56,793 individuals (mean age 57 years, 51% women), the HRRpred model was moderately correlated with actual HRR (r = 0.48, 95% confidence interval [CI] 0.47-0.48). Over a median follow-up of 10 years, we observed 2060 incident diabetes mellitus (DM) events, 862 heart failure events, and 2065 deaths. Higher HRRpred was associated with lower risk of DM (hazard ratio [HR] 0.79 per 1 standard deviation change, 95% CI 0.76-0.83), heart failure (HR 0.89, 95% CI 0.83-0.95), and death (HR 0.83, 95% CI 0.79-0.86). After accounting for resting heart rate, the association of HRRpred with incident DM and all-cause mortality were similar. Genetic determinants of HRRpred included known heart rate, cardiac conduction system, cardiomyopathy, and metabolic trait loci. Conclusion: Deep learning-derived estimates of HRR using resting ECG independently associated with future clinical outcomes, including new-onset DM and all-cause mortality. Inferring postexercise heart rate response from a resting ECG may have potential clinical implications and impact on preventive strategies warrants future study.

16.
J Am Coll Cardiol ; 80(5): 486-497, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35902171

RESUMO

BACKGROUND: The left ventricular outflow tract (LVOT) and ascending aorta are spatially complex, with distinct pathologies and embryologic origins. Prior work examined the genetics of thoracic aortic diameter in a single plane. OBJECTIVES: We sought to elucidate the genetic basis for the diameter of the LVOT, aortic root, and ascending aorta. METHODS: Using deep learning, we analyzed 2.3 million cardiac magnetic resonance images from 43,317 UK Biobank participants. We computed the diameters of the LVOT, the aortic root, and at 6 locations of ascending aorta. For each diameter, we conducted a genome-wide association study and generated a polygenic score. Finally, we investigated associations between these scores and disease incidence. RESULTS: A total of 79 loci were significantly associated with at least 1 diameter. Of these, 35 were novel, and most were associated with 1 or 2 diameters. A polygenic score of aortic diameter approximately 13 mm from the sinotubular junction most strongly predicted thoracic aortic aneurysm (n = 427,016; mean HR: 1.42 per SD; 95% CI: 1.34-1.50; P = 6.67 × 10-21). A polygenic score predicting a smaller aortic root was predictive of aortic stenosis (n = 426,502; mean HR: 1.08 per SD; 95% CI: 1.03-1.12; P = 5 × 10-6). CONCLUSIONS: We detected distinct genetic loci underpinning the diameters of the LVOT, aortic root, and at several segments of ascending aorta. We spatially defined a region of aorta whose genetics may be most relevant to predicting thoracic aortic aneurysm. We further described a genetic signature that may predispose to aortic stenosis. Understanding genetic contributions to proximal aortic diameter may enable identification of individuals at risk for aortic disease and facilitate prioritization of therapeutic targets.


Assuntos
Aneurisma , Aneurisma da Aorta Torácica , Estenose da Valva Aórtica , Aorta/diagnóstico por imagem , Aorta/patologia , Aneurisma da Aorta Torácica/diagnóstico , Aneurisma da Aorta Torácica/epidemiologia , Aneurisma da Aorta Torácica/genética , Estenose da Valva Aórtica/genética , Constrição Patológica , Estudo de Associação Genômica Ampla , Humanos
17.
Nat Genet ; 54(6): 792-803, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35697867

RESUMO

Congenital heart diseases often involve maldevelopment of the evolutionarily recent right heart chamber. To gain insight into right heart structure and function, we fine-tuned deep learning models to recognize the right atrium, right ventricle and pulmonary artery, measuring right heart structures in 40,000 individuals from the UK Biobank with magnetic resonance imaging. Genome-wide association studies identified 130 distinct loci associated with at least one right heart measurement, of which 72 were not associated with left heart structures. Loci were found near genes previously linked with congenital heart disease, including NKX2-5, TBX5/TBX3, WNT9B and GATA4. A genome-wide polygenic predictor of right ventricular ejection fraction was associated with incident dilated cardiomyopathy (hazard ratio, 1.33 per standard deviation; P = 7.1 × 10-13) and remained significant after accounting for a left ventricular polygenic score. Harnessing deep learning to perform large-scale cardiac phenotyping, our results yield insights into the genetic determinants of right heart structure and function.


Assuntos
Cardiomiopatia Dilatada , Cardiopatias Congênitas , Cardiomiopatia Dilatada/patologia , Estudo de Associação Genômica Ampla , Coração , Humanos , Volume Sistólico , Função Ventricular Direita
19.
NPJ Digit Med ; 5(1): 47, 2022 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-35396454

RESUMO

Electronic health record (EHR) datasets are statistically powerful but are subject to ascertainment bias and missingness. Using the Mass General Brigham multi-institutional EHR, we approximated a community-based cohort by sampling patients receiving longitudinal primary care between 2001-2018 (Community Care Cohort Project [C3PO], n = 520,868). We utilized natural language processing (NLP) to recover vital signs from unstructured notes. We assessed the validity of C3PO by deploying established risk models for myocardial infarction/stroke and atrial fibrillation. We then compared C3PO to Convenience Samples including all individuals from the same EHR with complete data, but without a longitudinal primary care requirement. NLP reduced the missingness of vital signs by 31%. NLP-recovered vital signs were highly correlated with values derived from structured fields (Pearson r range 0.95-0.99). Atrial fibrillation and myocardial infarction/stroke incidence were lower and risk models were better calibrated in C3PO as opposed to the Convenience Samples (calibration error range for myocardial infarction/stroke: 0.012-0.030 in C3PO vs. 0.028-0.046 in Convenience Samples; calibration error for atrial fibrillation 0.028 in C3PO vs. 0.036 in Convenience Samples). Sampling patients receiving regular primary care and using NLP to recover missing data may reduce bias and maximize generalizability of EHR research.

20.
Cell Genom ; 2(1)2022 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-35199087

RESUMO

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA