RESUMEN
Rationale: Cigarette smoking (CS) impairs B cell function and antibody production, increasing infection risk. The impact of e-cigarette use ('vaping') and combined CS and vaping ('dual-use') on B cell activity is unclear. Objective: To examine B cell receptor sequencing (BCR-seq) profiles associated with CS, vaping, dual-use, COPD-related outcomes, and demographic factors. Methods: BCR-seq was performed on blood RNA samples from 234 participants in the COPDGene study. We assessed multivariable associations of B cell function measures (immunoglobulin heavy chain (IGH) subclass expression and usage, class-switching, V-segment usage, and clonal expansion) with CS, vaping, dual-use, COPD severity, age, sex, and race. We adjusted for multiple comparisons using the Benjamini-Hochberg method, identifying significant associations at 5% FDR and suggestive associations at 10% FDR. Results: Among 234 non-Hispanic white (NHW) and African American (AA) participants, CS and dual-use were significantly positively associated with increased secretory IgA production, with dual-use showing the strongest associations. Dual-use was positively associated with class switching and B cell clonal expansion, indicating increased B cell activation, with similar trends in those only smoking or only vaping. We observed significant associations between race and IgG antibody usage. AA participants had higher IgG subclass proportions and lower IgM usage compared to NHW participants. Conclusions: CS and vaping additively enhance B cell activation, most notably in dual-users. Self-reported race was strongly associated with IgG isotype usage. These findings highlight associations between B cell activation and antibody transcription, suggesting potential differences in immune and vaccine responses linked to CS, vaping, and race.
RESUMEN
BACKGROUND: Chronic obstructive pulmonary disease (COPD) exhibits considerable progression heterogeneity. We hypothesized that elastic principal graph analysis (EPGA) would identify distinct clinical phenotypes and their longitudinal relationships. METHODS: Cross-sectional data from 8,972 tobacco-exposed COPDGene participants, with and without COPD, were used to train a model with EPGA, using thirty clinical, physiologic and CT features. Principal component analysis (PCA) was used to reduce data dimensionality to six principal components. An elastic principal tree was fitted to the reduced space. 4,585 participants from COPDGene Phase 2 were used to test longitudinal trajectories. 2,652 participants from SPIROMICS tested external reproducibility. RESULTS: Our analysis used cross-sectional data to create an elastic principal tree, where the concept of time is represented by distance on the tree. Six clinically distinct tree segments were identified that differed by lung function, symptoms, and CT features: 1) Subclinical (SC); 2) Parenchymal Abnormality (PA); 3) Chronic Bronchitis (CB); 4) Emphysema Male (EM); 5) Emphysema Female (EF); and 6) Severe Airways (SA) disease. Cross-sectional SPIROMICS data confirmed similar groupings. 5-year data from COPDGene mapped longitudinal changes onto the tree. 29% of patients changed segment during follow-up; longitudinal trajectories confirmed a net flow of patients along the tree, from SC towards Emphysema, although alternative trajectories were noted, through airway disease predominant phenotypes, CB and SA. CONCLUSION: This novel analytic methodology provides an approach to defining longitudinal phenotypic trajectories using cross sectional data. These insights are clinically relevant and could facilitate precision therapy and future trials to modify disease progression.
RESUMEN
RATIONALE: While many studies have examined gene expression in lung tissue, the gene regulatory processes underlying emphysema are still not well understood. Finding efficient non-imaging screening methods and disease-modifying therapies has been challenging, but knowledge of the transcriptomic features of emphysema may help in this effort. OBJECTIVES: Our goals were to identify emphysema-associated biological pathways through transcriptomic analysis of bulk lung tissue, to determine the lung cell types in which these emphysema-associated pathways are altered, and to detect unique and overlapping transcriptomic signatures in blood and lung samples. METHODS: Using RNA-sequencing data from 446 samples in the Lung Tissue Research Consortium (LTRC) and 3,606 blood samples from the COPDGene study, we examined the transcriptomic features of chest computed tomography-quantified emphysema. We also leveraged publicly available lung single-cell RNA-sequencing data to identify cell types showing COPD-associated differential expression of the emphysema pathways found in the bulk analyses. MEASUREMENTS AND MAIN RESULTS: In the bulk lung RNA-seq analysis, 1,087 differentially expressed genes and 34 dysregulated pathways were significantly associated with emphysema. We observed alternative splicing of several genes and increased activity in pluripotency and cell barrier function pathways. Lung tissue and blood samples shared differentially expressed genes and biological pathways. Multiple lung cell types displayed dysregulation of epithelial barrier function pathways, and distinct pathway activities were observed among various macrophage subpopulations. CONCLUSIONS: This study identified emphysema-related changes in gene expression and alternative splicing, cell-type specific dysregulated pathways, and instances of shared pathway dysregulation between blood and lung.
RESUMEN
Hypoxic reprogramming of vasculature relies on genetic, epigenetic, and metabolic circuitry, but the control points are unknown. In pulmonary arterial hypertension (PAH), a disease driven by hypoxia inducible factor (HIF)-dependent vascular dysfunction, HIF-2α promoted expression of neighboring genes, long noncoding RNA (lncRNA) histone lysine N-methyltransferase 2E-antisense 1 (KMT2E-AS1) and histone lysine N-methyltransferase 2E (KMT2E). KMT2E-AS1 stabilized KMT2E protein to increase epigenetic histone 3 lysine 4 trimethylation (H3K4me3), driving HIF-2α-dependent metabolic and pathogenic endothelial activity. This lncRNA axis also increased HIF-2α expression across epigenetic, transcriptional, and posttranscriptional contexts, thus promoting a positive feedback loop to further augment HIF-2α activity. We identified a genetic association between rs73184087, a single-nucleotide variant (SNV) within a KMT2E intron, and disease risk in PAH discovery and replication patient cohorts and in a global meta-analysis. This SNV displayed allele (G)-specific association with HIF-2α, engaged in long-range chromatin interactions, and induced the lncRNA-KMT2E tandem in hypoxic (G/G) cells. In vivo, KMT2E-AS1 deficiency protected against PAH in mice, as did pharmacologic inhibition of histone methylation in rats. Conversely, forced lncRNA expression promoted more severe PH. Thus, the KMT2E-AS1/KMT2E pair orchestrates across convergent multi-ome landscapes to mediate HIF-2α pathobiology and represents a key clinical target in pulmonary hypertension.
Asunto(s)
Hipertensión Pulmonar , ARN Largo no Codificante , Humanos , Ratas , Animales , Ratones , Alelos , Hipertensión Pulmonar/genética , Histonas , ARN Largo no Codificante/genética , Roedores , Lisina , Hipertensión Pulmonar Primaria Familiar , Hipoxia/genética , Metiltransferasas , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genéticaRESUMEN
Rationale: Emphysema is a chronic obstructive pulmonary disease phenotype with important prognostic implications. Identifying blood-based biomarkers of emphysema will facilitate early diagnosis and development of targeted therapies. Objectives: To discover blood omics biomarkers for chest computed tomography-quantified emphysema and develop predictive biomarker panels. Methods: Emphysema blood biomarker discovery was performed using differential gene expression, alternative splicing, and protein association analyses in a training sample of 2,370 COPDGene participants with available blood RNA sequencing, plasma proteomics, and clinical data. Internal validation was conducted in a COPDGene testing sample (n = 1,016), and external validation was done in the ECLIPSE study (n = 526). Because low body mass index (BMI) and emphysema often co-occur, we performed a mediation analysis to quantify the effect of BMI on gene and protein associations with emphysema. Elastic net models with bootstrapping were also developed in the training sample sequentially using clinical, blood cell proportions, RNA-sequencing, and proteomic biomarkers to predict quantitative emphysema. Model accuracy was assessed by the area under the receiver operating characteristic curves for subjects stratified into tertiles of emphysema severity. Measurements and Main Results: Totals of 3,829 genes, 942 isoforms, 260 exons, and 714 proteins were significantly associated with emphysema (false discovery rate, 5%) and yielded 11 biological pathways. Seventy-four percent of these genes and 62% of these proteins showed mediation by BMI. Our prediction models demonstrated reasonable predictive performance in both COPDGene and ECLIPSE. The highest-performing model used clinical, blood cell, and protein data (area under the receiver operating characteristic curve in COPDGene testing, 0.90; 95% confidence interval, 0.85-0.90). Conclusions: Blood transcriptome and proteome-wide analyses revealed key biological pathways of emphysema and enhanced the prediction of emphysema.
Asunto(s)
Enfisema , Enfermedad Pulmonar Obstructiva Crónica , Enfisema Pulmonar , Humanos , Transcriptoma , Proteómica , Enfisema Pulmonar/genética , Enfisema Pulmonar/complicaciones , Biomarcadores , Perfilación de la Expresión GénicaRESUMEN
Rationale: Chronic obstructive pulmonary disease (COPD) is characterized by pathologic changes in the airways, lung parenchyma, and persistent inflammation, but the links between lung structural changes and blood transcriptome patterns have not been fully described. Objections: The objective of this study was to identify novel relationships between lung structural changes measured by chest computed tomography (CT) and blood transcriptome patterns measured by blood RNA sequencing (RNA-seq). Methods: CT scan images and blood RNA-seq gene expression from 1223 participants in the COPD Genetic Epidemiology (COPDGene®) study were jointly analyzed using deep learning to identify shared aspects of inflammation and lung structural changes that we labeled image-expression axes (IEAs). We related IEAs to COPD-related measurements and prospective health outcomes through regression and Cox proportional hazards models and tested them for biological pathway enrichment. Results: We identified 2 distinct IEAs: IEAemph which captures an emphysema-predominant process with a strong positive correlation to CT emphysema and a negative correlation to forced expiratory volume in 1 second and body mass index (BMI); and IEAairway which captures an airway-predominant process with a positive correlation to BMI and airway wall thickness and a negative correlation to emphysema. Pathway enrichment analysis identified 29 and 13 pathways significantly associated with IEAemph and IEAairway, respectively (adjusted p<0.001). Conclusions: Integration of CT scans and blood RNA-seq data identified 2 IEAs that capture distinct inflammatory processes associated with emphysema and airway-predominant COPD.
RESUMEN
INTRODUCTION: Chronic obstructive pulmonary disease (COPD) can progress across several domains, complicating the identification of the determinants of disease progression. In our previous work, we applied k-means clustering to spirometric and chest radiological measures to identify four COPD-related subtypes: 'relatively resistant smokers (RRS)', 'mild upper lobe-predominant emphysema (ULE)', 'airway-predominant disease (AD)' and 'severe emphysema (SE)'. In the current study, we examined the associations of these subtypes to longitudinal COPD-related health measures as well as blood transcriptomic and plasma proteomic biomarkers. METHODS: We included 8266 non-Hispanic white and African-American smokers from the COPDGene study. We used linear regression to investigate cluster associations to 5-year prospective changes in spirometric and radiological measures and to gene expression and protein levels. We used Cox-proportional hazard test to test for cluster associations to prospective exacerbations, comorbidities and mortality. RESULTS: The RRS, ULE, AD and SE clusters represented 39%, 15%, 26% and 20% of the studied cohort at baseline, respectively. The SE cluster had the greatest 5-year FEV1 (forced expiratory volume in 1 s) and emphysema progression, and the highest risks of exacerbations, cardiovascular disease and mortality. The AD cluster had the highest diabetes risk. After adjustments, only the SE cluster had an elevated respiratory mortality risk, while the ULE, AD and SE clusters had elevated all-cause mortality risks. These clusters also demonstrated differential protein and gene expression biomarker associations, mostly related to inflammatory and immune processes. CONCLUSION: COPD k-means subtypes demonstrate varying rates of disease progression, prospective comorbidities, mortality and associations to transcriptomic and proteomic biomarkers. These findings emphasise the clinical and biological relevance of these subtypes, which call for more study for translation into clinical practice. TRAIL REGISTRATION NUMBER: NCT00608764.
Asunto(s)
Enfisema , Enfermedad Pulmonar Obstructiva Crónica , Enfisema Pulmonar , Biomarcadores , Análisis por Conglomerados , Progresión de la Enfermedad , Enfisema/complicaciones , Humanos , Estudios Prospectivos , Proteómica , Enfermedad Pulmonar Obstructiva Crónica/complicaciones , Enfisema Pulmonar/complicaciones , Enfisema Pulmonar/diagnóstico por imagen , Tomografía Computarizada por Rayos XRESUMEN
Background: The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. Methods: We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene®) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV1) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit. Results: Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD. Conclusion: Random forest, along with deep phenotyping, predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials.
RESUMEN
Rationale: Multiple studies have demonstrated an increased risk of chronic obstructive pulmonary disease (COPD) in heterozygous carriers of the AAT (alpha-1 antitrypsin) Z allele. However, it is not known if MZ subjects with COPD are phenotypically different from noncarriers (MM genotype) with COPD. Objectives: To assess if MZ subjects with COPD have different clinical features compared with MM subjects with COPD. Methods: Genotypes of SERPINA1 were ascertained by using whole-genome sequencing data in three independent studies. We compared outcomes between MM subjects with COPD and MZ subjects with COPD in each study and combined the results in a meta-analysis. We performed longitudinal and survival analyses to compare outcomes in MM and MZ subjects with COPD over time. Measurements and Main Results: We included 290 MZ subjects with COPD and 6,184 MM subjects with COPD across the three studies. MZ subjects had a lower FEV1% predicted and greater quantitative emphysema on chest computed tomography scans compared with MM subjects. In a meta-analysis, the FEV1 was 3.9% lower (95% confidence interval [CI], -6.55% to -1.26%) and emphysema (the percentage of lung attenuation areas <-950 HU) was 4.14% greater (95% CI, 1.44% to 6.84%) in MZ subjects. We found one gene, PGF (placental growth factor), to be differentially expressed in lung tissue from one study between MZ subjects and MM subjects. Conclusions: Carriers of the AAT Z allele (those who were MZ heterozygous) with COPD had lower lung function and more emphysema than MM subjects with COPD. Taken with the subtle differences in gene expression between the two groups, our findings suggest that MZ subjects represent an endotype of COPD.
Asunto(s)
Genotipo , Heterocigoto , Fenotipo , Enfermedad Pulmonar Obstructiva Crónica/genética , alfa 1-Antitripsina/genética , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Casos y Controles , Femenino , Marcadores Genéticos , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Enfermedad Pulmonar Obstructiva Crónica/mortalidad , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Pruebas de Función Respiratoria , Análisis de Supervivencia , Secuenciación Completa del GenomaRESUMEN
Rationale: The ability of peripheral blood biomarkers to assess chronic obstructive pulmonary disease (COPD) risk and progression is unknown. Genetics and gene expression may capture important aspects of COPD-related biology that predict disease activity. Objectives: Develop a transcriptional risk score (TRS) for COPD and assess the contribution of the TRS and a polygenic risk score (PRS) for disease susceptibility and progression. Methods: We randomly split 2,569 COPDGene (Genetic Epidemiology of COPD) participants with whole-blood RNA sequencing into training (n = 1,945) and testing (n = 624) samples and used 468 ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate End-points) COPD cases with microarray data for replication. We developed a TRS using penalized regression (least absolute shrinkage and selection operator) to model FEV1/FVC and studied the predictive value of TRS for COPD (Global Initiative for Chronic Obstructive Lung Disease 2-4), prospective FEV1 change (ml/yr), and additional COPD-related traits. We adjusted for potential confounders, including age and smoking. We evaluated the predictive performance of the TRS in the context of a previously derived PRS and clinical factors. Measurements and Main Results: The TRS included 147 transcripts and was associated with COPD (odds ratio, 3.3; 95% confidence interval [CI], 2.4-4.5; P < 0.001), FEV1 change (ß, -17 ml/yr; 95% CI, -28 to -6.6; P = 0.002), and other COPD-related traits. In ECLIPSE cases, we replicated the association with FEV1 change (ß, -8.2; 95% CI, -15 to -1; P = 0.025) and the majority of other COPD-related traits. Models including PRS, TRS, and clinical factors were more predictive of COPD (area under the receiver operator characteristic curve, 0.84) and annualized FEV1 change compared with models with one risk score or clinical factors alone. Conclusions: Blood transcriptomics can improve prediction of COPD and lung function decline when added to a PRS and clinical risk factors.
Asunto(s)
Biomarcadores/sangre , Progresión de la Enfermedad , Enfermedad Pulmonar Obstructiva Crónica/sangre , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Medición de Riesgo/métodos , Anciano , Femenino , Regulación de la Expresión Génica , Predisposición Genética a la Enfermedad , Humanos , Masculino , Persona de Mediana Edad , Herencia Multifactorial , Oportunidad Relativa , Fenotipo , Valor Predictivo de las Pruebas , Estudios Prospectivos , Factores de Riesgo , Índice de Severidad de la Enfermedad , Factores de TranscripciónRESUMEN
Cigarette smoking induces a profound transcriptomic and systemic inflammatory response. Previous studies have focused on gene level differential expression of smoking, but the genome-wide effects of smoking on alternative isoform regulation have not yet been described. We conducted RNA sequencing in whole-blood samples of 454 current and 767 former smokers in the COPDGene Study, and we analyzed the effects of smoking on differential usage of isoforms and exons. At 10% FDR, we detected 3167 differentially expressed genes, 945 differentially used isoforms and 160 differentially used exons. Isoform switch analysis revealed widespread 3' UTR lengthening associated with cigarette smoking. The lengthening of these 3' UTRs was consistent with alternative usage of distal polyadenylation sites, and these extended 3' UTR regions were significantly enriched with functional sequence elements including microRNA and RNA-protein binding sites. These findings warrant further studies on alternative polyadenylation events as potential biomarkers and novel therapeutic targets for smoking-related diseases.
Asunto(s)
Fumar Cigarrillos , Poliadenilación , Regiones no Traducidas 3' , Fumar Cigarrillos/efectos adversos , Fumar Cigarrillos/genética , Isoformas de Proteínas/genética , Fumar/efectos adversos , Fumar/genéticaRESUMEN
Most predictive models based on gene expression data do not leverage information related to gene splicing, despite the fact that splicing is a fundamental feature of eukaryotic gene expression. Cigarette smoking is an important environmental risk factor for many diseases, and it has profound effects on gene expression. Using smoking status as a prediction target, we developed deep neural network predictive models using gene, exon, and isoform level quantifications from RNA sequencing data in 2,557 subjects in the COPDGene Study. We observed that models using exon and isoform quantifications clearly outperformed gene-level models when using data from 5 genes from a previously published prediction model. Whereas the test set performance of the previously published model was 0.82 in the original publication, our exon-based models including an exon-to-isoform mapping layer achieved a test set AUC (area under the receiver operating characteristic) of 0.88, which improved to an AUC of 0.94 using exon quantifications from a larger set of genes. Isoform variability is an important source of latent information in RNA-seq data that can be used to improve clinical prediction models.
Asunto(s)
Aprendizaje Profundo , Modelos Estadísticos , RNA-Seq/métodos , Fumar , Anciano , Biología Computacional , Exones/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Isoformas de Proteínas/genética , Curva ROC , Fumar/epidemiología , Fumar/genéticaAsunto(s)
Arteria Pulmonar/fisiopatología , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Fumar/efectos adversos , Enfermedades Vasculares/fisiopatología , Adulto , Anciano , Angiografía por Tomografía Computarizada , Progresión de la Enfermedad , Femenino , Volumen Espiratorio Forzado , Humanos , Masculino , Persona de Mediana Edad , Arteria Pulmonar/diagnóstico por imagen , Enfermedad Pulmonar Obstructiva Crónica/etiología , Enfermedad Pulmonar Obstructiva Crónica/mortalidad , Índice de Severidad de la Enfermedad , Fumar/mortalidad , Espirometría , Enfermedades Vasculares/diagnóstico por imagen , Enfermedades Vasculares/etiología , Enfermedades Vasculares/mortalidad , Capacidad VitalRESUMEN
The quality of service in healthcare is constantly challenged by outlier events such as pandemics (i.e., Covid-19) and natural disasters (such as hurricanes and earthquakes). In most cases, such events lead to critical uncertainties in decision-making, as well as in multiple medical and economic aspects at a hospital. External (geographic) or internal factors (medical and managerial) lead to shifts in planning and budgeting, but most importantly, reduce confidence in conventional processes. In some cases, support from other hospitals proves necessary, which exacerbates the planning aspect. This paper presents three data-driven methods that provide data-driven indicators to help healthcare managers organize their economics and identify the most optimum plan for resources allocation and sharing. Conventional decision-making methods fall short in recommending validated policies for managers. Using reinforcement learning, genetic algorithms, traveling salesman, and clustering, we experimented with different healthcare variables and presented tools and outcomes that could be applied at health institutes. Experiments are performed; the results are recorded, evaluated, and presented.
RESUMEN
COPD is a heterogeneous syndrome. Many COPD subtypes have been proposed, but there is not yet consensus on how many COPD subtypes there are and how they should be defined. The COPD Genetic Epidemiology Study (COPDGene), which has generated 10-year longitudinal chest imaging, spirometry, and molecular data, is a rich resource for relating COPD phenotypes to underlying genetic and molecular mechanisms. In this article, we place COPDGene clustering studies in context with other highly cited COPD clustering studies, and summarize the main COPD subtype findings from COPDGene. First, most manifestations of COPD occur along a continuum, which explains why continuous aspects of COPD or disease axes may be more accurate and reproducible than subtypes identified through clustering methods. Second, continuous COPD-related measures can be used to create subgroups through the use of predictive models to define cut-points, and we review COPDGene research on blood eosinophil count thresholds as a specific example. Third, COPD phenotypes identified or prioritized through machine learning methods have led to novel biological discoveries, including novel emphysema genetic risk variants and systemic inflammatory subtypes of COPD. Fourth, trajectory-based COPD subtyping captures differences in the longitudinal evolution of COPD, addressing a major limitation of clustering analyses that are confounded by disease severity. Ongoing longitudinal characterization of subjects in COPDGene will provide useful insights about the relationship between lung imaging parameters, molecular markers, and COPD progression that will enable the identification of subtypes based on underlying disease processes and distinct patterns of disease progression, with the potential to improve the clinical relevance and reproducibility of COPD subtypes.
Asunto(s)
Aprendizaje Automático , Epidemiología Molecular , Enfermedad Pulmonar Obstructiva Crónica/clasificación , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Enfermedad Pulmonar Obstructiva Crónica/genética , Análisis por Conglomerados , Diagnóstico por Imagen , Progresión de la Enfermedad , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Pruebas de Función RespiratoriaRESUMEN
BACKGROUND: Chronic obstructive pulmonary disease (COPD) remains a major cause of morbidity and mortality. Present-day diagnostic criteria are largely based solely on spirometric criteria. Accumulating evidence has identified a substantial number of individuals without spirometric evidence of COPD who suffer from respiratory symptoms and/or increased morbidity and mortality. There is a clear need for an expanded definition of COPD that is linked to physiologic, structural (computed tomography [CT]) and clinical evidence of disease. Using data from the COPD Genetic Epidemiology study (COPDGene®), we hypothesized that an integrated approach that includes environmental exposure, clinical symptoms, chest CT imaging and spirometry better defines disease and captures the likelihood of progression of respiratory obstruction and mortality. METHODS: Four key disease characteristics - environmental exposure (cigarette smoking), clinical symptoms (dyspnea and/or chronic bronchitis), chest CT imaging abnormalities (emphysema, gas trapping and/or airway wall thickening), and abnormal spirometry - were evaluated in a group of 8784 current and former smokers who were participants in COPDGene® Phase 1. Using these 4 disease characteristics, 8 categories of participants were identified and evaluated for odds of spirometric disease progression (FEV1 > 350 ml loss over 5 years), and the hazard ratio for all-cause mortality was examined. RESULTS: Using smokers without symptoms, CT imaging abnormalities or airflow obstruction as the reference population, individuals were classified as Possible COPD, Probable COPD and Definite COPD. Current Global initiative for obstructive Lung Disease (GOLD) criteria would diagnose 4062 (46%) of the 8784 study participants with COPD. The proposed COPDGene® 2019 diagnostic criteria would add an additional 3144 participants. Under the new criteria, 82% of the 8784 study participants would be diagnosed with Possible, Probable or Definite COPD. These COPD groups showed increased risk of disease progression and mortality. Mortality increased in patients as the number of their COPD characteristics increased, with a maximum hazard ratio for all cause-mortality of 5.18 (95% confidence interval [CI]: 4.15-6.48) in those with all 4 disease characteristics. CONCLUSIONS: A substantial portion of smokers with respiratory symptoms and imaging abnormalities do not manifest spirometric obstruction as defined by population normals. These individuals are at significant risk of death and spirometric disease progression. We propose to redefine the diagnosis of COPD through an integrated approach using environmental exposure, clinical symptoms, CT imaging and spirometric criteria. These expanded criteria offer the potential to stimulate both current and future interventions that could slow or halt disease progression in patients before disability or irreversible lung structural changes develop.
RESUMEN
Genome-wide association studies (GWAS) have identified multiple associations with emphysema apicobasal distribution (EABD), but the biological functions of these variants are unknown. To characterize the functions of EABD-associated variants, we integrated GWAS results with 1) expression quantitative trait loci (eQTL) from the Genotype Tissue Expression (GTEx) project and subjects in the COPDGene (Genetic Epidemiology of COPD) study and 2) cell type epigenomic marks from the Roadmap Epigenomics project. On the basis of these analyses, we selected a variant near ACVR1B (activin A receptor type 1B) for functional validation. SNPs from 168 loci with P values less than 5 × 10-5 in the largest GWAS meta-analysis of EABD were analyzed. Eighty-four loci overlapped eQTL, with 12 of these loci showing greater than 80% likelihood of harboring a single, shared GWAS and eQTL causal variant. Seventeen cell types were enriched for overlap between EABD loci and Roadmap Epigenomics marks (permutation P < 0.05), with the strongest enrichment observed in CD4+, CD8+, and regulatory T cells. We selected a putative causal variant, rs7962469, associated with ACVR1B expression in lung tissue for additional functional investigation, and reporter assays confirmed allele-specific regulatory activity for this variant in human bronchial epithelial and Jurkat immune cell lines. ACVR1B expression levels exhibit a nominally significant association with emphysema distribution. EABD-associated loci are preferentially enriched in regulatory elements of multiple cell types, most notably T-cell subsets. Multiple EABD loci colocalize to regulatory elements that are active across multiple tissues and cell types, and functional analyses confirm the presence of an EABD-associated functional variant that regulates ACVR1B expression, indicating that transforming growth factor-ß signaling plays a role in the EABD phenotype. Clinical trial registered with www.clinicaltrials.gov (NCT00608764).
Asunto(s)
Receptores de Activinas Tipo I/genética , Predisposición Genética a la Enfermedad/genética , Enfisema Pulmonar/genética , Factor de Crecimiento Transformador beta1/metabolismo , Línea Celular Tumoral , Estudio de Asociación del Genoma Completo , Humanos , Células Jurkat , Pulmón/patología , Polimorfismo de Nucleótido Simple/genética , Prueba de Estudio Conceptual , Sitios de Carácter Cuantitativo/genética , Subgrupos de Linfocitos T/inmunologíaRESUMEN
Predicting disease status for a complex human disease using genomic data is an important, yet challenging, step in personalized medicine. Among many challenges, the so-called curse of dimensionality problem results in unsatisfied performances of many state-of-art machine learning algorithms. A major recent advance in machine learning is the rapid development of deep learning algorithms that can efficiently extract meaningful features from high-dimensional and complex datasets through a stacked and hierarchical learning process. Deep learning has shown breakthrough performance in several areas including image recognition, natural language processing, and speech recognition. However, the performance of deep learning in predicting disease status using genomic datasets is still not well studied. In this article, we performed a review on the four relevant articles that we found through our thorough literature search. All four articles first used auto-encoders to project high-dimensional genomic data to a low dimensional space and then applied the state-of-the-art machine learning algorithms to predict disease status based on the low-dimensional representations. These deep learning approaches outperformed existing prediction methods, such as prediction based on transcript-wise screening and prediction based on principal component analysis. The limitations of the current deep learning approach and possible improvements were also discussed.
RESUMEN
BACKGROUND: Emphysema has considerable variability in its regional distribution. Craniocaudal emphysema distribution is an important predictor of the response to lung volume reduction. However, there is little consensus regarding how to define upper lobe-predominant and lower lobe-predominant emphysema subtypes. Consequently, the clinical and genetic associations with these subtypes are poorly characterized. METHODS: We sought to identify subgroups characterized by upper-lobe or lower-lobe emphysema predominance and comparable amounts of total emphysema by analyzing data from 9,210 smokers without alpha-1-antitrypsin deficiency in the Genetic Epidemiology of COPD (COPDGene) cohort. CT densitometric emphysema was measured in each lung lobe. Random forest clustering was applied to lobar emphysema variables after regressing out the effects of total emphysema. Clusters were tested for association with clinical and imaging outcomes at baseline and at 5-year follow-up. Their associations with genetic variants were also compared. RESULTS: Three clusters were identified: minimal emphysema (n = 1,312), upper lobe-predominant emphysema (n = 905), and lower lobe-predominant emphysema (n = 796). Despite a similar amount of total emphysema, the lower-lobe group had more severe airflow obstruction at baseline and higher rates of metabolic syndrome compared with subjects with upper-lobe predominance. The group with upper-lobe predominance had greater 5-year progression of emphysema, gas trapping, and dyspnea. Differential associations with known COPD genetic risk variants were noted. CONCLUSIONS: Subgroups of smokers defined by upper-lobe or lower-lobe emphysema predominance exhibit different functional and radiological disease progression rates, and the upper-lobe predominant subtype shows evidence of association with known COPD genetic risk variants. These subgroups may be useful in the development of personalized treatments for COPD.
Asunto(s)
Enfisema Pulmonar/patología , Anciano , Comorbilidad , Progresión de la Enfermedad , Femenino , Volumen Espiratorio Forzado/fisiología , Humanos , Masculino , Persona de Mediana Edad , Enfisema Pulmonar/fisiopatología , Índice de Severidad de la Enfermedad , Tomografía Computarizada por Rayos XRESUMEN
RATIONALE: Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe-predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. OBJECTIVES: To identify the genetic influences of emphysema distribution in non-alpha-1 antitrypsin-deficient smokers. METHODS: A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism-, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. MEASUREMENTS AND MAIN RESULTS: We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. CONCLUSIONS: This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic approaches in chronic obstructive pulmonary disease. Clinical trial registered with www.clinicaltrials.gov (NCT 00608764).