RESUMO
Chronic obstructive pulmonary disease (COPD) is a complex disease influenced by well-established environmental exposures (most notably, cigarette smoking) and incompletely defined genetic factors. The chromosome 4q region harbors multiple genetic risk loci for COPD, including signals near HHIP, FAM13A, GSTCD, TET2, and BTC. Leveraging RNA-Seq data from lung tissue in COPD cases and controls, we estimated the co-expression network for genes in the 4q region bounded by HHIP and BTC (~70MB), through partial correlations informed by protein-protein interactions. We identified several co-expressed gene pairs based on partial correlations, including NPNT-HHIP, BTC-NPNT and FAM13A-TET2, which were replicated in independent lung tissue cohorts. Upon clustering the co-expression network, we observed that four genes previously associated to COPD: BTC, HHIP, NPNT and PPM1K appeared in the same network community. Finally, we discovered a sub-network of genes differentially co-expressed between COPD vs controls (including FAM13A, PPA2, PPM1K and TET2). Many of these genes were previously implicated in cell-based knock-out experiments, including the knocking out of SPP1 which belongs to the same genomic region and could be a potential local key regulatory gene. These analyses identify chromosome 4q as a region enriched for COPD genetic susceptibility and differential co-expression.
RESUMO
Rationale: Cigarette smoking (CS) impairs B cell function and antibody production, increasing infection risk. The impact of e-cigarette use ('vaping') and combined CS and vaping ('dual-use') on B cell activity is unclear. Objective: To examine B cell receptor sequencing (BCR-seq) profiles associated with CS, vaping, dual-use, COPD-related outcomes, and demographic factors. Methods: BCR-seq was performed on blood RNA samples from 234 participants in the COPDGene study. We assessed multivariable associations of B cell function measures (immunoglobulin heavy chain (IGH) subclass expression and usage, class-switching, V-segment usage, and clonal expansion) with CS, vaping, dual-use, COPD severity, age, sex, and race. We adjusted for multiple comparisons using the Benjamini-Hochberg method, identifying significant associations at 5% FDR and suggestive associations at 10% FDR. Results: Among 234 non-Hispanic white (NHW) and African American (AA) participants, CS and dual-use were significantly positively associated with increased secretory IgA production, with dual-use showing the strongest associations. Dual-use was positively associated with class switching and B cell clonal expansion, indicating increased B cell activation, with similar trends in those only smoking or only vaping. We observed significant associations between race and IgG antibody usage. AA participants had higher IgG subclass proportions and lower IgM usage compared to NHW participants. Conclusions: CS and vaping additively enhance B cell activation, most notably in dual-users. Self-reported race was strongly associated with IgG isotype usage. These findings highlight associations between B cell activation and antibody transcription, suggesting potential differences in immune and vaccine responses linked to CS, vaping, and race.
RESUMO
Rates of cannabis initiation among teenagers and young adults are increasing. Further, the use of various forms of cannabis (smoked or vaped) with nicotine (dual use) is increasingly common among young people. The health effects of dual use are lesser known, particularly in the context of high-potency cannabis products and across different routes of administration, which is ominous in terms of predicting future health outcomes. There is a long history of cannabis use being associated with decreased activity and increased snacking, both of which could portend an increased risk of metabolic and cardiovascular disease, particularly when these habits begin during formative years. However, modern forms of cannabis may not have these same effects. Here, we assess whether cannabis use alone and dual use of cannabis with nicotine impact dietary and exercise habits in young people. An anonymous, social media-based survey was designed based on the UC San Diego Inhalant Questionnaire and published diet and exercise questionnaires. A total of 457 surveys were completed. Young sole cannabis users represented 29% of responders, 16% were dual users of cannabis and nicotine, and 55% were non-users of either drug. Although the sole use of cannabis was not associated with dietary or activity differences relative to non-users, dual users of cannabis and nicotine reported higher consumption of unhealthy sugars. This novel finding of dual use being associated with increased sugar intake in young people raises concerns for an increased risk of metabolic syndrome and cardiovascular disease in this population.
Assuntos
Fumar Maconha , Humanos , Adolescente , Feminino , Masculino , Adulto Jovem , Fumar Maconha/epidemiologia , Adulto , Inquéritos e Questionários , Açúcares da Dieta , Exercício FísicoRESUMO
Fibrosis drives end-organ damage in many diseases. However, clinical trials targeting individual upstream activators of fibroblasts, such as TGFß, have largely failed. Here, we target the leukemia inhibitory factor receptor (LIFR) as a "master amplifier" of multiple upstream activators of lung fibroblasts. In idiopathic pulmonary fibrosis (IPF), the most common fibrotic lung disease, we found that lung myofibroblasts had high LIF expression. Further, TGFß1, one of the key drivers of fibrosis, upregulated LIF expression in IPF fibroblasts. In vitro anti-LIFR antibody blocking on human IPF lung fibroblasts reduced induction of profibrotic genes downstream of TGFß1, IL-4 and IL-13. Further, siRNA silencing of LIFR in IPF precision cut lung slices reduced expression of fibrotic proteins. Together, we find that LIFR drives an autocrine positive feedback loop that amplifies and sustains pathogenic activation of IPF fibroblasts downstream of multiple external stimuli, implicating LIFR as a therapeutic target in fibrosis. Significance Statement: Fibroblasts have a central role in the pathogenesis of fibrotic diseases. However, due to in part to multiple profibrotic stimuli, targeting a single activator of fibroblasts, like TGFß, has not yielded successful clinical treatments. We hypothesized that a more effective therapeutic strategy is identifying a downstream "master amplifier" of a range of upstream profibrotic stimuli. This study identifies the leukemia inhibitory factor receptor (LIFR) on fibrotic lung fibroblasts amplifies multiple profibrotic stimuli, such as IL-13 and TGFß. Blocking LIFR reduced fibrosis in ex vivo lung tissue from patients with idiopathic pulmonary fibrosis (IPF). LIFR, acting as a master amplifier downstream of fibroblast activation, offers an alternative therapeutic strategy for fibrotic diseases.
RESUMO
Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide. The primary causes of COPD are environmental, including cigarette smoking; however, genetic susceptibility also contributes to COPD risk. Genome-Wide Association Studies (GWASes) have revealed more than 80 genetic loci associated with COPD, leading to the identification of multiple COPD GWAS genes. However, the biological relationships between the identified COPD susceptibility genes are largely unknown. Genes associated with a complex disease are often in close network proximity, i.e. their protein products often interact directly with each other and/or similar proteins. In this study, we use affinity purification mass spectrometry (AP-MS) to identify protein interactions with HHIP , a well-established COPD GWAS gene which is part of the sonic hedgehog pathway, in two disease-relevant lung cell lines (IMR90 and 16HBE). To better understand the network neighborhood of HHIP , its proximity to the protein products of other COPD GWAS genes, and its functional role in COPD pathogenesis, we create HUBRIS, a protein-protein interaction network compiled from 8 publicly available databases. We identified both common and cell type-specific protein-protein interactors of HHIP. We find that our newly identified interactions shorten the network distance between HHIP and the protein products of several COPD GWAS genes, including DSP, MFAP2, TET2 , and FBLN5 . These new shorter paths include proteins that are encoded by genes involved in extracellular matrix and tissue organization. We found and validated interactions to proteins that provide new insights into COPD pathobiology, including CAVIN1 (IMR90) and TP53 (16HBE). The newly discovered HHIP interactions with CAVIN1 and TP53 implicate HHIP in response to oxidative stress.
RESUMO
Chronic Obstructive Pulmonary Disease (COPD) is a complex, heterogeneous disease. Traditional subtyping methods generally focus on either the clinical manifestations or the molecular endotypes of the disease, resulting in classifications that do not fully capture the disease's complexity. Here, we bridge this gap by introducing a subtyping pipeline that integrates clinical and gene expression data with variational autoencoders. We apply this methodology to the COPDGene study, a large study of current and former smoking individuals with and without COPD. Our approach generates a set of vector embeddings, called Personalized Integrated Profiles (PIPs), that recapitulate the joint clinical and molecular state of the subjects in the study. Prediction experiments show that the PIPs have a predictive accuracy comparable to or better than other embedding approaches. Using trajectory learning approaches, we analyze the main trajectories of variation in the PIP space and identify five well-separated subtypes with distinct clinical phenotypes, expression signatures, and disease outcomes. Notably, these subtypes are more robust to data resampling compared to those identified using traditional clustering approaches. Overall, our findings provide new avenues to establish fine-grained associations between the clinical characteristics, molecular processes, and disease outcomes of COPD.
RESUMO
Rationale: Constantly exposed to the external environment and mutagens such as tobacco smoke, human lungs have one of the highest somatic mutation rates among all human organs. However, the relationship of these mutations to lung disease and function is not known. Objectives: To identify the prevalence and significance of clonal somatic mutations in chronic lung diseases. Methods: We analyzed the clonal somatic mutations from 1,251 samples of normal and diseased noncancerous lung tissue RNA sequencing with paired whole-genome sequencing from the Lung Tissue Research Consortium. We examined the associations of somatic mutations with lung function, disease status, and computationally deconvoluted cell types in two of the most common diseases represented in our dataset, chronic obstructive pulmonary disease (COPD; 29%) and idiopathic pulmonary fibrosis (IPF; 13%). Measurements and Main Results: Clonal somatic mutational burden was associated with reduced lung function in both COPD and IPF. We identified an increased prevalence of clonal somatic mutations in individuals with IPF compared with normal control subjects and individuals with COPD independent of age and smoking status. IPF clonal somatic mutations were enriched in disease-related and airway epithelial-expressed genes such as MUC5B in IPF. Patients who were MUC5B risk variant carriers had increased odds of developing somatic mutations of MUC5B that were explained by increased expression of MUC5B. Conclusions: Our identification of an increased prevalence of clonal somatic mutation in diseased lung that correlates with airway epithelial gene expression and disease severity highlights for the first time the role of somatic mutational processes in lung disease genetics.
Assuntos
Fibrose Pulmonar Idiopática , Doença Pulmonar Obstrutiva Crônica , Humanos , Fibrose Pulmonar Idiopática/genética , Fibrose Pulmonar Idiopática/metabolismo , Pulmão/metabolismo , Mutação/genética , Fenômenos Fisiológicos Respiratórios , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Doença Pulmonar Obstrutiva Crônica/genética , Doença Pulmonar Obstrutiva Crônica/metabolismoRESUMO
BACKGROUND: The lifetime risk of developing clinical COPD among smokers ranges from 13% to 22%. Identifying at-risk individuals who will develop overt disease in a reasonable timeframe may allow for early intervention. We hypothesised that readily available clinical and physiological variables could help identify ever-smokers at higher risk of developing chronic airflow limitation (CAL). METHODS: Among 2273 Lovelace Smokers' Cohort (LSC) participants, we included 677 (mean age 54â years) with normal spirometry at baseline and a minimum of three spirometries, each 1â year apart. Repeated spirometric measurements were used to determine incident CAL. Using logistic regression, demographics, anthropometrics, smoking history, modified Medical Research Council dyspnoea scale, St George's Respiratory Questionnaire, comorbidities and spirometry, we related variables obtained at baseline to incident CAL as defined by the Global Initiative for Chronic Obstructive Lung Disease and lower limit of normal criteria. The predictive model derived from the LSC was validated in subjects from the COPDGene study. RESULTS: Over 6.3â years, the incidence of CAL was 26 cases per 1000 person-years. The strongest independent predictors were forced expiratory volume in 1â s (FEV1)/forced vital capacity (FVC) <0.75, having smoked ≥30â pack-years, body mass index (BMI) ≤25â kg·m2 and symptoms of chronic bronchitis. Having all four predictors increased the risk of developing CAL over 6â years to 85% (area under the receiver operating characteristic curve (AUC ROC) 0.84, 95% CI 0.81-0.89). The prediction model showed similar results when applied to subjects in the COPDGene study with a follow-up period of 10â years (AUC ROC 0.77, 95% CI 0.72-0.81). CONCLUSION: In middle-aged ever-smokers, a simple predictive model with FEV1/FVC, smoking history, BMI and chronic bronchitis helps identify subjects at high risk of developing CAL.
Assuntos
Bronquite Crônica , Doença Pulmonar Obstrutiva Crônica , Pessoa de Meia-Idade , Humanos , Bronquite Crônica/diagnóstico , Bronquite Crônica/epidemiologia , Bronquite Crônica/complicações , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Volume Expiratório Forçado , Capacidade Vital , Fumar/epidemiologia , Espirometria/métodos , PulmãoRESUMO
Detection of viruses by RNA and DNA sequencing has improved the understanding of the human virome. We sought to identify blood viral signatures through secondary use of RNA-sequencing (RNA-seq) data in a large study cohort. The ability to reveal undiagnosed infections with public health implications among study subjects with available sequencing data could enable epidemiologic surveys and may lead to diagnosis and therapeutic interventions, leveraging existing research data in a clinical context. We detected viral RNA in peripheral blood RNA-seq data from a COPD-enriched population of current and former smokers. Correlation between viral detection and both reported infections and relevant disease outcomes was evaluated. We identified Hepatitis C virus RNA in 228 subjects and HIV RNA in 30 subjects. Overall, we observed 31 viral species, including Epstein-Barr virus and Cytomegalovirus. We observed an enrichment of Hepatitis C and HIV infections among subjects reporting liver disease and HIV infections, respectively. Higher interferon expression scores were observed in the subjects with Hepatitis C and HIV infections. Through secondary use of RNA-seq from a cohort of current and former smokers, we detected peripheral blood viral signatures. We identified HIV and Hepatitis C virus (HCV), highlighting potential public health implications for the approach described this study. We observed correlations with reported infections, chronic infection outcomes and the host transcriptomic response, providing evidence to support the validity of the approach.
Assuntos
Infecções por Vírus Epstein-Barr , Infecções por HIV , Hepatite C , Humanos , Hepacivirus/genética , Infecções por HIV/diagnóstico , Infecções por HIV/genética , Infecções por HIV/complicações , Infecções por Vírus Epstein-Barr/complicações , Fumantes , Herpesvirus Humano 4/genética , Hepatite C/diagnóstico , Hepatite C/genética , Hepatite C/complicações , RNA , RNA Viral/genéticaRESUMO
Low dose computed tomography (LDCT) is an effective screening test to decrease lung cancer deaths. Lung cancer screening may be a teachable moment helping people who smoke to quit, which may result in increased benefit of screening. Innovative strategies are needed to engage high-risk individuals in learning about LDCT screening. More precise methods such as polygenic risk scores quantify genetic predisposition to tobacco use, and optimize lung health interventions. We present the ESCAPE (Enhanced Smoking Cessation Approach to Promote Empowerment) protocol. This study will test a smoking cessation intervention using personal stories and a lung cancer screening decision-aide compared to standard care (brief advice, referral to a quit line, and a lung cancer screening decision-aide), examine the relationship between a polygenic risk score and smoking abstinence, and describe perceptions about integration of genomic information into smoking cessation treatment. A randomized controlled trial followed by a sequential explanatory mixed methods approach will compare the efficacy of the interventions. Interviews will add insight into the use of genomic information and risk perceptions to tailor smoking cessation treatment. Two-hundred and fifty individuals will be recruited from primary care, community-based organizations, mailing lists and through social media. Data will be collected at baseline, 1, 3 and 6-months. The primary outcomes are 7-day point prevalence smoking abstinence and stage of lung cancer screening at 6-months. The results from this study will provide information to refine the ESCAPE intervention and facilitate integration of precision health into future lung health interventions. Clinical trial registration number: NCT0469129T.
Assuntos
Neoplasias Pulmonares , Abandono do Hábito de Fumar , Humanos , Abandono do Hábito de Fumar/métodos , Detecção Precoce de Câncer/métodos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/epidemiologia , Pulmão , Fumar/epidemiologia , Fumar/terapia , Ensaios Clínicos Controlados Aleatórios como AssuntoRESUMO
Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of splicing quantitative trait loci (sQTLs) has shown that genetic regulation of alternative splicing is widespread. However, identification of the corresponding isoform or protein products associated with disease-associated sQTLs is challenging with short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference transcript annotations, which are incomplete. Solutions to these issues may be found through integration of newly emerging long-read sequencing technologies. Long-read sequencing offers the capability to sequence full-length mRNA transcripts and, in some cases, to link sQTLs to transcript isoforms containing disease-relevant protein alterations. Here, we provide an overview of sQTL mapping approaches, the use of long-read sequencing to characterize sQTL effects on isoforms, the linkage of RNA isoforms to protein-level functions and comment on future directions in the field. Based on recent progress, long-read RNA sequencing promises to be part of the human disease genetics toolkit to discover and treat protein isoforms causing rare and complex diseases.
Assuntos
Genética Humana , Isoformas de RNA , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Isoformas de RNA/genética , RNA Mensageiro/genética , Análise de Sequência de RNARESUMO
BACKGROUND: Interstitial lung abnormalities (ILA) are radiologic findings that may progress to idiopathic pulmonary fibrosis (IPF). Blood gene expression profiles can predict IPF mortality, but whether these same genes associate with ILA and ILA outcomes is unknown. This study evaluated if a previously described blood gene expression profile associated with IPF mortality is associated with ILA and all-cause mortality. METHODS: In COPDGene and ECLIPSE study participants with visual scoring of ILA and gene expression data, we evaluated the association of a previously described IPF mortality score with ILA and mortality. We also trained a new ILA score, derived using genes from the IPF score, in a subset of COPDGene. We tested the association with ILA and mortality on the remainder of COPDGene and ECLIPSE. RESULTS: In 1469 COPDGene (training n = 734; testing n = 735) and 571 ECLIPSE participants, the IPF score was not associated with ILA or mortality. However, an ILA score derived from IPF score genes was associated with ILA (meta-analysis of test datasets OR 1.4 [95% CI: 1.2-1.6]) and mortality (HR 1.25 [95% CI: 1.12-1.41]). Six of the 11 genes in the ILA score had discordant directions of effects compared to the IPF score. The ILA score partially mediated the effects of age on mortality (11.8% proportion mediated). CONCLUSIONS: An ILA gene expression score, derived from IPF mortality-associated genes, identified genes with concordant and discordant effects on IPF mortality and ILA. These results suggest shared, and unique biologic processes, amongst those with ILA, IPF, aging, and death.
Assuntos
Fibrose Pulmonar Idiopática , Doenças Pulmonares Intersticiais , Estudos de Coortes , Humanos , Fibrose Pulmonar Idiopática/diagnóstico , Fibrose Pulmonar Idiopática/genética , Pulmão , Doenças Pulmonares Intersticiais/diagnóstico , Doenças Pulmonares Intersticiais/genética , Tomografia Computadorizada por Raios X , Transcriptoma/genéticaRESUMO
BACKGROUND: Chronic obstructive pulmonary disease (COPD) and idiopathic pulmonary fibrosis (IPF) are characterized by shared exposures and clinical features, but distinct genetic and pathologic features exist. These features have not been well-studied using large-scale gene expression datasets. We hypothesized that there are divergent gene, pathway, and cellular signatures between COPD and IPF. METHODS: We performed RNA-sequencing on lung tissues from individuals with IPF (n = 231) and COPD (n = 377) compared to control (n = 267), defined as individuals with normal spirometry. We grouped the overlapping differential expression gene sets based on direction of expression and examined the resultant sets for genes of interest, pathway enrichment, and cell composition. Using gene set variation analysis, we validated the overlap group gene sets in independent COPD and IPF data sets. RESULTS: We found 5010 genes differentially expressed between COPD and control, and 11,454 genes differentially expressed between IPF and control (1% false discovery rate). 3846 genes overlapped between IPF and COPD. Several pathways were enriched for genes upregulated in COPD and downregulated in IPF; however, no pathways were enriched for genes downregulated in COPD and upregulated in IPF. There were many myeloid cell genes with increased expression in COPD but decreased in IPF. We found that the genes upregulated in COPD but downregulated in IPF were associated with lower lung function in the independent validation cohorts. CONCLUSIONS: We identified a divergent gene expression signature between COPD and IPF, with increased expression in COPD and decreased in IPF. This signature is associated with worse lung function in both COPD and IPF.
Assuntos
Fibrose Pulmonar Idiopática , Doença Pulmonar Obstrutiva Crônica , Humanos , Fibrose Pulmonar Idiopática/patologia , Pulmão/patologia , Doença Pulmonar Obstrutiva Crônica/complicações , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/genética , Análise de Sequência de RNA , Transcriptoma/genéticaRESUMO
Rationale: The ability of peripheral blood biomarkers to assess chronic obstructive pulmonary disease (COPD) risk and progression is unknown. Genetics and gene expression may capture important aspects of COPD-related biology that predict disease activity. Objectives: Develop a transcriptional risk score (TRS) for COPD and assess the contribution of the TRS and a polygenic risk score (PRS) for disease susceptibility and progression. Methods: We randomly split 2,569 COPDGene (Genetic Epidemiology of COPD) participants with whole-blood RNA sequencing into training (n = 1,945) and testing (n = 624) samples and used 468 ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate End-points) COPD cases with microarray data for replication. We developed a TRS using penalized regression (least absolute shrinkage and selection operator) to model FEV1/FVC and studied the predictive value of TRS for COPD (Global Initiative for Chronic Obstructive Lung Disease 2-4), prospective FEV1 change (ml/yr), and additional COPD-related traits. We adjusted for potential confounders, including age and smoking. We evaluated the predictive performance of the TRS in the context of a previously derived PRS and clinical factors. Measurements and Main Results: The TRS included 147 transcripts and was associated with COPD (odds ratio, 3.3; 95% confidence interval [CI], 2.4-4.5; P < 0.001), FEV1 change (ß, -17 ml/yr; 95% CI, -28 to -6.6; P = 0.002), and other COPD-related traits. In ECLIPSE cases, we replicated the association with FEV1 change (ß, -8.2; 95% CI, -15 to -1; P = 0.025) and the majority of other COPD-related traits. Models including PRS, TRS, and clinical factors were more predictive of COPD (area under the receiver operator characteristic curve, 0.84) and annualized FEV1 change compared with models with one risk score or clinical factors alone. Conclusions: Blood transcriptomics can improve prediction of COPD and lung function decline when added to a PRS and clinical risk factors.
Assuntos
Biomarcadores/sangue , Progressão da Doença , Doença Pulmonar Obstrutiva Crônica/sangue , Doença Pulmonar Obstrutiva Crônica/genética , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Medição de Risco/métodos , Idoso , Feminino , Regulação da Expressão Gênica , Predisposição Genética para Doença , Humanos , Masculino , Pessoa de Meia-Idade , Herança Multifatorial , Razão de Chances , Fenótipo , Valor Preditivo dos Testes , Estudos Prospectivos , Fatores de Risco , Índice de Gravidade de Doença , Fatores de TranscriçãoRESUMO
Cigarette smoking induces a profound transcriptomic and systemic inflammatory response. Previous studies have focused on gene level differential expression of smoking, but the genome-wide effects of smoking on alternative isoform regulation have not yet been described. We conducted RNA sequencing in whole-blood samples of 454 current and 767 former smokers in the COPDGene Study, and we analyzed the effects of smoking on differential usage of isoforms and exons. At 10% FDR, we detected 3167 differentially expressed genes, 945 differentially used isoforms and 160 differentially used exons. Isoform switch analysis revealed widespread 3' UTR lengthening associated with cigarette smoking. The lengthening of these 3' UTRs was consistent with alternative usage of distal polyadenylation sites, and these extended 3' UTR regions were significantly enriched with functional sequence elements including microRNA and RNA-protein binding sites. These findings warrant further studies on alternative polyadenylation events as potential biomarkers and novel therapeutic targets for smoking-related diseases.
Assuntos
Fumar Cigarros , Poliadenilação , Regiões 3' não Traduzidas , Fumar Cigarros/efeitos adversos , Fumar Cigarros/genética , Isoformas de Proteínas/genética , Fumar/efeitos adversos , Fumar/genéticaRESUMO
Most predictive models based on gene expression data do not leverage information related to gene splicing, despite the fact that splicing is a fundamental feature of eukaryotic gene expression. Cigarette smoking is an important environmental risk factor for many diseases, and it has profound effects on gene expression. Using smoking status as a prediction target, we developed deep neural network predictive models using gene, exon, and isoform level quantifications from RNA sequencing data in 2,557 subjects in the COPDGene Study. We observed that models using exon and isoform quantifications clearly outperformed gene-level models when using data from 5 genes from a previously published prediction model. Whereas the test set performance of the previously published model was 0.82 in the original publication, our exon-based models including an exon-to-isoform mapping layer achieved a test set AUC (area under the receiver operating characteristic) of 0.88, which improved to an AUC of 0.94 using exon quantifications from a larger set of genes. Isoform variability is an important source of latent information in RNA-seq data that can be used to improve clinical prediction models.
Assuntos
Aprendizado Profundo , Modelos Estatísticos , RNA-Seq/métodos , Fumar , Idoso , Biologia Computacional , Éxons/genética , Feminino , Perfilação da Expressão Gênica , Humanos , Masculino , Pessoa de Meia-Idade , Isoformas de Proteínas/genética , Curva ROC , Fumar/epidemiologia , Fumar/genéticaRESUMO
The human microbiome has a role in the development of multiple diseases. Individual microbiome profiles are highly personalized, though many species are shared. Understanding the relationship between the human microbiome and disease may inform future individualized treatments. We hypothesize the blood microbiome signature may be a surrogate for some lung microbial characteristics. We sought associations between the blood microbiome signature and lung-relevant host factors. Based on reads not mapped to the human genome, we detected microbial nucleic acids through secondary use of peripheral blood RNA-sequencing from 2,590 current and former smokers with and without chronic obstructive pulmonary disease (COPD) from the COPDGene study. We used the Genome Analysis Toolkit (GATK) microbial pipeline PathSeq to infer microbial profiles. We tested associations between the inferred profiles and lung disease relevant phenotypes and examined links to host gene expression pathways. We replicated our analyses using a second independent set of blood RNA-seq data from 1,065 COPDGene study subjects and performed a meta-analysis across the two studies. The four phyla with highest abundance across all subjects were Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. In our meta-analysis, we observed associations (q-value < 0.05) between Acinetobacter, Serratia, Streptococcus and Bacillus inferred abundances and Modified Medical Research Council (mMRC) dyspnea score. Current smoking status was associated (q < 0.05) with Acinetobacter, Serratia and Cutibacterium abundance. All 12 taxa investigated were associated with at least one white blood cell distribution variable. Abundance for nine of the 12 taxa was associated with sex, and seven of the 12 taxa were associated with race. Host-microbiome interaction analysis revealed clustering of genera associated with mMRC dyspnea score and smoking status, through shared links to several host pathways. This study is the first to identify a bacterial microbiome signature in the peripheral blood of current and former smokers. Understanding the relationships between systemic microbial signatures and lung-related phenotypes may inform novel interventions and aid understanding of the systemic effects of smoking.
Assuntos
Microbiota , Sepse/microbiologia , Fumantes , Idoso , Idoso de 80 Anos ou mais , Suscetibilidade a Doenças , Feminino , Seguimentos , Predisposição Genética para Doença , Interações entre Hospedeiro e Microrganismos , Interações Hospedeiro-Patógeno , Humanos , Pulmão/microbiologia , Masculino , Metanálise como Assunto , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/complicações , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/etiologia , Testes de Função Respiratória , Sepse/diagnóstico , Sepse/etiologia , Fumar/efeitos adversosRESUMO
BACKGROUND: Attenuation of transforming growth factor ß by blocking angiotensin II has been shown to reduce emphysema in a murine model. General population studies have demonstrated that the use of angiotensin converting enzyme inhibitors (ACEis) and angiotensin-receptor blockers (ARBs) is associated with reduction of emphysema progression in former smokers and that the use of ACEis is associated with reduction of FEV1 progression in current smokers. RESEARCH QUESTION: Is use of ACEi and ARB associated with less progression of emphysema and FEV1 decline among individuals with COPD or baseline emphysema? METHODS: Former and current smokers from the Genetic Epidemiology of COPD Study who attended baseline and 5-year follow-up visits, did not change smoking status, and underwent chest CT imaging were included. Adjusted linear mixed models were used to evaluate progression of adjusted lung density (ALD), percent emphysema (%total lung volume <-950 Hounsfield units [HU]), 15th percentile of the attenuation histogram (attenuation [in HU] below which 15% of voxels are situated plus 1,000 HU), and lung function decline over 5 years between ACEi and ARB users and nonusers in those with spirometry-confirmed COPD, as well as all participants and those with baseline emphysema. Effect modification by smoking status also was investigated. RESULTS: Over 5 years of follow-up, compared with nonusers, ACEi and ARB users with COPD showed slower ALD progression (adjusted mean difference [aMD], 1.6; 95% CI, 0.34-2.9). Slowed lung function decline was not observed based on phase 1 medication (aMD of FEV1 % predicted, 0.83; 95% CI, -0.62 to 2.3), but was when analysis was limited to consistent ACEi and ARB users (aMD of FEV1 % predicted, 1.9; 95% CI, 0.14-3.6). No effect modification by smoking status was found for radiographic outcomes, and the lung function effect was more pronounced in former smokers. Results were similar among participants with baseline emphysema. INTERPRETATION: Among participants with spirometry-confirmed COPD or baseline emphysema, ACEi and ARB use was associated with slower progression of emphysema and lung function decline. TRIAL REGISTRY: ClinicalTrials.gov; No.: NCT00608764; URL: www.clinicaltrials.gov.