RESUMO
Rheumatoid arthritis is a prototypical autoimmune disease that causes joint inflammation and destruction1. There is currently no cure for rheumatoid arthritis, and the effectiveness of treatments varies across patients, suggesting an undefined pathogenic diversity1,2. Here, to deconstruct the cell states and pathways that characterize this pathogenic heterogeneity, we profiled the full spectrum of cells in inflamed synovium from patients with rheumatoid arthritis. We used multi-modal single-cell RNA-sequencing and surface protein data coupled with histology of synovial tissue from 79 donors to build single-cell atlas of rheumatoid arthritis synovial tissue that includes more than 314,000 cells. We stratified tissues into six groups, referred to as cell-type abundance phenotypes (CTAPs), each characterized by selectively enriched cell states. These CTAPs demonstrate the diversity of synovial inflammation in rheumatoid arthritis, ranging from samples enriched for T and B cells to those largely lacking lymphocytes. Disease-relevant cell states, cytokines, risk genes, histology and serology metrics are associated with particular CTAPs. CTAPs are dynamic and can predict treatment response, highlighting the clinical utility of classifying rheumatoid arthritis synovial phenotypes. This comprehensive atlas and molecular, tissue-based stratification of rheumatoid arthritis synovial tissue reveal new insights into rheumatoid arthritis pathology and heterogeneity that could inform novel targeted treatments.
Assuntos
Artrite Reumatoide , Humanos , Artrite Reumatoide/complicações , Artrite Reumatoide/genética , Artrite Reumatoide/imunologia , Artrite Reumatoide/patologia , Citocinas/metabolismo , Inflamação/complicações , Inflamação/genética , Inflamação/imunologia , Inflamação/patologia , Membrana Sinovial/patologia , Linfócitos T/imunologia , Linfócitos B/imunologia , Predisposição Genética para Doença/genética , Fenótipo , Análise da Expressão Gênica de Célula ÚnicaRESUMO
The study aims to determine the shared genetic architecture between COVID-19 severity with existing medical conditions using electronic health record (EHR) data. We conducted a Phenome-Wide Association Study (PheWAS) of genetic variants associated with critical illness (n = 35) or hospitalization (n = 42) due to severe COVID-19 using genome-wide association summary data from the Host Genetics Initiative. PheWAS analysis was performed using genotype-phenotype data from the Veterans Affairs Million Veteran Program (MVP). Phenotypes were defined by International Classification of Diseases (ICD) codes mapped to clinically relevant groups using published PheWAS methods. Among 658,582 Veterans, variants associated with severe COVID-19 were tested for association across 1,559 phenotypes. Variants at the ABO locus (rs495828, rs505922) associated with the largest number of phenotypes (nrs495828 = 53 and nrs505922 = 59); strongest association with venous embolism, odds ratio (ORrs495828 1.33 (p = 1.32 x 10-199), and thrombosis ORrs505922 1.33, p = 2.2 x10-265. Among 67 respiratory conditions tested, 11 had significant associations including MUC5B locus (rs35705950) with increased risk of idiopathic fibrosing alveolitis OR 2.83, p = 4.12 × 10-191; CRHR1 (rs61667602) associated with reduced risk of pulmonary fibrosis, OR 0.84, p = 2.26× 10-12. The TYK2 locus (rs11085727) associated with reduced risk for autoimmune conditions, e.g., psoriasis OR 0.88, p = 6.48 x10-23, lupus OR 0.84, p = 3.97 x 10-06. PheWAS stratified by ancestry demonstrated differences in genotype-phenotype associations. LMNA (rs581342) associated with neutropenia OR 1.29 p = 4.1 x 10-13 among Veterans of African and Hispanic ancestry but not European. Overall, we observed a shared genetic architecture between COVID-19 severity and conditions related to underlying risk factors for severe and poor COVID-19 outcomes. Differing associations between genotype-phenotype across ancestries may inform heterogenous outcomes observed with COVID-19. Divergent associations between risk for severe COVID-19 with autoimmune inflammatory conditions both respiratory and non-respiratory highlights the shared pathways and fine balance of immune host response and autoimmunity and caution required when considering treatment targets.
Assuntos
COVID-19 , Veteranos , COVID-19/epidemiologia , COVID-19/genética , Estudos de Associação Genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
MOTIVATION: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. METHODS: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. RESULTS: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Desenvolvimento de Medicamentos , Registros Eletrônicos de Saúde , Redes Neurais de Computação , FarmacovigilânciaRESUMO
OBJECTIVES: Rheumatoid arthritis (RA) and atherosclerosis share many common inflammatory pathways. We studied whether a multi-biomarker panel for RA disease activity (MBDA) would associate with changes in arterial inflammation in an interventional trial. METHODS: In the TARGET Trial, RA patients with active disease despite methotrexate were randomly assigned to the addition of either a TNF inhibitor or sulfasalazine+hydroxychloroquine (triple therapy). Baseline and 24-week follow-up 18F-fluorodeoxyglucose (FDG) positron emission tomography/computed tomography scans were assessed for change in arterial inflammation measured as the maximal arterial target-to-blood background ratio of FDG uptake in the most diseased segment of the carotid arteries or aorta (MDS-TBRmax). The MBDA test, measured at baseline and weeks 6, 18, and 24, was assessed for its association with the change in MDS-TBRmax. RESULTS: Interpretable scans were available at baseline and week 24 for n = 112 patients. The MBDA score at week 24 was significantly correlated with the change in MDR-TBRmax (Spearman's rho = 0.239; p= 0.011) and remained significantly associated after adjustment for relevant confounders. Those with low MBDA at week 24 had a statistically significant adjusted reduction in arterial inflammation of 0.35 units vs no significant reduction in those who did not achieve low MBDA. Neither DAS28-CRP nor CRP predicted change in arterial inflammation. The MBDA component with the strongest association with change in arterial inflammation was serum amyloid A (SAA). CONCLUSIONS: Among treated RA patients, achieved MBDA predicts of changes in arterial inflammation. Achieving low MBDA at 24 weeks was associated with clinically meaningful reductions in arterial inflammation, regardless of treatment.
RESUMO
In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage the large amount of unlabeled data to improve estimation efficiency. We establish the consistency and asymptotic normality of the proposed estimator under the correct specification of either the density ratio model or the outcome model. We also correct for potential overfitting bias in the estimators in finite samples with cross-validation. We compare our proposed estimators to existing methods and show reductions in bias and gains in efficiency through simulations. We illustrate the practical utility of the proposed method on evaluating prediction performance of a phenotyping model for rheumatoid arthritis (RA) on a temporally evolving EHR cohort.
Assuntos
Aprendizado de Máquina , Aprendizado de Máquina Supervisionado , Humanos , Curva ROC , Projetos de Pesquisa , ViésRESUMO
BACKGROUND: We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares. METHODS: Using Medicare claims linked with EHR, we selected gout patients who initiated the urate-lowering therapy (ULT). Patients' 12-month baseline period and on-treatment follow-up were segmented into 1-month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician-documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data and examined positive/negative predictive values (P/NPV). RESULTS: We extracted and labeled 839 months of follow-up (280 with gout flares). The claims-only model selected 20 variables (AUC = 0.69). The NLP concept-only model selected 15 (AUC = 0.69). The combined model selected 32 claims variables and 13 NLP concepts (AUC = 0.73). The claims-only model had a PPV of 0.64 [0.50, 0.77] and an NPV of 0.71 [0.65, 0.76], whereas the combined model had a PPV of 0.76 [0.61, 0.88] and an NPV of 0.71 [0.65, 0.76]. CONCLUSION: Adding NLP concept variables to claims variables resulted in a small improvement in the identification of gout flares. Our data-driven claims-only model and our combined claims/NLP-concept model outperformed existing rule-based claims algorithms reliant on medication use, diagnosis, and procedure codes.
Assuntos
Gota , Idoso , Humanos , Estados Unidos/epidemiologia , Gota/diagnóstico , Gota/epidemiologia , Processamento de Linguagem Natural , Registros Eletrônicos de Saúde , Medicare , Exacerbação dos Sintomas , AlgoritmosRESUMO
Rheumatoid arthritis (RA) affects 24.5 million people worldwide and has been associated with increased cancer risks. However, the extent to which the observed risks are related to the pathophysiology of rheumatoid arthritis or its treatments is unknown. Leveraging nationwide health insurance claims data with 85.97 million enrollees across 8 years, we identified 92 864 patients without cancers at the time of rheumatoid arthritis diagnoses. We matched 68 415 of these patients with participants without rheumatoid arthritis by sex, race, age and inferred health and economic status and compared their risks of developing all cancer types. By 12 months after the diagnosis of rheumatoid arthritis, rheumatoid arthritis patients were 1.21 (95% confidence interval [CI] [1.14, 1.29]) times more likely to develop any cancer compared with matched enrollees without rheumatoid arthritis. In particular, the risk of developing lymphoma is 2.08 (95% CI [1.67, 2.58]) times higher in the rheumatoid arthritis group, and the risk of developing lung cancer is 1.69 (95% CI [1.32, 2.13]) times higher. We further identified the five most commonly used drugs in treating rheumatoid arthritis, and the log-rank test showed none of them is implicated with a significantly increased cancer risk compared with rheumatoid arthritis patients without that specific drug. Our study suggested that the pathophysiology of rheumatoid arthritis, rather than its treatments, is implicated in the development of subsequent cancers. Our method is extensible to investigating the connections among drugs, diseases and comorbidities at scale.
Assuntos
Artrite Reumatoide , Neoplasias Pulmonares , Linfoma , Humanos , Artrite Reumatoide/complicações , Artrite Reumatoide/epidemiologia , Artrite Reumatoide/tratamento farmacológico , Comorbidade , Neoplasias Pulmonares/etiologia , Neoplasias Pulmonares/complicações , Análise de DadosRESUMO
With the worldwide digitalisation of medical records, electronic health records (EHRs) have become an increasingly important source of real-world data (RWD). RWD can complement traditional study designs because it captures almost the complete variety of patients, leading to more generalisable results. For rheumatology, these data are particularly interesting as our diseases are uncommon and often take years to develop. In this review, we discuss the following concepts related to the use of EHR for research and considerations for translation into clinical care: EHR data contain a broad collection of healthcare data covering the multitude of real-life patients and the healthcare processes related to their care. Machine learning (ML) is a powerful method that allows us to leverage a large amount of heterogeneous clinical data for clinical algorithms, but requires extensive training, testing, and validation. Patterns discovered in EHR data using ML are applicable to real life settings, however, are also prone to capturing the local EHR structure and limiting generalisability outside the EHR(s) from which they were developed. Population studies on EHR necessitates knowledge on the factors influencing the data available in the EHR to circumvent biases, for example, access to medical care, insurance status. In summary, EHR data represent a rapidly growing and key resource for real-world studies. However, transforming RWD EHR data for research and for real-world evidence using ML requires knowledge of the EHR system and their differences from existing observational data to ensure that studies incorporate rigorous methods that acknowledge or address factors such as access to care, noise in the data, missingness and indication bias.
Assuntos
Inteligência Artificial , Registros Eletrônicos de Saúde , Humanos , Algoritmos , Aprendizado de Máquina , Projetos de PesquisaRESUMO
OBJECTIVE: Recent large-scale randomised trials demonstrate that immunomodulators reduce cardiovascular (CV) events among the general population. However, it is uncertain whether these effects apply to rheumatoid arthritis (RA) and if certain treatment strategies in RA reduce CV risk to a greater extent. METHODS: Patients with active RA despite use of methotrexate were randomly assigned to addition of a tumour necrosis factor (TNF) inhibitor (TNFi) or addition of sulfasalazine and hydroxychloroquine (triple therapy) for 24 weeks. Baseline and follow-up 18F-fluorodeoxyglucose-positron emission tomography/CT scans were assessed for change in arterial inflammation, an index of CV risk, measured as an arterial target-to-background ratio (TBR) in the carotid arteries and aorta. RESULTS: 115 patients completed the protocol. The two treatment groups were well balanced with a median age of 58 years, 71% women, 57% seropositive and a baseline disease activity score in 28 joints of 4.8 (IQR 4.0, 5.6). Baseline TBR was similar across the two groups. Significant TBR reductions were observed in both groups-ΔTNFi: -0.24 (SD=0.51), Δtriple therapy: -0.19 (SD=0.51)-without difference between groups (difference in Δs: -0.02, 95% CI -0.19 to 0.15, p=0.79). While disease activity was significantly reduced across both treatment groups, there was no association with change in TBR (ß=0.04, 95% CI -0.03 to 0.10). CONCLUSION: We found that addition of either a TNFi or triple therapy resulted in clinically important improvements in vascular inflammation. However, the addition of a TNFi did not reduce arterial inflammation more than triple therapy. TRIAL REGISTRATION NUMBER: NCT02374021.
Assuntos
Antirreumáticos , Arterite , Artrite Reumatoide , Doenças Cardiovasculares , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Antirreumáticos/efeitos adversos , Doenças Cardiovasculares/prevenção & controle , Doenças Cardiovasculares/induzido quimicamente , Fator de Necrose Tumoral alfa , Fatores de Risco , Artrite Reumatoide/diagnóstico por imagem , Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/induzido quimicamente , Metotrexato/uso terapêutico , Fatores Imunológicos/uso terapêutico , Fatores de Risco de Doenças Cardíacas , Arterite/induzido quimicamente , Arterite/tratamento farmacológico , Resultado do TratamentoRESUMO
OBJECTIVE: Electronic health records (EHR), containing detailed longitudinal clinical information on a large number of patients and covering broad patient populations, open opportunities for comprehensive predictive modeling of disease progression and treatment response. However, since EHRs were originally constructed for administrative purposes not for research, in the EHR-linked studies, it is often not feasible to capture reliable information for analytical variables, especially in the survival setting, when both accurate event status and event times are needed for model building. For example, progression-free survival (PFS), a commonly used survival outcome for cancer patients, often involves complex information embedded in free-text clinical notes and cannot be extracted reliably. Proxies of PFS time such as time to the first mention of progression in the notes are at best good approximations to the true event time. This leads to difficulty in efficiently estimating event rates for an EHR patient cohort. Estimating survival rates based on error-prone outcome definitions can lead to biased results and hamper the power in the downstream analysis. On the other hand, extracting accurate event time information via manual annotation is time and resource intensive. The objective of this study is to develop a calibrated survival rate estimator using noisy outcomes from EHR data. MATERIALS AND METHODS: In this paper, we propose a two-stage semi-supervised calibration of noisy event rate (SCANER) estimator that can effectively overcome censoring induced dependency and attains more robust performance (i.e., not sensitive to misspecification of the imputation model) by fully utilizing both a small-labeled set of gold-standard survival outcomes annotated via manual chart review and a set of proxy features automatically captured via EHR in the unlabeled set. We validate the SCANER estimator by estimating the PFS rates for a virtual cohort of lung cancer patients from one large tertiary care center and the ICU-free survival rates for COVID patients from two large tertiary care centers. RESULTS: In terms of survival rate estimates, the SCANER had very similar point estimates compared to the complete-case Kaplan Meier estimator. On the other hand, other benchmark methods for comparison, which fail to account for the induced dependency between event time and the censoring time conditioning on surrogate outcomes, produced biased results across all three case studies. In terms of standard errors, the SCANER estimator was more efficient than the KM estimator, with up to 50% efficiency gain. CONCLUSION: The SCANER estimator achieves more efficient, robust, and accurate survival rate estimates compared to existing approaches. This promising new approach can also improve the resolution (i.e., granularity of event time) by using labels conditioning on multiple surrogates, particularly among less common or poorly coded conditions.
Assuntos
COVID-19 , Neoplasias Pulmonares , Humanos , Registros Eletrônicos de Saúde , Calibragem , Análise de SobrevidaRESUMO
Rationale: A common MUC5B gene polymorphism, rs35705950-T, is associated with idiopathic pulmonary fibrosis (IPF), but its role in severe acute respiratory syndrome coronavirus 2 infection and disease severity is unclear. Objectives: To assess whether rs35705950-T confers differential risk for clinical outcomes associated with coronavirus disease (COVID-19) infection among participants in the Million Veteran Program (MVP). Methods: The MUC5B rs35705950-T allele was directly genotyped among MVP participants; clinical events and comorbidities were extracted from the electronic health records. Associations between the incidence or severity of COVID-19 and rs35705950-T were analyzed within each ancestry group in the MVP followed by transancestry meta-analysis. Replication and joint meta-analysis were conducted using summary statistics from the COVID-19 Host Genetics Initiative (HGI). Sensitivity analyses with adjustment for additional covariates (body mass index, Charlson comorbidity index, smoking, asbestosis, rheumatoid arthritis with interstitial lung disease, and IPF) and associations with post-COVID-19 pneumonia were performed in MVP subjects. Measurements and Main Results: The rs35705950-T allele was associated with fewer COVID-19 hospitalizations in transancestry meta-analyses within the MVP (Ncases = 4,325; Ncontrols = 507,640; OR = 0.89 [0.82-0.97]; P = 6.86 × 10-3) and joint meta-analyses with the HGI (Ncases = 13,320; Ncontrols = 1,508,841; OR, 0.90 [0.86-0.95]; P = 8.99 × 10-5). The rs35705950-T allele was not associated with reduced COVID-19 positivity in transancestry meta-analysis within the MVP (Ncases = 19,168/Ncontrols = 492,854; OR, 0.98 [0.95-1.01]; P = 0.06) but was nominally significant (P < 0.05) in the joint meta-analysis with the HGI (Ncases = 44,820; Ncontrols = 1,775,827; OR, 0.97 [0.95-1.00]; P = 0.03). Associations were not observed with severe outcomes or mortality. Among individuals of European ancestry in the MVP, rs35705950-T was associated with fewer post-COVID-19 pneumonia events (OR, 0.82 [0.72-0.93]; P = 0.001). Conclusions: The MUC5B variant rs35705950-T may confer protection in COVID-19 hospitalizations.
Assuntos
COVID-19 , Fibrose Pulmonar Idiopática , Humanos , COVID-19/epidemiologia , COVID-19/genética , Mucina-5B/genética , Polimorfismo Genético , Fibrose Pulmonar Idiopática/genética , Genótipo , Hospitalização , Predisposição Genética para Doença/genéticaRESUMO
OBJECTIVE: The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. METHODS: The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. RESULTS: With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. CONCLUSIONS: The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes.
Assuntos
COVID-19 , Registros Eletrônicos de Saúde , Algoritmos , Humanos , Logical Observation Identifiers Names and Codes , Reconhecimento Automatizado de PadrãoRESUMO
OBJECTIVE: Examine the association of methotrexate (MTX) use with cardiovascular disease (CVD) in rheumatoid arthritis (RA) using marginal structural models (MSM) and determine if CVD risk is mediated through modification of disease activity. METHODS: We identified incident CVD events (coronary artery disease (CAD), stroke, heart failure (HF) hospitalisation, CVD death) within a multicentre, prospective cohort of US Veterans with RA. A 28-joint Disease Activity Score with C-reactive protein (DAS28-CRP) was collected at regular visits and medication exposures were determined by linking to pharmacy dispensing data. MSMs were used to estimate the treatment effect of MTX on risk of incident CVD, accounting for time-varying confounders between receiving MTX and CVD events. A mediation analysis was performed to estimate the indirect effects of methotrexate on CVD risk through modification of RA disease activity. RESULTS: Among 2044 RA patients (90% male, mean age 63.9 years, baseline DAS28-CRP 3.6), there were 378 incident CVD events. Using MSM, MTX use was associated with a 24% reduced risk of composite CVD events (HR 0.76, 95% CI 0.58 to 0.99) including a 57% reduction in HF hospitalisations (HR 0.43, 95% CI 0.24 to 0.77). Individual associations with CAD, stroke and CVD death were not statistically significant. In mediation analyses, there was no evidence of indirect effects of MTX on CVD risk through disease activity modification (HR 1.03, 95% CI 0.80 to 1.32). CONCLUSIONS: MTX use in RA was associated with a reduced risk of CVD events, particularly HF-related hospitalisations. These associations were not mediated through reductions in RA disease activity, suggesting alternative MTX-related mechanisms may modify CVD risk in this population.
Assuntos
Antirreumáticos/uso terapêutico , Artrite Reumatoide/tratamento farmacológico , Doença da Artéria Coronariana/epidemiologia , Fatores de Risco de Doenças Cardíacas , Insuficiência Cardíaca/epidemiologia , Hospitalização/estatística & dados numéricos , Metotrexato/uso terapêutico , Acidente Vascular Cerebral/epidemiologia , Idoso , Artrite Reumatoide/epidemiologia , Artrite Reumatoide/fisiopatologia , Doenças Cardiovasculares/mortalidade , Feminino , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Modelos de Riscos ProporcionaisRESUMO
PURPOSE OF REVIEW: Patients with chronic inflammatory disease have an increased risk of cardiovascular disease. This article reviews the current evidence of cardiovascular prevention in three common systemic inflammatory disorders (SIDs): psoriasis, rheumatoid arthritis, and systemic lupus erythematosus. RECENT FINDINGS: General population cardiovascular risk assessment tools currently underestimate cardiovascular risk and disease-specific risk assessment tools are an area of active investigation. A disease-specific cardiovascular risk estimator has not been shown to more accurately predict risk compared with the current guidelines. Rheumatoid arthritis-specific risk estimators have been shown to better predict cardiovascular risk in some cohorts and not others. Systemic lupus erythematosus-specific scores have also been proposed and require further validation, whereas psoriasis is an open area of active investigation. The current role of universal prevention treatment with statin therapy in patients with SID remains unclear. Aggressive risk factor modification and control of disease activity are important interventions to reduce cardiovascular risk. SUMMARY: A comprehensive approach that includes cardiovascular risk factor modification, control of systemic inflammation, and increased patient and physician awareness is needed in cardiovascular prevention of chronic inflammation. Clinical trials are currently underway to test whether disease-specific anti-inflammatory therapies will reduce cardiovascular risk.
Assuntos
Artrite Reumatoide , Doenças Cardiovasculares , Psoríase , Artrite Reumatoide/complicações , Artrite Reumatoide/tratamento farmacológico , Doenças Cardiovasculares/etiologia , Doenças Cardiovasculares/prevenção & controle , Doença Crônica , Humanos , Inflamação/complicações , Fatores de RiscoRESUMO
Crohn's disease (CD) and ulcerative colitis (UC) are heterogeneous. With availability of therapeutic classes with distinct immunologic mechanisms of action, it has become imperative to identify markers that predict likelihood of response to each drug class. However, robust development of such tools has been challenging because of need for large prospective cohorts with systematic and careful assessment of treatment response using validated indices. Most hospitals in the United States use electronic health records (EHRs) that warehouse a large amount of narrative (free-text) and codified (administrative) data generated during routine clinical care. These data have been used to construct virtual disease cohorts for epidemiologic research as well as for defining genetic basis of disease states or discrete laboratory values.1-3 Whether EHR-based data can be used to validate genetic associations for more nuanced outcomes such as treatment response has not been examined previously.
Assuntos
Colite Ulcerativa , Doença de Crohn , Doenças Inflamatórias Intestinais , Registros Eletrônicos de Saúde , Humanos , Doenças Inflamatórias Intestinais/tratamento farmacológico , Estudos Prospectivos , Estados UnidosRESUMO
OBJECTIVES: To develop classification algorithms that accurately identify axial SpA (axSpA) patients in electronic health records, and compare the performance of algorithms incorporating free-text data against approaches using only International Classification of Diseases (ICD) codes. METHODS: An enriched cohort of 7853 eligible patients was created from electronic health records of two large hospitals using automated searches (⩾1 ICD codes combined with simple text searches). Key disease concepts from free-text data were extracted using NLP and combined with ICD codes to develop algorithms. We created both supervised regression-based algorithms-on a training set of 127 axSpA cases and 423 non-cases-and unsupervised algorithms to identify patients with high probability of having axSpA from the enriched cohort. Their performance was compared against classifications using ICD codes only. RESULTS: NLP extracted four disease concepts of high predictive value: ankylosing spondylitis, sacroiliitis, HLA-B27 and spondylitis. The unsupervised algorithm, incorporating both the NLP concept and ICD code for AS, identified the greatest number of patients. By setting the probability threshold to attain 80% positive predictive value, it identified 1509 axSpA patients (mean age 53 years, 71% male). Sensitivity was 0.78, specificity 0.94 and area under the curve 0.93. The two supervised algorithms performed similarly but identified fewer patients. All three outperformed traditional approaches using ICD codes alone (area under the curve 0.80-0.87). CONCLUSION: Algorithms incorporating free-text data can accurately identify axSpA patients in electronic health records. Large cohorts identified using these novel methods offer exciting opportunities for future clinical research.
Assuntos
Registros Eletrônicos de Saúde/estatística & dados numéricos , Processamento de Linguagem Natural , Melhoria de Qualidade , Espondilartrite/classificação , Espondilite Anquilosante/classificação , Idoso , Algoritmos , Área Sob a Curva , Estudos de Coortes , Feminino , Humanos , Classificação Internacional de Doenças , Masculino , Pessoa de Meia-Idade , Sensibilidade e Especificidade , Espondilartrite/epidemiologia , Espondilite Anquilosante/epidemiologiaRESUMO
OBJECTIVE: The objective of this study was to compare the performance of an RA algorithm developed and trained in 2010 utilizing natural language processing and machine learning, using updated data containing ICD10, new RA treatments, and a new electronic medical records (EMR) system. METHODS: We extracted data from subjects with ≥1 RA International Classification of Diseases (ICD) codes from the EMR of two large academic centres to create a data mart. Gold standard RA cases were identified from reviewing a random 200 subjects from the data mart, and a random 100 subjects who only have RA ICD10 codes. We compared the performance of the following algorithms using the original 2010 data with updated data: (i) a published 2010 RA algorithm; (ii) updated algorithm, incorporating ICD10 RA codes and new DMARDs; and (iii) published algorithm using ICD codes only, ICD RA code ≥3. RESULTS: The gold standard RA cases had mean age 65.5 years, 78.7% female, 74.1% RF or antibodies to cyclic citrullinated peptide (anti-CCP) positive. The positive predictive value (PPV) for ≥3 RA ICD was 54%, compared with 56% in 2010. At a specificity of 95%, the PPV of the 2010 algorithm and the updated version were both 91%, compared with 94% (95% CI: 91, 96%) in 2010. In subjects with ICD10 data only, the PPV for the updated 2010 RA algorithm was 93%. CONCLUSION: The 2010 RA algorithm validated with the updated data with similar performance characteristics as the 2010 data. While the 2010 algorithm continued to perform better than the rule-based approach, the PPV of the latter also remained stable over time.
Assuntos
Artrite Reumatoide , Classificação Internacional de Doenças , Algoritmos , Registros Eletrônicos de Saúde , HumanosRESUMO
OBJECTIVES: This study aimed to compare comorbidities and biologic DMARD (bDMARD) use between AS and non-radiographic axial SpA (nr-axSpA) patients, using a large cohort of patients from routine clinical practice in the United States. METHODS: We performed a cross-sectional study using electronic medical records from two academic hospitals in the United States. Data were extracted using automated searches (⩾3 ICD codes combined with text searches) and supplemented with manual chart review. Patients were categorized into AS or nr-axSpA according to classification criteria. Disease features, comorbidities (from a list of 39 chronic conditions) and history of bDMARD prescription were compared using descriptive statistics. RESULTS: Among 965 patients identified, 775 (80%) were classified as having axSpA. The cohort was predominantly male (74%) with a mean age of 52.5 years (s.d. 16.8). AS patients were significantly older (54 vs 46 years), more frequently male (77% vs 64%) and had higher serum inflammatory markers than those with nr-axSpA (median CRP 3.4 vs 2.2 mg/dl). Half of all patients had at least one comorbidity. The mean number of comorbidities was 1.5 (s.d. 2.2) and similar between AS and nr-axSpA groups. A history of bDMARD-use was seen in 55% of patients with no difference between groups. The most commonly prescribed bDMARDs were adalimumab (31%) and etanercept (29%). Ever-prescriptions of individual bDMARDs were similar between AS and nr-axSpA. CONCLUSION: Despite age differences, nr-axSpA patients had similar comorbidity burdens as those with AS. Both groups received comparable bDMARD treatment in this United States clinic-based cohort.
Assuntos
Antirreumáticos/uso terapêutico , Produtos Biológicos/uso terapêutico , Doença Crônica/epidemiologia , Espondilartrite/epidemiologia , Espondilite Anquilosante/epidemiologia , Adulto , Idoso , Doença Crônica/tratamento farmacológico , Comorbidade , Estudos Transversais , Feminino , Humanos , Mediadores da Inflamação/sangue , Masculino , Pessoa de Meia-Idade , Espondilartrite/sangue , Espondilartrite/tratamento farmacológico , Espondilite Anquilosante/sangue , Espondilite Anquilosante/tratamento farmacológico , Estados UnidosAssuntos
Artrite Reumatoide , Biomarcadores , Doenças Cardiovasculares , Troponina T , Humanos , Artrite Reumatoide/sangue , Artrite Reumatoide/complicações , Troponina T/sangue , Feminino , Masculino , Doenças Cardiovasculares/etiologia , Doenças Cardiovasculares/sangue , Pessoa de Meia-Idade , Idoso , Biomarcadores/sangueRESUMO
The Electronic Medical Records (EMR) data linked with genomic data have facilitated efficient and large scale translational studies. One major challenge in using EMR for translational research is the difficulty in accurately and efficiently annotating disease phenotypes due to the low accuracy of billing codes and the time involved with manual chart review. Recent efforts such as those by the Electronic Medical Records and Genomics (eMERGE) Network and Informatics for Integrating Biology & the Bedside (i2b2) have led to an increasing number of algorithms available for classifying various disease phenotypes. Investigators can apply such algorithms to obtain predicted phenotypes for their specific EMR study. They typically perform a small validation study within their cohort to assess the algorithm performance and then subsequently treat the algorithm classification as the true phenotype for downstream genetic association analyses. Despite the superior performance compared to simple billing codes, these algorithms may not port well across institutions, leading to bias and low power for association studies. In this paper, we propose a semi-supervised method to make inferences about both the accuracy of multiple available algorithms and the effect of genetic markers on the true phenotype, leveraging information from both a large set of unlabeled data where both genetic markers and algorithm output information and a small validation data where labels are additionally available. The simulation studies show that the proposed method substantially outperforms existing methods from the missing data literature. The proposed methods are applied to an EMR study of how low density lipoprotein risk alleles affect the risk of cardiovascular disease among patients with rheumatoid arthritis.