Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
1.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36805623

RESUMO

MOTIVATION: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. METHODS: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. RESULTS: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Desenvolvimento de Medicamentos , Registros Eletrônicos de Saúde , Redes Neurais de Computação , Farmacovigilância
2.
Pharmacoepidemiol Drug Saf ; 33(1): e5684, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37654015

RESUMO

BACKGROUND: We aimed to determine whether integrating concepts from the notes from the electronic health record (EHR) data using natural language processing (NLP) could improve the identification of gout flares. METHODS: Using Medicare claims linked with EHR, we selected gout patients who initiated the urate-lowering therapy (ULT). Patients' 12-month baseline period and on-treatment follow-up were segmented into 1-month units. We retrieved EHR notes for months with gout diagnosis codes and processed notes for NLP concepts. We selected a random sample of 500 patients and reviewed each of their notes for the presence of a physician-documented gout flare. Months containing at least 1 note mentioning gout flares were considered months with events. We used 60% of patients to train predictive models with LASSO. We evaluated the models by the area under the curve (AUC) in the validation data and examined positive/negative predictive values (P/NPV). RESULTS: We extracted and labeled 839 months of follow-up (280 with gout flares). The claims-only model selected 20 variables (AUC = 0.69). The NLP concept-only model selected 15 (AUC = 0.69). The combined model selected 32 claims variables and 13 NLP concepts (AUC = 0.73). The claims-only model had a PPV of 0.64 [0.50, 0.77] and an NPV of 0.71 [0.65, 0.76], whereas the combined model had a PPV of 0.76 [0.61, 0.88] and an NPV of 0.71 [0.65, 0.76]. CONCLUSION: Adding NLP concept variables to claims variables resulted in a small improvement in the identification of gout flares. Our data-driven claims-only model and our combined claims/NLP-concept model outperformed existing rule-based claims algorithms reliant on medication use, diagnosis, and procedure codes.


Assuntos
Gota , Idoso , Humanos , Estados Unidos/epidemiologia , Gota/diagnóstico , Gota/epidemiologia , Processamento de Linguagem Natural , Registros Eletrônicos de Saúde , Medicare , Exacerbação dos Sintomas , Algoritmos
3.
J Med Internet Res ; 25: e45662, 2023 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-37227772

RESUMO

Although randomized controlled trials (RCTs) are the gold standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data has been vital in postapproval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of real-world data is electronic health records (EHRs), which contain detailed information on patient care in both structured (eg, diagnosis codes) and unstructured (eg, clinical notes and images) forms. Despite the granularity of the data available in EHRs, the critical variables required to reliably assess the relationship between a treatment and clinical outcome are challenging to extract. To address this fundamental challenge and accelerate the reliable use of EHRs for RWE, we introduce an integrated data curation and modeling pipeline consisting of 4 modules that leverage recent advances in natural language processing, computational phenotyping, and causal modeling techniques with noisy data. Module 1 consists of techniques for data harmonization. We use natural language processing to recognize clinical variables from RCT design documents and map the extracted variables to EHR features with description matching and knowledge networks. Module 2 then develops techniques for cohort construction using advanced phenotyping algorithms to both identify patients with diseases of interest and define the treatment arms. Module 3 introduces methods for variable curation, including a list of existing tools to extract baseline variables from different sources (eg, codified, free text, and medical imaging) and end points of various types (eg, death, binary, temporal, and numerical). Finally, module 4 presents validation and robust modeling methods, and we propose a strategy to create gold-standard labels for EHR variables of interest to validate data curation quality and perform subsequent causal modeling for RWE. In addition to the workflow proposed in our pipeline, we also develop a reporting guideline for RWE that covers the necessary information to facilitate transparent reporting and reproducibility of results. Moreover, our pipeline is highly data driven, enhancing study data with a rich variety of publicly available information and knowledge sources. We also showcase our pipeline and provide guidance on the deployment of relevant tools by revisiting the emulation of the Clinical Outcomes of Surgical Therapy Study Group Trial on laparoscopy-assisted colectomy versus open colectomy in patients with early-stage colon cancer. We also draw on existing literature on EHR emulation of RCTs together with our own studies with the Mass General Brigham EHR.


Assuntos
Neoplasias do Colo , Registros Eletrônicos de Saúde , Humanos , Algoritmos , Informática , Projetos de Pesquisa
4.
J Biomed Inform ; 132: 104109, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35660521

RESUMO

OBJECTIVE: Accurately assigning phenotype information to individual patients via computational phenotyping using Electronic Health Records (EHRs) has been seen as the first step towards enabling EHRs for precision medicine research. Chart review labels annotated by clinical experts, also known as "gold standard" labels, are essential for the development and validation of computational phenotyping algorithms. However, given the complexity of EHR systems, the process of chart review is both labor intensive and time consuming. We propose a fully automated algorithm, referred to as pGUESS, to rank EHR notes according to their relevance to a given phenotype. By identifying the most relevant notes, pGUESS can greatly improve the efficiency and accuracy of chart reviews. METHOD: pGUESS uses prior guided semantic similarity to measure the informativeness of a clinical note to a given phenotype. We first select candidate clinical concepts from a pool of comprehensive medical concepts using public knowledge sources and then derive the semantic embedding vector (SEV) for a reference article (SEVref) and each note (SEVnote). The algorithm scores the relevance of a note as the cosine similarity between SEVnote and SEVref. RESULTS: The algorithm was validated against four sets of 200 notes that were manually annotated by clinical experts to assess their informativeness to one of three disease phenotypes. pGUESS algorithm substantially outperforms existing unsupervised approaches for classifying the relevance status with respect to both accuracy and scalability across phenotypes. Averaging over the three phenotypes, the rank correlation between the algorithm ranking and gold standard label was 0.64 for pGUESS, but only 0.47 and 0.35 for the next two best performing algorithms. pGUESS is also much more computationally scalable compared to existing algorithms. CONCLUSION: pGUESS algorithm can substantially reduce the burden of chart review and holds potential in improving the efficiency and accuracy of human annotation.


Assuntos
Algoritmos , Semântica , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Fenótipo , Medicina de Precisão
5.
J Biomed Inform ; 133: 104147, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35872266

RESUMO

OBJECTIVE: The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. METHODS: The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. RESULTS: With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. CONCLUSIONS: The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes.


Assuntos
COVID-19 , Registros Eletrônicos de Saúde , Algoritmos , Humanos , Logical Observation Identifiers Names and Codes , Reconhecimento Automatizado de Padrão
6.
Rheumatology (Oxford) ; 59(12): 3759-3766, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-32413107

RESUMO

OBJECTIVE: The objective of this study was to compare the performance of an RA algorithm developed and trained in 2010 utilizing natural language processing and machine learning, using updated data containing ICD10, new RA treatments, and a new electronic medical records (EMR) system. METHODS: We extracted data from subjects with ≥1 RA International Classification of Diseases (ICD) codes from the EMR of two large academic centres to create a data mart. Gold standard RA cases were identified from reviewing a random 200 subjects from the data mart, and a random 100 subjects who only have RA ICD10 codes. We compared the performance of the following algorithms using the original 2010 data with updated data: (i) a published 2010 RA algorithm; (ii) updated algorithm, incorporating ICD10 RA codes and new DMARDs; and (iii) published algorithm using ICD codes only, ICD RA code ≥3. RESULTS: The gold standard RA cases had mean age 65.5 years, 78.7% female, 74.1% RF or antibodies to cyclic citrullinated peptide (anti-CCP) positive. The positive predictive value (PPV) for ≥3 RA ICD was 54%, compared with 56% in 2010. At a specificity of 95%, the PPV of the 2010 algorithm and the updated version were both 91%, compared with 94% (95% CI: 91, 96%) in 2010. In subjects with ICD10 data only, the PPV for the updated 2010 RA algorithm was 93%. CONCLUSION: The 2010 RA algorithm validated with the updated data with similar performance characteristics as the 2010 data. While the 2010 algorithm continued to perform better than the rule-based approach, the PPV of the latter also remained stable over time.


Assuntos
Artrite Reumatoide , Classificação Internacional de Doenças , Algoritmos , Registros Eletrônicos de Saúde , Humanos
7.
Rheumatology (Oxford) ; 59(5): 1059-1065, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31535693

RESUMO

OBJECTIVES: To develop classification algorithms that accurately identify axial SpA (axSpA) patients in electronic health records, and compare the performance of algorithms incorporating free-text data against approaches using only International Classification of Diseases (ICD) codes. METHODS: An enriched cohort of 7853 eligible patients was created from electronic health records of two large hospitals using automated searches (⩾1 ICD codes combined with simple text searches). Key disease concepts from free-text data were extracted using NLP and combined with ICD codes to develop algorithms. We created both supervised regression-based algorithms-on a training set of 127 axSpA cases and 423 non-cases-and unsupervised algorithms to identify patients with high probability of having axSpA from the enriched cohort. Their performance was compared against classifications using ICD codes only. RESULTS: NLP extracted four disease concepts of high predictive value: ankylosing spondylitis, sacroiliitis, HLA-B27 and spondylitis. The unsupervised algorithm, incorporating both the NLP concept and ICD code for AS, identified the greatest number of patients. By setting the probability threshold to attain 80% positive predictive value, it identified 1509 axSpA patients (mean age 53 years, 71% male). Sensitivity was 0.78, specificity 0.94 and area under the curve 0.93. The two supervised algorithms performed similarly but identified fewer patients. All three outperformed traditional approaches using ICD codes alone (area under the curve 0.80-0.87). CONCLUSION: Algorithms incorporating free-text data can accurately identify axSpA patients in electronic health records. Large cohorts identified using these novel methods offer exciting opportunities for future clinical research.


Assuntos
Registros Eletrônicos de Saúde/estatística & dados numéricos , Processamento de Linguagem Natural , Melhoria de Qualidade , Espondilartrite/classificação , Espondilite Anquilosante/classificação , Idoso , Algoritmos , Área Sob a Curva , Estudos de Coortes , Feminino , Humanos , Classificação Internacional de Doenças , Masculino , Pessoa de Meia-Idade , Sensibilidade e Especificidade , Espondilartrite/epidemiologia , Espondilite Anquilosante/epidemiologia
8.
Eur J Epidemiol ; 34(2): 153-162, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30535584

RESUMO

We developed algorithms to identify pregnant women with suicidal behavior using information extracted from clinical notes by natural language processing (NLP) in electronic medical records. Using both codified data and NLP applied to unstructured clinical notes, we first screened pregnant women in Partners HealthCare for suicidal behavior. Psychiatrists manually reviewed clinical charts to identify relevant features for suicidal behavior and to obtain gold-standard labels. Using the adaptive elastic net, we developed algorithms to classify suicidal behavior. We then validated algorithms in an independent validation dataset. From 275,843 women with codes related to pregnancy or delivery, 9331 women screened positive for suicidal behavior by either codified data (N = 196) or NLP (N = 9,145). Using expert-curated features, our algorithm achieved an area under the curve of 0.83. By setting a positive predictive value comparable to that of diagnostic codes related to suicidal behavior (0.71), we obtained a sensitivity of 0.34, specificity of 0.96, and negative predictive value of 0.83. The algorithm identified 1423 pregnant women with suicidal behavior among 9331 women screened positive. Mining unstructured clinical notes using NLP resulted in a 11-fold increase in the number of pregnant women identified with suicidal behavior, as compared to solely reliance on diagnostic codes.


Assuntos
Registros Eletrônicos de Saúde , Classificação Internacional de Doenças/normas , Processamento de Linguagem Natural , Complicações na Gravidez , Ideação Suicida , Algoritmos , Mineração de Dados , Feminino , Humanos , Gravidez
9.
J Biomed Inform ; 100: 103322, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31672532

RESUMO

OBJECTIVE: With its increasingly widespread adoption, electronic health records (EHR) have enabled phenotypic information extraction at an unprecedented granularity and scale. However, often a medical concept (e.g. diagnosis, prescription, symptom) is described in various synonyms across different EHR systems, hindering data integration for signal enhancement and complicating dimensionality reduction for knowledge discovery. Despite existing ontologies and hierarchies, tremendous human effort is needed for curation and maintenance - a process that is both unscalable and susceptible to subjective biases. This paper aims to develop a data-driven approach to automate grouping medical terms into clinically relevant concepts by combining multiple up-to-date data sources in an unbiased manner. METHODS: We present a novel data-driven grouping approach - multi-view banded spectral clustering (mvBSC) combining summary data from multiple healthcare systems. The proposed method consists of a banding step that leverages the prior knowledge from the existing coding hierarchy, and a combining step that performs spectral clustering on an optimally weighted matrix. RESULTS: We apply the proposed method to group ICD-9 and ICD-10-CM codes together by integrating data from two healthcare systems. We show grouping results and hierarchies for 13 representative disease categories. Individual grouping qualities were evaluated using normalized mutual information, adjusted Rand index, and F1-measure, and were found to consistently exhibit great similarity to the existing manual grouping counterpart. The resulting ICD groupings also enjoy comparable interpretability and are well aligned with the current ICD hierarchy. CONCLUSION: The proposed approach, by systematically leveraging multiple data sources, is able to overcome bias while maximizing consensus to achieve generalizability. It has the advantage of being efficient, scalable, and adaptive to the evolving human knowledge reflected in the data, showing a significant step toward automating medical knowledge integration.


Assuntos
Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Algoritmos , Automação , Análise por Conglomerados , Humanos
10.
BMC Med Inform Decis Mak ; 19(1): 226, 2019 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-31730484

RESUMO

BACKGROUND: Electronic medical records (EMR) contain numerical data important for clinical outcomes research, such as vital signs and cardiac ejection fractions (EF), which tend to be embedded in narrative clinical notes. In current practice, this data is often manually extracted for use in research studies. However, due to the large volume of notes in datasets, manually extracting numerical data often becomes infeasible. The objective of this study is to develop and validate a natural language processing (NLP) tool that can efficiently extract numerical clinical data from narrative notes. RESULTS: To validate the accuracy of the tool EXTraction of EMR Numerical Data (EXTEND), we developed a reference standard by manually extracting vital signs from 285 notes, EF values from 300 notes, glycated hemoglobin (HbA1C), and serum creatinine from 890 notes. For each parameter of interest, we calculated the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score of EXTEND using two metrics. (1) completion of data extraction, and (2) accuracy of data extraction compared to the actual values in the note verified by chart review. At the note level, extraction by EXTEND was considered correct only if it accurately detected and extracted all values of interest in a note. Using manually-annotated labels as the gold standard, the note-level accuracy of EXTEND in capturing the numerical vital sign values, EF, HbA1C and creatinine ranged from 0.88 to 0.95 for sensitivity, 0.95 to 1.0 for specificity, 0.95 to 1.0 for PPV, 0.89 to 0.99 for NPV, and 0.92 to 0.96 in F1 scores. Compared to the actual value level, the sensitivity, PPV, and F1 score of EXTEND ranged from 0.91 to 0.95, 0.95 to 1.0 and 0.95 to 0.96. CONCLUSIONS: EXTEND is an efficient, flexible tool that uses knowledge-based rules to extract clinical numerical parameters with high accuracy. By increasing dictionary terms and developing new rules, the usage of EXTEND can easily be expanded to extract additional numerical data important in clinical outcomes research.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Algoritmos , Creatinina/sangue , Hemoglobinas Glicadas/metabolismo , Humanos , Sensibilidade e Especificidade , Volume Sistólico , Sinais Vitais
11.
J Pediatr ; 188: 224-231.e5, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28625502

RESUMO

OBJECTIVES: To compare registry and electronic health record (EHR) data mining approaches for cohort ascertainment in patients with pediatric pulmonary hypertension (PH) in an effort to overcome some of the limitations of registry enrollment alone in identifying patients with particular disease phenotypes. STUDY DESIGN: This study was a single-center retrospective analysis of EHR and registry data at Boston Children's Hospital. The local Informatics for Integrating Biology and the Bedside (i2b2) data warehouse was queried for billing codes, prescriptions, and narrative data related to pediatric PH. Computable phenotype algorithms were developed by fitting penalized logistic regression models to a physician-annotated training set. Algorithms were applied to a candidate patient cohort, and performance was evaluated using a separate set of 136 records and 179 registry patients. We compared clinical and demographic characteristics of patients identified by computable phenotype and the registry. RESULTS: The computable phenotype had an area under the receiver operating characteristics curve of 90% (95% CI, 85%-95%), a positive predictive value of 85% (95% CI, 77%-93%), and identified 413 patients (an additional 231%) with pediatric PH who were not enrolled in the registry. Patients identified by the computable phenotype were clinically distinct from registry patients, with a greater prevalence of diagnoses related to perinatal distress and left heart disease. CONCLUSIONS: Mining of EHRs using computable phenotypes identified a large cohort of patients not recruited using a classic registry. Fusion of EHR and registry data can improve cohort ascertainment for the study of rare diseases. TRIAL REGISTRATION: ClinicalTrials.gov: NCT02249923.


Assuntos
Mineração de Dados , Registros Eletrônicos de Saúde , Hipertensão Pulmonar/diagnóstico , Sistema de Registros , Algoritmos , Criança , Humanos , Hipertensão Pulmonar/epidemiologia , Fenótipo , Valor Preditivo dos Testes , Estudos Retrospectivos , Sensibilidade e Especificidade , Estados Unidos/epidemiologia
12.
Radiographics ; 36(1): 176-91, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26761536

RESUMO

The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications.


Assuntos
Pesquisa Biomédica/organização & administração , Mineração de Dados/métodos , Registros Eletrônicos de Saúde/organização & administração , Processamento de Linguagem Natural , Radiologia/organização & administração , Vocabulário Controlado , Humanos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos
13.
J Comput Assist Tomogr ; 40(3): 387-92, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26938697

RESUMO

OBJECTIVE: The aim of this study was to prospectively test the performance and potential for clinical integration of software that automatically calculates the right-to-left ventricular (RV/LV) diameter ratio from computed tomography pulmonary angiography images. METHODS: Using 115 computed tomography pulmonary angiography images that were positive for acute pulmonary embolism, we prospectively evaluated RV/LV ratio measurements that were obtained as follows: (1) completely manual measurement (reference standard), (2) completely automated measurement using the software, and (3 and 4) using a customized software interface that allowed 2 independent radiologists to manually adjust the automatically positioned calipers. RESULTS: Automated measurements underestimated (P < 0.001) the reference standard (1.09 [0.25] vs1.03 [0.35]). With manual correction of the automatically positioned calipers, the mean ratio became closer to the reference standard (1.06 [0.29] by read 1 and 1.07 [0.30] by read 2), and the correlation improved (r = 0.675 to 0.872 and 0.887). The mean time required for manual adjustment (37 [20] seconds) was significantly less than the time required to perform measurements entirely manually (100 [23] seconds). CONCLUSIONS: Automated CT RV/LV diameter ratio software shows promise for integration into the clinical workflow for patients with acute pulmonary embolism.


Assuntos
Angiografia por Tomografia Computadorizada , Ventrículos do Coração/diagnóstico por imagem , Reconhecimento Automatizado de Padrão/métodos , Artéria Pulmonar/diagnóstico por imagem , Embolia Pulmonar/diagnóstico por imagem , Software , Algoritmos , Ventrículos do Coração/anatomia & histologia , Humanos , Aprendizado de Máquina , Pessoa de Meia-Idade , Variações Dependentes do Observador , Tamanho do Órgão , Embolia Pulmonar/patologia , Intensificação de Imagem Radiográfica , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
Radiographics ; 35(7): 1965-88, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26562233

RESUMO

While use of advanced visualization in radiology is instrumental in diagnosis and communication with referring clinicians, there is an unmet need to render Digital Imaging and Communications in Medicine (DICOM) images as three-dimensional (3D) printed models capable of providing both tactile feedback and tangible depth information about anatomic and pathologic states. Three-dimensional printed models, already entrenched in the nonmedical sciences, are rapidly being embraced in medicine as well as in the lay community. Incorporating 3D printing from images generated and interpreted by radiologists presents particular challenges, including training, materials and equipment, and guidelines. The overall costs of a 3D printing laboratory must be balanced by the clinical benefits. It is expected that the number of 3D-printed models generated from DICOM images for planning interventions and fabricating implants will grow exponentially. Radiologists should at a minimum be familiar with 3D printing as it relates to their field, including types of 3D printing technologies and materials used to create 3D-printed anatomic models, published applications of models to date, and clinical benefits in radiology. Online supplemental material is available for this article.


Assuntos
Modelos Anatômicos , Impressão Tridimensional , Radiologia/métodos , Recursos Audiovisuais , Humanos , Imagens de Fantasmas , Impressão Tridimensional/economia , Impressão Tridimensional/instrumentação , Impressão Tridimensional/tendências , Desenho de Prótese , Resinas Sintéticas , Reologia , Software , Cirurgia Assistida por Computador , Engenharia Tecidual/métodos , Tomografia Computadorizada por Raios X
15.
ACR Open Rheumatol ; 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38747148

RESUMO

OBJECTIVE: We aimed to examine the feasibility of applying natural language processing (NLP) to unstructured electronic health record (EHR) documents to detect the presence of financial insecurity among patients with rheumatologic disease enrolled in an integrated care management program (iCMP). METHODS: We incorporated supervised, rule-based NLP and statistical methods to identify financial insecurity among patients with rheumatic conditions enrolled in an iCMP (n = 20,395) in a multihospital EHR system. We constructed a lexicon for financial insecurity using data from available knowledge sources and then reviewed EHR notes from 538 randomly selected individuals (training cohort n = 366, validation cohort n = 172). We manually categorized records as having "definite," "possible," or "no" mention of financial insecurity. All available notes were processed using Narrative Information Linear Extraction, a rule-based version of NLP. Models were trained using the NLP features for financial insecurity using logistic, least absolute shrinkage operator (LASSO), and random forest performance characteristic and were compared with the reference standard. RESULTS: A total of 245,142 notes were processed from 538 individual patient records. Financial insecurity was present among 100 (27%) individuals in the training cohort and 63 (37%) in the validation cohort. The LASSO and random forest models performed identically and slightly better than logistic regression, with positive predictive values of 0.90, sensitivities of 0.29, and specificities of 0.98. CONCLUSION: The development of a context-driven lexicon used with rule-based NLP to extract data that identify financial insecurity is feasible for use and improved the capture for presence of financial insecurity with high accuracy. In the absence of a standard lexicon and construct definition for financial insecurity status, additional studies are needed to optimize the sensitivity of algorithms to categorize financial insecurity with construct validity.

16.
Semin Arthritis Rheum ; 67: 152468, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38788567

RESUMO

OBJECTIVE: Cardiovascular disease (CVD) risk is increased in SLE and underestimated by general population prediction algorithms. We aimed to develop a novel SLE-specific prediction tool, SLECRISK, to provide a more accurate estimate of CVD risk in SLE. METHODS: We studied patients in the Brigham and Women's Hospital SLE cohort. We collected one-year baseline data including the presence of traditional CVD factors and SLE-related features at cohort enrollment. Ten-year follow-up for the first major adverse cardiovascular event (MACE; myocardial infarction (MI), stroke, or cardiac death) began at day +1 following the baseline period (index date). ICD-9/10 codes identified MACE were adjudicated by board-certified cardiologists. Least absolute shrinkage and selection operator regression selected SLE-related variables to add to the American College of Cardiology/American Heart Association (ACC/AHA) Pooled Cohort Risk Equations 10-year risk Cox regression model. Model fit statistics and performance (sensitivity, specificity, positive/negative predictive value, c-statistic) for predicting moderate/high 10-year risk (≥7.5 %) of MACE were assessed and compared to ACC/AHA, Framingham risk score (FRS), and modified FRS (mFRS). Optimism adjustment internal validation was performed using bootstrapping. RESULTS: We included 1,243 patients with 90 MACEs (46 MIs, 36 strokes, 19 cardiac deaths) over 8946.5 person-years of follow-up. SLE variables selected for the new prediction algorithm (SLECRISK) were SLE activity (remission/mild vs. moderate/severe), disease duration (years), creatinine (mg/dL), anti-dsDNA, anti-RNP, lupus anticoagulant, anti-Ro positivity, and low C4. The sensitivity for detecting moderate/high-risk (≥7.5 %) of MACE using SLECRISK was 0.74 (95 %CI: 0.65, 0.83), which was better than the sensitivity of the ACC/AHA model (0.38 (95 %CI: 0.28, 0.48)). It also identified 3.4-fold more moderate/high-risk patients than the ACC/AHA. Patients who were moderate/high-risk according to SLECRISK but not ACC/AHA, were more likely to be young women with severe SLE and few other traditional CVD risk factors. Model performance between SLECRISK, FRS, and mFRS were similar. CONCLUSION: The novel SLECRISK tool is more sensitive than the ACC/AHA for predicting moderate/high 10-year risk for MACE and may be particularly useful in predicting risk for young females with severe SLE. Future external validation studies utilizing cohorts with more severe SLE are needed.


Assuntos
Doenças Cardiovasculares , Lúpus Eritematoso Sistêmico , Humanos , Lúpus Eritematoso Sistêmico/complicações , Feminino , Masculino , Pessoa de Meia-Idade , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/etiologia , Adulto , Medição de Risco/métodos , Fatores de Risco de Doenças Cardíacas , Fatores de Risco , Medicina de Precisão
17.
Patterns (N Y) ; 5(1): 100906, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38264714

RESUMO

Electronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning. LATTE models the sequential dependency between the target event and visit embeddings to derive the timings. To improve label efficiency, LATTE constructs longitudinal silver-standard labels from unlabeled patients to perform semi-supervised training. LATTE is evaluated on the onset of type 2 diabetes, heart failure, and relapses of multiple sclerosis. LATTE consistently achieves substantial improvements over benchmark methods while providing high prediction interpretability. The event timings are shown to help discover risk factors of heart failure among patients with rheumatoid arthritis.

18.
J Am Heart Assoc ; 13(9): e030387, 2024 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-38686879

RESUMO

BACKGROUND: Coronary microvascular dysfunction as measured by myocardial flow reserve (MFR) is associated with increased cardiovascular risk in rheumatoid arthritis (RA). The objective of this study was to determine the association between reducing inflammation with MFR and other measures of cardiovascular risk. METHODS AND RESULTS: Patients with RA with active disease about to initiate a tumor necrosis factor inhibitor were enrolled (NCT02714881). All subjects underwent a cardiac perfusion positron emission tomography scan to quantify MFR at baseline before tumor necrosis factor inhibitor initiation, and after tumor necrosis factor inhibitor initiation at 24 weeks. MFR <2.5 in the absence of obstructive coronary artery disease was defined as coronary microvascular dysfunction. Blood samples at baseline and 24 weeks were measured for inflammatory markers (eg, high-sensitivity C-reactive protein [hsCRP], interleukin-1b, and high-sensitivity cardiac troponin T [hs-cTnT]). The primary outcome was mean MFR before and after tumor necrosis factor inhibitor initiation, with Δhs-cTnT as the secondary outcome. Secondary and exploratory analyses included the correlation between ΔhsCRP and other inflammatory markers with MFR and hs-cTnT. We studied 66 subjects, 82% of which were women, mean RA duration 7.4 years. The median atherosclerotic cardiovascular disease risk was 2.5%; 47% had coronary microvascular dysfunction and 23% had detectable hs-cTnT. We observed no change in mean MFR before (2.65) and after treatment (2.64, P=0.6) or hs-cTnT. A correlation was observed between a reduction in hsCRP and interleukin-1b with a reduction in hs-cTnT. CONCLUSIONS: In this RA cohort with low prevalence of cardiovascular risk factors, nearly 50% of subjects had coronary microvascular dysfunction at baseline. A reduction in inflammation was not associated with improved MFR. However, a modest reduction in interleukin-1b and no other inflammatory pathways was correlated with a reduction in subclinical myocardial injury. REGISTRATION: URL: https://www.clinicaltrials.gov; Unique identifier: NCT02714881.


Assuntos
Artrite Reumatoide , Biomarcadores , Circulação Coronária , Inflamação , Microcirculação , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Antirreumáticos/uso terapêutico , Artrite Reumatoide/fisiopatologia , Artrite Reumatoide/complicações , Artrite Reumatoide/sangue , Biomarcadores/sangue , Proteína C-Reativa/metabolismo , Doença da Artéria Coronariana/fisiopatologia , Doença da Artéria Coronariana/sangue , Doença da Artéria Coronariana/diagnóstico , Circulação Coronária/fisiologia , Vasos Coronários/fisiopatologia , Vasos Coronários/diagnóstico por imagem , Reserva Fracionada de Fluxo Miocárdico/fisiologia , Fatores de Risco de Doenças Cardíacas , Inflamação/sangue , Inflamação/fisiopatologia , Mediadores da Inflamação/sangue , Interleucina-1beta/sangue , Imagem de Perfusão do Miocárdio/métodos , Tomografia por Emissão de Pósitrons , Resultado do Tratamento , Troponina T/sangue , Inibidores do Fator de Necrose Tumoral/uso terapêutico
19.
JAMA Intern Med ; 183(10): 1090-1097, 2023 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-37603326

RESUMO

Importance: The US Food and Drug Administration (FDA) is building a national postmarketing surveillance system for medical devices, moving to a "total product life cycle" approach whereby more limited premarketing data are balanced with postmarketing surveillance to capture rare adverse events and long-term safety issues. Objective: To assess the methodological requirements and feasibility of postmarketing device surveillance using endovascular aneurysm repair devices (EVARs), which have been the subject of safety concerns, using clinical data from a large health care system. Design, Setting, and Participants: This retrospective cohort study included patients with electronic health record (EHR) data in the Veterans Affairs Corporate Data Warehouse. Exposure: Implantation of an AFX Endovascular AAA System (AFX) device (any of 3 iterations) or a non-AFX comparator EVAR device from January 1, 2011, to December 21, 2021. Main Outcomes and Measures: The primary outcomes were rates of type III endoleaks and all-cause mortality; and rates of these outcomes associated with AFX devices compared with non-AFX devices, assessed using Cox proportional hazards regression models and doubly robust causal modeling. Information on type III endoleaks was available only as free-text mentions in clinical notes, while all-cause mortality data could be extracted using structured data. Device-specific information required by the FDA is ascertained using unique device identifiers (UDIs), which include factors such as model numbers, catalog numbers, and manufacturer-specific product codes. The availability of UDIs in EHRs was assessed. Results: In total, 13 941 patients (mean [SD] age, 71.8 [7.4] years) received 1 of the devices of interest (AFX with Strata [AFX-S]: 718 patients [5.2%]; AFX with Duraply [AFX-D]: 404 patients [2.9%]; or AFX2: 682 patients [4.9%]), and 12 137 (87.1%) received non-AFX devices. The UDIs were not recorded in the EHR for any patient with an AFX device, and partial UDIs were available for 19 patients (0.1%) with a non-AFX device. This necessitated the development of advanced natural language processing tools to define the cohort of patients for analysis. The study identified a significantly higher risk of type III endoleaks at 5 years among patients receiving any of the AFX device iterations, including the most recent version, AFX2 (11.6%; 95% CI, 8.1%-15.1%) compared with that among patients with non-AFX devices (5.7%; 95% CI, 2.2%-9.2%; absolute risk difference, 5.9%; 95% CI, 2.3%-9.4%). However, there was no significantly higher all-cause mortality for any of the AFX device iterations, including for AFX2 (19.0%; 95% CI, 16.0%-22.0%) compared with non-AFX devices (18.0%; 95% CI, 15.0%-21.0%; absolute risk difference, 1.0%; 95% CI, -2.1% to 4.1%). Conclusions and Relevance: The findings of this cohort study suggest that clinical data can be used for the postmarketing device surveillance required by the FDA. The study also highlights ongoing challenges to performing larger-scale surveillance, including lack of consistent use of UDIs and insufficient relevant structured data to efficiently capture certain outcomes of interest.


Assuntos
Aneurisma da Aorta Abdominal , Implante de Prótese Vascular , Procedimentos Endovasculares , Humanos , Idoso , Prótese Vascular , Endoleak/etiologia , Correção Endovascular de Aneurisma , Aneurisma da Aorta Abdominal/etiologia , Aneurisma da Aorta Abdominal/mortalidade , Aneurisma da Aorta Abdominal/cirurgia , Estudos Retrospectivos , Estudos de Coortes , Resultado do Tratamento , Procedimentos Endovasculares/efeitos adversos , Procedimentos Endovasculares/instrumentação
20.
Arthritis Care Res (Hoboken) ; 75(12): 2529-2536, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37331999

RESUMO

OBJECTIVE: Social determinants of health (SDoH), such as poverty, are associated with increased burden and severity of rheumatic and musculoskeletal diseases. This study was undertaken to study the prevalence and documentation of SDoH-related needs in electronic health records (EHRs) of individuals with these conditions. METHODS: We randomly selected individuals with ≥1 International Classification of Diseases, Ninth/Tenth Revision (ICD-9/10) code for a rheumatic/musculoskeletal condition enrolled in a multihospital integrated care management program that coordinates care for medically and/or psychosocially complex individuals. We assessed SDoH documentation using terms for financial needs, food insecurity, housing instability, transportation, and medication access according to EHR note review and ICD-10 SDoH billing codes (Z codes). We used multivariable logistic regression to examine associations between demographic factors (age, gender, race, ethnicity, insurance) and ≥1 (versus 0) SDoH need as the odds ratio (OR) with 95% confidence interval (95% CI). RESULTS: Among 558 individuals with rheumatic/musculoskeletal conditions, 249 (45%) had ≥1 SDoH need documented in EHR notes by social workers, care coordinators, nurses, and physicians. A total of 171 individuals (31%) had financial insecurity, 105 (19%) had transportation needs, 94 (17%) had food insecurity; 5% had ≥1 related Z code. In the multivariable model, the odds of having ≥1 SDoH need was 2.45 times higher (95% CI 1.17-5.11) for Black versus White individuals and significantly higher for Medicaid or Medicare beneficiaries versus commercially insured individuals. CONCLUSION: Nearly half of this sample of complex care management patients with rheumatic/musculoskeletal conditions had SDoH documented within EHR notes; financial insecurity was the most prevalent. Only 5% of patients had representative billing codes suggesting that systematic strategies to extract SDoH from notes are needed.


Assuntos
Prestação Integrada de Cuidados de Saúde , Doenças Musculoesqueléticas , Doenças Reumáticas , Estados Unidos/epidemiologia , Humanos , Idoso , Determinantes Sociais da Saúde , Medicare , Doenças Musculoesqueléticas/diagnóstico , Doenças Musculoesqueléticas/epidemiologia , Doenças Musculoesqueléticas/terapia , Documentação , Doenças Reumáticas/diagnóstico , Doenças Reumáticas/epidemiologia , Doenças Reumáticas/terapia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA