RESUMEN
OBJECTIVE: Analysis of healthcare Real-World Data (RWD) provides an opportunity to observe actual patient diagnostic, treatment and outcomes events. However, researchers should understand the possible limitations of RWD. In particular, these data may be incomplete, which would affect the validity of study conclusions. MATERIALS AND METHODS: The completeness of medication RWD was investigated by analyzing the incidence of various diagnosis-medication couplets: the occurrence of a certain medication in the RWD for a patient having a certain diagnosis. Diagnosis and medication data were obtained from 61 U.S. medical data provider organizations, members of the TriNetX global research network. The number of patients having 22 diagnoses and expected medications were obtained at each institution, and the percent completion of each diagnosis-medication couplet calculated. The study hypothesis is that the degree of couplet completeness can serve as a proxy for overall completeness of medication data for a given organization. RESULTS: Five diagnosis-medication couplets were found to be reliable proxies, having at least a peak 87% observed completeness for the organizations studied: Type 1 diabetes mellitus and insulin; asthma and albuterol; congestive heart failure and diuretics; cardiovascular disease and aspirin; hypothyroidism and levothyroxine. DISCUSSION: These couplets were validated as reliable indicators by determining their status as standards of care. The degree to which patients with these five diagnoses had the specified associated medication was consistent within an organization data set. CONCLUSION: The overall degree of medication data completeness for an organization can be assessed by measuring the completeness of certain indicator diagnosis-medication couplets.
Asunto(s)
Insuficiencia Cardíaca , Insulina , HumanosRESUMEN
Objective: Clinical research networks facilitate collaborative research, but data sharing remains a common barrier. Materials and Methods: The TriNetX platform provides real-time access to electronic health record (EHR)-derived, anonymized data from 173 healthcare organizations (HCOs) and tools for queries and analysis. In 2022, 4 pediatric HCOs worked with TriNetX leadership to found the Pediatric Collaboratory Network (PCN), facilitated via a multi-institutional data-use agreement (DUA). The DUA enables collaborative study design and execution, with institutional review board-approved transfer of complete datasets for further analyses on a per-protocol basis. Results and Discussion: Of the 41.2 million children with TriNetX records, the PCN represents nearly 10%. The PCN assisted several early-career investigators to bring study concepts from conception to an international scientific meeting presentation and journal submission. Conclusion: The PCN facilitates EHR vendor-agnostic multicenter pediatric research on the global TriNetX platform. Continued growth of the PCN will advance knowledge in pediatric health.
RESUMEN
BACKGROUND: A wealth of clinically relevant information is only obtainable within unstructured clinical narratives, leading to great interest in clinical natural language processing (NLP). While a multitude of approaches to NLP exist, current algorithm development approaches have limitations that can slow the development process. These limitations are exacerbated when the task is emergent, as is the case currently for NLP extraction of signs and symptoms of COVID-19 and postacute sequelae of SARS-CoV-2 infection (PASC). OBJECTIVE: This study aims to highlight the current limitations of existing NLP algorithm development approaches that are exacerbated by NLP tasks surrounding emergent clinical concepts and to illustrate our approach to addressing these issues through the use case of developing an NLP system for the signs and symptoms of COVID-19 and PASC. METHODS: We used 2 preexisting studies on PASC as a baseline to determine a set of concepts that should be extracted by NLP. This concept list was then used in conjunction with the Unified Medical Language System to autonomously generate an expanded lexicon to weakly annotate a training set, which was then reviewed by a human expert to generate a fine-tuned NLP algorithm. The annotations from a fully human-annotated test set were then compared with NLP results from the fine-tuned algorithm. The NLP algorithm was then deployed to 10 additional sites that were also running our NLP infrastructure. Of these 10 sites, 5 were used to conduct a federated evaluation of the NLP algorithm. RESULTS: An NLP algorithm consisting of 12,234 unique normalized text strings corresponding to 2366 unique concepts was developed to extract COVID-19 or PASC signs and symptoms. An unweighted mean dictionary coverage of 77.8% was found for the 5 sites. CONCLUSIONS: The evolutionary and time-critical nature of the PASC NLP task significantly complicates existing approaches to NLP algorithm development. In this work, we present a hybrid approach using the Open Health Natural Language Processing Toolkit aimed at addressing these needs with a dictionary-based weak labeling step that minimizes the need for additional expert annotation while still preserving the fine-tuning capabilities of expert involvement.
RESUMEN
OBJECTIVES: Analysis of health care real-world data (RWD) provides an opportunity to observe the actual patient diagnostic, treatment, and outcome events. However, researchers should understand the possible limitations of RWD. In particular, the dates in these data may be shifted from their actual values, which might affect the validity of study conclusions. METHODS: A methodology for detecting the presence of shifted dates in RWD was developed by considering various approaches to confirm the expected occurrences of medical events, including unique temporal occurrences as well as recurring seasonal or weekday patterns in diagnoses or procedures. Diagnosis and procedure data was obtained from 71 U.S. health care data provider organizations (HCOs), members of the TriNetX global research network. Synthetic data was generated for various degrees of date shifting corresponding to the diagnoses and procedures studied, yielding the resulting patterns when various degrees of shifting (including no shift) were applied. These patterns were compared with those produced for each HCO to predict the presence and degree of date shifting. These predictions were compared with statements of date shifting by the originating HCOs to determine the predictive accuracy of the methods studied. RESULTS: Twenty-eight of the 71 HCOs analyzed were predicted by methodology and confirmed by their data providers to have shifted data. Likewise, 39 were predicted and confirmed to not have shifted data. With four HCOs, agreement between predicted and stated date shifting status was not obtained. The occurrence of routine medical exams, only happening during weekdays, for these U.S. HCOs was most predictive (0.92 correlation coefficient) of the presence or absence of date shifting. CONCLUSION: The presence of date shifting for U.S. HCOs may be reliably detected assessing whether the routine exams should always occur on weekdays.
Asunto(s)
Exactitud de los Datos , Atención a la Salud , Registros Electrónicos de Salud , Humanos , Instituciones de Salud , Personal de SaludRESUMEN
BACKGROUND: Pancreatic Duct Adenocarcinoma (PDAC) screening can enable early-stage disease detection and long-term survival. Current guidelines use inherited predisposition, with about 10% of PDAC cases eligible for screening. Using Electronic Health Record (EHR) data from a multi-institutional federated network, we developed and validated a PDAC RISk Model (Prism) for the general US population to extend early PDAC detection. METHODS: Neural Network (PrismNN) and Logistic Regression (PrismLR) were developed using EHR data from 55 US Health Care Organisations (HCOs) to predict PDAC risk 6-18 months before diagnosis for patients 40 years or older. Model performance was assessed using Area Under the Curve (AUC) and calibration plots. Models were internal-externally validated by geographic location, race, and time. Simulated model deployment evaluated Standardised Incidence Ratio (SIR) and other metrics. FINDINGS: With 35,387 PDAC cases, 1,500,081 controls, and 87 features per patient, PrismNN obtained a test AUC of 0.826 (95% CI: 0.824-0.828) (PrismLR: 0.800 (95% CI: 0.798-0.802)). PrismNN's average internal-external validation AUCs were 0.740 for locations, 0.828 for races, and 0.789 (95% CI: 0.762-0.816) for time. At SIR = 5.10 (exceeding the current screening inclusion threshold) in simulated model deployment, PrismNN sensitivity was 35.9% (specificity 95.3%). INTERPRETATION: Prism models demonstrated good accuracy and generalizability across diverse populations. PrismNN could find 3.5 times more cases at comparable risk than current screening guidelines. The small number of features provided a basis for model interpretation. Integration with the federated network provided data from a large, heterogeneous patient population and a pathway to future clinical deployment. FUNDING: Prevent Cancer Foundation, TriNetX, Boeing, DARPA, NSF, and Aarno Labs.
Asunto(s)
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Carcinoma Ductal Pancreático/patología , Modelos Logísticos , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/epidemiología , Neoplasias Pancreáticas/etiología , Estudios Retrospectivos , Estudios Multicéntricos como AsuntoRESUMEN
Objective: This article describes a scalable, performant, sustainable global network of electronic health record data for biomedical and clinical research. Materials and Methods: TriNetX has created a technology platform characterized by a conservative security and governance model that facilitates collaboration and cooperation between industry participants, such as pharmaceutical companies and contract research organizations, and academic and community-based healthcare organizations (HCOs). HCOs participate on the network in return for access to a suite of analytics capabilities, large networks of de-identified data, and more sponsored trial opportunities. Industry participants provide the financial resources to support, expand, and improve the technology platform in return for access to network data, which provides increased efficiencies in clinical trial design and deployment. Results: TriNetX is a growing global network, expanding from 55 HCOs and 7 countries in 2017 to over 220 HCOs and 30 countries in 2022. Over 19 000 sponsored clinical trial opportunities have been initiated through the TriNetX network. There have been over 350 peer-reviewed scientific publications based on the network's data. Conclusions: The continued growth of the TriNetX network and its yield of clinical trial collaborations and published studies indicates that this academic-industry structure is a safe, proven, sustainable path for building and maintaining research-centric data networks.
RESUMEN
PURPOSE: To explore medications and their administration patterns in real-world patients with breast cancer. METHODS: A retrospective study was performed using TriNetX, a federated network of deidentified, Health Insurance Portability and Accountability Act-compliant data from 21 health care organizations across North America. Patients diagnosed with breast cancer between January 1, 2013, and May 31, 2022, were included. We investigated a rule-based and unsupervised learning algorithm to extract medications and their administration patterns. To group similar administration patterns, we used three features in k-means clustering: total number of administrations, median number of days between administrations, and standard deviation of the days between administrations. We explored the first three lines of therapy for patients classified into six groups on the basis of their stage at diagnosis (early as stages I-III v late as stage IV) and the sensitivity of the tumor's receptors to targeted therapies: hormone receptor-positive/human epidermal growth factor 2-negative (HR+/ERBB2-), ERBB2-positive (ERBB2+/HR±), or triple-negative (TN; HR-/ERBB2-). To add credence to the derived regimens, we compared them to the National Comprehensive Cancer Network (NCCN): Breast Cancer (version 2.2023) recommendations. RESULTS: In early-stage HR+/ERBB2- and TN groups, the most common regimens were (1) cyclophosphamide and docetaxel, administered once every 3 weeks for three to six cycles and (2) cyclophosphamide and doxorubicin, administered once every 2 weeks for four cycles, followed by paclitaxel administered once every week for 12 cycles. In the early-stage ERBB2+/HR± group, most patients were administered carboplatin and docetaxel with or without pertuzumab and with trastuzumab (for six or more cycles). Medications most commonly administered in our data set (7,798 patients) agreed with recommendations from the NCCN in terms of medications (regimens), number of administrations (cycles), and days between administrations (cycle length). CONCLUSION: Although there is a general agreement with the NCCN Guidelines, real-world medication data exhibit variability in the medications and their administration patterns.
Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/etiología , Docetaxel/uso terapéutico , Estudios Retrospectivos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , CiclofosfamidaRESUMEN
PURPOSE: This is an update to a previously published report characterizing the impact that efforts to control the COVID-19 pandemic have had on the normal course of cancer-related encounters. METHODS: Data were analyzed from 22 US health care organizations (members of the TriNetX global network) having relevant, up-to-date encounter data. Although the original study compared encounter data pre-COVID-19 (January-April 2019) with the corresponding months in 2020, this update considers data through April 2021. As before, cohorts were generated for all neoplasm patients (malignant, benign, in situ, and of unspecified behavior), all new incidence neoplasm patients, exclusively malignant neoplasm patients, and new incidence malignant neoplasm patients. Data on the initial cancer stage were available for calendar year 2020 from about one third of the study's organizations. RESULTS: Although COVID-19 cases fluctuated through 2021, newly diagnosed cancers closely paralleled the prepandemic base year 2019. Similarly, screening for breast, colorectal, and cervical cancers quickly recovered beginning in May 2020 to prepandemic numbers. Preliminary data for the initial cancer stage showed no significant difference (P > .10) in distribution for breast or colon cancers between 2019 and 2020. CONCLUSION: Although the number of COVID-19 cases fluctuated, the steep declines observed during March and April 2020 in screening for breast and colon cancer and patients with newly diagnosed cancer did not continue through the rest of 2020 and into April 2021. Screening and new incidence cancer numbers quickly rose compared with prepandemic levels. The concern that more patients with advanced-stage cancer would be seen in the months following the drastic dips of March-April 2020 was not realized as the major disruption to normal cancer care was limited to these 2 months.
Asunto(s)
COVID-19 , Neoplasias , COVID-19/epidemiología , Humanos , Incidencia , Neoplasias/diagnóstico , Neoplasias/epidemiología , Neoplasias/terapia , Pandemias , SARS-CoV-2RESUMEN
The availability of next-generation sequencing (NGS) technologies and their continually declining costs have resulted in the accumulation of large genomic data sets. NGS results have traditionally been delivered in PDF format, and in some cases, structured data, e.g., XML or JSON formats, are also made available, but there is a lack of uniformity around the profiling of external vendor testing platforms. Atrium Health Wake Forest Baptist and TriNetX have harmonized and mapped genomic data to FHIR Genomic standards and imported it into the TriNetX database through a data pipeline. This process is translatable to other sequencing platforms and to other institutions. The addition of genotypic data to the TriNetX database to the reservoir of phenotypic data will promote enhanced industry trial recruitment, (ii) comprehensive intra-institutional genomic benchmarking/quality improvement, and eventually (iii) sweeping inter-institutional genomic research and treatment paradigm shifts.
RESUMEN
Including social determinants of health (SDoH) data in health outcomes research is essential for studying the sources of healthcare disparities and developing strategies to mitigate stressors. In this report, we describe a pragmatic design and approach to explore the encoding needs for transmitting SDoH screening tool responses from a large safety-net hospital into the National Covid Cohort Collaborative (N3C) OMOP dataset. We provide a stepwise account of designing data mapping and ingestion for patient-level SDoH and summarize the results of screening. Our approach demonstrates that sharing of these important data - typically stored as non-standard, EHR vendor specific codes - is feasible. As SDoH screening gains broader use nationally, the approach described in this paper could be used for other screening instruments and improve the interoperability of these important data.
RESUMEN
Recent findings have shown that the continued expansion of the scope and scale of data collected in electronic health records are making the protection of personally identifiable information (PII) more challenging and may inadvertently put our institutions and patients at risk if not addressed. As clinical terminologies expand to include new terms that may capture PII (e.g., Patient First Name, Patient Phone Number), institutions may start using them in clinical data capture (and in some cases, they already have). Once in use, PII-containing values associated with these terms may find their way into laboratory or observation data tables via extract-transform-load jobs intended to process structured data, putting institutions at risk of unintended disclosure. Here we aim to inform the informatics community of these findings, as well as put out a call to action for remediation by the community.
RESUMEN
OBJECTIVE: In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations. MATERIALS AND METHODS: We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements. RESULTS: Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback. DISCUSSION: We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate. CONCLUSION: By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require.
Asunto(s)
COVID-19 , Estudios de Cohortes , Exactitud de los Datos , Health Insurance Portability and Accountability Act , Humanos , Estados UnidosRESUMEN
Importance: The National COVID Cohort Collaborative (N3C) is a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative COVID-19 cohort to date. This multicenter data set can support robust evidence-based development of predictive and diagnostic tools and inform clinical care and policy. Objectives: To evaluate COVID-19 severity and risk factors over time and assess the use of machine learning to predict clinical severity. Design, Setting, and Participants: In a retrospective cohort study of 1â¯926â¯526 US adults with SARS-CoV-2 infection (polymerase chain reaction >99% or antigen <1%) and adult patients without SARS-CoV-2 infection who served as controls from 34 medical centers nationwide between January 1, 2020, and December 7, 2020, patients were stratified using a World Health Organization COVID-19 severity scale and demographic characteristics. Differences between groups over time were evaluated using multivariable logistic regression. Random forest and XGBoost models were used to predict severe clinical course (death, discharge to hospice, invasive ventilatory support, or extracorporeal membrane oxygenation). Main Outcomes and Measures: Patient demographic characteristics and COVID-19 severity using the World Health Organization COVID-19 severity scale and differences between groups over time using multivariable logistic regression. Results: The cohort included 174â¯568 adults who tested positive for SARS-CoV-2 (mean [SD] age, 44.4 [18.6] years; 53.2% female) and 1â¯133â¯848 adult controls who tested negative for SARS-CoV-2 (mean [SD] age, 49.5 [19.2] years; 57.1% female). Of the 174â¯568 adults with SARS-CoV-2, 32â¯472 (18.6%) were hospitalized, and 6565 (20.2%) of those had a severe clinical course (invasive ventilatory support, extracorporeal membrane oxygenation, death, or discharge to hospice). Of the hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March to April 2020 to 8.6% in September to October 2020 (P = .002 for monthly trend). Using 64 inputs available on the first hospital day, this study predicted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across machine learning methods. In a separate multivariable logistic regression model built for inference, age (odds ratio [OR], 1.03 per year; 95% CI, 1.03-1.04), male sex (OR, 1.60; 95% CI, 1.51-1.69), liver disease (OR, 1.20; 95% CI, 1.08-1.34), dementia (OR, 1.26; 95% CI, 1.13-1.41), African American (OR, 1.12; 95% CI, 1.05-1.20) and Asian (OR, 1.33; 95% CI, 1.12-1.57) race, and obesity (OR, 1.36; 95% CI, 1.27-1.46) were independently associated with higher clinical severity. Conclusions and Relevance: This cohort study found that COVID-19 mortality decreased over time during 2020 and that patient demographic characteristics and comorbidities were associated with higher clinical severity. The machine learning models accurately predicted ultimate clinical severity using commonly collected clinical data from the first 24 hours of a hospital admission.
Asunto(s)
COVID-19 , Bases de Datos Factuales , Predicción , Hospitalización , Modelos Biológicos , Índice de Severidad de la Enfermedad , Adulto , Anciano , Anciano de 80 o más Años , COVID-19/etnología , COVID-19/mortalidad , Comorbilidad , Etnicidad , Oxigenación por Membrana Extracorpórea , Femenino , Humanos , Concentración de Iones de Hidrógeno , Masculino , Persona de Mediana Edad , Pandemias , Respiración Artificial , Estudios Retrospectivos , Factores de Riesgo , SARS-CoV-2 , Estados Unidos , Adulto JovenRESUMEN
Background: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy. Methods and Findings: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients. Conclusions: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.
RESUMEN
OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19.
Asunto(s)
COVID-19 , Ciencia de los Datos/organización & administración , Difusión de la Información , Colaboración Intersectorial , Seguridad Computacional , Análisis de Datos , Comités de Ética en Investigación , Regulación Gubernamental , Humanos , National Institutes of Health (U.S.) , Estados UnidosRESUMEN
PURPOSE: While there are studies under way to characterize the direct effects of the COVID-19 pandemic on the care of patients with cancer, there have been few quantitative reports of the impact that efforts to control the pandemic have had on the normal course of cancer diagnosis and treatment encounters. METHODS: We used the TriNetX platform to analyze 20 health care institutions that have relevant, up-to-date encounter data. Using this COVID and Cancer Research Network (CCRN), we compared cancer cohorts identified by querying encounter data pre-COVID (January 2019-April 2019) and current (January 2020-April 2020). Cohorts were generated for all patients with neoplasms (malignant, benign, in situ, and of unspecified behavior), with new incidence neoplasms (first encounter), with exclusively malignant neoplasms, and with new incidence malignant neoplasms. Data from a UK institution were similarly analyzed. Additional analyses were performed on patients with selected cancers, as well as on those having had cancer screening. RESULTS: Clear trends were identified that suggest a significant decline in all current cohorts explored, with April 2020 displaying the largest decrease in the number of patients with cancer having encounters. Of the cancer types analyzed, lung, colorectal, and hematologic cancer cohorts exhibited smaller decreases in size in April 2020 versus 2019 (-39.1%, -39.9%, -39.1%, respectively) compared with cohort size decreases for breast cancer, prostate cancer, and melanoma (-47.7%, -49.1%, -51.8%, respectively). In addition, cancer screenings declined drastically, with breast cancer screenings dropping by -89.2% and colorectal cancer screenings by -84.5%. CONCLUSION: Trends seen in the CCRN clearly suggest a significant decrease in all cancer-related patient encounters as a result of the pandemic. The steep decreases in cancer screening and patients with a new incidence of cancer suggest the possibility of a future increase in patients with later-stage cancer being seen initially as well as an increased demand for cancer screening procedures as delayed tests are rescheduled.
Asunto(s)
Infecciones por Coronavirus/epidemiología , Detección Precoz del Cáncer/tendencias , Neoplasias/clasificación , Neoplasias/epidemiología , Neumonía Viral/epidemiología , COVID-19 , Estudios de Cohortes , Comorbilidad , Femenino , Humanos , Incidencia , Masculino , Pandemias , Reino Unido/epidemiología , Estados Unidos/epidemiologíaRESUMEN
Defining patient-to-patient similarity is essential for the development of precision medicine in clinical care and research. Conceptually, the identification of similar patient cohorts appears straightforward; however, universally accepted definitions remain elusive. Simultaneously, an explosion of vendors and published algorithms have emerged and all provide varied levels of functionality in identifying patient similarity categories. To provide clarity and a common framework for patient similarity, a workshop at the American Medical Informatics Association 2019 Annual Meeting was convened. This workshop included invited discussants from academics, the biotechnology industry, the FDA, and private practice oncology groups. Drawing from a broad range of backgrounds, workshop participants were able to coalesce around 4 major patient similarity classes: (1) feature, (2) outcome, (3) exposure, and (4) mixed-class. This perspective expands into these 4 subtypes more critically and offers the medical informatics community a means of communicating their work on this important topic.
Asunto(s)
Medicina de Precisión , Femenino , Humanos , Masculino , Informática Médica , Terminología como AsuntoRESUMEN
BACKGROUND AND OBJECTIVE: Clinical guidelines discourage antibiotic prescribing for many acute respiratory infections (ARIs), especially for non-antibiotic appropriate diagnoses. Electronic health record (EHR)-based clinical decision support has the potential to improve antibiotic prescribing for ARIs. METHODS: We randomly assigned 27 primary care clinics to receive an EHR-integrated, documentation-based clinical decision support system for the care of patients with ARIs - the ARI Smart Form - or to offer usual care. The primary outcome was the antibiotic prescribing rate for ARIs in an intent-to-intervene analysis based on administrative diagnoses. RESULTS: During the intervention period, patients made 21 961 ARI visits to study clinics. Intervention clinicians used the ARI Smart Form in 6% of 11 954 ARI visits. The antibiotic prescribing rate in the intervention clinics was 39% versus 43% in the control clinics (odds ratio (OR), 0.8; 95% confidence interval (CI), 0.6-1.2, adjusted for clustering by clinic). For antibiotic appropriate ARI diagnoses, the antibiotic prescribing rate was 54% in the intervention clinics and 59% in the control clinics (OR, 0.8; 95% CI, 0.5-1.3). For non-antibiotic appropriate diagnoses, the antibiotic prescribing rate was 32% in the intervention clinics and 34% in the control clinics (OR, 0.9; 95% CI, 0.6-1.4). When the ARI Smart Form was used, based on diagnoses entered on the form, the antibiotic prescribing rate was 49% overall, 88% for antibiotic appropriate diagnoses and 27% for non-antibiotic appropriate diagnoses. In an as-used analysis, the ARI Smart Form was associated with a lower antibiotic prescribing rate for acute bronchitis (OR, 0.5; 95% CI, 0.3-0.8). CONCLUSIONS: The ARI Smart Form neither reduced overall antibiotic prescribing nor significantly improved the appropriateness of antibiotic prescribing for ARIs, but it was not widely used. When used, the ARI Smart Form may improve diagnostic accuracy compared to administrative diagnoses and may reduce antibiotic prescribing for certain diagnoses.
Asunto(s)
Antibacterianos/uso terapéutico , Sistemas de Apoyo a Decisiones Clínicas , Sistemas de Registros Médicos Computarizados , Infecciones del Sistema Respiratorio/tratamiento farmacológico , Enfermedad Aguda , Adulto , Análisis por Conglomerados , Femenino , Humanos , Masculino , Persona de Mediana Edad , Estados UnidosRESUMEN
Clinical decision support systems (CDSS) integrated within Electronic Medical Records (EMR) hold the promise of improving healthcare quality. To date the effectiveness of CDSS has been less than expected, especially concerning the ambulatory management of chronic diseases. This is due, in part, to the fact that clinicians do not use CDSS fully. Barriers to clinicians' use of CDSS have included lack of integration into workflow, software usability issues, and relevance of the content to the patient at hand. At Partners HealthCare, we are developing "Smart Forms" to facilitate documentation-based clinical decision support. Rather than being interruptive in nature, the Smart Form enables writing a multi-problem visit note while capturing coded information and providing sophisticated decision support in the form of tailored recommendations for care. The current version of the Smart Form is designed around two chronic diseases: coronary artery disease and diabetes mellitus. The Smart Form has potential to improve the care of patients with both acute and chronic conditions.
Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Manejo de la Enfermedad , Sistemas de Registros Médicos Computarizados , Interfaz Usuario-Computador , Documentación , Humanos , Integración de SistemasRESUMEN
Clinical trials, whether industry, cooperative group sponsored, or investigator initiated, have an unacceptable rate of failure as a result of the inability to recruit sufficient numbers of patients. Even those trials that are completed often require time-consuming protocol amendments to achieve accrual goals. These inefficiencies in clinical trial research result in increasing costs and prolong the time needed to bring improved treatments to cancer clinical practice. TriNetX has developed a clinical research collaboration platform-deployed by a federated network of health care organizations (HCOs), pharmaceutical firms (Pharma), and contract research organizations (CROs)-to enable data-driven clinical research study design to reduce accrual failure and protocol amendment. Currently, the network extends to 55 HCOs and covers 84 million patients, mostly within the United States, but with a growing international presence. (Many of the HCOs in United States are Clinical and Translational Science Awardees and/or National Cancer Institute-designated cancer centers.) The TriNetX business model includes Pharma and the CROs as sponsors whose subscriptions financially support the network, including the software and hardware costs of the HCOs. Furthermore, as each HCO network member has their data harmonized with the TriNetX model upon joining, data sharing among them does not require any technical processes to establish connectivity. To date, on the basis of the data on the network, HCOs have been presented approximately 757 studies by Pharma and CROs, and four data-sharing subnetworks have been formed among member HCOs.