Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 59
Filter
2.
J Clin Transl Sci ; 8(1): e17, 2024.
Article in English | MEDLINE | ID: mdl-38384919

ABSTRACT

Introduction: The focus on social determinants of health (SDOH) and their impact on health outcomes is evident in U.S. federal actions by Centers for Medicare & Medicaid Services and Office of National Coordinator for Health Information Technology. The disproportionate impact of COVID-19 on minorities and communities of color heightened awareness of health inequities and the need for more robust SDOH data collection. Four Clinical and Translational Science Award (CTSA) hubs comprising the Texas Regional CTSA Consortium (TRCC) undertook an inventory to understand what contextual-level SDOH datasets are offered centrally and which individual-level SDOH are collected in structured fields in each electronic health record (EHR) system potentially for all patients. Methods: Hub teams identified American Community Survey (ACS) datasets available via their enterprise data warehouses for research. Each hub's EHR analyst team identified structured fields available in their EHR for SDOH using a collection instrument based on a 2021 PCORnet survey and conducted an SDOH field completion rate analysis. Results: One hub offered ACS datasets centrally. All hubs collected eleven SDOH elements in structured EHR fields. Two collected Homeless and Veteran statuses. Completeness at four hubs was 80%-98%: Ethnicity, Race; < 10%: Education, Financial Strain, Food Insecurity, Housing Security/Stability, Interpersonal Violence, Social Isolation, Stress, Transportation. Conclusion: Completeness levels for SDOH data in EHR at TRCC hubs varied and were low for most measures. Multiple system-level discussions may be necessary to increase standardized SDOH EHR-based data collection and harmonization to drive effective value-based care, health disparities research, translational interventions, and evidence-based policy.

3.
J Clin Transl Sci ; 7(1): e130, 2023.
Article in English | MEDLINE | ID: mdl-37396818

ABSTRACT

Background: Electronic health record (EHR) data have many quality problems that may affect the outcome of research results and decision support systems. Many methods have been used to evaluate EHR data quality. However, there has yet to be a consensus on the best practice. We used a rule-based approach to assess the variability of EHR data quality across multiple healthcare systems. Methods: To quantify data quality concerns across healthcare systems in a PCORnet Clinical Research Network, we used a previously tested rule-based framework tailored to the PCORnet Common Data Model to perform data quality assessment at 13 clinical sites across eight states. Results were compared with the current PCORnet data curation process to explore the differences between both methods. Additional analyses of testosterone therapy prescribing were used to explore clinical care variability and quality. Results: The framework detected discrepancies across sites, revealing evident data quality variability between sites. The detailed requirements encoded the rules captured additional data errors with a specificity that aids in remediation of technical errors compared to the current PCORnet data curation process. Other rules designed to detect logical and clinical inconsistencies may also support clinical care variability and quality programs. Conclusion: Rule-based EHR data quality methods quantify significant discrepancies across all sites. Medication and laboratory sources are causes of data errors.

4.
AMIA Jt Summits Transl Sci Proc ; 2023: 632-641, 2023.
Article in English | MEDLINE | ID: mdl-37350921

ABSTRACT

The 21st Century Cures Act allows the US Food and Drug Administration to consider real world data (RWD) for new indications or post approval study requirements. However, there is limited guidance as to the relative quality of different RWD types. The ACE-RWD program will compare the quality of EHR clinical data, EHR billing data, and linked healthcare claims data to traditional clinical trial data collection methods. ACE-RWD is being conducted alongside 5-10 ancillary studies, with five sponsors, across multiple therapeutic areas. Each ancillary study will be conducted after or in parallel with its parent clinical study at a minimum of two clinical sites. Although not required, it is anticipated that EHR clinical and EHR billing data will be obtained via EHR-to-eCRF mechanisms that are based on the Health Level Seven (HL7) Fast Healthcare Interoperability Resources (FHIR®) standard.

5.
Res Sq ; 2023 Mar 27.
Article in English | MEDLINE | ID: mdl-37034600

ABSTRACT

Background: Medical record abstraction (MRA) is a commonly used method for data collection in clinical research, but is prone to error, and the influence of quality control (QC) measures is seldom and inconsistently assessed during the course of a study. We employed a novel, standardized MRA-QC framework as part of an ongoing observational study in an effort to control MRA error rates. In order to assess the effectiveness of our framework, we compared our error rates against traditional MRA studies that had not reported using formalized MRA-QC methods. Thus, the objective of this study was to compare the MRA error rates derived from the literature with the error rates found in a study using MRA as the sole method of data collection that employed an MRA-QC framework. Methods: Using a moderator meta-analysis employed with Q-test, the MRA error rates from the meta-analysis of the literature were compared with the error rate from a recent study that implemented formalized MRA training and continuous QC processes. Results: The MRA process for data acquisition in clinical research was associated with both high and highly variable error rates (70 - 2,784 errors per 10,000 fields). Error rates for the study using our MRA-QC framework were between 1.04% (optimistic, all-field rate) and 2.57% (conservative, populated-field rate) (or 104 - 257 errors per 10,000 fields), 4.00 - 5.53 percentage points less than the observed rate from the literature (p<0.0001). Conclusions: Review of the literature indicated that the accuracy associated with MRA varied widely across studies. However, our results demonstrate that, with appropriate training and continuous QC, MRA error rates can be significantly controlled during the course of a clinical research study.

6.
Contemp Clin Trials ; 128: 107144, 2023 05.
Article in English | MEDLINE | ID: mdl-36898625

ABSTRACT

BACKGROUND: eSource software is used to automatically copy a patient's electronic health record data into a clinical study's electronic case report form. However, there is little evidence to assist sponsors in identifying the best sites for multi-center eSource studies. METHODS: We developed an eSource site readiness survey. The survey was administered to principal investigators, clinical research coordinators, and chief research information officers at Pediatric Trial Network sites. RESULTS: A total of 61 respondents were included in this study (clinical research coordinator, 22; principal investigator, 20; and chief research information officer, 19). Clinical research coordinators and principal investigators ranked medication administration, medication orders, laboratory, medical history, and vital signs data as having the highest priority for automation. While most organizations used some electronic health record research functions (clinical research coordinator, 77%; principal investigator, 75%; and chief research information officer, 89%), only 21% of sites were using Fast Healthcare Interoperability Resources standards to exchange patient data with other institutions. Respondents generally gave lower readiness for change ratings to organizations that did not have a separate research information technology group and where researchers practiced in hospitals not operated by their medical schools. CONCLUSIONS: Site readiness to participate in eSource studies is not merely a technical problem. While technical capabilities are important, organizational priorities, structure, and the site's support of clinical research functions are equally important considerations.


Subject(s)
Electronic Health Records , Software , Humans , Child , Surveys and Questionnaires , Electronics , Data Collection
7.
Contemp Clin Trials ; 126: 107110, 2023 03.
Article in English | MEDLINE | ID: mdl-36738915

ABSTRACT

Children have historically been underrepresented in randomized controlled trials and multi-center studies. This is particularly true for children who reside in rural and underserved areas. Conducting multi-center trials in rural areas presents unique informatics challenges. These challenges call for increased attention towards informatics infrastructure and the need for development and application of sound informatics approaches to the collection, processing, and management of data for clinical studies. By modifying existing local infrastructure and utilizing open source tools, we have been able to successfully deploy a multi-site data coordinating and operations center. We report our implementation decisions for data collection and management for the IDeA States Pediatric Clinical Trial Network (ISPCTN) based on the functionality needed for the ISPCTN, our synthesis of the extant literature in data collection and management methodology, and Good Clinical Data Management Practices.


Subject(s)
Data Management , Informatics , Child , Humans , Data Collection , Rural Population
8.
Res Sq ; 2023 Dec 21.
Article in English | MEDLINE | ID: mdl-38196643

ABSTRACT

Background: In clinical research, prevention of systematic and random errors of data collected is paramount to ensuring reproducibility of trial results and the safety and efficacy of the resulting interventions. Over the last 40 years, empirical assessments of data accuracy in clinical research have been reported in the literature. Although there have been reports of data error and discrepancy rates in clinical studies, there has been little systematic synthesis of these results. Further, although notable exceptions exist, little evidence exists regarding the relative accuracy of different data processing methods. We aim to address this gap by evaluating error rates for 4 data processing methods. Methods: A systematic review of the literature identified through PubMed was performed to identify studies that evaluated the quality of data obtained through data processing methods typically used in clinical trials: medical record abstraction (MRA), optical scanning, single-data entry, and double-data entry. Quantitative information on data accuracy was abstracted from the manuscripts and pooled. Meta-analysis of single proportions based on the Freeman-Tukey transformation method and the generalized linear mixed model approach were used to derive an overall estimate of error rates across data processing methods used in each study for comparison. Results: A total of 93 papers (published from 1978 to 2008) meeting our inclusion criteria were categorized according to their data processing methods. The accuracy associated with data processing methods varied widely, with error rates ranging from 2 errors per 10,000 fields to 2,784 errors per 10,000 fields. MRA was associated with both high and highly variable error rates, having a pooled error rate of 6.57% (95% CI: 5.51, 7.72). In comparison, the pooled error rates for optical scanning, single-data entry, and double-data entry methods were 0.74% (0.21, 1.60), 0.29% (0.24, 0.35) and 0.14% (0.08, 0.20), respectively. Conclusions: Data processing and cleaning methods may explain a significant amount of the variability in data accuracy. MRA error rates, for example, were high enough to impact decisions made using the data and could necessitate increases in sample sizes to preserve statistical power. Thus, the choice of data processing methods can likely impact process capability and, ultimately, the validity of trial results.

9.
Contemp Clin Trials ; 122: 106953, 2022 11.
Article in English | MEDLINE | ID: mdl-36202199

ABSTRACT

BACKGROUND: Single Institutional Review Boards (sIRB) are not achieving the benefits envisioned by the National Institutes of Health. The recently published Health Level Seven (HL7®) Fast Healthcare Interoperability Resources (FHIR®) data exchange standard seeks to improve sIRB operational efficiency. METHODS AND RESULTS: We conducted a study to determine whether the use of this standard would be economically attractive for sIRB workflows collectively and for Reviewing and Relying institutions. We examined four sIRB-associated workflows at a single institution: (1) Initial Study Protocol Application, (2) Site Addition for an Approved sIRB study, (3) Continuing Review, and (4) Medical and Non-Medical Event Reporting. Task-level information identified personnel roles and their associated hour requirements for completion. Tasks that would be eliminated by the data exchange standard were identified. Personnel costs were estimated using annual salaries by role. No tasks would be eliminated in the Initial Study Protocol Application or Medical and Non-Medical Event Reporting workflows through use of the proposed data exchange standard. Site Addition workflow hours would be reduced by 2.50 h per site (from 15.50 to 13.00 h) and Continuing Review hours would be reduced by 9.00 h per site per study year (from 36.50 to 27.50 h). Associated costs savings were $251 for the Site Addition workflow (from $1609 to $1358) and $1033 for the Continuing Review workflow (from $4110 to $3076). CONCLUSION: Use of the proposed HL7 FHIR® data exchange standard would be economically attractive for sIRB workflows collectively and for each entity participating in the new workflows.


Subject(s)
Electronic Health Records , Ethics Committees, Research , Humans , Health Level Seven
10.
BMC Med Res Methodol ; 22(1): 227, 2022 08 15.
Article in English | MEDLINE | ID: mdl-35971057

ABSTRACT

BACKGROUND: Studies have shown that data collection by medical record abstraction (MRA) is a significant source of error in clinical research studies relying on secondary use data. Yet, the quality of data collected using MRA is seldom assessed. We employed a novel, theory-based framework for data quality assurance and quality control of MRA. The objective of this work is to determine the potential impact of formalized MRA training and continuous quality control (QC) processes on data quality over time. METHODS: We conducted a retrospective analysis of QC data collected during a cross-sectional medical record review of mother-infant dyads with Neonatal Opioid Withdrawal Syndrome. A confidence interval approach was used to calculate crude (Wald's method) and adjusted (generalized estimating equation) error rates over time. We calculated error rates using the number of errors divided by total fields ("all-field" error rate) and populated fields ("populated-field" error rate) as the denominators, to provide both an optimistic and a conservative measurement, respectively. RESULTS: On average, the ACT NOW CE Study maintained an error rate between 1% (optimistic) and 3% (conservative). Additionally, we observed a decrease of 0.51 percentage points with each additional QC Event conducted. CONCLUSIONS: Formalized MRA training and continuous QC resulted in lower error rates than have been found in previous literature and a decrease in error rates over time. This study newly demonstrates the importance of continuous process controls for MRA within the context of a multi-site clinical research study.


Subject(s)
Data Accuracy , Medical Records , Data Collection , Humans , Infant, Newborn , Research Design , Retrospective Studies
11.
Article in English | MEDLINE | ID: mdl-35373222

ABSTRACT

Colonoscopy is a screening and diagnostic procedure for detection of colorectal carcinomas with specific quality metrics that monitor and improve adenoma detection rates. These quality metrics are stored in disparate documents i.e., colonoscopy, pathology, and radiology reports. The lack of integrated standardized documentation is impeding colorectal cancer research. Clinical concept extraction using Natural Language Processing (NLP) and Machine Learning (ML) techniques is an alternative to manual data abstraction. Contextual word embedding models such as BERT (Bidirectional Encoder Representations from Transformers) and FLAIR have enhanced performance of NLP tasks. Combining multiple clinically-trained embeddings can improve word representations and boost the performance of the clinical NLP systems. The objective of this study is to extract comprehensive clinical concepts from the consolidated colonoscopy documents using concatenated clinical embeddings. We built high-quality annotated corpora for three report types. BERT and FLAIR embeddings were trained on unlabeled colonoscopy related documents. We built a hybrid Artificial Neural Network (h-ANN) to concatenate and fine-tune BERT and FLAIR embeddings. To extract concepts of interest from three report types, 3 models were initialized from the h-ANN and fine-tuned using the annotated corpora. The models achieved best F1-scores of 91.76%, 92.25%, and 88.55% for colonoscopy, pathology, and radiology reports respectively.

12.
Article in English | MEDLINE | ID: mdl-35386186

ABSTRACT

Clinical named entity recognition (NER) is an essential building block for many downstream natural language processing (NLP) applications such as information extraction and de-identification. Recently, deep learning (DL) methods that utilize word embeddings have become popular in clinical NLP tasks. However, there has been little work on evaluating and combining the word embeddings trained from different domains. The goal of this study is to improve the performance of NER in clinical discharge summaries by developing a DL model that combines different embeddings and investigate the combination of standard and contextual embeddings from the general and clinical domains. We developed: 1) A human-annotated high-quality internal corpus with discharge summaries and 2) A NER model with an input embedding layer that combines different embeddings: standard word embeddings, context-based word embeddings, a character-level word embedding using a convolutional neural network (CNN), and an external knowledge sources along with word features as one-hot vectors. Embedding was followed by bidirectional long short-term memory (Bi-LSTM) and conditional random field (CRF) layers. The proposed model reaches or overcomes state-of-the-art performance on two publicly available data sets and an F1 score of 94.31% on an internal corpus. After incorporating mixed-domain clinically pre-trained contextual embeddings, the F1 score further improved to 95.36% on the internal corpus. This study demonstrated an efficient way of combining different embeddings that will improve the recognition performance aiding the downstream de-identification of clinical notes.

13.
JAMIA Open ; 5(1): ooac010, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35274085

ABSTRACT

Objective: To inform training needs for the revised Certified Clinical Data Manager (CCDMTM) Exam. Introduction: Clinical data managers hold the responsibility for processing the data on which research conclusions and regulatory decisions are based, highlighting the importance of applying effective data management practices. The use of practice standards such as the Good Clinical Data Management Practices increases confidence in data, emphasizing that the study conclusions likely hold much more weight when utilizing standard practices. Methods: A quantitative, descriptive study, and application of classic test theory was undertaken to analyze past data from the CCDMTM Exam to identify potential training needs. Data across 952 sequential exam attempts were pooled for analysis. Results: Competency domain-level analysis identified training needs in 4 areas: design tasks; data processing tasks; programming tasks; and coordination and management tasks. Conclusions: Analysis of past CCDMTM Exam results using classic test theory identified training needs reflective of exam takers. Training in the identified areas could benefit CCDMTM Exam takers and improve their ability to apply effective data management practices. While this may not be reflective of individual or organizational needs, recommendations for assessing individual and organizational training needs are provided.

14.
Article in English | MEDLINE | ID: mdl-35300321

ABSTRACT

Colonoscopy plays a critical role in screening of colorectal carcinomas (CC). Unfortunately, the data related to this procedure are stored in disparate documents, colonoscopy, pathology, and radiology reports respectively. The lack of integrated standardized documentation is impeding accurate reporting of quality metrics and clinical and translational research. Natural language processing (NLP) has been used as an alternative to manual data abstraction. Performance of Machine Learning (ML) based NLP solutions is heavily dependent on the accuracy of annotated corpora. Availability of large volume annotated corpora is limited due to data privacy laws and the cost and effort required. In addition, the manual annotation process is error-prone, making the lack of quality annotated corpora the largest bottleneck in deploying ML solutions. The objective of this study is to identify clinical entities critical to colonoscopy quality, and build a high-quality annotated corpus using domain specific taxonomies following standardized annotation guidelines. The annotated corpus can be used to train ML models for a variety of downstream tasks.

15.
AMIA Annu Symp Proc ; 2022: 775-784, 2022.
Article in English | MEDLINE | ID: mdl-37128433

ABSTRACT

Individual researchers and research networks have developed and applied different approaches to assess the data quality of electronic health record (EHR) data. A previously published rules-based method to evaluate the data quality of EHR data provides deeper levels of data quality analysis. To examine the effectiveness and generalizability of the rule-based framework, we reprogrammed and translated published rule templates to operate against the PCORnet Common Data Model and executed them against a database for a single center of the Greater Plains Collaborative (GPC) PCORnet Clinical Research Network. The framework detected additional data errors and logical inconsistencies not revealed by current data quality procedures. Laboratory and medication data were more vulnerable to errors. Hemolyzed samples in the emergency department and metformin prescribing in ambulatory clinics are further described to illustrate application of specific rule-based findings by researchers to engage their health systems in evaluating healthcare delivery and clinical quality concerns.


Subject(s)
Data Accuracy , Electronic Health Records , Humans , Patient Outcome Assessment , Delivery of Health Care
16.
Clin Transl Sci ; 15(2): 309-321, 2022 02.
Article in English | MEDLINE | ID: mdl-34706145

ABSTRACT

Artificial intelligence (AI) is transforming many domains, including finance, agriculture, defense, and biomedicine. In this paper, we focus on the role of AI in clinical and translational research (CTR), including preclinical research (T1), clinical research (T2), clinical implementation (T3), and public (or population) health (T4). Given the rapid evolution of AI in CTR, we present three complementary perspectives: (1) scoping literature review, (2) survey, and (3) analysis of federally funded projects. For each CTR phase, we addressed challenges, successes, failures, and opportunities for AI. We surveyed Clinical and Translational Science Award (CTSA) hubs regarding AI projects at their institutions. Nineteen of 63 CTSA hubs (30%) responded to the survey. The most common funding source (48.5%) was the federal government. The most common translational phase was T2 (clinical research, 40.2%). Clinicians were the intended users in 44.6% of projects and researchers in 32.3% of projects. The most common computational approaches were supervised machine learning (38.6%) and deep learning (34.2%). The number of projects steadily increased from 2012 to 2020. Finally, we analyzed 2604 AI projects at CTSA hubs using the National Institutes of Health Research Portfolio Online Reporting Tools (RePORTER) database for 2011-2019. We mapped available abstracts to medical subject headings and found that nervous system (16.3%) and mental disorders (16.2) were the most common topics addressed. From a computational perspective, big data (32.3%) and deep learning (30.0%) were most common. This work represents a snapshot in time of the role of AI in the CTSA program.


Subject(s)
Artificial Intelligence , Translational Science, Biomedical , Humans , Translational Research, Biomedical , United States
17.
Ther Innov Regul Sci ; 55(6): 1250-1257, 2021 11.
Article in English | MEDLINE | ID: mdl-34228318

ABSTRACT

BACKGROUND: The 21st Century Cures Act allows the US Food and Drug Administration (FDA) to utilize real-world data (RWD) to create real-world evidence (RWE) for new indications or post approval study requirements. We compared central adjudication with two insurance claims data sources to understand how endpoint accuracy differences impact RWE results. METHODS: We developed a decision analytic model to compare differences in efficacy (all-cause death, stroke and myocardial infarction) and safety (bleeding requiring transfusion) results for a simulated acute coronary syndrome antiplatelet therapy clinical trial. Endpoint accuracy metrics were derived from previous studies that compared centrally-adjudicated and insurance claims-based clinical trial endpoints. RESULTS: Efficacy endpoint results per 100 patients were similar for the central adjudication model (intervention event rate, 11.3; control, 13.7; difference, 2.4) and the prospective claims data collection model (intervention event rate, 11.2; control 13.6; difference, 2.3). However, the retrospective claims linking model's efficacy results were larger (intervention event rate, 14.6; control, 18.0; difference, 3.4). True positive event rate results (intervention, control and difference) for both insurance claims-based models were less than the central adjudication model due to false negative events. Differences in false positive event rates were responsible for differences in efficacy results for the two insurance claims-based models. CONCLUSION: Efficacy endpoint results differed by data source. Investigators need guidance to determine which data sources produce regulatory-grade RWE.


Subject(s)
Insurance , Myocardial Infarction , Stroke , Humans , Prospective Studies , Retrospective Studies
18.
Stud Health Technol Inform ; 281: 799-803, 2021 May 27.
Article in English | MEDLINE | ID: mdl-34042688

ABSTRACT

The ongoing COVID-19 pandemic has become the most impactful pandemic of the past century. The SARS-CoV-2 virus has spread rapidly across the globe affecting and straining global health systems. More than 2 million people have died from COVID-19 (as of 30 January 2021). To lessen the pandemic's impact, advanced methods such as Artificial Intelligence models are proposed to predict mortality, morbidity, disease severity, and other outcomes and sequelae. We performed a rapid scoping literature review to identify the deep learning techniques that have been applied to predict hospital mortality in COVID-19 patients. Our review findings provide insights on the important deep learning models, data types, and features that have been reported in the literature. These summary findings will help scientists build reliable and accurate models for better intervention strategies for predicting mortality in current and future pandemic situations.


Subject(s)
COVID-19 , Deep Learning , Artificial Intelligence , Humans , Pandemics , SARS-CoV-2
19.
Stud Health Technol Inform ; 281: 183-187, 2021 May 27.
Article in English | MEDLINE | ID: mdl-34042730

ABSTRACT

Endoscopy procedures are often performed with either moderate or deep sedation. While deep sedation is costly, procedures with moderate sedation are not always well tolerated resulting in patient discomfort, and are often aborted. Due to lack of clear guidelines, the decision to utilize moderate sedation or anesthesia for a procedure is made by the providers, leading to high variability in clinical practice. The objective of this study was to build a Machine Learning (ML) model that predicts if a colonoscopy can be successfully completed with moderate sedation based on patients' demographics, comorbidities, and prescribed medications. XGBoost model was trained and tested on 10,025 colonoscopies (70% - 30%) performed at University of Arkansas for Medical Sciences (UAMS). XGBoost achieved average area under receiver operating characteristic curve (AUC) of 0.762, F1-score to predict procedures that need moderate sedation was 0.85, and precision and recall were 0.81 and 0.89 respectively. The proposed model can be employed as a decision support tool for physicians to bolster their confidence while choosing between moderate sedation and anesthesia for a colonoscopy procedure.


Subject(s)
Anesthesia , Colonoscopy , Conscious Sedation , Humans , Machine Learning
20.
Stud Health Technol Inform ; 281: 397-401, 2021 May 27.
Article in English | MEDLINE | ID: mdl-34042773

ABSTRACT

Direct extraction and use of electronic health record (EHR) data is a long-term and multifaceted endeavor that includes design, development, implementation and evaluation of methods and tools for semi-automating tasks in the research data collection process, including, but not limited to, medical record abstraction (MRA). A systematic mapping of study data elements was used to measure the coverage of the Health Level Seven (HL7®) Fast Healthcare Interoperability Resources (FHIR®) standard for a federally sponsored, pragmatic cardiovascular randomized controlled trial (RCT) targeting adults. We evaluated site-level implementations of the HL7® FHIR® standard to investigate study- and site-level differences that could affect coverage and offer insight into the feasibility of a FHIR-based eSource solution for multicenter clinical research.


Subject(s)
Electronic Health Records , Health Level Seven
SELECTION OF CITATIONS
SEARCH DETAIL
...