Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
Sci Rep ; 14(1): 20307, 2024 09 02.
Article in English | MEDLINE | ID: mdl-39218966

ABSTRACT

Citizen science data on biodiversity has experienced significant growth, largely driven by advancements in technology that facilitate data sharing. In recent years, mobile applications have provided a substantial boost to both the collection and sharing of this data. While this accessible information is undoubtedly valuable, we must consider the numerous biases present in this data when used for high-quality research. In this study, we analyse citizen science data for the birds of the Iberian Peninsula, comprising over 23 million unique records after filtering for duplicates (i.e., instances where the same observation was uploaded more than once). Using a 10 × 10 km square grid, we present information on well-surveyed cells (completeness) as well as temporal, taxonomic, geographical, and land use coverages. We found that the completeness of information is generally high, with better coverage around densely populated areas such as major cities and coastal regions, as well as popular birding destinations, which are frequently visited. The Mediterranean ecoregion and wetlands exhibit the highest levels of surveying. Furthermore, we observed an increase in temporal coverage since the 1980s and broad taxonomic coverage across all bird orders in the Iberian region. Our results underscore the utility of citizen science for many locations, as expressed in 10 × 10 km cells. However, they also highlight the inadequate data completeness across part of the territory, particularly in forested or sparsely inhabited areas. These findings not only identify cells suitable for bird diversity or conservation studies but also indicate areas where citizen-driven bird recording efforts should be encouraged.


Subject(s)
Biodiversity , Birds , Citizen Science , Animals , Spain , Data Accuracy , Portugal , Conservation of Natural Resources/methods
2.
Int J Equity Health ; 23(1): 143, 2024 Jul 18.
Article in English | MEDLINE | ID: mdl-39026324

ABSTRACT

BACKGROUND: Race and ethnicity are important drivers of health inequalities worldwide. However, the recording of race/ethnicity in data systems is frequently insufficient, particularly in low- and middle-income countries. The aim of this study is to descriptively analyse trends in data completeness in race/color records in hospital admissions and the rates of hospitalizations by various causes for Blacks and Whites individuals. METHODS: We conducted a longitudinal analysis, examining hospital admission data from Brazil's Hospital Information System (SIH) between 2010 and 2022, and analysed trends in reporting completeness and racial inequalities. These hospitalization records were examined based on year, quarter, cause of admission (using International Classification of Diseases (ICD-10) codes), and race/color (categorized as Black, White, or missing). We examined the patterns in hospitalization rates and the prevalence of missing data over a period of time. RESULTS: Over the study period, there was a notable improvement in data completeness regarding race/color in hospital admissions in Brazil. The proportion of missing values on race decreased from 34.7% in 2010 to 21.2% in 2020. As data completeness improved, racial inequalities in hospitalization rates became more evident - across several causes, including assaults, tuberculosis, hypertensive diseases, at-risk hospitalizations during pregnancy and motorcycle accidents. CONCLUSIONS: The study highlights the critical role of data quality in identifying and addressing racial health inequalities. Improved data completeness has revealed previously hidden inequalities in health records, emphasizing the need for comprehensive data collection to inform equitable health policies and interventions. Policymakers working in areas where socioeconomic data reporting (including on race and ethnicity) is suboptimal, should address data completeness to fully understand the scale of health inequalities.


Subject(s)
Health Information Systems , Health Status Disparities , Healthcare Disparities , Hospital Information Systems , Female , Humans , Male , Brazil , Health Information Systems/standards , Healthcare Disparities/statistics & numerical data , Hospital Information Systems/standards , Hospitalization/statistics & numerical data , Longitudinal Studies , Racial Groups/statistics & numerical data , Socioeconomic Factors , White People/statistics & numerical data , Black People/statistics & numerical data
3.
Popul Health Metr ; 22(1): 12, 2024 Jun 15.
Article in English | MEDLINE | ID: mdl-38879515

ABSTRACT

BACKGROUND: Heterogeneity in national SARS-CoV-2 infection surveillance capabilities may compromise global enumeration and tracking of COVID-19 cases and deaths and bias analyses of the pandemic's tolls. Taking account of heterogeneity in data completeness may thus help clarify analyses of the relationship between COVID-19 outcomes and standard preparedness measures. METHODS: We examined country-level associations of pandemic preparedness capacities inventories, from the Global Health Security (GHS) Index and Joint External Evaluation (JEE), on SARS-CoV-2 infection and COVID-19 death data completion rates adjusted for income. Analyses were stratified by 100, 100-300, 300-500, and 500-700 days after the first reported case in each country. We subsequently reevaluated the relationship of pandemic preparedness on SARS-CoV-2 infection and age-standardized COVID-19 death rates adjusted for cross-country differentials in data completeness during the pre-vaccine era. RESULTS: Every 10% increase in the GHS Index was associated with a 14.9% (95% confidence interval 8.34-21.8%) increase in SARS-CoV-2 infection completion rate and a 10.6% (5.91-15.4%) increase in the death completion rate during the entire observation period. Disease prevention (infections: ß = 1.08 [1.05-1.10], deaths: ß = 1.05 [1.04-1.07]), detection (infections: ß = 1.04 [1.01-1.06], deaths: ß = 1.03 [1.01-1.05]), response (infections: ß = 1.06 [1.00-1.13], deaths: ß = 1.05 [1.00-1.10]), health system (infections: ß = 1.06 [1.03-1.10], deaths: ß = 1.05 [1.03-1.07]), and risk environment (infections: ß = 1.27 [1.15-1.41], deaths: ß = 1.15 [1.08-1.23]) were associated with both data completeness outcomes. Effect sizes of GHS Index on infection completion (Low income: ß = 1.18 [1.04-1.34], Lower Middle income: ß = 1.41 [1.16-1.71]) and death completion rates (Low income: ß = 1.19 [1.09-1.31], Lower Middle income: ß = 1.25 [1.10-1.43]) were largest in LMICs. After adjustment for cross-country differences in data completeness, each 10% increase in the GHS Index was associated with a 13.5% (4.80-21.4%) decrease in SARS-CoV-2 infection rate at 100 days and a 9.10 (1.07-16.5%) decrease at 300 days. For age-standardized COVID-19 death rates, each 10% increase in the GHS Index was with a 15.7% (5.19-25.0%) decrease at 100 days and a 10.3% (- 0.00-19.5%) decrease at 300 days. CONCLUSIONS: Results support the pre-pandemic hypothesis that countries with greater pandemic preparedness capacities have larger SARS-CoV-2 infection and mortality data completeness rates and lower COVID-19 disease burdens. More high-quality data of COVID-19 impact based on direct measurement are needed.


Subject(s)
COVID-19 , Global Health , Pandemic Preparedness , Humans , COVID-19/mortality , COVID-19/prevention & control , COVID-19/epidemiology
4.
J Proteome Res ; 22(11): 3508-3518, 2023 11 03.
Article in English | MEDLINE | ID: mdl-37815119

ABSTRACT

The ubiquity of mass spectrometry-based bottom-up proteomic analyses as a component of biological investigation mandates the validation of methodologies that increase acquisition efficiency, improve sample coverage, and enhance profiling depth. Chromatographic separation is often ignored as an area of potential improvement, with most analyses relying on traditional reversed-phase liquid chromatography (RPLC); this consistent reliance on a single chromatographic paradigm fundamentally limits our view of the observable proteome. Herein, we build upon early reports and validate porous graphitic carbon chromatography (PGC) as a facile means to substantially enhance proteomic coverage without changes to sample preparation, instrument configuration, or acquisition methods. Analysis of offline fractionated cell line digests using both separations revealed an increase in peptide and protein identifications by 43% and 24%, respectively. Increased identifications provided more comprehensive coverage of cellular components and biological processes independent of protein abundance, highlighting the substantial quantity of proteomic information that may go undetected in standard analyses. We further utilize these data to reveal that label-free quantitative analyses using RPLC separations alone may not be reflective of actual protein constituency. Together, these data highlight the value and comprehension offered through PGC-MS proteomic analyses. RAW proteomic data have been uploaded to the MassIVE repository with the primary accession code MSV000091495.


Subject(s)
Carbon , Graphite , Proteomics/methods , Porosity , Chromatography, Reverse-Phase/methods , Graphite/chemistry , Proteome/chemistry
5.
J Am Med Inform Assoc ; 30(12): 1985-1994, 2023 11 17.
Article in English | MEDLINE | ID: mdl-37632234

ABSTRACT

OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a "loyalty cohort" since they typically return to the same providers) have mostly complete data within that organization's electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests' contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a "loyalty score" algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data.


Subject(s)
Algorithms , Electronic Health Records , Humans , Machine Learning , Delivery of Health Care , Electronics
6.
Traffic Inj Prev ; 24(sup1): S131-S140, 2023.
Article in English | MEDLINE | ID: mdl-37267005

ABSTRACT

OBJECTIVE: Regulations are currently being drafted by the European Commission for the safe introduction of automated driving systems (ADSs) with conditional or higher automation (SAE level 3 and above). One of the main challenges for complying with the drafted regulations is proving that the residual risk of an ADS is lower than the existing state of the art without the ADS and that the current safety state of European roads is not compromised. Therefore, much research has been conducted to estimate the safety risk of ADS. One proposed method for estimating the risk is data-driven, scenario-based assessment, where tests are partially automatically generated based on recorded traffic data. Although this is a promising method, uncertainties in the estimated risk arise from, among others, the limited number of tests that are conducted and the limited data that have been used to generate the tests. This work addresses the following question: "Given the limitations of the data and the number of tests, what is the uncertainty of the estimated safety risk of the ADS?" METHODS: To compute the safety risk, parameterized test scenarios are based on large-scale collections of road scenarios that are stored in a scenario database. The exposure of the scenarios and the parameter distributions are estimated using the data as well as confidence bounds of these estimates. Next, virtual simulations are conducted of the scenarios for a variety of parameter values. Using a probabilistic framework, all results are combined to estimate the residual risk as well as the uncertainty of this estimation. RESULTS: The results are used to provide confidence bounds on the calculated fatality rate in case an ADS is implemented in the vehicle. For example, using the proposed probabilistic framework, it is possible to claim with 95% certainty that the fatality rate is less than 10-7 fatalities per hour of driving. The proposed method is illustrated with a case study in which the risk and its uncertainty are quantified for a longitudinal controller in 3 different types of scenarios. The case study code is publicly available. CONCLUSIONS: If results show that the uncertainty is too high, the proposed method allows answering questions like "How much more data do we need?" or "How many more (virtual) simulations must be conducted?" Therefore, the method can be used to set requirements on the amount of data and the number of (virtual) simulations. For a reliable risk estimate, though, much more data are needed than those used in the case study. Furthermore, because the method relies on (virtual) simulations, the reliability of the result depends on the validity of the models used in the simulations. The presented case study illustrates that the proposed method is able to quantify the uncertainty of the estimated safety risk of an ADS. Future work involves incorporating the proposed method into the type approval framework for future ADSs of SAE levels 3, 4, and 5, as proposed in the upcoming European Union implementing regulation for ADS.


Subject(s)
Automobile Driving , Humans , Accidents, Traffic , Reproducibility of Results , Automation , Records
7.
Stud Health Technol Inform ; 305: 390-393, 2023 Jun 29.
Article in English | MEDLINE | ID: mdl-37387047

ABSTRACT

Data quality is a primary barrier to using electronic medical records (EMR) data for clinical and research purposes. Although EMR has been in use for a long time in LMICs, its data has been seldomly used. This study aimed to assess the completeness of demographic and clinical data in a tertiary hospital in Rwanda. We conducted a cross-sectional study and assessed 92,153 patient data recorded in EMR from October 1st to December 31st, 2022. The findings indicated that over 92% of social demographic data elements were complete, and the completeness of clinical data elements ranged from 27% to 89%. The completeness of data varied markedly by departments. We recommend an exploratory study to understand further reasons associated with the completeness of data in clinical departments.


Subject(s)
Data Accuracy , Electronic Health Records , Humans , Rwanda , Tertiary Care Centers , Cross-Sectional Studies
8.
Bull Cancer ; 110(9): 873-882, 2023 Sep.
Article in French | MEDLINE | ID: mdl-36949001

ABSTRACT

BACKGROUND: Over the last three decades the incidence of thyroid cancer (TC) has increased in many regions of the world, however little is known about TC incidence and trends in Algeria. MATERIAL AND METHODS: Using data from the Oran cancer registry (OCR) we assessed TC incidence and trends in Oran for the period 1996-2013 with the historical data method. The incidence curves were unstable and did not show any clear trend. Therefore, we actively collected data on TC for the period 1996-2013 using the multisource approach and the independent case ascertainment method. RESULTS: Analysis of actively collected and validated data showed a significant increase in the incidence of TC. We compared the two databases to identify differences. There were 558 TC cases during the period 1996-2013 in the OCR, while our active data collection enabled us to find 1,391 TC cases during the same period. The completeness rate in the OCR was 40.1%. These differences were due to our approach that consisted in the inclusion of a greater number of health facilities and laboratories (44 versus 23 in the OCR), and the active data collection in the nuclear medicine facility of the University Hospital of Tlemcen that we undertook. CONCLUSIONS: The application of the recommendations of the International Agency for Research on Cancer (IARC) to enhance data completeness and quality, and an active collection of TC data in the nuclear medicine facility of the University Hospital of Tlemcen should make the OCR an essential tool for decision-making in public health and for directing health policy towards health priorities.


Subject(s)
Thyroid Neoplasms , Humans , Algeria/epidemiology , Thyroid Neoplasms/epidemiology , Data Collection/methods , Registries , Incidence
9.
BMC Infect Dis ; 23(1): 104, 2023 Feb 22.
Article in English | MEDLINE | ID: mdl-36814192

ABSTRACT

BACKGROUND: Routinely collected population-wide health data are often used to understand mortality trends including child mortality, as these data are often available more readily or quickly and for lower geographic levels than population-wide mortality data. However, understanding the completeness and accuracy of routine health data sources is essential for their appropriate interpretation and use. This study aims to assess the accuracy of diagnostic coding for public sector in-facility childhood (age < 5 years) infectious disease deaths (lower respiratory tract infections [LRTI], diarrhoea, meningitis, and tuberculous meningitis [TBM]) in routine hospital information systems (RHIS) through comparison with causes of death identified in a child death audit system (Child Healthcare Problem Identification Programme [Child PIP]) and the vital registration system (Death Notification [DN] Surveillance) in the Western Cape, South Africa and to calculate admission mortality rates (number of deaths in admitted patients per 1000 live births) using the best available data from all sources. METHODS: The three data sources: RHIS, Child PIP, and DN Surveillance are integrated and linked by the Western Cape Provincial Health Data Centre using a unique patient identifier. We calculated the deduplicated total number of infectious disease deaths and estimated admission mortality rates using all three data sources. We determined the completeness of Child PIP and DN Surveillance in identifying deaths recorded in RHIS and the level of agreement for causes of death between data sources. RESULTS: Completeness of recorded in-facility infectious disease deaths in Child PIP (23/05/2007-08/02/2021) and DN Surveillance (2010-2013) was 70% and 69% respectively. The greatest agreement in infectious causes of death were for diarrhoea and LRTI: 92% and 84% respectively between RHIS and Child PIP, and 98% and 83% respectively between RHIS and DN Surveillance. In-facility infectious disease admission mortality rates decreased significantly for the province: 1.60 (95% CI: 1.37-1.85) to 0.73 (95% CI: 0.56-0.93) deaths per 1000 live births from 2007 to 2020. CONCLUSION: RHIS had accurate causes of death amongst children dying from infectious diseases, particularly for diarrhoea and LRTI, with declining in-facility admission mortality rates over time. We recommend integrating data sources to ensure the most accurate assessment of child deaths.


Subject(s)
Communicable Diseases , Respiratory Tract Infections , Child , Humans , Infant , Child, Preschool , Cause of Death , South Africa/epidemiology , Information Sources , Public Sector , Diarrhea
10.
Health Aff Sch ; 1(4): qxad047, 2023 Oct.
Article in English | MEDLINE | ID: mdl-38756741

ABSTRACT

Variation in availability, format, and standardization of patient attributes across health care organizations impacts patient-matching performance. We report on the changing nature of patient-matching features available from 2010-2020 across diverse care settings. We asked 38 health care provider organizations about their current patient attribute data-collection practices. All sites collected name, date of birth (DOB), address, and phone number. Name, DOB, current address, social security number (SSN), sex, and phone number were most commonly used for cross-provider patient matching. Electronic health record queries for a subset of 20 participating sites revealed that DOB, first name, last name, city, and postal codes were highly available (>90%) across health care organizations and time. SSN declined slightly in the last years of the study period. Birth sex, gender identity, language, country full name, country abbreviation, health insurance number, ethnicity, cell phone number, email address, and weight increased over 50% from 2010 to 2020. Understanding the wide variation in available patient attributes across care settings in the United States can guide selection and standardization efforts for improved patient matching in the United States.

11.
Front Digit Health ; 4: 856010, 2022.
Article in English | MEDLINE | ID: mdl-36506847

ABSTRACT

Objective: The present study aimed to assess the quality of electronic medical records (EMR) retrieved from hospital information systems (HIS) of three educational hospitals in Mashhad, Iran. Methods: In this multi-center, cross-sectional study, inpatient electronic records collected from three academic hospitals were categorized into five data groups, namely demographics (D); care handler (CH), indicating the doers of the medical actions; diagnosis and treatment (DT); administrative and financial (AF); and laboratory and Para clinic (LP). Next, we asked 25 physicians from the three academic hospitals to determine data elements of medical research and education value (called research and educational data) in every group. Flowingly, the quality of the five data groups (completeness * accuracy) was reported for entire sampled data and those specified as research and educational data, based on the exact concordance between electronic medical records and corresponding paper records. HISRA, standing for HIS recording ability, was also assessed compared to data elements of standard paper forms. Results: For entire data, HISRA was 58.5%. In all hospitals, the highest data quality (more than 90%) belongs to D and AF data groups, and the lowest quality goes to CH and DT groups (less than 50%, and 60%, respectively). For research and educational data, HISRA was 47%, and the quality of D and AF data groups were the highest (nearly 100%), while CH and DT stood around 50% and 60% in order. The quality of the LP data group was almost 85% in all hospitals but hospital C (well over 30%). Total data quality for the hospitals was almost less than 70%. Conclusions: The low quality of electronic medical records was mostly a result of incompleteness, while the accuracy was relatively good. Results showed that the HIS application development mainly focused on administrative and financial aspects rather than academic and clinical goals.

12.
J Patient Rep Outcomes ; 6(1): 128, 2022 Dec 22.
Article in English | MEDLINE | ID: mdl-36547735

ABSTRACT

BACKGROUND: To understand our performance with respect to the collection and reporting of patient-reported outcome (PRO) measure (PROM) data, we examined the protocol content, data completeness and publication of PROs from interventional trials conducted at the Royal Marsden NHS Foundation Trust (RM) and explored factors associated with data missingness and PRO publication. DESIGN: From local records, we identified closed, intervention trials sponsored by RM that opened after 1995 and collected PROMs as primary, secondary or exploratory outcomes. Protocol data were extracted by two researchers and scored against the SPIRIT-PRO (PRO protocol content checklist; score 0-100, higher scores indicate better completeness). For studies with locally held datasets, the information team summarized for each study, PRO completion defined as the number of expected (as per protocol) PRO measurements versus the number of actual (i.e. completed) PRO measurements captured in the study data set. Relevant publications were identified by searching three online databases and chief investigator request. Data were extracted and each publication scored against the CONSORT-PRO (PRO manuscript content checklist; scored as SPIRIT-PRO above). Descriptive statistics are presented with exploratory comparisons of point estimates and 95% confidence intervals. RESULTS: Twenty-six of 65 studies were included in the review. Nineteen studies had accessible datasets and 18 studies published at least one article. Fourteen studies published PRO results. Most studies had a clinical (rather than PRO) primary outcome (16/26). Across all studies, responses in respect of 35 of 69 PROMs were published. Trial protocols scored on average 46.7 (range 7.1-92.9) on the SPIRIT-PRO. Among studies with accessible data, half (10/19) had less than 25% missing measurements. Publications scored on average 80.9 (range 36-100%) on the CONSORT-PRO. Studies that published PRO results had somewhat fewer missing measurements (19% [7-32%] vs 60% [- 26 to 146%]). For individual PROMs within studies, missing measurements were lower for those that were published (17% [10-24%] vs 41% [18-63%]). Studies with higher SPIRIT-PRO scores and PROs as primary endpoints (13% [4-22%] vs 39% [10-58%]) had fewer missing measurements. CONCLUSIONS: Missing data may affect publication of PROs. Extent of inclusion of SPIRIT-PRO protocol items and PROs as primary endpoints may improve data completeness. Preliminary evidence from the study suggests a future larger study examining the relationship between PRO completion and publication is warranted.

13.
Nanotoxicology ; 16(2): 195-216, 2022 03.
Article in English | MEDLINE | ID: mdl-35506346

ABSTRACT

This manuscript proposes a methodology to assess the completeness and quality of physicochemical and hazard datasets for risk assessment purposes. The approach is also specifically applicable to similarity assessment as a basis for grouping of (nanoforms of) chemical substances as well as for classification of the substances according to the Classification, Labeling and Packaging regulation. The unique goal of this approach is to assess data quality in such a way that all the steps are automatized, thus reducing reliance on expert judgment. The analysis starts from available (meta)data as provided in the data entry templates developed by the NanoSafety community and used for import into the eNanoMapper database. The methodology is implemented in the templates as a traffic light system-the providers of the data can see in real time the completeness scores calculated by the system for their datasets in green, yellow, or red. This is an interactive feedback feature that is intended to provide an incentive for anyone inserting data into the database to deliver more complete and higher quality datasets. The users of the data can also see this information both in the data entry templates and on the database interface, which enables them to select better datasets for their assessments. The proposed methodology has been partially implemented in the eNanoMapper database and in a Weight of Evidence approach for the regulatory classification of nanomaterials. It was fully implemented in a publicly available online R tool.


Subject(s)
Data Accuracy , Nanostructures , Databases, Factual , Nanostructures/chemistry , Risk Assessment/methods
14.
Front Pharmacol ; 13: 845949, 2022.
Article in English | MEDLINE | ID: mdl-35444533

ABSTRACT

Objective: To evaluate the continuity and completeness of electronic health record (EHR) data, and the concordance of select clinical outcomes and baseline comorbidities between EHR and linked claims data, from three healthcare delivery systems in Taiwan. Methods: We identified oral hypoglycemic agent (OHA) users from the Integrated Medical Database of National Taiwan University Hospital (NTUH-iMD), which was linked to the National Health Insurance Research Database (NHIRD), from June 2011 to December 2016. A secondary evaluation involved two additional EHR databases. We created consecutive 90-day periods before and after the first recorded OHA prescription and defined patients as having continuous EHR data if there was at least one encounter or prescription in a 90-day interval. EHR data completeness was measured by dividing the number of encounters in the NTUH-iMD by the number of encounters in the NHIRD. We assessed the concordance between EHR and claims data on three clinical outcomes (cardiovascular events, nephropathy-related events, and heart failure admission). We used individual comorbidities that comprised the Charlson comorbidity index to examine the concordance of select baseline comorbidities between EHRs and claims. Results: We identified 39,268 OHA users in the NTUH-iMD. Thirty-one percent (n = 12,296) of these users contributed to the analysis that examined data continuity during the 6-month baseline and 24-month follow-up period; 31% (n = 3,845) of the 12,296 users had continuous data during this 30-month period and EHR data completeness was 52%. The concordance of major cardiovascular events, nephropathy-related events, and heart failure admission was moderate, with the NTU-iMD capturing 49-55% of the outcome events recorded in the NHIRD. The concordance of comorbidities was considerably different between the NTUH-iMD and NHIRD, with an absolute standardized difference >0.1 for most comorbidities examined. Across the three EHR databases studied, 29-55% of the OHA users had continuous records during the 6-month baseline and 24-month follow-up period. Conclusion: EHR data continuity and data completeness may be suboptimal. A thorough evaluation of data continuity and completeness is recommended before conducting clinical and translational research using EHR data in Taiwan.

15.
Article in English | MEDLINE | ID: mdl-35270590

ABSTRACT

Public health agencies routinely collect time-referenced records to describe and compare foodborne outbreak characteristics. Few studies provide comprehensive metadata to inform researchers of data limitations prior to conducting statistical modeling. We described the completeness of 103 variables for 22,792 outbreaks publicly reported by the United States Centers for Disease Control and Prevention's (US CDC's) electronic Foodborne Outbreak Reporting System (eFORS) and National Outbreak Reporting System (NORS). We compared monthly trends of completeness during eFORS (1998−2008) and NORS (2009−2019) reporting periods using segmented time series analyses adjusted for seasonality. We quantified the overall, annual, and monthly completeness as the percentage of outbreaks with blank records per our study period, calendar year, and study month, respectively. We found that outbreaks of unknown genus (n = 7401), Norovirus (n = 6414), Salmonella (n = 2872), Clostridium (n = 944), and multiple genera (n = 779) accounted for 80.77% of all outbreaks. However, crude completeness ranged from 46.06% to 60.19% across the 103 variables assessed. Variables with the lowest crude completeness (ranging 3.32−6.98%) included pathogen, specimen etiological testing, and secondary transmission traceback information. Variables with low (<35%) average monthly completeness during eFORS increased by 0.33−0.40%/month after transitioning to NORS, most likely due to the expansion of surveillance capacity and coverage within the new reporting system. Examining completeness metrics in outbreak surveillance systems provides essential information on the availability of data for public reuse. These metadata offer important insights for public health statisticians and modelers to precisely monitor and track the geographic spread, event duration, and illness intensity of foodborne outbreaks.


Subject(s)
Foodborne Diseases , Norovirus , Centers for Disease Control and Prevention, U.S. , Disease Outbreaks , Foodborne Diseases/epidemiology , Foodborne Diseases/etiology , Humans , Population Surveillance , United States/epidemiology
16.
J Am Med Inform Assoc ; 29(7): 1225-1232, 2022 06 14.
Article in English | MEDLINE | ID: mdl-35357470

ABSTRACT

BACKGROUND: Electric health record (EHR) discontinuity, that is, receiving care outside of a given EHR system, can lead to substantial information bias. We aimed to determine whether a previously described EHR-continuity prediction model can reduce the misclassification of 4 commonly used risk scores in pharmacoepidemiology. METHODS: The study cohort consists of patients aged ≥ 65 years identified in 2 US EHR systems linked with Medicare claims data from 2007 to 2017. We calculated 4 risk scores, CHAD2DS2-VASc, HAS-BLED, combined comorbidity score (CCS), claims-based frailty index (CFI) based on information recorded in the 365 days before cohort entry, and assessed their misclassification by comparing score values based on EHR data alone versus the linked EHR-claims data. CHAD2DS2-VASc and HAS-BLED were assessed in atrial fibrillation (AF) patients, whereas CCS and CFI were assessed in the general population. RESULTS: Our study cohort included 204 014 patients (26 537 with nonvalvular AF) in system 1 and 115 726 patients (15 529 with nonvalvular AF) in system 2. Comparing the low versus high predicted EHR continuity in system 1, the proportion of patients with misclassification of ≥2 categories improved from 55% to 16% for CHAD2DS2-VASc, from 55% to 12% for HAS-BLED, from 37% to 16% for CCS, and from 10% to 2% for CFI. A similar pattern was found in system 2. CONCLUSIONS: Using a previously described prediction model to identify patients with high EHR continuity may significantly reduce misclassification for the commonly used risk scores in EHR-based comparative studies.


Subject(s)
Atrial Fibrillation , Stroke , Aged , Anticoagulants , Atrial Fibrillation/diagnosis , Electronic Health Records , Humans , Medicare , Risk Assessment , Risk Factors , United States
17.
Vaccine ; 40(5): 752-756, 2022 01 31.
Article in English | MEDLINE | ID: mdl-34980508

ABSTRACT

BACKGROUND: The Vaccine Safety Datalink (VSD) uses vaccination data from electronic health records (EHR) at eight integrated health systems to monitor vaccine safety. Accurate capture of data from vaccines administered outside of the health system is critical for vaccine safety research, especially for COVID-19 vaccines, where many are administered in non-traditional settings. However, timely access and inclusion of data from Immunization Information Systems (IIS) into VSD safety assessments is not well understood. METHODS: We surveyed the eight data-contributing VSD sites to assess: 1) status of sending data to IIS; 2) status of receiving data from IIS; and 3) integration of IIS data into the site EHR. Sites reported separately for COVID-19 vaccination to capture any differences in capacity to receive and integrate data on COVID-19 vaccines versus other vaccines. RESULTS: All VSD sites send data to and receive data from their state IIS. All eight sites (100%) routinely integrate IIS data for COVID-19 vaccines into VSD research studies. Six sites (75%) also routinely integrate all other vaccination data; two sites integrate data from IIS following a reconciliation process, which can result in delays to integration into VSD datasets. CONCLUSIONS: COVID-19 vaccines are being administered in a variety of non-traditional settings, where IIS are commonly used as centralized reporting systems. All eight VSD sites receive and integrate COVID-19 vaccine data from IIS, which positions the VSD well for conducting quality assessments of vaccine safety. Efforts to improve the timely receipt of all vaccination data will improve capacity to conduct vaccine safety assessments within the VSD.


Subject(s)
COVID-19 , Vaccines , COVID-19 Vaccines , Humans , Immunization , Information Systems , SARS-CoV-2 , United States , Vaccination/adverse effects , Vaccines/adverse effects
18.
Int J Health Policy Manag ; 11(7): 937-946, 2022 07 01.
Article in English | MEDLINE | ID: mdl-33327687

ABSTRACT

BACKGROUND: During 2012-2015, the Federal Government of Nigeria launched the Subsidy Reinvestment and Empowerment Programme, a health system strengthening (HSS) programme with a Maternal and Child Health component (Subsidy Reinvestment and Empowerment Programme [SURE-P]/MCH), which was monitored using the Health Management Information Systems (HMIS) data reporting tools. Good quality data is essential for health policy and planning decisions yet, little is known on whether and how broad health systems strengthening programmes affect quality of data. This paper explores the effects of the SURE-P/MCH on completeness of MCH data in the National HMIS. METHODS: This mixed-methods study was undertaken in Anambra state, southeast Nigeria. A standardized proforma was used to collect facility-level data from the facility registers on MCH services to assess the completeness of data from 2 interventions and one control clusters. The facility data was collected to cover before, during, and after the SURE-P intervention activities. Qualitative in-depth interviews were conducted with purposefully-identified health facility workers to identify their views and experiences of changes in data quality throughout the above 3 periods. RESULTS: Quantitative analysis of the facility data showed that data completeness improved substantially, starting before SURE-P and continuing during SURE-P but across all clusters (ie, including the control). Also health workers felt data completeness were improved during the SURE-P, but declined with the cessation of the programme. We also found that challenges to data completeness are dependent on many variables including a high burden on providers for data collection, many variables to be filled in the data collection tools, and lack of health worker incentives. CONCLUSION: Quantitative analysis showed improved data completeness and health workers believed the SURE-P/MCH had contributed to the improvement. The functioning of national HMIS are inevitably linked with other health systems components. While health systems strengthening programmes have a great potential for improved overall systems performance, a more granular understanding of their implications on the specific components such as the resultant quality of HMIS data, is needed.


Subject(s)
Child Health Services , Health Information Systems , Management Information Systems , Maternal Health Services , Child , Humans , Female , Pregnancy , Nigeria , Family
19.
IUCrJ ; 8(Pt 6): 855-856, 2021 Nov 01.
Article in English | MEDLINE | ID: mdl-34804538

ABSTRACT

Tchon & Makal [IUCrJ (2021), 8, 1006-1017] use numerical simulations to explore the dependence of data completeness on crystal orientation, X-ray energy and diamond anvil cell geometry for high-pressure diffraction experiments. Their completeness heat maps for different Laue classes can be used to guide optimization of high-pressure single-crystal diffraction experiments.

20.
IUCrJ ; 8(Pt 6): 1006-1017, 2021 Nov 01.
Article in English | MEDLINE | ID: mdl-34804552

ABSTRACT

Sufficiently high completeness of diffraction data is necessary to correctly determine the space group, observe solid-state structural transformations or investigate charge density distribution under pressure. Regrettably, experiments performed at high pressure in a diamond anvil cell (DAC) yield inherently incomplete datasets. The present work systematizes the combined influence of radiation wavelength, DAC opening angle and sample orientation in a DAC on the completeness of diffraction data collected in a single-crystal high-pressure (HP) experiment with the help of dedicated software. In particular, the impact of the sample orientation on the achievable data completeness is quantified and proved to be substantial. Graphical guides for estimating the most beneficial sample orientation depending on the sample Laue class and assuming a few commonly used experimental setups are proposed. The usefulness of these guides has been tested in the case of luminescent 1,3-diacetylpyrene, suspected to undergo transitions from the α phase (Pnma) to the γ phase (Pn21 a) and δ phase (P1121/a) under pressure. Effective sample orientation has ensured over 90% coverage even for the monoclinic system and enabled unrestrained structure refinements and access to complete systematic extinction patterns.

SELECTION OF CITATIONS
SEARCH DETAIL