RESUMO
CONTEXT: Occupation and industry are basic data elements that, when collected during public health investigations, can be key to understanding patterns of disease transmission and developing effective prevention measures. OBJECTIVE: To assess the completeness and quality of occupation and industry data among select notifiable conditions in Washington and discuss potential improvements to current data collection efforts. DESIGN: We evaluated occupation and industry data, collected by local health departments during routine case investigations, for 11 notifiable conditions, selected for inclusion based on an established or plausible link to occupational exposure. SETTING AND PARTICIPANTS: Confirmed cases of select notifiable conditions among Washington residents aged 16 to 64 years, for years 2019-2021. MAIN OUTCOME MEASURES: We calculated the percentage of cases among working-age adults reported as employed, the percentage with occupation and industry data collected, and the percentage assigned standard occupation and industry codes. We identified the most common responses for occupation and industry and challenges of assigning codes to those responses. RESULTS: Among the 11 conditions evaluated, one-third of cases aged 16 to 64 years were reported as employed. Among the cases reported as employed, 91.5% reported occupation data and 30.5% reported industry data. "Self-employed" was among the top responses for occupation, a response that does not describe a specific job and could not be assigned an occupation code. In the absence of additional information, 4 of the most common responses for industry could not be coded: "health care," "technology," "tech," and "food." CONCLUSION: Routine collection of informative occupation and industry data among working-age adults is largely absent from case investigations in Washington. Methods of data collection that improve quality while minimizing the burden of collection should be pursued. Suggestions for improving data quality are discussed.
Assuntos
Confiabilidade dos Dados , Indústrias , Adulto , Humanos , Washington/epidemiologia , Ocupações , Coleta de DadosRESUMO
Correctly designed flow cytometry (virometry) assays allow accurate detection and enumeration of viruses in water. However, rigorous controls and calibrators are needed to obtain quality data. In the absence of proper controls, the use of fluorescent dyes for virus enumeration can produce false positive signals and lead to the wrong estimation of total virus counts by misreporting colloid particles as virions. Here we describe a protocol that addresses the problems that might potentially confound virometry data accuracy.
Assuntos
Bacteriófagos , Bioensaio , Confiabilidade dos Dados , Citometria de Fluxo , Corantes FluorescentesRESUMO
BACKGROUND: In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. METHODS: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. EXPERIMENTS: We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. RESULTS: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd≈0.95) at CKD4 index date, and a minimum of ~0.45 year (sd≈0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd≈1.1) at index date and ~0.64 years (sd≈0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. CONCLUSION: GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes.
Assuntos
Current Procedural Terminology , Confiabilidade dos Dados , Humanos , Classificação Internacional de Doenças , Probabilidade , Algoritmo Florestas AleatóriasRESUMO
An underestimation of pertussis burden has impeded understanding of transmission and disallows effective policy and prevention to be prioritized and enacted. Capture-recapture analyses can improve burden estimates; however, uncertainty remains around incorporating health administrative data due to accuracy limitations. The aim of this study is to explore the impact of pertussis case definitions and data accuracy on capture-recapture estimates. We used a dataset from March 7, 2010 to December 31, 2017 comprised of pertussis case report, laboratory, and health administrative data. We compared Chao capture-recapture abundance estimates using prevalence, incidence, and adjusted false positive case definitions. The latter was developed by removing the proportion of false positive physician billing code-only case episodes after validation. We calculated sensitivity by dividing the number of observed cases by abundance. Abundance estimates demonstrated that a high proportion of cases were missed by all sources. Under the primary analysis, the highest sensitivity of 78.5% (95% CI 76.2-80.9%) for those less than one year of age was obtained using all sources after adjusting for false positives, which dropped to 43.1% (95% CI 42.4-43.8%) for those one year of age or older. Most code-only episodes were false positives (91.0%), leading to considerably lower abundance estimates and improvements in laboratory testing and case report sensitivity using this definition. Accuracy limitations can be accounted for in capture-recapture analyses using different case definitions and adjustment. The latter enhanced the validity of estimates, furthering the utility of capture-recapture methods to epidemiological research. Findings demonstrated that all sources consistently fail to detect pertussis cases. This is differential by age, suggesting ascertainment and testing bias. Results demonstrate the value of incorporating real time health administrative data into public health surveillance if accuracy limitations can be addressed.
Assuntos
Coqueluche , Humanos , Ontário/epidemiologia , Coqueluche/epidemiologia , Coqueluche/prevenção & controle , Vigilância em Saúde Pública , Confiabilidade dos Dados , PrevalênciaRESUMO
BACKGROUND: Surveys of hospitalized patients are important for research and learning about unobservable medical issues (eg, mental health, quality of life, and symptoms), but there has been little work examining survey data quality in this population whose capacity to respond to survey items may differ from the general population. OBJECTIVE: The aim of this study is to determine what factors drive response rates, survey drop-offs, and missing data in surveys of hospitalized patients. METHODS: Cross-sectional surveys were distributed on an inpatient tablet to patients in a large, midwestern US hospital. Three versions were tested: 1 with 174 items and 2 with 111 items; one 111-item version had missing item reminders that prompted participants when they did not answer items. Response rate, drop-off rate (abandoning survey before completion), and item missingness (skipping items) were examined to investigate data quality. Chi-square tests, Kaplan-Meyer survival curves, and distribution charts were used to compare data quality among survey versions. Response duration was computed for each version. RESULTS: Overall, 2981 patients responded. Response rate did not differ between the 174- and 111-item versions (81.7% vs 83%, P=.53). Drop-off was significantly reduced when the survey was shortened (65.7% vs 20.2% of participants dropped off, P<.001). Approximately one-quarter of participants dropped off by item 120, with over half dropping off by item 158. The percentage of participants with missing data decreased substantially when missing item reminders were added (77.2% vs 31.7% of participants, P<.001). The mean percentage of items with missing data was reduced in the shorter survey (40.7% vs 20.3% of items missing); with missing item reminders, the percentage of items with missing data was further reduced (20.3% vs 11.7% of items missing). Across versions, for the median participant, each item added 24.6 seconds to a survey's duration. CONCLUSIONS: Hospitalized patients may have a higher tolerance for longer surveys than the general population, but surveys given to hospitalized patients should have a maximum of 120 items to ensure high rates of completion. Missing item prompts should be used to reduce missing data. Future research should examine generalizability to nonhospitalized individuals.
Assuntos
Pacientes Internados , Qualidade de Vida , Humanos , Estudos Transversais , Confiabilidade dos Dados , EletrônicaRESUMO
The process of peer review has been the gold standard for evaluating medical science, but significant pressures from the recent COVID-19 pandemic, new methods of communication, larger amounts of research, and an evolving publication landscape have placed significant pressures on this system. A task force convened by the American College of Cardiology identified the 5 most significant controversies associated with the current peer-review process: the effect of preprints, reviewer blinding, reviewer selection, reviewer incentivization, and publication of peer reviewer comments. Although specific solutions to these issues will vary, regardless of how scientific communication evolves, peer review must remain an essential process for ensuring scientific integrity, timely dissemination of information, and better patient care. In medicine, the peer-review process is crucial because harm can occur if poor-quality data or incorrect conclusions are published. With the dramatic increase in scientific publications and new methods of communication, high-quality peer review is more important now than ever.
Assuntos
Medicina , Pandemias , Humanos , Revisão por Pares/métodos , Comunicação , Confiabilidade dos Dados , Revisão da Pesquisa por ParesRESUMO
Increasingly, applied social scientists and clinicians recognize the value of engaging transgender and gender-diverse (TGD) people, particularly TGD individuals with lived experience as care recipients (peers), to inform the provision of gender-affirming care. Despite this trend, few researchers have systematically examined how this group can contribute to and enhance the development and delivery of interventions intended to affirm gender diversity. In this article, we address limitations in the literature by drawing on a secondary analysis of qualitative data - originally collected to examine the peer support experiences of TGD individuals - to explore the potential that TGD peers hold for elevating gender-affirming care. The study was informed methodologically by an abductive approach to grounded theory, and conceptually by critical resilience and intersectional scholarship. Data collection involved virtual, semi-structured interviews with 35 TGD individuals in two Canadian cities who indicated having experiences of seeking, receiving, and/or providing peer support. Data analysis comprised an iterative, abductive process of cross-referencing participant accounts with relevant scholarship to arrive at an account of how TGD peers may contribute to the growth of gender-affirming care. Our findings suggest, broadly, that TGD peers may enhance gender-affirming care by: (1) validating a growing diversity of embodiments and experiences in healthcare decision-making, (2) nurturing and diversifying relevant networks of safety, community support, and advocacy outside formal systems of care, and (3) strengthening possibilities for resisting and transforming existing healthcare systems. After outlining these findings, we briefly consider the implications of our analysis and leverage our inferences to substantiate the notion of community-driven gender-affirming care, meaning care that is intentional in its incorporation of relevant community stakeholders to shape governance and service provision. We conclude with reflections on the promise of community-driven care at a time of heightened volatility across systems serving TGD populations.
Assuntos
Pessoas Transgênero , Humanos , Canadá , Cidades , Confiabilidade dos Dados , Análise de DadosRESUMO
The US National Library of Medicine has created and maintained the PubMed® database, a collection of over 33.8 million records that contain citations and abstracts from the biomedical and life sciences literature. This database is an important resource for researchers and information service providers alike. As part of our work related to the creation of an author graph for coronaviruses, we encountered several data quality issues with records from a curated subset of the PubMed database called MEDLINE. We provide a data quality assessment for records selected from the MEDLINE database and report on several issues ranging from parsing issues (e.g. character encodings and schema definition weaknesses) to low scores for identifiers against several data quality metrics (e.g. completeness, validity and uniqueness). Database URL https://pubmed.ncbi.nlm.nih.gov.
Assuntos
Confiabilidade dos Dados , Estados Unidos , MEDLINE , PubMed , Bases de Dados Factuais , National Library of Medicine (U.S.)RESUMO
BACKGROUND: Although gene expression data play significant roles in biological and medical studies, their applications are hampered due to the difficulty and high expenses of gathering them through biological experiments. It is an urgent problem to generate high quality gene expression data with computational methods. WGAN-GP, a generative adversarial network-based method, has been successfully applied in augmenting gene expression data. However, mode collapse or over-fitting may take place for small training samples due to just one discriminator is adopted in the method. RESULTS: In this study, an improved data augmentation approach MDWGAN-GP, a generative adversarial network model with multiple discriminators, is proposed. In addition, a novel method is devised for enriching training samples based on linear graph convolutional network. Extensive experiments were implemented on real biological data. CONCLUSIONS: The experimental results have demonstrated that compared with other state-of-the-art methods, the MDWGAN-GP method can produce higher quality generated gene expression data in most cases.
Assuntos
Confiabilidade dos Dados , Expressão GênicaRESUMO
Evidence synthesis, embedded within a systematic review of the literature, is a well-established approach for collating and combining all the relevant information on a particular research question. A robust synthesis can establish the evidence base, which underpins best practice guidance. Such endeavours are frequently used by policymakers and practitioners to inform their decision making. Traditionally, an evidence synthesis of interventions consisted of a meta-analysis of quantitative data comparing two treatment alternatives addressing a specific and focussed clinical question. However, as the methods in the field have evolved, especially in response to the increasingly complex healthcare questions, more advanced evidence synthesis techniques have been developed. These can deal with extended data structures considering more than two treatment alternatives (network meta-analysis) and complex multicomponent interventions. The array of questions capable of being answered has also increased with specific approaches being developed for different evidence types including diagnostic, prognostic and qualitative data. Furthermore, driven by a desire for increasingly up-to-date evidence summaries, living systematic reviews have emerged. All of these methods can potentially have a role in informing older adult healthcare decisions. The aim of this review is to increase awareness and uptake of the increasingly comprehensive array of newer synthesis methods available and highlight their utility for answering clinically relevant questions in the context of older adult research, giving examples of where such techniques have already been effectively applied within the field. Their strengths and limitations are discussed, and we suggest user-friendly software options to implement the methods described.
Assuntos
Confiabilidade dos Dados , Humanos , Idoso , Metanálise em RedeRESUMO
Numerous studies make extensive use of healthcare data, including human materials and clinical information, and acknowledge its significance. However, limitations in data collection methods can impact the quality of healthcare data obtained from multiple institutions. In order to secure high-quality data related to human materials, research focused on data quality is necessary. This study validated the quality of data collected in 2020 from 16 institutions constituting the Korea Biobank Network using 104 validation rules. The validation rules were developed based on the DQ4HEALTH model and were divided into four dimensions: completeness, validity, accuracy, and uniqueness. Korea Biobank Network collects and manages human materials and clinical information from multiple biobanks, and is in the process of developing a common data model for data integration. The results of the data quality verification revealed an error rate of 0.74%. Furthermore, an analysis of the data from each institution was performed to examine the relationship between the institution's characteristics and error count. The results from a chi-square test indicated that there was an independent correlation between each institution and its error count. To confirm this correlation between error counts and the characteristics of each institution, a correlation analysis was conducted. The results, shown in a graph, revealed the relationship between factors that had high correlation coefficients and the error count. The findings suggest that the data quality was impacted by biases in the evaluation system, including the institution's IT environment, infrastructure, and the number of collected samples. These results highlight the need to consider the scalability of research quality when evaluating clinical epidemiological information linked to human materials in future validation studies of data quality.
Assuntos
Bancos de Espécimes Biológicos , Confiabilidade dos Dados , Humanos , Manejo de Espécimes/métodos , Atenção à Saúde , República da CoreiaRESUMO
Monitoring of clinical trials is critical to the protection of human subjects and the conduct of high-quality research. Even though the adoption of risk-based monitoring (RBM) has been suggested for many years, the RBM approach has been less widespread than expected. Centralized monitoring is one of the RMB pillars, together with remote-site monitoring visits, reduced Source Data Verification (SDV) and Source Document Reviews (SDR). The COVID-19 pandemic promoted disruptions in the conduction of clinical trials, as on-site monitoring visits were adjourned. In this context, the transition to RBM by all actors involved in clinical trials has been encouraged. In order to ensure the highest quality of data within a COVID-19 clinical trial, a centralized monitoring tool alongside Case Report Forms (CRFs) and synchronous automated routines were developed at the clinical research platform, Fiocruz, Brazilian Ministry of Health. This paper describes how these tools were developed, their features, advantages, and limitations. The software codes, and the CRFs are available at the Fiocruz Data Repository for Research-Arca Dados, reaffirming Fiocruz's commitment to Open Science practices.
Assuntos
Confiabilidade dos Dados , Pandemias , Humanos , Pandemias/prevenção & controle , Software , BrasilRESUMO
BACKGROUND: Missingness in health care data poses significant challenges in the development and implementation of artificial intelligence (AI) and machine learning solutions. Identifying and addressing these challenges is critical to ensuring the continued growth and accuracy of these models as well as their equitable and effective use in health care settings. OBJECTIVE: This study aims to explore the challenges, opportunities, and potential solutions related to missingness in health care data for AI applications through the conduct of a digital conference and thematic analysis of conference proceedings. METHODS: A digital conference was held in September 2022, attracting 861 registered participants, with 164 (19%) attending the live event. The conference featured presentations and panel discussions by experts in AI, machine learning, and health care. Transcripts of the event were analyzed using the stepwise framework of Braun and Clark to identify key themes related to missingness in health care data. RESULTS: Three principal themes-data quality and bias, human input in model development, and trust and privacy-emerged from the analysis. Topics included the accuracy of predictive models, lack of inclusion of underrepresented communities, partnership with physicians and other populations, challenges with sensitive health care data, and fostering trust with patients and the health care community. CONCLUSIONS: Addressing the challenges of data quality, human input, and trust is vital when devising and using machine learning algorithms in health care. Recommendations include expanding data collection efforts to reduce gaps and biases, involving medical professionals in the development and implementation of AI models, and developing clear ethical guidelines to safeguard patient privacy. Further research and ongoing discussions are needed to ensure these conclusions remain relevant as health care and AI continue to evolve.
Assuntos
Inteligência Artificial , Aprendizado de Máquina , Humanos , Algoritmos , Confiabilidade dos Dados , Coleta de DadosRESUMO
OBJECTIVES: Although ChatGPT was not developed for medical use, there is growing interest in its use in medical fields. Understanding its capabilities and precautions for its use in the medical field is an urgent matter. We hypothesized that differences in the amounts of information published in different medical fields would be proportionate to the amounts of training ChatGPT receives in those fields, and hence its accuracy in providing answers. STUDY DESIGN: A non-clinical experimental study. METHODS: We administered the Japanese National Medical Examination to GPT-3.5 and GPT-4 to examine the rates of accuracy and consistency in their responses. We counted the total number of documents in the Web of Science Core Collection per medical field and assessed the relationship with ChatGPT's accuracy. We also performed multivariate-adjusted models to investigate the risk factors for incorrect answers. RESULTS: For GPT-4, we confirmed an accuracy rate of 81.0 % and a consistency rate of 88.8 % on the exam; both showed improvement compared to those for GPT-3.5. A positive correlation was observed between the accuracy rate and consistency rate (R = 0.51, P < 0.001). The number of documents per medical field was significantly correlated with the accuracy rate in that medical field (R = 0.44, P < 0.05), with relatively few publications being an independent risk factor for incorrect answers. CONCLUSIONS: Checking consistency may help identify incorrect answers when using ChatGPT. Users should be aware that the accuracy of the answers by ChatGPT may decrease when it is asked about topics with limited published information, such as new drugs and diseases.
Assuntos
Inteligência Artificial , Humanos , Fatores de Risco , Confiabilidade dos DadosRESUMO
BACKGROUND: Good quality data are a key to quality health care. In 2017, WHO has launched the Quality of Care Network (QCN) to reduce maternal, newborn and stillbirth mortality via learning and sharing networks. Guided by the principle of equity and dignity, the network members agreed to implement the programme in 2017-2021. OBJECTIVE: This paper seeks to explore how QCN has contributed to improving data quality and to identify factors influencing quality of data in Ethiopia. METHODS: We conducted a qualitative study in selected QCN facilities in Ethiopia using key informant interview and observation methods. We interviewed 40 people at national, sub-national and facility levels. Non-participant observations were carried out in four purposively selected health facilities; we accessed monthly reports from 41 QCN learning facilities. A codebook was prepared following a deductive and inductive analytical approach, coded using Nvivo 12 and thematically analysed. RESULTS: There was a general perception that QCN had improved health data documentation and use in the learning facilities, achieved through coaching, learning and building from pre-existing initiatives. QCN also enhanced the data elements available by introducing a broader set of quality indicators. However, the perception of poor data quality persisted. Factors negatively affecting data quality included a lack of integration of QCN data within routine health system activities, the perception that QCN was a pilot, plus a lack of inclusive engagement at different levels. Both individual and system capabilities needed to be strengthened. CONCLUSION: There is evidence of QCN's contribution to improving data awareness. But a lack of inclusive engagement of actors, alignment and limited skill for data collection and analysis continued to affect data quality and use. In the absence of new resources, integration of new data activities within existing routine health information systems emerged as the most important potential action for positive change.
Assuntos
Confiabilidade dos Dados , Confiança , Recém-Nascido , Humanos , Etiópia , Qualidade da Assistência à Saúde , Inquéritos e Questionários , Instalações de SaúdeRESUMO
BACKGROUND: The COVID-19 pandemic has spread over the world. The ability to achieve sufficient immunization coverage to end the global pandemic depends on the acceptance of the COVID-19 vaccine, but it has faced a major challenge around the world. In low-income and developing countries, 22.7% of the population has received at least one dose of the Covid-19 vaccine, which means that a large percentage of the population are unvaccinated, even though they have access to the Covid-19 vaccine so many countries do not accept the vaccine. The aim of this study was to assess COVID-19 vaccine acceptance and its associated factors in Debre Berhan City, Ethiopia, 2022. METHODS: A mixed-methods approach comprising both qualitative interviews and a quantitative survey was used among participants in Debre Berhan City. A multi-stage sampling technique was used to recruit the study participants. An in-depth interview was used for the qualitative data. Data was collected by a face-to-face interview questionnaire from June 08 to July 08, 2022. The collected data was entered using Epi Data version 4.6 and analyzed using SPSS version 25. Variables with a p-value less than 0.25 at Bivariable logistic regression analysis were entered into multivariable logistic regression analysis. Logistics regression was employed, and a p-value <0.05 was considered statistically significant. RESULT: A total of 765 participants were included in the study, with a response rate of 97.08%. More than half (52.9%) of the respondents had the willingness to accept the COVID-19 vaccine. Participants who had Contact with COVID-19 patient (AOR = 3.98; 95% CI: (1.30-12.14), having good knowledge of COVID-19 vaccine (AOR = 4.63; 95% CI: (1.84-11.63), and positive attitude toward the COVID-19 vaccine (AOR = 3.41; 95% CI: (1.34-8.69) were statistically significantly associated variables with COVID-19 vaccine acceptance. CONCLUSION AND RECOMMENDATION: The present study revealed that the acceptance COVID-19 vaccine was 52.9, and a significant proportion of participants were hesitant to receive the vaccine and refused to get vaccinated. Significantly associated Variables for COVID-19 vaccine acceptance were Contact with COVID-19 patient, having good knowledge of the COVID-19 vaccine, and having a positive attitude towards the COVID-19 vaccine. Various stakeholders to apprise the public about the cause of the disease and the scientific development of vaccine in order to enhance acceptance of the vaccine.
Assuntos
Vacinas contra COVID-19 , COVID-19 , Humanos , COVID-19/epidemiologia , COVID-19/prevenção & controle , Etiópia/epidemiologia , Pandemias , Confiabilidade dos DadosRESUMO
Large high-quality datasets are essential for building powerful artificial intelligence (AI) algorithms capable of supporting advancement in cardiac clinical research. However, researchers working with electrocardiogram (ECG) signals struggle to get access and/or to build one. The aim of the present work is to shed light on a potential solution to address the lack of large and easily accessible ECG datasets. Firstly, the main causes of such a lack are identified and examined. Afterward, the potentials and limitations of cardiac data generation via deep generative models (DGMs) are deeply analyzed. These very promising algorithms have been found capable not only of generating large quantities of ECG signals but also of supporting data anonymization processes, to simplify data sharing while respecting patients' privacy. Their application could help research progress and cooperation in the name of open science. However several aspects, such as a standardized synthetic data quality evaluation and algorithm stability, need to be further explored.
Assuntos
Inteligência Artificial , Eletrocardiografia , Humanos , Coração , Algoritmos , Confiabilidade dos DadosRESUMO
The long-term protection and restoration of aquatic resources depends on robust monitoring data; data that require systematic quality control and analysis tools. The MassWateR R package facilitates quality control, analysis, and data sharing for discrete surface water quality data collected by monitoring programs of various size and technical capacity. The tools were developed to address regional needs for programs in Massachusetts, USA, but the principles and outputs can be applicable to monitoring data collected anywhere. Users can create quality control reports, perform outlier analyses, and assess trends by season, date, and site for more than 40 parameters. Users can also prepare data for submission to the United States Environmental Protection Agency Water Quality Exchange, thus sharing data to the largest water quality database in the United States. The automated and reproducible workflow offered by MassWateR is expected to increase the quantity and quality of publicly available data to support the management of aquatic resources.
Assuntos
Monitoramento Ambiental , Qualidade da Água , Estados Unidos , Bases de Dados Factuais , Confiabilidade dos Dados , Controle de QualidadeRESUMO
Objective: Systematic reviews and other evidence synthesis projects require systematic search methods. Search systems require several essential attributes to support systematic searching; however, many systems used in evidence synthesis fail to meet one or more of these requirements. I undertook a qualitative study to examine the effects of these limitations on systematic searching and how searchers select information sources for evidence synthesis projects. Methods: Qualitative data were collected from interviews with twelve systematic searchers. Data were analyzed using reflexive thematic analysis. Results: I used thematic analysis to identify two key themes relating to search systems: systems shape search processes, and systematic searching occurs within the information market. Many systems required for systematic reviews, in particular sources of unpublished studies, are not designed for systematic searching. Participants described various workarounds for the limitations they encounter in these systems. Economic factors influence searchers' selection of sources to search, as well as the degree to which vendors prioritize these users. Conclusion: Interviews with systematic searchers suggest priorities for improving search systems, and barriers to improvement that must be overcome. Vendors must understand the unique requirements of systematic searching and recognize systematic searchers as a distinct group of users. Better interfaces and improved functionality will result in more efficient evidence synthesis.
Assuntos
Confiabilidade dos Dados , Fonte de Informação , Humanos , Revisões Sistemáticas como Assunto , Pesquisa QualitativaRESUMO
OBJECTIVE: The neurosurgical match is a challenging process for applicants and programs alike. Programs must narrow a wide field of applicants to interview and then determine how to rank them after limited interaction. To streamline this, programs commonly screen applicants using United States Medical Licensing Examination (USMLE) Step scores. However, this approach removes nuance from a consequential decision and exacerbates existing biases. The primary objective of this study was to demonstrate the feasibility of effecting minor modifications to the residency application process, as the authors have done at their institution, specifically by reducing the prominence of USMLE board scores and Alpha Omega Alpha (AΩA) status, both of which have been identified as bearing racial biases. METHODS: At the authors' institution, residents and attendings holistically reviewed applications with intentional redundancy so that every file was reviewed by two individuals. Reviewers were blinded to applicants' photographs and test scores. On interview day, the applicant was evaluated for their strength in three domains: knowledge, commitment to neurosurgery, and integrity. For rank discussions, applicants were reviewed in the order of their domain scores, and USMLE scores were unblinded. A regression analysis of the authors' rank list was made by regressing the rank list by AΩA status, Step 1 score, Step 2 score, subinternship, and total interview score. RESULTS: No variables had a significant effect on the rank list except total interview score, for which a single-point increase corresponded to a 15-position increase in rank list when holding all other variables constant (p < 0.05). CONCLUSIONS: The goal of this holistic review and domain-based interview process is to mitigate bias by shifting the focus to selected core qualities in lieu of traditional metrics. Since implementation, the authors' final rank lists have closely reflected the total interview score but were not significantly affected by board scores or AΩA status. This system allows for the removal of known sources of bias early in the process, with the aim of reducing potential downstream effects and ultimately promoting a final list that is more reflective of stated values.