RESUMO
BACKGROUND: Pulmonary nodules represent a growing health care burden because of delayed diagnosis of malignant lesions and overtesting for benign processes. Clinical prediction models were developed to inform physician assessment of pretest probability of nodule malignancy but have not been validated in a high-risk cohort of nodules for which biopsy was ultimately performed. RESEARCH QUESTION: Do guideline-recommended prediction models sufficiently discriminate between benign and malignant nodules when applied to cases referred for biopsy by navigational bronchoscopy? STUDY DESIGN AND METHODS: We assembled a prospective cohort of 322 indeterminate pulmonary nodules in 282 patients referred to a tertiary medical center for diagnostic navigational bronchoscopy between 2017 and 2019. We calculated the probability of malignancy for each nodule using the Brock model, Mayo Clinic model, and Veterans Affairs (VA) model. On a subset of 168 patients who also had PET-CT scans before biopsy, we also calculated the probability of malignancy using the Herder model. The performance of the models was evaluated by calculating the area under the receiver operating characteristic curves (AUCs) for each model. RESULTS: The study cohort contained 185 malignant and 137 benign nodules (57% prevalence of malignancy). The malignant and benign cohorts were similar in terms of size, with a median longest diameter for benign and malignant nodules of 15 and 16 mm, respectively. The Brock model, Mayo Clinic model, and VA model showed similar performance in the entire cohort (Brock AUC, 0.70; 95% CI, 0.64-0.76; Mayo Clinic AUC, 0.70; 95% CI, 0.64-0.76; VA AUC, 0.67; 95% CI, 0.62-0.74). For 168 nodules with available PET-CT scans, the Herder model had an AUC of 0.77 (95% CI, 0.68-0.85). INTERPRETATION: Currently available clinical models provide insufficient discrimination between benign and malignant nodules in the common clinical scenario in which a patient is being referred for biopsy, especially when PET-CT scan information is not available.
RESUMO
In Pakistan, the assessment of road safety measures within road safety management systems is commonly seen as the most deficient part. Accident prediction models are essential for road authorities, road designers, and road safety specialists. These models facilitate the examination of safety concerns, the identification of safety improvements, and the projection of the potential impact of these modifications in terms of collision reduction. In the context described above, the goal of this paper is to utilize the 2-tuple linguistic q-rung orthopair fuzzy set (2TLq-ROFS), a new and useful decision tool with a strong ability to address uncertain or imprecise information in practical decision-making processes. In addition, for dealing with the multi-attribute group decision-making problems in road safety management, this paper proposes a new 2TLq-ROF integrated determination of objective criteria weights (IDOCRIW)-the qualitative flexible multiple criteria (QUALIFLEX) decision analysis method with a weighted power average (WPA) operator based on the 2TLq-ROF numbers. The IDOCRIW method is used to calculate the weight of attributes and the QUALIFLEX method is used to rank the options. To show the viability and superiority of the proposed approach, we also perform a case study on the evaluation of accident prediction models in road safety management. Finally, the results of the experiments and comparisons with existing methods are used to explain the benefits and superiority of the suggested approach. The findings of this study show that the proposed approach is more practical and compatible with other existing approaches.
RESUMO
Background: Colorectal cancer (CRC) incidence and mortality are increasing internationally. Endoscopy services are under significant pressure with many overwhelmed. Faecal immunochemical testing (FIT) has been advocated to identify a high-risk population of symptomatic patients requiring definitive investigation by colonoscopy. Combining FIT with other factors in a risk prediction model could further improve performance in identifying those requiring investigation most urgently. We systematically reviewed performance of models predicting risk of CRC and/or advanced colorectal polyps (ACP) in symptomatic patients, with a particular focus on those models including FIT. Methods: The review protocol was published on PROSPERO (CRD42022314710). Searches were conducted from database inception to April 2023 in MEDLINE, EMBASE, Cochrane libraries, SCOPUS and CINAHL. Risk of bias of each study was assessed using The Prediction study Risk Of Bias Assessment Tool. A narrative synthesis based on the guidelines for Synthesis Without Meta-Analysis was performed due to study heterogeneity. Findings: We included 62 studies; 23 included FIT (n = 22) or guaiac Faecal Occult Blood Testing (n = 1) combined with one or more other variables. Twenty-one studies were conducted solely in primary care. Generally, prediction models including FIT consistently had good discriminatory ability for CRC/ACP (i.e. AUC >0.8) and performed better than models without FIT although some models without FIT also performed well. However, many studies did not present calibration and internal and external validation were limited. Two studies were rated as low risk of bias; neither model included FIT. Interpretation: Risk prediction models, including and not including FIT, show promise for identifying those most at risk of colorectal neoplasia. Substantial limitations in evidence remain, including heterogeneity, high risk of bias, and lack of external validation. Further evaluation in studies adhering to gold standard methodology, in appropriate populations, is required before widespread adoption in clinical practice. Funding: National Institute for Health and Care Research (NIHR) [Health Technology Assessment Programme (HTA) Programme (Project number 133852).
RESUMO
Background: Glycosylated hemoglobin (HbA1c) is recommended for diagnosing and monitoring type 2 diabetes. However, the monitoring frequency in real-world applications has not yet reached the recommended frequency in the guidelines. Developing machine learning models to screen patients with poor glycemic control in patients with T2D could optimize management and decrease medical service costs. Methods: This study was carried out on patients with T2D who were examined for HbA1c at the Sichuan Provincial People's Hospital from April 2018 to December 2019. Characteristics were extracted from interviews and electronic medical records. The data (excluded FBG or included FBG) were randomly divided into a training dataset and a test dataset with a radio of 8:2 after data pre-processing. Four imputing methods, four screening methods, and six machine learning algorithms were used to optimize data and develop models. Models were compared on the basis of predictive performance metrics, especially on the model benefit (MB, a confusion matrix combined with economic burden associated with therapeutic inertia). The contributions of features were interpreted using SHapley Additive exPlanation (SHAP). Finally, we validated the sample size on the best model. Results: The study included 980 patients with T2D, of whom 513 (52.3%) were defined as positive (need to perform the HbA1c test). The results indicated that the model trained in the data (included FBG) presented better forecast performance than the models that excluded the FBG value. The best model used modified random forest as the imputation method, ElasticNet as the feature screening method, and the LightGBM algorithms and had the best performance. The MB, AUC, and AUPRC of the best model, among a total of 192 trained models, were 43475.750 (¥), 0.972, 0.944, and 0.974, respectively. The FBG values, previous HbA1c values, having a rational and reasonable diet, health status scores, type of manufacturers of metformin, interval of measurement, EQ-5D scores, occupational status, and age were the most significant contributors to the prediction model. Conclusion: We found that MB could be an indicator to evaluate the model prediction performance. The proposed model performed well in identifying patients with T2D who need to undergo the HbA1c test and could help improve individualized T2D management.
RESUMO
A non-invasive risk assessment tool capable of stratifying coronary artery stenosis into high and low risk would reduce the number of patients who undergo invasive FFR, the current gold standard procedure for assessing coronary artery disease. Current statistic-based models that predict if FFR is above or below the threshold for physiological significance rely completely on anatomical parameters, such as percent diameter stenosis (%DS), resulting in models not accurate enough for clinical application. The inclusion of coronary artery flow rate (CFR) was added to an anatomical-only logistic regression model to quantify added predictive value. Initial hypothesis testing on a cohort of 96 coronary artery segments with some degree of stenosis found higher mean CFR in a group with low FFR < 0.8 (µ = 2.37 ml/s) compared to a group with high FFR > 0.8 (µ = 1.85 ml/s) (p-value = 0.046). Logistic regression modeling using both %DS and CFR (AUC = 0.78) outperformed logistic regression models using either only %DS (AUC = 0.71) or only CFR (AUC = 0.62). Including physiological parameters in addition to anatomical parameters are necessary to improve statistical based models for assessing high or low FFR.
RESUMO
Current colorectal cancer (CRC) screening recommendations take a "one-size-fits-all" approach using age as the major criterion to initiate screening. Precision screening that incorporates factors beyond age to risk stratify individuals could improve on current approaches and optimally use available resources with benefits for patients, providers, and health care systems. Prediction models could identify high-risk groups who would benefit from more intensive screening, while low-risk groups could be recommended less intensive screening incorporating noninvasive screening modalities. In addition to age, prediction models incorporate well-established risk factors such as genetics (eg, family CRC history, germline, and polygenic risk scores), lifestyle (eg, smoking, alcohol, diet, and physical inactivity), sex, and race and ethnicity among others. Although several risk prediction models have been validated, few have been systematically studied for risk-adapted population CRC screening. In order to envisage clinical implementation of precision screening in the future, it will be critical to develop reliable and accurate prediction models that apply to all individuals in a population; prospectively study risk-adapted CRC screening on the population level; garner acceptance from patients and providers; and assess feasibility, resources, cost, and cost-effectiveness of these new paradigms. This review evaluates the current state of risk prediction modeling and provides a roadmap for future implementation of precision CRC screening.
Assuntos
Neoplasias Colorretais , Detecção Precoce de Câncer , Humanos , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/epidemiologia , Neoplasias Colorretais/genética , Fatores de Risco , Estilo de Vida , Medição de Risco , Colonoscopia , Programas de RastreamentoRESUMO
BACKGROUND: The benefits and harms of breast screening may be better balanced through a risk-stratified approach. We conducted a systematic review assessing the accuracy of questionnaire-based risk assessment tools for this purpose. METHODS: Population: asymptomatic women aged ≥40 years; Intervention: questionnaire-based risk assessment tool (incorporating breast density and polygenic risk where available); Comparison: different tool applied to the same population; Primary outcome: breast cancer incidence; Scope: external validation studies identified from databases including Medline and Embase (period 1 January 2008-20 July 2021). We assessed calibration (goodness-of-fit) between expected and observed cancers and compared observed cancer rates by risk group. Risk of bias was assessed with PROBAST. RESULTS: Of 5124 records, 13 were included examining 11 tools across 15 cohorts. The Gail tool was most represented (n = 11), followed by Tyrer-Cuzick (n = 5), BRCAPRO and iCARE-Lit (n = 3). No tool was consistently well-calibrated across multiple studies and breast density or polygenic risk scores did not improve calibration. Most tools identified a risk group with higher rates of observed cancers, but few tools identified lower-risk groups across different settings. All tools demonstrated a high risk of bias. CONCLUSION: Some risk tools can identify groups of women at higher or lower breast cancer risk, but this is highly dependent on the setting and population.
RESUMO
INTRODUCTION: Personalized disease management informed by quantitative risk prediction has the potential to improve patient care and outcomes. The integration of risk prediction into clinical workflow should be informed by the experiences and preferences of stakeholders, and the impact of such integration should be evaluated in prospective comparative studies. The objectives of the IMplementing Predictive Analytics towards efficient chronic obstructive pulmonary disease (COPD) treatments (IMPACT) study are to integrate an exacerbation risk prediction tool into routine care and to determine its impact on prescription appropriateness (primary outcome), medication adherence, quality of life, exacerbation rates, and sex and gender disparities in COPD care (secondary outcomes). METHODS: IMPACT will be conducted in two phases. Phase 1 will include the systematic and user-centered development of two decision support tools: (1) a decision tool for pulmonologists called the ACCEPT decision intervention (ADI), which combines risk prediction from the previously developed Acute COPD Exacerbation Prediction Tool with treatment algorithms recommended by the Canadian Thoracic Society's COPD pharmacotherapy guidelines, and (2) an information pamphlet for COPD patients (patient tool), tailored to their prescribed medication, clinical needs, and lung function. In phase 2, we will conduct a stepped-wedge cluster randomized controlled trial in two outpatient respiratory clinics to evaluate the impact of the decision support tools on quality of care and patient outcomes. Clusters will be practicing pulmonologists (n ≥ 24), who will progressively switch to the intervention over 18 months. At the end of the study, a qualitative process evaluation will be carried out to determine the barriers and enablers of uptake of the tools. DISCUSSION: The IMPACT study coincides with a planned harmonization of electronic health record systems across tertiary care centers in British Columbia, Canada. The harmonization of these systems combined with IMPACT's implementation-oriented design and partnership with stakeholders will facilitate integration of the tools into routine care, if the results of the proposed study reveal positive association with improvement in the process and outcomes of clinical care. The process evaluation at the end of the trial will inform subsequent design iterations before largescale implementation. TRIAL REGISTRATION: NCT05309356.
RESUMO
Because the U.S. is a major player in the international oil market, it is interesting to study whether aggregate and state-level economic conditions can predict the subsequent realized volatility of oil price returns. To address this research question, we frame our analysis in terms of variants of the popular heterogeneous autoregressive realized volatility (HAR-RV) model. To estimate the models, we use quantile-regression and quantile machine learning (Lasso) estimators. Our estimation results highlights the differential effects of economic conditions on the quantiles of the conditional distribution of realized volatility. Using weekly data for the period April 1987 to December 2021, we document evidence of predictability at a biweekly and monthly horizon.
RESUMO
Short-term mobile monitoring campaigns are increasingly used to assess long-term air pollution exposure in epidemiology. Little is known about how monitoring network design features, including the number of stops and sampling temporality, impacts exposure assessment models. We address this gap by leveraging an extensive mobile monitoring campaign conducted in the greater Seattle area over the course of a year during all days of the week and most hours. The campaign measured total particle number concentration (PNC; sheds light on ultrafine particulate (UFP) number concentration), black carbon (BC), nitrogen dioxide (NO2), fine particulate matter (PM2.5), and carbon dioxide (CO2). In Monte Carlo sampling of 7327 total stops (278 sites × 26 visits each), we restricted the number of sites and visits used to estimate annual averages. Predictions from the all-data campaign performed well, with cross-validated R2s of 0.51-0.77. We found similar model performances (85% of the all-data campaign R2) with â¼1000 to 3000 randomly selected stops for NO2, PNC, and BC, and â¼4000 to 5000 stops for PM2.5 and CO2. Campaigns with additional temporal restrictions (e.g., business hours, rush hours, weekdays, or fewer seasons) had reduced model performances and different spatial surfaces. Mobile monitoring campaigns wanting to assess long-term exposure should carefully consider their monitoring designs.
Assuntos
Poluentes Atmosféricos , Poluição do Ar , Poluentes Atmosféricos/análise , Dióxido de Nitrogênio/análise , Dióxido de Carbono , Monitoramento Ambiental , Poluição do Ar/análise , Material Particulado/análise , Fuligem/análiseRESUMO
Next generation risk assessment is defined as a knowledge-driven system that allows for cost-efficient assessment of human health risk related to chemical exposure, without animal experimentation. One of the key features of next generation risk assessment is to facilitate prioritization of chemical substances that need a more extensive toxicological evaluation, in order to address the need to assess an increasing number of substances. In this case study focusing on chemicals in food, we explored how exposure data combined with the Threshold of Toxicological Concern (TTC) concept could be used to prioritize chemicals, both for existing substances and new substances entering the market. Using a database of existing chemicals relevant for dietary exposure we calculated exposure estimates, followed by application of the TTC concept to identify substances of higher concern. Subsequently, a selected set of these priority substances was screened for toxicological potential using high-throughput screening (HTS) approaches. Remarkably, this approach resulted in alerts for a selection of substances that are already on the market and represent relevant exposure in consumers. Taken together, the case study provides proof-of-principle for the approach taken to identify substances of concern, and this approach can therefore be considered a supportive element to a next generation risk assessment strategy.
RESUMO
Circular economy is a global trend as a promising strategy for the sustainable use of natural resources. In this context, waste-to-energy presents an effective solution to respond to the ever-increasing waste generation and energy demand duality. However, waste diversity makes their management a serious challenge. Among their categories, biomass waste valorization is an attractive solution energy regarding its low cost and raw materials availability. Nevertheless, the knowledge of biomass waste characteristics, such as composition and energy content, is a necessity. In this research, new models are developed to estimate biomass wastes higher heating value (HHV) based on the ultimate analysis using linear regression and artificial neural network (ANN). The quality-measure of the two models for new dataset was evaluated with statistical metrics such as coefficient of correlation (R), root mean squared error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The methods developed in this work provided attractive accuracies comparing to other literature models. Additionally, it is found that the ANN, as machine learning method, is the best model for biomass HHV prediction (R = 0.75377, RMSE = 1.17527, MAE = 0.93315 and MAPE = 5.73%). Therefore, obtained results can be widely employed to design and optimize the reactors of combustion. In fact, the developed ANN software is a simple and accurate tool for HHV estimation based on ultimate analysis. Indeed, ANN is one of the most applicable and widely used software in the field of waste-to-energy.
Assuntos
Calefação , Redes Neurais de Computação , Biomassa , Modelos Lineares , Fenômenos FísicosRESUMO
PURPOSE: To develop a predictive model based on Danish administrative registers to facilitate automated identification of individuals at risk of any type of cancer. METHODS: A nationwide register-based cohort study covering all individuals in Denmark aged +20 years. The outcome was all-type cancer during 2017 excluding nonmelanoma skin cancer. Diagnoses, medication, and contact with general practitioners in the exposure period (2007-2016) were considered for the predictive model. We applied backward selection to all variables by logistic regression to develop a risk model for cancer. We applied the models to the validation cohort, calculated the receiver operating characteristic curves, and estimated the corresponding areas under the curve (AUC). RESULTS: The study population consisted of 4.2 million persons; 32,447 (0.76%) were diagnosed with cancer in 2017. We identified 39 predictive risk factors in women and 42 in men, with age above 30 as the strongest predictor for cancer. Testing the model for cancer risk showed modest accuracy, with an AUC of 0.82 (95% CI 0.81-0.82) for men and 0.75 (95% CI 0.74-0.75) for women. CONCLUSION: We have developed and tested a model for identifying the individual risk of cancer through the use of administrative data. The models need to be further investigated before being applied to clinical practice.
RESUMO
The aims of this study were (1) to develop a comprehensive risk-of-death and life expectancy (LE) model and (2) to provide data on the effects of multiple risk factors on LE. We used data for Canada from the Global Burden of Disease (GBD) Study. To create period life tables for males and females, we obtained age/sex-specific deaths rates for 270 diseases, population distributions for 51 risk factors, and relative risk functions for all disease-exposure pairs. We computed LE gains from eliminating each factor, LE values for different levels of exposure to each factor, and LE gains from simultaneous reductions in multiple risk factors at various ages. If all risk factors were eliminated, LE in Canada would increase by 6.26 years for males and 5.05 for females. The greatest benefit would come from eliminating smoking in males (2.45 years) and high blood pressure in females (1.42 years). For most risk factors, their dose-response relationships with LE were non-linear and depended on the presence of other factors. In individuals with high levels of risk, eliminating or reducing exposure to multiple factors could improve LE by several years, even at a relatively advanced age.
Assuntos
Carga Global da Doença , Expectativa de Vida , Feminino , Humanos , Tábuas de Vida , Masculino , Fatores de Risco , FumarRESUMO
Rising incidences of cutaneous melanoma have fueled the development of statistical models that predict individual melanoma risk. Our aim was to assess the validity of published prediction models for incident cutaneous melanoma using a standardized procedure based on PROBAST (Prediction model Risk Of Bias ASsessment Tool). We included studies that were identified by a recent systematic review and updated the literature search to ensure that our PROBAST rating included all relevant studies. Six reviewers assessed the risk of bias (ROB) for each study using the published "PROBAST Assessment Form" that consists of four domains and an overall ROB rating. We further examined a temporal effect regarding changes in overall and domain-specific ROB rating distributions. Altogether, 42 studies were assessed, of which the vast majority (n = 34; 81%) was rated as having high ROB. Only one study was judged as having low ROB. The main reasons for high ROB ratings were the use of hospital controls in case-control studies and the omission of any validation of prediction models. However, our temporal analysis results showed a significant reduction in the number of studies with high ROB for the domain "analysis". Nevertheless, the evidence base of high-quality studies that can be used to draw conclusions on the prediction of incident cutaneous melanoma is currently much weaker than the high number of studies on this topic would suggest.
RESUMO
BACKGROUND: Residents receive a numeric performance rating (eg, 1-7 scoring scale) along with a narrative (ie, qualitative) feedback based on their performance in each workplace-based assessment (WBA). Aggregated qualitative data from WBA can be overwhelming to process and fairly adjudicate as part of a global decision about learner competence. Current approaches with qualitative data require a human rater to maintain attention and appropriately weigh various data inputs within the constraints of working memory before rendering a global judgment of performance. OBJECTIVE: This study explores natural language processing (NLP) and machine learning (ML) applications for identifying trainees at risk using a large WBA narrative comment data set associated with numerical ratings. METHODS: NLP was performed retrospectively on a complete data set of narrative comments (ie, text-based feedback to residents based on their performance on a task) derived from WBAs completed by faculty members from multiple hospitals associated with a single, large, residency program at McMaster University, Canada. Narrative comments were vectorized to quantitative ratings using the bag-of-n-grams technique with 3 input types: unigram, bigrams, and trigrams. Supervised ML models using linear regression were trained with the quantitative ratings, performed binary classification, and output a prediction of whether a resident fell into the category of at risk or not at risk. Sensitivity, specificity, and accuracy metrics are reported. RESULTS: The database comprised 7199 unique direct observation assessments, containing both narrative comments and a rating between 3 and 7 in imbalanced distribution (scores 3-5: 726 ratings; and scores 6-7: 4871 ratings). A total of 141 unique raters from 5 different hospitals and 45 unique residents participated over the course of 5 academic years. When comparing the 3 different input types for diagnosing if a trainee would be rated low (ie, 1-5) or high (ie, 6 or 7), our accuracy for trigrams was 87%, bigrams 86%, and unigrams 82%. We also found that all 3 input types had better prediction accuracy when using a bimodal cut (eg, lower or higher) compared with predicting performance along the full 7-point rating scale (50%-52%). CONCLUSIONS: The ML models can accurately identify underperforming residents via narrative comments provided for WBAs. The words generated in WBAs can be a worthy data set to augment human decisions for educators tasked with processing large volumes of narrative assessments.
RESUMO
Transparent and accurate reporting is essential to evaluate the validity and applicability of risk prediction models. Our aim was to evaluate the reporting quality of studies developing and validating risk prediction models for melanoma according to the TRIPOD (Transparent Reporting of a multivariate prediction model for Individual Prognosis Or Diagnosis) checklist. We included studies that were identified by a recent systematic review and updated the literature search to ensure that our TRIPOD rating included all relevant studies. Six reviewers assessed compliance with all 37 TRIPOD components for each study using the published "TRIPOD Adherence Assessment Form". We further examined a potential temporal effect of the reporting quality. Altogether 42 studies were assessed including 35 studies reporting the development of a prediction model and seven studies reporting both development and validation. The median adherence to TRIPOD was 57% (range 29% to 78%). Study components that were least likely to be fully reported were related to model specification, title and abstract. Although the reporting quality has slightly increased over the past 35 years, there is still much room for improvement. Adherence to reporting guidelines such as TRIPOD in the publication of study results must be adopted as a matter of course to achieve a sufficient level of reporting quality necessary to foster the use of the prediction models in applications.
RESUMO
BACKGROUND: Postoperative morbidity places considerable burden on health and resources. Thus, strategies to identify, predict, and reduce postoperative morbidity are needed. AIMS: To identify and explore existing preoperative risk assessment tools for morbidity after cardiac surgery. METHODS: Electronic databases (including MEDLINE, CINAHL, and Embase) were searched to December 2020 for preoperative risk assessment models for morbidity after adult cardiac surgery. Models exploring one isolated postoperative morbidity and those in patients having heart transplantation or congenital surgery were excluded. Data extraction and quality assessments were undertaken by two authors. RESULTS: From 2251 identified papers, 22 models were found. The majority (54.5%) were developed in the USA or Canada, defined morbidity outcome within the in-hospital period (90.9%), and focused on major morbidity. Considerable variation in morbidity definition was identified, with morbidity incidence between 4.3% and 52%. The majority (45.5%) defined morbidity and mortality separately but combined them to develop one model, while seven studies (33.3%) constructed a morbidity-specific model. Models contained between 5 and 50 variables. Commonly included variables were age, emergency surgery, left ventricular dysfunction, and reoperation/previous cardiac surgery, although definition differences across studies were observed. All models demonstrated at least reasonable discriminatory power [area under the receiver operating curve (0.61-0.82)]. CONCLUSION: Despite the methodological heterogeneity across models, all demonstrated at least reasonable discriminatory power and could be implemented depending on local preferences. Future strategies to identify, predict, and reduce morbidity after cardiac surgery should consider the ageing population and those with minor and/or multiple complex morbidities.
Assuntos
Procedimentos Cirúrgicos Cardíacos , Adulto , Procedimentos Cirúrgicos Cardíacos/efeitos adversos , Humanos , Morbidade , Complicações Pós-Operatórias/epidemiologia , Período Pós-Operatório , Reoperação , Medição de RiscoRESUMO
OBJECTIVE: Electronic health records have incomplete capture of patient outcomes. We consider the case when observability is differential across a predictor. Including such a predictor (sensitive variable) can lead to algorithmic bias, potentially exacerbating health inequities. MATERIALS AND METHODS: We define bias for a clinical prediction model (CPM) as the difference between the true and estimated risk, and differential bias as bias that differs across a sensitive variable. We illustrate the genesis of differential bias via a 2-stage process, where conditional on having the outcome of interest, the outcome is differentially observed. We use simulations and a real-data example to demonstrate the possible impact of including a sensitive variable in a CPM. RESULTS: If there is differential observability based on a sensitive variable, including it in a CPM can induce differential bias. However, if the sensitive variable impacts the outcome but not observability, it is better to include it. When a sensitive variable impacts both observability and the outcome no simple recommendation can be provided. We show that one cannot use observed data to detect differential bias. DISCUSSION: Our study furthers the literature on observability, showing that differential observability can lead to algorithmic bias. This highlights the importance of considering whether to include sensitive variables in CPMs. CONCLUSION: Including a sensitive variable in a CPM depends on whether it truly affects the outcome or just the observability of the outcome. Since this cannot be distinguished with observed data, observability is an implicit assumption of CPMs.
Assuntos
Modelos Estatísticos , Viés , Humanos , PrognósticoRESUMO
Based on both new and previously utilized experimental data, the present study provides a comparative assessment of sensors and machine learning approaches for evaluating the microbiological spoilage of ready-to-eat leafy vegetables (baby spinach and rocket). Fourier-transform infrared (FTIR), near-infrared (NIR), visible (VIS) spectroscopy and multispectral imaging (MSI) were used. Two data partitioning approaches and two algorithms, namely partial least squares regression and support vector regression (SVR), were evaluated. Concerning baby spinach, when model testing was performed on samples randomly selected, the performance was better than or similar to the one attained when testing was performed based on dynamic temperatures data, depending on the applied analytical technology. The two applied algorithms yielded similar model performances for the majority of baby spinach cases. Regarding rocket, the random data partitioning approach performed considerably better results in almost all cases of sensor/algorithm combination. Furthermore, SVR algorithm resulted in considerably or slightly better model performances for the FTIR, VIS and NIR sensors, depending on the data partitioning approach. However, PLSR algorithm provided better models for the MSI sensor. Overall, the microbiological spoilage of baby spinach was better assessed by models derived mainly from the VIS sensor, while FTIR and MSI were more suitable in rocket. According to the findings of this study, a distinct sensor and computational analysis application is needed for each vegetable type, suggesting that there is not a single combination of analytical approach/algorithm that could be applied successfully in all food products and throughout the food supply chain.