Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 468
Filter
1.
Cell Biosci ; 14(1): 88, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956702

ABSTRACT

This study investigates NADPH oxidase 4 (NOX4) involvement in iron-mediated astrocyte cell death in Alzheimer's Disease (AD) using single-cell sequencing data and transcriptomes. We analyzed AD single-cell RNA sequencing data, identified astrocyte marker genes, and explored biological processes in astrocytes. We integrated AD-related chip data with ferroptosis-related genes, highlighting NOX4. We validated NOX4's role in ferroptosis and AD in vitro and in vivo. Astrocyte marker genes were enriched in AD, emphasizing their role. NOX4 emerged as a crucial player in astrocytic ferroptosis in AD. Silencing NOX4 mitigated ferroptosis, improved cognition, reduced Aß and p-Tau levels, and alleviated mitochondrial abnormalities. NOX4 promotes astrocytic ferroptosis, underscoring its significance in AD progression.

2.
Sensors (Basel) ; 24(13)2024 Jul 05.
Article in English | MEDLINE | ID: mdl-39001139

ABSTRACT

The paper "Using Absorption Models for Insulin and Carbohydrates and Deep Leaning to Improve Glucose Level Predictions" (Sensors2021, 21, 5273) proposes a novel approach to predicting blood glucose levels for people with type 1 diabetes mellitus (T1DM). By building exponential models from raw carbohydrate and insulin data to simulate the absorption in the body, the authors reported a reduction in their model's root-mean-square error (RMSE) from 15.5 mg/dL (raw) to 9.2 mg/dL (exponential) when predicting blood glucose levels one hour into the future. In this comment, we demonstrate that the experimental techniques used in that paper are flawed, which invalidates its results and conclusions. Specifically, after reviewing the authors' code, we found that the model validation scheme was malformed, namely, the training and test data from the same time intervals were mixed. This means that the reported RMSE numbers in the referenced paper did not accurately measure the predictive capabilities of the approaches that were presented. We repaired the measurement technique by appropriately isolating the training and test data, and we discovered that their models actually performed dramatically worse than was reported in the paper. In fact, the models presented in the that paper do not appear to perform any better than a naive model that predicts future glucose levels to be the same as the current ones.


Subject(s)
Blood Glucose , Diabetes Mellitus, Type 1 , Insulin , Insulin/metabolism , Humans , Blood Glucose/metabolism , Blood Glucose/analysis , Diabetes Mellitus, Type 1/metabolism , Carbohydrates/chemistry , Models, Biological
3.
Heliyon ; 10(12): e33448, 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-39027433

ABSTRACT

The Abbay River Basin faces the looming threat of extreme climate events, including prolonged droughts and erratic rainfall patterns, which can significantly affect soil health and fertility. This study aimed to explore the influence of extreme climate conditions on soil pH and exchangeable aluminum, aiming to promote sustainable agricultural practices in Ethiopia. The Africa Soil Information Service (ASIS) provided datasets on soil pH and exchangeable aluminum. The European Copernicus Climate Change Data Store was used to download historical and future datasets of extreme climatic indices from 1980 to 2010 and 2015-2050, respectively. The Coupled Model Intercomparison Project Phase 6 model ensemble was used to predict future climate impacts under three shared socioeconomic scenarios: SSP1-2.6, SSP2-4.3, and SSP5-8.5. Data extraction, quality control, and clustering were conducted before analysis, and the model was validated for its accuracy and reliability in predicting soil parameter changes. An artificial neural network model was utilized to predict the effects of extreme climate indices on soil pH and exchangeable aluminum concentrations. The model was designed to accurately and reliably predict changes in soil parameters. This study compared the changes in soil pH and aluminum concentrations using paired t tests. The model's diagnostic results indicated a significant impact of extreme climate scenarios on soil pH and exchangeable aluminum. Extreme climate factors such as heavy precipitation and cooler night time temperatures significantly contribute to soil acidification and an increase in aluminum concentration. Under the SSP1-2.6 and SSP2-4.5 emission scenarios, soil pH levels are expected to increase by 8.38 % and 3.79 %, respectively. These changes in soil pH are expected to have significant impacts on the exchangeable aluminum content in the soil, with increases of 37 % and 5.38 %, respectively, under the same emission scenarios. However, the SSP5.8 scenario predicted a 45 % increase in exchangeable aluminum and a 9.36 % decrease in soil pH. Therefore, this study significantly enhances our understanding of the influence of climate change on soil health. The development of strategies to mitigate climate change impacts on agriculture in the region must consider the effects of extreme climate indices.

4.
Sci Total Environ ; 949: 175111, 2024 Jul 28.
Article in English | MEDLINE | ID: mdl-39079631

ABSTRACT

Modeling of watershed Ecosystem Services (ES) processes has increased greatly in recent years, potentially improving environmental management and decision-making by describing the value of nature. ES models may be sensitive to different conditions and, therefore, should ideally be validated against observed data for their use as a decision-support instrument. However, outcomes from such ES modeling are barely validated, making it difficult to assess uncertainties associated with the modeling and justify their actual usefulness to develop generalizable management recommendations. This study proposes a framework for the systematic validation of one of such tools, the InVEST Nutrient Delivery Model (NDR) for nutrient retention estimates. The framework is divided into three stages: 1) running the NDR model inputs, processes, and outputs; 2) building a long-term reference dataset from open access water quality observations; and 3) using the reference data for model calibration and validation. We applied this framework to twenty watersheds in the Commonwealth of Puerto Rico, where data availability resembles thar of watersheds across the United States. Long-term water quality data from monitoring stations facilitated model calibration and validation. Our framework provided a reproducible method to linking the vast monitoring network in the U.S. and its territories for evaluating the InVEST's NDR model performance. Beyond the framework development, this study found that the InVEST NDR model explained 62.7 % and 79.3 % of the variance in the total nitrogen and total phosphorus between 2000 and 2022, respectively, supporting the suitability of the model for watershed scale ecosystem services assessment. The findings can also serve as a reference to support the use of InVEST for other locations in the tropics without publically available monitoring data.

5.
Environ Monit Assess ; 196(8): 723, 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38987411

ABSTRACT

A comprehensive seasonal assessment of groundwater vulnerability was conducted in the weathered hard rock aquifer of the upper Swarnrekha watershed in Ranchi district, India. Lineament density (Ld) and land use/land cover (LULC) were integrated into the conventional DRASTIC and Pesticide DRASTIC (P-DRASTIC) models and were extensively compared with six modified models, viz. DRASTIC-Ld, DRASTIC-Lu, DRASTIC-LdLu, P-DRASTIC-Ld, P-DRASTIC-Lu, and P-DRASTIC-LdLu, to identify the most optimal model for vulnerability mapping in hard rock terrain of the region. Findings were geochemically validated using NO3- concentrations of 68 wells during pre-monsoon (Pre-M) and post-monsoon (Post-M) 2022. Irrespective of the applied model, groundwater vulnerability shows significant seasonal variation, with > 45% of the region classified as high to very high vulnerability in the pre-M, increasing to Ì´67% in post-M season, highlighting the importance of seasonal vulnerability assessments. Agriculture and industries' dominant southern region showed higher vulnerability, followed by regions with high Ld and thin weathered zone. Incorporating Ld and LULC parameters into DRASTIC-LdLu and P-DRASTIC-LdLu models increases the 'Very High' vulnerability zones to 17.4% and 17.6% for pre-M and 29.4% and 27.9% for post-M, respectively. Similarly, 'High' vulnerable zones increase from 32.5% and 25% in pre-M to 33.8% and 35.3% in post-M for respective models. Model output comparisons suggest that modified DRASTIC-LdLu and P-DRASTIC-LdLu perform better, with accurate estimations of 83.8% and 89.7% for pre-M and post-M, respectively. However, results of geochemical validation suggest that among all the applied modified models, DRASTIC-LdLu performs best, with accurate estimations of 34.4% and 20.6% for pre-M and post-M, respectively.


Subject(s)
Environmental Monitoring , Groundwater , Water Pollutants, Chemical , Groundwater/chemistry , Environmental Monitoring/methods , India , Water Pollutants, Chemical/analysis , Agriculture , Seasons , Water Pollution, Chemical/statistics & numerical data
6.
Article in English | MEDLINE | ID: mdl-38903904

ABSTRACT

The Additive Manufacturing Benchmark Series (AM Bench) is a NIST-led organization that provides a continuing series of additive manufacturing benchmark measurements, challenge problems, and conferences with the primary goal of enabling modelers to test their simulations against rigorous, highly controlled additive manufacturing benchmark measurement data. To this end, single-track (1D) and pad (2D) scans on bare plate nickel alloy 718 were completed with thermography, cross-sectional grain orientation and local chemical composition maps, and cross-sectional melt pool size measurements. The laser power, scan speed, and laser spot size were varied for single tracks, and the scan direction was varied for pads. This article focuses on the cross-sectional melt pool size measurements and presents the predictions from challenge problems. Single-track depth correlated with volumetric energy density while width did not (within the studied parameters). The melt pool size for pad scans was greater than single tracks due to heat buildup. Pad scan melt pool depth was reduced when the laser scan direction and gas flow direction were parallel. The melt pool size in pad scans showed little to no trend against position within the pads. Uncertainty budgets for cross-sectional melt pool size from optical micrographs are provided for the purpose of model validation.

7.
J Process Control ; 1392024 Jul.
Article in English | MEDLINE | ID: mdl-38855126

ABSTRACT

Behavioral interventions (such as those developed to increase physical activity, achieve smoking cessation, or weight loss) can be represented as dynamic process systems incorporating a multitude of factors, ranging from cognitive (internal) to environmental (external) influences. This facilitates the application of system identification and control engineering methods to address questions such as: what drives individuals to improve health behaviors (such as engaging in physical activity)? In this paper, the goal is to efficiently estimate personalized, dynamic models which in turn will lead to control systems that can optimize this behavior. This problem is examined in system identification applied to the Just Walk study that aimed to increase walking behavior in sedentary adults. The paper presents a Discrete Simultaneous Perturbation Stochastic Approximation (DSPSA)-based modeling of the Goal Attainment construct estimated using AutoRegressive with eXogenous inputs (ARX) models. Feature selection of participants and ARX order selection is achieved through the DSPSA algorithm, which efficiently handles computationally expensive calculations. DSPSA can search over large sets of features as well as regressor structures in an informed, principled manner to model behavioral data within reasonable computational time. DSPSA estimation highlights the large individual variability in motivating factors among participants in Just Walk, thus emphasizing the importance of a personalized approach for optimized behavioral interventions.

8.
Eur J Cardiothorac Surg ; 66(1)2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38889265
9.
Mol Neurobiol ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38898199

ABSTRACT

Depression is one of the predominant common mental illnesses that affects millions of people of all ages worldwide. Random mood changes, loss of interest in routine activities, and prevalent unpleasant senses often characterize this common depreciated mental illness. Subjects with depressive disorders have a likelihood of developing cardiovascular complications, diabesity, and stroke. The exact genesis and pathogenesis of this disease are still questionable. A significant proportion of subjects with clinical depression display inadequate response to antidepressant therapies. Hence, clinicians often face challenges in predicting the treatment response. Emerging reports have indicated the association of depression with metabolic alterations. Metabolomics is one of the promising approaches that can offer fresh perspectives into the diagnosis, treatment, and prognosis of depression at the metabolic level. Despite numerous studies exploring metabolite profiles post-pharmacological interventions, a quantitative understanding of consistently altered metabolites is not yet established. The article gives a brief discussion on different biomarkers in depression and the degree to which biomarkers can improve treatment outcomes. In this review article, we have systemically reviewed the role of metabolomics in depression along with current challenges and future perspectives.

10.
JMIR Med Inform ; 12: e53625, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38842167

ABSTRACT

Background: Despite restrictive opioid management guidelines, opioid use disorder (OUD) remains a major public health concern. Machine learning (ML) offers a promising avenue for identifying and alerting clinicians about OUD, thus supporting better clinical decision-making regarding treatment. Objective: This study aimed to assess the clinical validity of an ML application designed to identify and alert clinicians of different levels of OUD risk by comparing it to a structured review of medical records by clinicians. Methods: The ML application generated OUD risk alerts on outpatient data for 649,504 patients from 2 medical centers between 2010 and 2013. A random sample of 60 patients was selected from 3 OUD risk level categories (n=180). An OUD risk classification scheme and standardized data extraction tool were developed to evaluate the validity of the alerts. Clinicians independently conducted a systematic and structured review of medical records and reached a consensus on a patient's OUD risk level, which was then compared to the ML application's risk assignments. Results: A total of 78,587 patients without cancer with at least 1 opioid prescription were identified as follows: not high risk (n=50,405, 64.1%), high risk (n=16,636, 21.2%), and suspected OUD or OUD (n=11,546, 14.7%). The sample of 180 patients was representative of the total population in terms of age, sex, and race. The interrater reliability between the ML application and clinicians had a weighted kappa coefficient of 0.62 (95% CI 0.53-0.71), indicating good agreement. Combining the high risk and suspected OUD or OUD categories and using the review of medical records as a gold standard, the ML application had a corrected sensitivity of 56.6% (95% CI 48.7%-64.5%) and a corrected specificity of 94.2% (95% CI 90.3%-98.1%). The positive and negative predictive values were 93.3% (95% CI 88.2%-96.3%) and 60.0% (95% CI 50.4%-68.9%), respectively. Key themes for disagreements between the ML application and clinician reviews were identified. Conclusions: A systematic comparison was conducted between an ML application and clinicians for identifying OUD risk. The ML application generated clinically valid and useful alerts about patients' different OUD risk levels. ML applications hold promise for identifying patients at differing levels of OUD risk and will likely complement traditional rule-based approaches to generating alerts about opioid safety issues.

11.
Front Microbiol ; 15: 1373013, 2024.
Article in English | MEDLINE | ID: mdl-38835486

ABSTRACT

Background: This study aimed to clarify the relationship between the gut microbiota and osteoporosis combining Mendelian randomization (MR) analysis with animal experiments. Methods: We conducted an analysis on the relationship between differential bacteria and osteoporosis using open-access genome-wide association study (GWAS) data on gut microbe and osteoporosis obtained from public databases. The analysis was performed using two-sample MR analysis, and the causal relationship was examined through inverse variance weighting (IVW), MR Egger, weighted median, and weighted mode methods. Bilateral oophorectomy was employed to replicate the mouse osteoporosis model, which was assessed by micro computed tomography (CT), pathological tests, and bone transformation indexes. Additionally, 16S rDNA sequencing was conducted on fecal samples, while SIgA and indexes of IL-6, IL-1ß, and TNF-α inflammatory factors were examined in colon samples. Through immunofluorescence and histopathology, expression levels of tight junction proteins, such as claudin-1, ZO-1, and occludin, were assessed, and conduct correlation analysis on differential bacteria and related environmental factors were performed. Results: A positive correlation was observed between g_Ruminococcus1 and the risk of osteoporosis, while O_Burkholderiales showed a negative correlation with the risk of osteoporosis. Furthermore, there was no evidence of heterogeneity or pleiotropy. The successful replication of the mouse osteoporosis model was assessed, and it was found that the abundance of the O_Burkholderiales was significantly reduced, while the abundance of g_Ruminococcus was significantly increased in the ovariectomized (OVX)-mice. The intestinal SIgA level of OVX mice decreased, the expression level of inflammatory factors increased, barrier damage occurred, and the content of LPS in the colon and serum significantly increased. The abundance level of O_Burkholderiales is strongly positively correlated with bone formation factors, gut barrier indicators, bone density, bone volume fraction, and trabecular bone quantity, whereas it was strongly negatively correlated with bone resorption factors and intestinal inflammatory factors, The abundance level of g_Ruminococcus shows a strong negative correlation with bone formation factors, gut barrier indicators, and bone volume fraction, and a strong positive correlation with bone resorption factors and intestinal inflammatory factors. Conclusion: O_Burkholderiales and g_Ruminococcus may regulate the development of osteoporosis through the microbiota-gut-bone axis.

12.
Sci Total Environ ; 941: 173555, 2024 Sep 01.
Article in English | MEDLINE | ID: mdl-38806120

ABSTRACT

A sound evaluation of the cadmium (Cd) mass balance in agricultural soils needs accurate data of Cd leaching. Reported Cd concentrations from in situ studies are often one order of magnitude lower than predicted by empirical models, which were calibrated to pore water data from stored soils. It is hypothesized that this discrepancy is related to the preferential flow of water (non-equilibrium) and/or artefacts caused by drying and rewetting soils prior to pore water analysis. These hypotheses were tested on multiple soils (n = 27) with contrasting properties. Pore waters were collected by soil centrifugation from field fresh soil samples and also after incubating the same soils (28 days, 20 °C), following two drying-rewetting cycles, the idea being that chemical equilibrium in the soil is reached after incubation. Incubation increased pore water Cd by a factor 4, on average, and up to a factor 16. That increase was statistically related to the decrease of pore water pH and the increase of nitrate, both mainly related to incubation-induced nitrification. After correcting for both factors, the Cd rise was also highest at higher pore water Ca. This suggests that higher Ca in soil enlarges Cd concentration gradients among pore classes in field fresh soils because high Ca promotes soil aggregation and separation of mobile from immobile water. Several empirical models were used to predict pore water Cd. Predictions exceeded observations up to a factor 30 for the fresh pore waters but matched well with those of incubated soils; again, deviations from the 1:1 line in field fresh soils were largest in high Ca (>0.8 mM) soils, suggesting that local equilibrium conditions in field fresh soils are not found at higher Ca. Our results demonstrate that empirical models need recalibration with field fresh pore water data to make accurate soil Cd mass balances in risk assessments.

13.
Sensors (Basel) ; 24(9)2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38732969

ABSTRACT

The recent scientific literature abounds in proposals of seizure forecasting methods that exploit machine learning to automatically analyze electroencephalogram (EEG) signals. Deep learning algorithms seem to achieve a particularly remarkable performance, suggesting that the implementation of clinical devices for seizure prediction might be within reach. However, most of the research evaluated the robustness of automatic forecasting methods through randomized cross-validation techniques, while clinical applications require much more stringent validation based on patient-independent testing. In this study, we show that automatic seizure forecasting can be performed, to some extent, even on independent patients who have never been seen during the training phase, thanks to the implementation of a simple calibration pipeline that can fine-tune deep learning models, even on a single epileptic event recorded from a new patient. We evaluate our calibration procedure using two datasets containing EEG signals recorded from a large cohort of epileptic subjects, demonstrating that the forecast accuracy of deep learning methods can increase on average by more than 20%, and that performance improves systematically in all independent patients. We further show that our calibration procedure works best for deep learning models, but can also be successfully applied to machine learning algorithms based on engineered signal features. Although our method still requires at least one epileptic event per patient to calibrate the forecasting model, we conclude that focusing on realistic validation methods allows to more reliably compare different machine learning approaches for seizure prediction, enabling the implementation of robust and effective forecasting systems that can be used in daily healthcare practice.


Subject(s)
Algorithms , Deep Learning , Electroencephalography , Seizures , Humans , Electroencephalography/methods , Seizures/diagnosis , Seizures/physiopathology , Calibration , Signal Processing, Computer-Assisted , Epilepsy/diagnosis , Epilepsy/physiopathology , Machine Learning
14.
Stat Med ; 43(14): 2830-2852, 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-38720592

ABSTRACT

INTRODUCTION: There is currently no guidance on how to assess the calibration of multistate models used for risk prediction. We introduce several techniques that can be used to produce calibration plots for the transition probabilities of a multistate model, before assessing their performance in the presence of random and independent censoring through a simulation. METHODS: We studied pseudo-values based on the Aalen-Johansen estimator, binary logistic regression with inverse probability of censoring weights (BLR-IPCW), and multinomial logistic regression with inverse probability of censoring weights (MLR-IPCW). The MLR-IPCW approach results in a calibration scatter plot, providing extra insight about the calibration. We simulated data with varying levels of censoring and evaluated the ability of each method to estimate the calibration curve for a set of predicted transition probabilities. We also developed evaluated the calibration of a model predicting the incidence of cardiovascular disease, type 2 diabetes and chronic kidney disease among a cohort of patients derived from linked primary and secondary healthcare records. RESULTS: The pseudo-value, BLR-IPCW, and MLR-IPCW approaches give unbiased estimates of the calibration curves under random censoring. These methods remained predominately unbiased in the presence of independent censoring, even if the censoring mechanism was strongly associated with the outcome, with bias concentrated in low-density regions of predicted transition probability. CONCLUSIONS: We recommend implementing either the pseudo-value or BLR-IPCW approaches to produce a calibration curve, combined with the MLR-IPCW approach to produce a calibration scatter plot. The methods have been incorporated into the "calibmsm" R package available on CRAN.


Subject(s)
Computer Simulation , Diabetes Mellitus, Type 2 , Models, Statistical , Humans , Diabetes Mellitus, Type 2/epidemiology , Risk Assessment/methods , Risk Assessment/statistics & numerical data , Logistic Models , Calibration , Cardiovascular Diseases/epidemiology , Renal Insufficiency, Chronic/epidemiology , Probability
15.
Pediatr Cardiol ; 2024 May 10.
Article in English | MEDLINE | ID: mdl-38730015

ABSTRACT

Assessment of pulmonary regurgitation (PR) guides treatment for patients with congenital heart disease. Quantitative assessment of PR fraction (PRF) by echocardiography is limited. Cardiac MRI (cMRI) is the reference-standard for PRF quantification. We created an algorithm to predict cMRI-quantified PRF from echocardiography using machine learning (ML). We retrospectively performed echocardiographic measurements paired to cMRI within 3 months in patients with ≥ mild PR from 2009 to 2022. Model inputs were vena contracta ratio, PR index, PR pressure half-time, main and branch pulmonary artery diastolic flow reversal (BPAFR), and transannular patch repair. A gradient boosted trees ML algorithm was trained using k-fold cross-validation to predict cMRI PRF by phase contrast imaging as a continuous number and at > mild (PRF ≥ 20%) and severe (PRF ≥ 40%) thresholds. Regression performance was evaluated with mean absolute error (MAE), and at clinical thresholds with area-under-the-receiver-operating-characteristic curve (AUROC). Prediction accuracy was compared to historical clinician accuracy. We externally validated prior reported studies for comparison. We included 243 subjects (median age 21 years, 58% repaired tetralogy of Fallot). The regression MAE = 7.0%. For prediction of > mild PR, AUROC = 0.96, but BPAFR alone outperformed the ML model (sensitivity 94%, specificity 97%). The ML model detection of severe PR had AUROC = 0.86, but in the subgroup with BPAFR, performance dropped (AUROC = 0.73). Accuracy between clinicians and the ML model was similar (70% vs. 69%). There was decrement in performance of prior reported algorithms on external validation in our dataset. A novel ML model for echocardiographic quantification of PRF outperforms prior studies and has comparable overall accuracy to clinicians. BPAFR is an excellent marker for > mild PRF, and has moderate capacity to detect severe PR, but more work is required to distinguish moderate from severe PR. Poor external validation of prior works highlights reproducibility challenges.

16.
Water Res ; 258: 121806, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38796911

ABSTRACT

This work investigates the validation and application of a competitive model approach for full-scale wastewater treatment plants (WWTP) with external recirculation of partially loaded powdered activated carbon (PAC) for removal of organic micropollutants (OMP). It is based on the ideal adsorbed solution theory (IAST) for multisolute mixtures combined with calibration of fictive organic components and correction of single-solute model parameters for OMP by use of the tracer model (TRM). Adsorption kinetics are represented by a pseudo first order reaction (PFO) and compared to mass transfer calculated with the homogenous surface diffusion model (HSDM). Model validation with operational data from two different WWTPs showed a strong dependency of model results on the batch sample quality used for model calibration. In contrast, the kinetic approach is of less importance for predicting full-scale OMP removal with long PAC sludge retention times. Further model application demonstrated that external PAC recirculation significantly improves the OMP removal with regard to both adsorption capacity and compensation of competitive effects of Dissolved Organic Carbon (DOC).


Subject(s)
Charcoal , Waste Disposal, Fluid , Wastewater , Water Pollutants, Chemical , Adsorption , Water Pollutants, Chemical/chemistry , Waste Disposal, Fluid/methods , Charcoal/chemistry , Wastewater/chemistry , Water Purification/methods , Kinetics , Models, Theoretical , Carbon/chemistry
17.
BMC Med Res Methodol ; 24(1): 115, 2024 May 17.
Article in English | MEDLINE | ID: mdl-38760688

ABSTRACT

BACKGROUND: Nested case-control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. METHODS: We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. RESULTS: Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58-0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62-0.69) and 0.65 (0.61-0.69), respectively. The unweighted O/E ratio was 18.38 (17.67-19.06) in the NCC datasets, while it was 1.69 (1.42-1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53-1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. CONCLUSIONS: Nested case-control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.


Subject(s)
Algorithms , Humans , Case-Control Studies , Female , Models, Statistical , Breast Neoplasms , Ovarian Neoplasms , Middle Aged , Aged
18.
Heliyon ; 10(10): e31359, 2024 May 30.
Article in English | MEDLINE | ID: mdl-38803864

ABSTRACT

Coking was regarded as a predominant source of air pollution. Despite the adoption of more environmentally friendly equipment, whether the coking enterprises in the Beijing-Tianjin-Hebei (BTH) region are still causing regional air pollution is worthy of study, which is essential for the control of coking enterprises in this area. To improve the prediction accuracy of large-scale air pollutant distribution, the air particle distribution in the BTH region was simulated via land use regression (LUR) combined with Bayesian maximum entropy (BME); then, the distribution was correlated with the exhaust gas emitted from coking enterprises. Results indicated that the R2 of the "LUR + BME" method reached 0.95, higher than 0.82 using LUR alone. The air quality distribution presented a pattern of "low in the northern mountains and high in the southern plains", similar to the distribution of coking enterprises in BTH region. A significant correlation was found between exhaust emissions from coking enterprises and air quality in the BTH region, confirming the contribution of coking emissions to air pollution in this region, and the necessity to continue the strict control on coking enterprises in BTH area.

19.
BMC Res Notes ; 17(1): 105, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38622619

ABSTRACT

OBJECTIVE: To build and validate an early risk prediction model for gestational diabetes mellitus (GDM) based on first-trimester electronic medical records including maternal demographic and clinical risk factors. METHODS: To develop and validate a GDM prediction model, two datasets were used in this retrospective study. One included data of 14,015 pregnant women from Máxima Medical Center (MMC) in the Netherlands. The other was from an open-source database nuMoM2b including data of 10,038 nulliparous pregnant women, collected in the USA. Widely used maternal demographic and clinical risk factors were considered for modeling. A GDM prediction model based on elastic net logistic regression was trained from a subset of the MMC data. Internal validation was performed on the remaining MMC data to evaluate the model performance. For external validation, the prediction model was tested on an external test set from the nuMoM2b dataset. RESULTS: An area under the receiver-operating-characteristic curve (AUC) of 0.81 was achieved for early prediction of GDM on the MMC test data, comparable to the performance reported in previous studies. While the performance markedly decreased to an AUC of 0.69 when testing the MMC-based model on the external nuMoM2b test data, close to the performance trained and tested on the nuMoM2b dataset only (AUC = 0.70).


Subject(s)
Diabetes, Gestational , Pregnancy , Female , Humans , Diabetes, Gestational/diagnosis , Diabetes, Gestational/epidemiology , Retrospective Studies , Risk Factors , Pregnancy Trimester, First , Demography
20.
J Cheminform ; 16(1): 43, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38622648

ABSTRACT

Multiple metrics are used when assessing and validating the performance of quantitative structure-activity relationship (QSAR) models. In the case of binary classification, balanced accuracy is a metric to assess the global performance of such models. In contrast to accuracy, balanced accuracy does not depend on the respective prevalence of the two categories in the test set that is used to validate a QSAR classifier. As such, balanced accuracy is used to overcome the effect of imbalanced test sets on the model's perceived accuracy. Matthews' correlation coefficient (MCC), an alternative global performance metric, is also known to mitigate the imbalance of the test set. However, in contrast to the balanced accuracy, MCC remains dependent on the respective prevalence of the predicted categories. For simplicity, the rest of this work is based on the positive prevalence. The MCC value may be underestimated at high or extremely low positive prevalence. It contributes to more challenging comparisons between experiments using test sets with different positive prevalences and may lead to incorrect interpretations. The concept of balanced metrics beyond balanced accuracy is, to the best of our knowledge, not yet described in the cheminformatic literature. Therefore, after describing the relevant literature, this manuscript will first formally define a confusion matrix, sensitivity and specificity and then present, with synthetic data, the danger of comparing performance metrics under nonconstant prevalence. Second, it will demonstrate that balanced accuracy is the performance metric accuracy calibrated to a test set with a positive prevalence of 50% (i.e., balanced test set). This concept of balanced accuracy will then be extended to the MCC after showing its dependency on the positive prevalence. Applying the same concept to any other performance metric and widening it to the concept of calibrated metrics will then be briefly discussed. We will show that, like balanced accuracy, any balanced performance metric may be expressed as a function of the well-known values of sensitivity and specificity. Finally, a tale of two MCCs will exemplify the use of this concept of balanced MCC versus MCC with four use cases using synthetic data. SCIENTIFIC CONTRIBUTION: This work provides a formal, unified framework for understanding prevalence dependence in model validation metrics, deriving balanced metric expressions beyond balanced accuracy, and demonstrating their practical utility for common use cases. In contrast to prior literature, it introduces the derived confusion matrix to express metrics as functions of sensitivity, specificity and prevalence without needing additional coefficients. The manuscript extends the concept of balanced metrics to Matthews' correlation coefficient and other widely used performance indicators, enabling robust comparisons under prevalence shifts.

SELECTION OF CITATIONS
SEARCH DETAIL