Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
Comput Biol Med ; 182: 109093, 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39232407

RESUMO

The heightened prevalence of respiratory disorders, particularly exacerbated by a significant upswing in fatalities due to the novel coronavirus, underscores the critical need for early detection and timely intervention. This imperative is paramount, possessing the potential to profoundly impact and safeguard numerous lives. Medically, chest radiography stands out as an essential and economically viable medical imaging approach for diagnosing and assessing the severity of diverse Respiratory Disorders. However, their detection in Chest X-Rays is a cumbersome task even for well-trained radiologists owing to low contrast issues, overlapping of the tissue structures, subjective variability, and the presence of noise. To address these issues, a novel analytical model termed Exponential Pixelating Integral is introduced for the automatic detection of infections in Chest X-Rays in this work. Initially, the presented Exponential Pixelating Integral enhances the pixel intensities to overcome the low-contrast issues that are then polar-transformed followed by their representation using the locally invariant Mandelbrot and Julia fractal geometries for effective distinction of structural features. The collated features labeled Exponential Pixelating Integral with dually characterized fractal features are then classified by the non-parametric multivariate adaptive regression splines to establish an ensemble model between each pair of classes for effective diagnosis of diverse diseases. Rigorous analysis of the proposed classification framework on large medical benchmarked datasets showcases its superiority over its peers by registering a higher classification accuracy and F1 scores ranging from 98.46 to 99.45 % and 96.53-98.10 % respectively, making it a precise and interpretable automated system for diagnosing respiratory disorders.

2.
J Family Med Prim Care ; 13(7): 2683-2691, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-39071025

RESUMO

Objectives: Coronavirus disease 2019 (COVID-19) emerged as a global pandemic during 2019 to 2022. The gold standard method of detecting this disease is reverse transcription-polymerase chain reaction (RT-PCR). However, RT-PCR has a number of shortcomings. Hence, the objective is to propose a cheap and effective method of detecting COVID-19 infection by using machine learning (ML) techniques, which encompasses five basic parameters as an alternative to the costly RT-PCR. Materials and Methods: Two machine learning-based predictive models, namely, Artificial Neural Network (ANN) and Multivariate Adaptive Regression Splines (MARS), are designed for predicting COVID-19 infection as a cheaper and simpler alternative to RT-PCR utilizing five basic parameters [i.e., age, total leucocyte count, red blood cell count, platelet count, C-reactive protein (CRP)]. Each of these parameters was studied, and correlation is drawn with COVID-19 diagnosis and progression. These laboratory parameters were evaluated in 171 patients who presented with symptoms suspicious of COVID-19 in a hospital at Kharagpur, India, from April to August 2022. Out of a total of 171 patients, 88 and 83 were found to be COVID-19-negative and COVID-19-positive, respectively. Results: The accuracies of the predicted class are found to be 97.06% and 91.18% for ANN and MARS, respectively. CRP is found to be the most significant input parameter. Finally, two predictive mathematical equations for each ML model are provided, which can be quite useful to detect the COVID-19 infection easily. Conclusion: It is expected that the present study will be useful to the medical practitioners for predicting the COVID-19 infection in patients based on only five very basic parameters.

3.
J Pers Med ; 14(1)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38276247

RESUMO

PURPOSE: The treatment of childhood myopia often involves the use of topical atropine, which has been demonstrated to be effective in decelerating the progression of myopia. It is crucial to monitor intraocular pressure (IOP) to ensure the safety of topical atropine. This study aims to identify the optimal machine learning IOP-monitoring module and establish a precise baseline IOP as a clinical safety reference for atropine medication. METHODS: Data from 1545 eyes of 1171 children receiving atropine for myopia were retrospectively analyzed. Nineteen variables including patient demographics, medical history, refractive error, and IOP measurements were considered. The data were analyzed using a multivariate adaptive regression spline (MARS) model to analyze the impact of different factors on the End IOP. RESULTS: The MARS model identified age, baseline IOP, End Spherical, duration of previous atropine treatment, and duration of current atropine treatment as the five most significant factors influencing the End IOP. The outcomes revealed that the baseline IOP had the most significant effect on final IOP, exhibiting a notable knot at 14 mmHg. When the baseline IOP was equal to or exceeded 14 mmHg, there was a positive correlation between atropine use and End IOP, suggesting that atropine may increase the End IOP in children with a baseline IOP greater than 14 mmHg. CONCLUSIONS: MARS model demonstrates a better ability to capture nonlinearity than classic multiple linear regression for predicting End IOP. It is crucial to acknowledge that administrating atropine may elevate intraocular pressure when the baseline IOP exceeds 14 mmHg. These findings offer valuable insights into factors affecting IOP in children undergoing atropine treatment for myopia, enabling clinicians to make informed decisions regarding treatment options.

4.
Front Neurol ; 14: 1283214, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38156090

RESUMO

Predicting the length of hospital stay for myasthenia gravis (MG) patients is challenging due to the complex pathogenesis, high clinical variability, and non-linear relationships between variables. Considering the management of MG during hospitalization, it is important to conduct a risk assessment to predict the length of hospital stay. The present study aimed to successfully predict the length of hospital stay for MG based on an expandable data mining technique, multivariate adaptive regression splines (MARS). Data from 196 MG patients' hospitalization were analyzed, and the MARS model was compared with classical multiple linear regression (MLR) and three other machine learning (ML) algorithms. The average hospital stay duration was 12.3 days. The MARS model, leveraging its ability to capture non-linearity, identified four significant factors: disease duration, age at admission, MGFA clinical classification, and daily prednisolone dose. Cut-off points and correlation curves were determined for these risk factors. The MARS model outperformed the MLR and the other ML methods (including least absolute shrinkage and selection operator MLR, classification and regression tree, and random forest) in assessing hospital stay length. This is the first study to utilize data mining methods to explore factors influencing hospital stay in patients with MG. The results highlight the effectiveness of the MARS model in identifying the cut-off points and correlation for risk factors associated with MG hospitalization. Furthermore, a MARS-based formula was developed as a practical tool to assist in the measurement of hospital stay, which can be feasibly supported as an extension of clinical risk assessment.

5.
Heliyon ; 9(10): e20730, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37842586

RESUMO

The consumer price index (CPI) is one of the most important macroeconomic indicators for determining inflation, and accurate predictions of CPI changes are important for a country's economic development. This study uses multivariate linear regression (MLR), support vector regression (SVR), autoregressive distributed lag (ARDL), and multivariate adaptive regression splines (MARS) to predict the CPI of the United States. Data from January 2017 to February 2022 were randomly selected and divided into two stages: 80 % for training and 20% for testing. The US CPI was modeled for the observed period and relied on a mix of elements, including crude oil price, world gold price, and federal fund effective rate. Evaluation metrics-mean absolute percentage value, mean absolute error, root mean square error, R-squared, and correlation of determination-were employed to estimate forecasted values. The MLR, SVR, ARDL, and MARS models attained high accuracy parameters, while the MARS algorithm generated higher accuracy in US CPI forecasts than the others in the testing phase. These outputs could support the US government in overseeing economic policies, sectors, and social security, thereby boosting national economic development.

6.
Heliyon ; 9(9): e19964, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37809827

RESUMO

Multivariate Adaptive Regression Splines (MARS) is a useful non-parametric regression analysis method that can be used for model selection in high-dimensional data. Since MARS can identify and model complex, non-linear relationships between the dependent variable and independent variables without requiring any assumptions, it has advantage over simple linear regression techniques. Also, for simplifying the model building process and preventing overfitting, MARS can select automatically the variables to be included in the model, which is useful for datasets with many variables. While MARS is a flexible non-parametric regression method, generalized cross validation (GCV) technique is used within the MARS framework to avoid overfitting and to select the best model. GCV criterion is widely used and can be effective in many situations, however it has some criticism. These criticism are the arbitrary value of the smoothing parameter used in the algorithm of the GCV criterion and the models obtained using this criterion are high-dimensional. In this paper, it is aimed to obtain the barest model that best explains the relationship between the dependent variable and independent variables by using alternative information criteria (Akaike information criterion (AIC), Schwarz Bayesian criterion (SBC) and information complexity criterion (ICOMP(IFIM)PEU)) instead of the use of smoothing parameters in order to put an end to the criticism. To achieve this goal, a simulation study was first conducted with a data set composed of variables that do and do not contribute to the dependent variable to test the success of the information criteria. As a consequence of this simulation work, when variables (which do not contribute to the dependent variable) are not included in the regression model, it demonstrates the success of the criteria in model selection. As a real data set, the reasons for loan defaults were investigated between the years 2005-2019 by utilizing data from 18 banks operating in Türkiye. The results obtained reveal the success of ICOMP(IFIM)PEU criterion in model selection.

7.
Sensors (Basel) ; 23(13)2023 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-37447814

RESUMO

The prediction of soil properties at different depths is an important research topic for promoting the conservation of black soils and the development of precision agriculture. Mid-infrared spectroscopy (MIR, 2500-25000 nm) has shown great potential in predicting soil properties. This study aimed to explore the ability of MIR to predict soil organic matter (OM) and total nitrogen (TN) at five different depths with the calibration from the whole depth (0-100 cm) or the shallow layers (0-40 cm) and compare its performance with visible and near-infrared spectroscopy (vis-NIR, 350-2500 nm). A total of 90 soil samples containing 450 subsamples (0-10 cm, 10-20 cm, 20-40 cm, 40-70 cm, and 70-100 cm depths) and their corresponding MIR and vis-NIR spectra were collected from a field of black soil in Northeast China. Multivariate adaptive regression splines (MARS) were used to build prediction models. The results showed that prediction models based on MIR (OM: RMSEp = 1.07-3.82 g/kg, RPD = 1.10-5.80; TN: RMSEp = 0.11-0.15 g/kg, RPD = 1.70-4.39) outperformed those based on vis-NIR (OM: RMSEp = 1.75-8.95 g/kg, RPD = 0.50-3.61; TN: RMSEp = 0.12-0.27 g/kg; RPD = 1.00-3.11) because of the higher number of characteristic bands. Prediction models based on the whole depth calibration (OM: RMSEp = 1.09-2.97 g/kg, RPD = 2.13-5.80; TN: RMSEp = 0.08-0.19 g/kg, RPD = 1.86-4.39) outperformed those based on the shallow layers (OM: RMSEp = 1.07-8.95 g/kg, RPD = 0.50-3.93; TN: RMSEp = 0.11-0.27 g/kg, RPD = 1.00-2.24) because the soil sample data of the whole depth had a larger and more representative sample size and a wider distribution. However, prediction models based on the whole depth calibration might provide lower accuracy in some shallow layers. Accordingly, it is suggested that the methods pertaining to soil property prediction based on the spectral library should be considered in future studies for an optimal approach to predicting soil properties at specific depths. This study verified the superiority of MIR for soil property prediction at specific depths and confirmed the advantage of modeling with the whole depth calibration, pointing out a possible optimal approach and providing a reference for predicting soil properties at specific depths.


Assuntos
Agricultura , Solo , Espectrofotometria Infravermelho , Espectroscopia de Luz Próxima ao Infravermelho , Nitrogênio/análise , Solo/química , Espectrofotometria Infravermelho/normas , Espectroscopia de Luz Próxima ao Infravermelho/normas , Modelos Teóricos , Agricultura/instrumentação , Agricultura/métodos
8.
Artigo em Inglês | MEDLINE | ID: mdl-37309763

RESUMO

BACKGROUND: Anthrapyrazoles are a new class of antitumor agents and successors to anthracyclines possessing a broad range of antitumor activity in various model tumors. OBJECTIVE: The present study introduces novel QSAR models for the prediction of antitumor activity of anthrapyrazole analogues. METHODS: The predictive performance of four machine learning algorithms, namely artificial neural networks, boosted trees, multivariate adaptive regression splines, and random forest, was studied in terms of variation of the observed and predicted data, internal validation, predictability, precision, and accuracy. RESULTS: ANN and boosted trees algorithms met the validation criteria. It means that these procedures may be able to forecast the anticancer effects of the anthrapyrazoles studied. Evaluation of validation metrics, calculated for each approach, indicated the artificial neural network (ANN) procedure as the algorithm of choice, especially with regard to the obtained predictability as well as the lowest value of mean absolute error. The designed multilayer perceptron (MLP)-15-7-1 network displayed a high correlation between the predicted and the experimental pIC50 value for the training, test, and validation set. A conducted sensitivity analysis enabled an indication of the most important structural features of the studied activity. CONCLUSION: The ANN strategy combines topographical and topological information and can be used for the design and development of novel anthrapyrazole analogues as anticancer molecules.

9.
Spectrochim Acta A Mol Biomol Spectrosc ; 300: 122944, 2023 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-37269660

RESUMO

Oxidative desulfurization (ODS) of diesel fuels has received attention in recent years due to mild working conditions and effective removal of the aromatic sulfur compounds. There is a need for rapid, accurate, and reproducible analytical tools to monitor the performance of ODS systems. During the ODS process, sulfur compounds are oxidized to their corresponding sulfones which are easily removed by extraction in polar solvents. The amount of extracted sulfones is a reliable indicator of ODS performance, showing both oxidation and extraction efficiency. This article studies the ability of a non-parametric regression algorithm, principal component analysis-multivariate adaptive regression splines (PCA-MARS) as an alternative to back propagation artificial neural network (BP-ANN) to predict the concentration of sulfone removed during the ODS process. Using PCA, variables were compressed to identify principal components (PCs) that best described the data matrix, and the scores of such PCs were used as input variables for the MARS and ANN algorithms. Thecoefficientofdeterminationincalibration (R2c), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were calculated for PCA-BP-ANN (R2c = 0.9913, RMSEC = 2.4206 and RMSEP = 5.7124) and PCA-MARS (R2c = 0.9841, RMSEC = 2.7934 and RMSEP = 5.8476) models and were compared with the genetic algorithm partial least squares (GA-PLS) (R2c = 0.9472, RMSEC = 5.5226 and RMSEP = 9.6417) and as the results reveal, both methods are better than GA-PLS in terms of prediction accuracy. The proposed PCA-MARS and PCA-BP-ANN models are robust models that provide similar predictions and can be effectively used to predict sulfone containing samples. The MARS algorithm builds a flexible model using simpler linear regression and is computationally more efficient than BPNN due to data-driven stepwise search, addition, and pruning.


Assuntos
Redes Neurais de Computação , Compostos de Enxofre , Espectroscopia de Infravermelho com Transformada de Fourier , Análise de Componente Principal , Análise dos Mínimos Quadrados , Sulfonas , Estresse Oxidativo
10.
World J Nucl Med ; 21(4): 276-282, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36398299

RESUMO

Objective In the present study, we have used machine learning algorithm to accomplish the task of automated detection of poor-quality scintigraphic images. We have validated the accuracy of our machine learning algorithm on 99m Tc-methyl diphosphonate ( 99m Tc-MDP) bone scan images. Materials and Methods Ninety-nine patients underwent 99mTC-MDP bone scan acquisition twice at two different acquisition speeds, one at low speed and another at double the speed of the first scan, with patient lying in the same position on the scan table. The low-speed acquisition resulted in good-quality images and the high-speed acquisition resulted in poor-quality images. The principal component analysis (PCA) of all the images was performed and the first 32 principal components (PCs) were retained as feature vectors of the image. These 32 feature vectors of each image were used for the classification of images into poor or good quality using machine learning algorithm (multivariate adaptive regression splines [MARS]). The data were split into two sets, that is, training set and test set in the ratio of 60:40. Hyperparameter tuning of the model was done in which five-fold cross-validation was performed. Receiver operator characteristic (ROC) analysis was used to select the optimal model using the largest value of area under the ROC curve. Sensitivity, specificity, and accuracy for the classification of poor- and good-quality images were taken as metrics for the performance of the algorithm. Result Accuracy, sensitivity, and specificity of the model in classifying poor-quality and good-quality images were 93.22, 93.22, and 93.22%, respectively, for the training dataset and 86.88, 80, and 93.7%, respectively, for the test dataset. Conclusion Machine learning algorithms can be used to classify poor- and good-quality images with good accuracy (86.88%) using 32 PCs as the feature vector and MARS as the classification model.

11.
Comput Struct Biotechnol J ; 20: 5490-5499, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36249559

RESUMO

Genomic wide selection (GWS) is one contributions of molecular genetics to breeding. Machine learning (ML) and artificial neural networks (ANN) methods are non-parameterized and can develop more accurate and parsimonious models for GWS analysis. Multivariate Adaptive Regression Splines (MARS) is considered one of the most flexible ML methods, automatically modeling nonlinearities and interactions of the predictor variables. This study aimed to evaluate and compare methods based on ANN, ML, including MARS, and G-BLUP through GWS. An F2 population formed by 1000 individuals and genotyped for 4010 SNP markers and twelve traits from a model considering epistatic effect, with QTL numbers ranging from eight to 480 and heritability ( h 2 ) of 0.3, 0.5 or 0.8 were simulated. Variation in heritability and number of QTL impacts the performance of methods. About quantitative traits (40, 80, 120, 240, and 480 QTLs) was observed highest R2 to Radial Base Network (RBF) and G-BLUP, followed by Random Forest (RF), Bagging (BA), and Boosting (BO). RF and BA also showed better results for traits to h 2 of 0.3 with R 2 values 16.51% and 16.30%, respectively, while MARS methods showed better results for oligogenic traits with R 2 values ranging from 39,12 % to 43,20 % in h 2 of 0.5 and from 59.92% to 78,56% in h 2 of 0.8. Non-additive MARS methods also showed high R2 for traits with high heritability and 240 QTLs or more. ANN and ML methods are powerful tools to predict genetic values in traits with epistatic effect, for different degrees of heritability and QTL numbers.

12.
Nutrients ; 14(9)2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35565657

RESUMO

Some controversy remains on thresholds for deficiency or sufficiency of serum 25-hydroxyvitamin D (25(OH)D) levels. Moreover, 25(OH)D levels sufficient for bone health might differ from those required for cancer survival. This study aimed to explore these 25(OH)D threshold levels by applying the machine learning method of multivariable adaptive regression splines (MARS) in post hoc analyses using data from the AMATERASU trial, which randomly assigned Japanese patients with digestive tract cancer to receive vitamin D or placebo supplementation. Using MARS, threshold 25(OH)D levels were estimated as 17 ng/mL for calcium and 29 ng/mL for parathyroid hormone (PTH). Vitamin D supplementation increased calcium levels in patients with baseline 25(OH)D levels ≤17 ng/mL, suggesting deficiency for bone health, but not in those >17 ng/mL. Vitamin D supplementation improved 5-year relapse-free survival (RFS) compared with placebo in patients with intermediate 25(OH)D levels (18−28 ng/mL): vitamin D, 84% vs. placebo, 71%; hazard ratio, 0.49; 95% confidence interval, 0.25−0.96; p = 0.04. In contrast, vitamin D supplementation did not improve 5-year RFS among patients with low (≤17 ng/mL) or with high (≥29 ng/mL) 25(OH)D levels. MARS might be a reliable method with the potential to eliminate guesswork in the estimation of threshold values of biomarkers.


Assuntos
Neoplasias Gastrointestinais , Deficiência de Vitamina D , Cálcio/uso terapêutico , Suplementos Nutricionais , Neoplasias Gastrointestinais/tratamento farmacológico , Humanos , Aprendizado de Máquina , Recidiva Local de Neoplasia/tratamento farmacológico , Hormônio Paratireóideo , Vitamina D/análogos & derivados , Deficiência de Vitamina D/tratamento farmacológico , Vitaminas/uso terapêutico
13.
Sensors (Basel) ; 22(7)2022 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-35408249

RESUMO

Linear dependence of variables is a commonly used assumption in most diagnostic systems for which many robust methodologies have been developed over the years. In case the system nonlinearities are relevant, fault diagnosis methods, relying on the assumption of linearity, might potentially provide unsatisfactory results in terms of false alarms and missed detections. In recent years, many authors have proposed machine learning (ML) techniques to improve fault diagnosis performance to mitigate this problem. Although very powerful, these techniques require faulty data samples that are representative of any fault scenario. Additionally, ML techniques suffer from issues related to overfitting and unpredictable performance in regions which are not fully explored in the training phase. This paper proposes a non-linear additive model to characterize the non-linear redundancy relationships among the system signals. Using the multivariate adaptive regression splines (MARS) algorithm, these relationships are identified directly from the data. Next, the non-linear redundancy relationships are linearized to derive a local time-dependent fault signature matrix. The faulty sensor can then be isolated by measuring the angular distance between the column vectors of the fault signature matrix and the primary residual vector. A quantitative analysis of fault isolation and fault estimation performance is performed by exploiting real data from multiple flights of a semi-autonomous aircraft, thus allowing a detailed quantitative comparison with state-of-the-art machine-learning-based fault diagnosis algorithms.


Assuntos
Aprendizado de Máquina , Máquina de Vetores de Suporte , Aeronaves , Algoritmos
14.
Sci Total Environ ; 819: 152893, 2022 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-34995597

RESUMO

The demand for electricity affects the future climate through its effect on greenhouse gas emissions in the electricity generation process, but climate change also impacts electricity demand by changing the need for heating and cooling. Developing reliable temperature response functions (TRFs) that illustrate electricity demand as a function of temperature is key for decreasing uncertainty in future climate projections under a changing climate and for impact assessments of climate change on energy systems. However, this task is challenging because electricity demand is determined by multiple factors that interact in complicated ways because demand fluctuations represent timely human responses to given meteorological conditions. We propose a novel method to acquire reliable TRFs at a regional scale based on comprehensive modeling of electricity demand fluctuations. Six candidate algorithms were examined, and multivariate adaptive regression splines (MARS) was selected as the best algorithm with the dataset used. Using MARS, we constructed models with the capacity to precisely reproduce complex electricity demand patterns based on multiple predictors and simulated the impact of temperature on electricity demand while controlling for the effects of other factors. The temporal segments in TRFs are detected and parameters and functional forms of TRFs for 10 regions in Japan were presented.


Assuntos
Eletricidade , Gases de Efeito Estufa , Mudança Climática , Calefação , Humanos , Temperatura
15.
Environ Sci Pollut Res Int ; 29(14): 20556-20570, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34739667

RESUMO

This study evaluates the potential of kriging-based (kriging and kriging-logistic) and machine learning models (MARS, GBRT, and ANN) in predicting the effluent arsenic concentration of a wastewater treatment plant. Two distinct input combination scenarios were established, using seven quantitative and qualitative independent influent variables. In the first scenario, all of the seven independent variables were taken into account for constructing the data-driven models. For the second input scenario, the forward selection k-fold cross-validation method was employed to select effective explanatory influent parameters. The results obtained from both input scenarios show that the kriging-logistic and machine learning models are effective and robust. However, using the feature selection procedure in the second scenario not only made the architecture of the model simpler and more effective, but also enhanced the performance of the developed models (e.g., around 7.8% performance enhancement of the RMSE). Although the standard kriging method provided the least good predictive results (RMSE = 0.18 ug/l and NSE=0.75), it was revealed that the kriging-logistic method gave the best performance among the applied models (RMSE = 0.11 ug/l and NSE=0.90).


Assuntos
Arsênio , Purificação da Água , Aprendizado de Máquina , Análise Espacial
16.
Water Environ Res ; 93(11): 2360-2373, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34528328

RESUMO

Stream waters play a crucial role in catering to the world's needs with the required quality of water. Due to the discharges of wastewater from the various point and nonpoint sources, most of the watersheds are contaminated easily. The Upper Green River watershed in Kentucky, USA, is one such watershed that is contaminated over the years due to the runoff from rural areas and agricultural lands and combined sewer overflows (CSOs) from urban areas. Monitoring and characterizing the water quality status of streams in such watersheds has become of great importance, with multivariate statistical techniques such as regression, factor analysis, cluster analysis, and artificial intelligence methods such as artificial neural networks (ANNs). The water quality parameters, namely, fecal coliform (FC), turbidity, pH, and conductivity have been predicted quantitatively using ANNs to understand the water quality status of streams in the Upper Green River watershed elsewhere. In this study, a novel attempt has been made to predict the status of the quality of the Green River water with the predictive capabilities of a few decision tree (DT) algorithms such as classification and regression tree (CART) model, multivariate adaptive regression splines (MARS) model, random forest (RF) model, and extreme learning machine (ELM) model. The RF model's performance is better in predicting FC, turbidity, and pH than CART models in training and testing phases. Relatively, MARS and ELM models did better in testing though the performance is poorer in training. For example, we obtain the RMSE values of 2206, 2532, 1533, and 1969 using RF, CART, MARS, and ELM for FC in testing. A good correlation has been observed between conductivity and temperature, precipitation, and land-use factors for the MARS model. Overall, DT models are helpful in understanding, interpreting the outcomes, and visualizing the results compared with the other models. PRACTITIONER POINTS: The prediction of stream water quality parameters using decision trees is explored. The climate and land use parameters are used as input parameters to the modeling. The DT models of CART, MARS, RF, and ANNs such as ELM are explored to predict stream water quality. The RF model shows stable results compared with CART, MARS, and ELM for the data explored. Apart from the R2 value, RMSE and MAE indicate the effectiveness of DTs in prediction.


Assuntos
Aprendizado de Máquina , Rios , Qualidade da Água , Algoritmos , Árvores de Decisões , Monitoramento Ambiental
17.
Environ Sci Pollut Res Int ; 28(4): 4417-4429, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32944856

RESUMO

Despite their environmental impact, fossil-fuel power plants are still commonly used due to their high capacity and relatively low cost compared to renewable energy sources. The aim of this paper is to assess the performance of such energy systems as a key element within a fossil-fuel energy supply network. The methodology relies on fossil-fuel power plant modelling to define an optimal energy management level. However, it can be difficult to model the energy management of thermal power stations (TPS). Therefore, this paper shows an energy efficiency model found on a new hybrid algorithm that is a combination of multivariate adaptive regression splines (MARS) and differential evolution (DE) to estimate net annual electricity generation (NAEG) and carbon dioxide (CO2) emissions (CDE) from economic and performance variables in thermal power plants. This technique requires the DE optimisation of the MARS hyperparameters during the development of the training process. In addition to successfully forecast net annual electricity generation (NAEG) and carbon dioxide (CO2) emissions (CDE) (coefficients of determination with a value of 0.9803 and 0.9895, respectively), the mathematical model used in this work can determine the importance of each economic and energy parameter to characterize the behaviour of thermal power stations.


Assuntos
Combustíveis Fósseis , Centrais Elétricas , Dióxido de Carbono/análise , Eletricidade , Fontes Geradoras de Energia
18.
Ann Palliat Med ; 10(2): 1296-1303, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33040556

RESUMO

BACKGROUND: Glycosylated hemoglobin (HbA1c) is directly proportional to the level of glucose in the blood, and it has been the gold standard to evaluate the status of long-term blood glucose levels. Exploring the factors that lead to HbA1c improvement is beneficial for effectively controlling of HbA1c levels. METHODS: Data collected from 52 hospitals in five cities in northern China were divided into training and test sets at a ratio of 7:3. The training set was used to build models, and the test set was used to evaluate the generalizability of the models. The performance of multivariate adaptive regression splines (MARS) models and logistic regression was evaluated, namely, the accuracy, Youden's index, recall rate, G-mean and area under the ROC curve (AUC) with 95% confidence intervals (CIs). RESULTS: The prevalence of improvements in HbA1c levels was 38.35%. Doses of insulin less than 13 U, more than 3 kinds of oral medicine, exercise frequency greater than once per week and 2 h postprandial blood glucose (2hPBG) less than 10.56 mmol/L were found to improve HbA1c. The following interactions were negatively associated with improvement in HbA1c levels: patients with relative complications and 2hPBG less than 10.56 mmol/L, type 2 diabetes mellitus (T2DM) duration more than 7 years and insulin dose less than 13U. Compared to logistic regression, the MARS model performed better in the above aspects, except for accuracy. CONCLUSIONS: Given the interaction between factors affecting HbA1c improvement, medical staff should conduct comprehensive interventions to further reduce HbA1c levels in patients. In this study, the MARS model was superior to the traditional logistic regression in improving HbA1c levels. MARS had greater generalizability because it not only considered nonlinear relations in the process of model fitting but also adopted cross-validation. Nevertheless, more studies are needed to provide evidence for this result.


Assuntos
Diabetes Mellitus Tipo 2 , Glicemia , China , Hemoglobinas Glicadas/análise , Humanos , Prevalência
19.
Environ Monit Assess ; 192(12): 752, 2020 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-33159587

RESUMO

The aim of this study was to model the surface water quality of the Broad River Basin, South Carolina. The most suitable two monitoring stations numbered as USGS 02156500 (Near Carlisle) and USGS 02160991 (Near Jenkinsville) were selected for the reason that the river water temperature (WT), pH, and specific conductance (SC), as well as dissolved oxygen (DO) concentration, were simultaneously monitored and recorded at these sites. The monitoring period from September 2016 to August 2017 was taken into account for the modeling studies. The electrical conductivity (EC) values corresponding to the river SC values were calculated. First, the conventional regression analysis (CRA) was applied to three regression forms, i.e., linear, power, and exponential functions, to estimate the river DO concentration. Then, the multivariate adaptive regression splines (MARS) and TreeNet gradient boosting machine (TreeNet) techniques were employed. Three performance statistics, i.e., root means square error (RMSE), mean absolute error (MAE), and Nash-Sutcliffe coefficient of efficiency (NS), were used to compare the estimation capabilities of these techniques. The TreeNet technique, which was used for the first time in the modeling of DO concentration, had higher estimation success with the RMSE, MAE, and NS values of 0.182 mg/L, 0.123 mg/L, and 0.990, respectively, for the Carlisle station and 0.313 mg/L, 0.233 mg/L, and 0.965, respectively, for the Jenkinsville station in the training phase. The MARS technique, which had limited availability of its application in the modeling of DO concentration, had higher estimation success with the RMSE, MAE, and NS values of 0.240 mg/L, 0.195 mg/L, and 0.981, respectively, for the Carlisle station and 0.527 mg/L, 0.432 mg/L, and 0.980, respectively, for the Jenkinsville station in the testing phase. Considering the RMSE and MAE values being lower, as well as NS values being higher for the model having an input combination of WT, pH, and EC, the Carlisle station came into prominence. It was concluded that international researchers, who have engaged in the river water quality modeling studies, can favor the MARS and TreeNET techniques without any hesitation and estimate the river DO concentration successfully. The models developed for the Carlisle station were tested with the data sets for the monitoring period from September 2017 to August 2018 at the same station. Similarly, the models developed for the Jenkinsville station were tested with the data sets for the monitoring period from September 2017 to August 2018 at the same station. It was concluded that the models could estimate the river DO concentrations very close to in situ measurements at the same site but for the different monitoring periods, too. Furthermore, the models developed for the Carlisle station were tested with the data sets from the Jenkinsville station for the same monitoring period. Similarly, the models developed for the Jenkinsville station were tested with the data sets from the Carlisle station for the same monitoring period. It was also concluded that the developed models could estimate the river DO concentrations very close to in situ measurements at different monitoring sites but for the same monitoring period on the same river, too. It can be asserted that the models developed for any monitoring site on a river can be employed for another monitoring site on the same river, too, as in the case of the Broad River, South Carolina.


Assuntos
Rios , Qualidade da Água , Monitoramento Ambiental , Análise Multivariada , Redes Neurais de Computação , Oxigênio , Análise de Regressão , South Carolina
20.
Accid Anal Prev ; 146: 105735, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32835954

RESUMO

This study develops bicycle-vehicle safety performance functions (SPFs) for five facilities in the Highway Safety Manual (HSM). These are urban two-lane undivided segments (U2U), urban four-lane divided/undivided segments (U4DU), rural two-lane undivided segments (R2U), urban four-leg and three-leg signalized intersections (USG), and urban four-leg and three-leg stop-controlled intersections (UST). Two modeling techniques were explored, the Conway-Maxwell-Poisson (COM-Poisson) model (to accommodate bicycle-vehicle crash under-dispersion) and a machine learning technique, the multivariate adaptive regression splines (MARS). MARS is a non-black-box model and can effectively handle non-linear crash predictors and interactions. A total of 1,311 bicycle-vehicle crashes from 2011 through 2015 in Alabama were collected and their respective police reports were reviewed in details. Results from the SPFs for roadway segments using COM-Poisson showed that bicycle-vehicle crash frequencies were reduced along curved and downgrade/upgrade stretches and when having heavy traffic flow (along U2U segments). For urban signalized (USG) intersections, the absence of right-turn lanes on minor roads, the presence of bus stops, and the increase in the major road annual average daily traffic (AADT) were significant factors contributing to the increase in the number of bicycle-vehicle crashes. However, the presence of divided medians on major approaches was found to reduce bicycle-vehicle crashes at USG and UST intersections. MARS outperformed the corresponding COM-Poisson models for all five facilities based on mean absolute deviance (MAD), mean square prediction error (MSPE), and generalized R-square. MARS is recommended as a promising technique for effectively predicting bicycle-vehicle crashes on segments and intersections.


Assuntos
Acidentes de Trânsito/prevenção & controle , Ciclismo/estatística & dados numéricos , Ambiente Construído/estatística & dados numéricos , Acidentes de Trânsito/estatística & dados numéricos , Alabama , Inteligência Artificial , Ciclismo/lesões , Humanos , Modelos Estatísticos , População Rural
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA