Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Diagnostics (Basel) ; 14(15)2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-39125499

RESUMO

Type 2 diabetes mellitus (T2DM) is one of the most common metabolic diseases in the world and poses a significant public health challenge. Early detection and management of this metabolic disorder is crucial to prevent complications and improve outcomes. This paper aims to find core differences in male and female markers to detect T2DM by their clinic and anthropometric features, seeking out ranges in potential biomarkers identified to provide useful information as a pre-diagnostic tool whie excluding glucose-related biomarkers using machine learning (ML) models. We used a dataset containing clinical and anthropometric variables from patients diagnosed with T2DM and patients without TD2M as control. We applied feature selection with three different techniques to identify relevant biomarker models: an improved recursive feature elimination (RFE) evaluating each set from all the features to one feature with the Akaike information criterion (AIC) to find optimal outputs; Least Absolute Shrinkage and Selection Operator (LASSO) with glmnet; and Genetic Algorithms (GA) with GALGO and forward selection (FS) applied to GALGO output. We then used these for comparison with the AIC to measure the performance of each technique and collect the optimal set of global features. Then, an implementation and comparison of five different ML models was carried out to identify the most accurate and interpretable one, considering the following models: logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), k-nearest neighbors (KNN), and nearest centroid (Nearcent). The models were then combined in an ensemble to provide a more robust approximation. The results showed that potential biomarkers such as systolic blood pressure (SBP) and triglycerides are together significantly associated with T2DM. This approach also identified triglycerides, cholesterol, and diastolic blood pressure as biomarkers with differences between male and female actors that have not been previously reported in the literature. The most accurate ML model was selection with RFE and random forest (RF) as the estimator improved with the AIC, which achieved an accuracy of 0.8820. In conclusion, this study demonstrates the potential of ML models in identifying potential biomarkers for early detection of T2DM, excluding glucose-related biomarkers as well as differences between male and female anthropometric and clinic profiles. These findings may help to improve early detection and management of the T2DM by accounting for differences between male and female subjects in terms of anthropometric and clinic profiles, potentially reducing healthcare costs and improving personalized patient attention. Further research is needed to validate these potential biomarkers ranges in other populations and clinical settings.

2.
PeerJ ; 12: e16501, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38223762

RESUMO

The occurrence of fungi is cosmopolitan, and while some mushroom species are beneficial to human health, others can be toxic and cause illness problems. This study aimed to analyze the organoleptic, ecological, and morphological characteristics of a group of fungal specimens and identify the most significant features to develop models for fungal toxicity classification using genetic algorithms and LASSO regression. The results of the study indicated that odor, spore print color, and habitat were the most significant characteristics identified by the genetic algorithm GALGO. Meanwhile, odor, gill size, stalk shape, and twelve other features were the relevant characteristics identified by LASSO regression. The importance score of the odor variable was 99.99%, gill size obtained 73.7%, stalk shape scored 39.9%, and the remaining variables did not score higher than 18%. Logistic regression, k-nearest neighbor (KNN), and XG-Boost classification algorithms were used to develop models using the features selected by both GALGO and LASSO. The models were evaluated using sensitivity, specificity, and accuracy metrics. The models with the highest AUC values were XGBoost, with a maximum value of 0.99 using the features selected by LASSO, followed by KNN with a maximum value of 0.99. The GALGO selection resulted in a maximum AUC of 0.98 in KNN and XGBoost. The models developed in this study have the potential to aid in the accurate identification of toxic fungi, which can prevent health problems caused by their consumption.


Assuntos
Agaricus , Humanos , Agaricus/genética , Algoritmos , Benchmarking , Análise por Conglomerados , Aprendizado de Máquina
3.
Sensors (Basel) ; 23(17)2023 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-37688015

RESUMO

In recent years, the application of artificial intelligence (AI) in the automotive industry has led to the development of intelligent systems focused on road safety, aiming to improve protection for drivers and pedestrians worldwide to reduce the number of accidents yearly. One of the most critical functions of these systems is pedestrian detection, as it is crucial for the safety of everyone involved in road traffic. However, pedestrian detection goes beyond the front of the vehicle; it is also essential to consider the vehicle's rear since pedestrian collisions occur when the car is in reverse drive. To contribute to the solution of this problem, this research proposes a model based on convolutional neural networks (CNN) using a proposed one-dimensional architecture and the Inception V3 architecture to fuse the information from the backup camera and the distance measured by the ultrasonic sensors, to detect pedestrians when the vehicle is reversing. In addition, specific data collection was performed to build a database for the research. The proposed model showed outstanding results with 99.85% accuracy and 99.86% correct classification performance, demonstrating that it is possible to achieve the goal of pedestrian detection using CNN by fusing two types of data.

4.
PeerJ ; 11: e14806, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36945355

RESUMO

The gastrointestinal (GI) tract can be affected by different diseases or lesions such as esophagitis, ulcers, hemorrhoids, and polyps, among others. Some of them can be precursors of cancer such as polyps. Endoscopy is the standard procedure for the detection of these lesions. The main drawback of this procedure is that the diagnosis depends on the expertise of the doctor. This means that some important findings may be missed. In recent years, this problem has been addressed by deep learning (DL) techniques. Endoscopic studies use digital images. The most widely used DL technique for image processing is the convolutional neural network (CNN) due to its high accuracy for modeling complex phenomena. There are different CNNs that are characterized by their architecture. In this article, four architectures are compared: AlexNet, DenseNet-201, Inception-v3, and ResNet-101. To determine which architecture best classifies GI tract lesions, a set of metrics; accuracy, precision, sensitivity, specificity, F1-score, and area under the curve (AUC) were used. These architectures were trained and tested on the HyperKvasir dataset. From this dataset, a total of 6,792 images corresponding to 10 findings were used. A transfer learning approach and a data augmentation technique were applied. The best performing architecture was DenseNet-201, whose results were: 97.11% of accuracy, 96.3% sensitivity, 99.67% specificity, and 95% AUC.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Trato Gastrointestinal/diagnóstico por imagem , Endoscopia Gastrointestinal , Diagnóstico por Computador/métodos
5.
Sensors (Basel) ; 23(2)2023 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-36679580

RESUMO

Driver identification refers to the process whose primary purpose is identifying the person behind the steering wheel using collected information about the driver him/herself. The constant monitoring of drivers through sensors generates great benefits in advanced driver assistance systems (ADAS), to learn more about the behavior of road users. Currently, there are many research works that address the subject in search of creating intelligent models that help to identify vehicle users in an efficient and objective way. However, the different methodologies proposed to create these models are based on data generated from sensors that include different vehicle brands on routes established in real environments, which, although they provide very important information for different purposes, in the case of driver identification, there may be a certain degree of bias due to the different situations in which the route environment may change. The proposed method seeks to intelligently and objectively select the most outstanding statistical features from motor activity generated in the main elements of the vehicle with genetic algorithms for driver identification, this process being newer than those established by the state-of-the-art. The results obtained from the proposal were an accuracy of 90.74% to identify two drivers and 62% for four, using a Random Forest Classifier (RFC). With this, it can be concluded that a comprehensive selection of features can greatly optimize the identification of drivers.


Assuntos
Condução de Veículo , Humanos , Masculino , Acidentes de Trânsito , Algoritmo Florestas Aleatórias , Aprendizagem , Atividade Motora
6.
Rev. mex. ing. bioméd ; 44(spe1): 38-52, Aug. 2023. tab, graf
Artigo em Inglês | LILACS-Express | LILACS | ID: biblio-1565605

RESUMO

Abstract It is estimated that depression affects more than 300 million people in worldwide. Unfortunately, the current method of psychiatric evaluation requires a great effort on the part of clinicians to collect complete information. The aim of this paper is determine the optimal time intervals to detect depression using genetic algorithms and machine learning techniques; from motor activity readings of 55 participants during a week at one-minute intervals. The time intervals with the best performance in detecting depression in individuals were selected by applying Genetic Algorithms (GA). Methodology. 385 observations of the study participants were evaluated, obtaining an accuracy of 83.0 % with Logistic Regression (LR). Conclusion. There is a relationship between motor activity and people with depression since it is possible to detect it using machine learning techniques. However, the changes in the variables of the time intervals could be established as key factors since, at different times, they could give good or bad results because the motor activity in the patients could vary. However, the results present a first approximation for developing tools that help the opportune and objective diagnosis of depression.


Resumen Se estima que la depresión afecta a más de 300 millones de personas en el mundo. Desafortunadamente, el método de evaluación psiquiátrica actual requiere un gran esfuerzo por parte de los médicos para recopilar información completa. Objetivo. Determinar los intervalos de tiempo óptimos para detectar depresión mediante algoritmos genéticos y técnicas de aprendizaje automático, a partir de las lecturas de actividad motora de 55 sujetos durante una semana en intervalos de un minuto. Los intervalos de tiempo con mejor desempeño en la detección de depresión en individuos fueron seleccionados aplicando algoritmos genéticos. Metodología. Se evaluaron 385 observaciones de los sujetos de estudio, obteniendo una precisión del 83.0 % con Regresión Logística (LR). Conclusión. Existe una relación entre la actividad motora y las personas con depresión ya que es posible detectarla utilizando técnicas de aprendizaje automático. Sin embargo, los cambios en las variables de los intervalos de tiempo podrían establecerse como factores clave ya que en diferentes momentos podrían dar buenos o malos resultados debido a que la actividad motora en los pacientes podría llegar a variar. No obstante, los resultados presentan una primera aproximación para el desarrollo de herramientas que ayuden al diagnóstico oportuno y objetivo de la depresión.

7.
Rev Invest Clin ; 74(6): 314-327, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36546894

RESUMO

Background: The coronavirus disease (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus and is responsible for nearly 6 million deaths worldwide in the past 2 years. Machine learning (ML) models could help physicians in identifying high-risk individuals. Objectives: To study the use of ML models for COVID-19 prediction outcomes using clinical data and a combination of clinical and metabolic data, measured in a metabolomics facility from a public university. Methods: A total of 154 patients were included in the study. "Basic profile" was considered with clinical and demographic variables (33 variables), whereas in the "extended profile," metabolomic and immunological variables were also considered (156 characteristics). A selection of features was carried out for each of the profiles with a genetic algorithm (GA) and random forest models were trained and tested to predict each of the stages of COVID-19. Results: The model based on extended profile was more useful in early stages of the disease. Models based on clinical data were preferred for predicting severe and critical illness and death. ML detected trimethylamine N-oxide, lipid mediators, and neutrophil/lymphocyte ratio as important variables. Conclusions: ML and GAs provided adequate models to predict COVID-19 outcomes in patients with different severity grades.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/diagnóstico , Algoritmos , Prognóstico , Aprendizado de Máquina
8.
Rev. invest. clín ; Rev. invest. clín;74(6): 314-327, Nov.-Dec. 2022. tab, graf
Artigo em Inglês | LILACS-Express | LILACS | ID: biblio-1431820

RESUMO

ABSTRACT Background: The coronavirus disease (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus and is responsible for nearly 6 million deaths worldwide in the past 2 years. Machine learning (ML) models could help physicians in identifying high-risk individuals. Objectives: To study the use of ML models for COVID-19 prediction outcomes using clinical data and a combination of clinical and metabolic data, measured in a metabolomics facility from a public university. Methods: A total of 154 patients were included in the study. "Basic profile" was considered with clinical and demographic variables (33 variables), whereas in the "extended profile," metabolomic and immunological variables were also considered (156 characteristics). A selection of features was carried out for each of the profiles with a genetic algorithm (GA) and random forest models were trained and tested to predict each of the stages of COVID-19. Results: The model based on extended profile was more useful in early stages of the disease. Models based on clinical data were preferred for predicting severe and critical illness and death. ML detected trimethylamine N-oxide, lipid mediators, and neutrophil/lymphocyte ratio as important variables. Conclusion: ML and GAs provided adequate models to predict COVID-19 outcomes in patients with different severity grades.

9.
Diagnostics (Basel) ; 12(11)2022 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-36428864

RESUMO

According to the World Health Organization (WHO), type 2 diabetes mellitus (T2DM) is a result of the inefficient use of insulin by the body. More than 95% of people with diabetes have T2DM, which is largely due to excess weight and physical inactivity. This study proposes an intelligent feature selection of metabolites related to different stages of diabetes, with the use of genetic algorithms (GA) and the implementation of support vector machines (SVMs), K-Nearest Neighbors (KNNs) and Nearest Centroid (NEARCENT) and with a dataset obtained from the Instituto Mexicano del Seguro Social with the protocol name of the following: "Análisis metabolómico y transcriptómico diferencial en orina y suero de pacientes pre diabéticos, diabéticos y con nefropatía diabética para identificar potenciales biomarcadores pronósticos de daño renal" (differential metabolomic and transcriptomic analyses in the urine and serum of pre-diabetic, diabetic and diabetic nephropathy patients to identify potential prognostic biomarkers of kidney damage). In order to analyze which machine learning (ML) model is the most optimal for classifying patients with some stage of T2DM, the novelty of this work is to provide a genetic algorithm approach that detects significant metabolites in each stage of progression. More than 100 metabolites were identified as significant between all stages; with the data analyzed, the average accuracies obtained in each of the five most-accurate implementations of genetic algorithms were in the range of 0.8214-0.9893 with respect to average accuracy, providing a precise tool to use in detections and backing up a diagnosis constructed entirely with metabolomics. By providing five potential biomarkers for progression, these extremely significant metabolites are as follows: "Cer(d18:1/24:1) i2", "PC(20:3-OH/P-18:1)", "Ganoderic acid C2", "TG(16:0/17:1/18:1)" and "GPEtn(18:0/20:4)".

10.
Bioengineering (Basel) ; 9(9)2022 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-36135004

RESUMO

Depression is a common illness worldwide, affecting an estimated 3.8% of the population, including 5% of all adults, in particular, 5.7% of adults over 60 years of age. Unfortunately, at present, the ways to evaluate different mental disorders, like the Montgomery-Åsberg depression rating scale (MADRS) and observations, need a great effort, on part of specialists due to the lack of availability of patients to obtain the necessary information to know their conditions and to detect illness such as depression in an objective way. Based on data analysis and artificial intelligence techniques, like Convolutional Neural Network (CNN), it is possible to classify a person, from the mental status examination, into two classes. Moreover, it is beneficial to observe how the data of these two classes are similar in different time intervals. In this study, a motor activity database was used, from which the readings of 55 subjects of study (32 healthy and 23 with some degree of depression) were recorded with a small wrist-worn accelerometer to detect the peak amplitude of movement acceleration and generate a transient voltage signal proportional to the rate of acceleration. Motor activity data were selected per patient in time-lapses of one day for seven days (one week) in one-minute intervals. The data were pre-processed to be given to a two-dimensional convolutional network (2D-CNN), where each record of motor activity per minute was represented as a pixel of an image. The proposed model is capable of detecting depression in real-time (if this is implemented in a mobile device such as a smartwatch) with low computational cost and accuracy of 76.72% In summary, the model shows promising abilities to detect possible cases of depression, providing a helpful resource to identify the condition and be able to take the appropriate follow-up for the patient.

11.
Healthcare (Basel) ; 10(8)2022 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-35893185

RESUMO

Type 2 diabetes mellitus (T2DM) represents one of the biggest health problems in Mexico, and it is extremely important to early detect this disease and its complications. For a noninvasive detection of T2DM, a machine learning (ML) approach that uses ensemble classification models with dichotomous output that is also fast and effective for early detection and prediction of T2D can be used. In this article, an ensemble technique by hard voting is designed and implemented using generalized linear regression (GLM), support vector machines (SVM) and artificial neural networks (ANN) for the classification of T2DM patients. In the materials and methods as a first step, the data is balanced, standardized, imputed and integrated into the three models to classify the patients in a dichotomous result. For the selection of features, an implementation of LASSO is developed, with a 10-fold cross-validation and for the final validation, the Area Under the Curve (AUC) is used. The results in LASSO showed 12 features, which are used in the implemented models to obtain the best possible scenario in the developed ensemble model. The algorithm with the best performance of the three is SVM, this model obtained an AUC of 92% ± 3%. The ensemble model built with GLM, SVM and ANN obtained an AUC of 90% ± 3%.

12.
Healthcare (Basel) ; 10(7)2022 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-35885784

RESUMO

Major depressive disorder (MDD) is the most recurrent mental illness globally, affecting approximately 5% of adults. Furthermore, according to the National Institute of Mental Health (NIMH) of the U.S., calculating an actual schizophrenia prevalence rate is challenging because of this illness's underdiagnosis. Still, most current global metrics hover between 0.33% and 0.75%. Machine-learning scientists use data from diverse sources to analyze, classify, or predict to improve the psychiatric attention, diagnosis, and treatment of MDD, schizophrenia, and other psychiatric conditions. Motor activity data are gaining popularity in mental illness diagnosis assistance because they are a cost-effective and noninvasive method. In the knowledge discovery in databases (KDD) framework, a model to classify depressive and schizophrenic patients from healthy controls is constructed using accelerometer data. Taking advantage of the multiple sleep disorders caused by mental disorders, the main objective is to increase the model's accuracy by employing only data from night-time activity. To compare the classification between the stages of the day and improve the accuracy of the classification, the total activity signal was cut into hourly time lapses and then grouped into subdatasets depending on the phases of the day: morning (06:00-11:59), afternoon (12:00-17:59), evening (18:00-23:59), and night (00:00-05:59). Random forest classifier (RFC) is the algorithm proposed for multiclass classification, and it uses accuracy, recall, precision, the Matthews correlation coefficient, and F1 score to measure its efficiency. The best model was night-featured data and RFC, with 98% accuracy for the classification of three classes. The effectiveness of this experiment leads to less monitoring time for patients, reducing stress and anxiety, producing more efficient models, using wearables, and increasing the amount of data.

13.
J Pers Med ; 11(12)2021 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-34945799

RESUMO

One of the main microvascular complications presented in the Mexican population is diabetic retinopathy which affects 27.50% of individuals with type 2 diabetes. Therefore, the purpose of this study is to construct a predictive model to find out the risk factors of this complication. The dataset contained a total of 298 subjects, including clinical and paraclinical features. An analysis was constructed using machine learning techniques including Boruta as a feature selection method, and random forest as classification algorithm. The model was evaluated through a statistical test based on sensitivity, specificity, area under the curve (AUC), and receiving operating characteristic (ROC) curve. The results present significant values obtained by the model obtaining 69% of AUC. Moreover, a risk evaluation was incorporated to evaluate the impact of the predictors. The proposed method identifies creatinine, lipid treatment, glomerular filtration rate, waist hip ratio, total cholesterol, and high density lipoprotein as risk factors in Mexican subjects. The odds ratio increases by 3.5916 times for control patients which have high levels of cholesterol. It is possible to conclude that this proposed methodology is a preliminary computer-aided diagnosis tool for clinical decision-helping to identify the diagnosis of DR.

14.
Sensors (Basel) ; 21(22)2021 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-34833826

RESUMO

Worldwide, motor vehicle accidents are one of the leading causes of death, with alcohol-related accidents playing a significant role, particularly in child death. Aiming to aid in the prevention of this type of accidents, a novel non-invasive method capable of detecting the presence of alcohol inside a motor vehicle is presented. The proposed methodology uses a series of low-cost alcohol MQ3 sensors located inside the vehicle, whose signals are stored, standardized, time-adjusted, and transformed into 5 s window samples. Statistical features are extracted from each sample and a feature selection strategy is carried out using a genetic algorithm, and a forward selection and backwards elimination methodology. The four features derived from this process were used to construct an SVM classification model that detects presence of alcohol. The experiments yielded 7200 samples, 80% of which were used to train the model. The rest were used to evaluate the performance of the model, which obtained an area under the ROC curve of 0.98 and a sensitivity of 0.979. These results suggest that the proposed methodology can be used to detect the presence of alcohol and enforce prevention actions.


Assuntos
Condução de Veículo , Dirigir sob a Influência , Acidentes de Trânsito/prevenção & controle , Algoritmos , Criança , Humanos , Veículos Automotores
15.
Healthcare (Basel) ; 9(7)2021 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-34356262

RESUMO

Children's healthcare is a relevant issue, especially the prevention of domestic accidents, since it has even been defined as a global health problem. Children's activity classification generally uses sensors embedded in children's clothing, which can lead to erroneous measurements for possible damage or mishandling. Having a non-invasive data source for a children's activity classification model provides reliability to the monitoring system where it is applied. This work proposes the use of environmental sound as a data source for the generation of children's activity classification models, implementing feature selection methods and classification techniques based on Bayesian networks, focused on the recognition of potentially triggering activities of domestic accidents, applicable in child monitoring systems. Two feature selection techniques were used: the Akaike criterion and genetic algorithms. Likewise, models were generated using three classifiers: naive Bayes, semi-naive Bayes and tree-augmented naive Bayes. The generated models, combining the methods of feature selection and the classifiers used, present accuracy of greater than 97% for most of them, with which we can conclude the efficiency of the proposal of the present work in the recognition of potentially detonating activities of domestic accidents.

16.
Healthcare (Basel) ; 9(8)2021 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-34442108

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disease that mainly affects older adults. Currently, AD is associated with certain hypometabolic biomarkers, beta-amyloid peptides, hyperphosphorylated tau protein, and changes in brain morphology. Accurate diagnosis of AD, as well as mild cognitive impairment (MCI) (prodromal stage of AD), is essential for early care of the disease. As a result, machine learning techniques have been used in recent years for the diagnosis of AD. In this research, we propose a novel methodology to generate a multivariate model that combines different types of features for the detection of AD. In order to obtain a robust biomarker, ADNI baseline data, clinical and neuropsychological assessments (1024 features) of 106 patients were used. The data were normalized, and a genetic algorithm was implemented for the selection of the most significant features. Subsequently, for the development and validation of the multivariate classification model, a support vector machine model was created, and a five-fold cross-validation with an AUC of 87.63% was used to measure model performance. Lastly, an independent blind test of our final model, using 20 patients not considered during the model construction, yielded an AUC of 100%.

17.
Healthcare (Basel) ; 9(3)2021 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-33809283

RESUMO

The main cause of death in Mexico and the world is heart disease, and it will continue to lead the death rate in the next decade according to data from the World Health Organization (WHO) and the National Institute of Statistics and Geography (INEGI). Therefore, the objective of this work is to implement, compare and evaluate machine learning algorithms that are capable of classifying normal and abnormal heart sounds. Three different sounds were analyzed in this study; normal heart sounds, heart murmur sounds and extra systolic sounds, which were labeled as healthy sounds (normal sounds) and unhealthy sounds (murmur and extra systolic sounds). From these sounds, fifty-two features were calculated to create a numerical dataset; thirty-six statistical features, eight Linear Predictive Coding (LPC) coefficients and eight Cepstral Frequency-Mel Coefficients (MFCC). From this dataset two more were created; one normalized and one standardized. These datasets were analyzed with six classifiers: k-Nearest Neighbors, Naive Bayes, Decision Trees, Logistic Regression, Support Vector Machine and Artificial Neural Networks, all of them were evaluated with six metrics: accuracy, specificity, sensitivity, ROC curve, precision and F1-score, respectively. The performances of all the models were statistically significant, but the models that performed best for this problem were logistic regression for the standardized data set, with a specificity of 0.7500 and a ROC curve of 0.8405, logistic regression for the normalized data set, with a specificity of 0.7083 and a ROC curve of 0.8407, and Support Vector Machine with a lineal kernel for the non-normalized data; with a specificity of 0.6842 and a ROC curve of 0.7703. Both of these metrics are of utmost importance in evaluating the performance of computer-assisted diagnostic systems.

18.
Healthcare (Basel) ; 9(4)2021 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-33917300

RESUMO

Diabetes incidence has been a problem, because according with the World Health Organization and the International Diabetes Federation, the number of people with this disease is increasing very fast all over the world. Diabetic treatment is important to prevent the development of several complications, also lipid profile monitoring is important. For that reason the aim of this work is the implementation of machine learning algorithms that are able to classify cases, that corresponds to patients diagnosed with diabetes that have diabetes treatment, and controls that refers to subjects who do not have diabetes treatment but some of them have diabetes, bases on lipids profile levels. Logistic regression, K-nearest neighbor, decision trees and random forest were implemented, all of them were evaluated with accuracy, sensitivity, specificity and AUC-ROC curve metrics. Artificial neural network obtain an acurracy of 0.685 and an AUC value of 0.750, logistic regression achieve an accuracy of 0.729 and an AUC value of 0.795, K-nearest neighbor gets an accuracy of 0.669 and an AUC value of 0.709, on the other hand, decision tree reached an accuracy pg 0.691 and a AUC value of 0.683, finally random forest achieve an accuracy of 0.704 and an AUC curve of 0.776. The performance of all models was statistically significant, but the best performance model for this problem corresponds to logistic regression.

19.
Healthcare (Basel) ; 9(2)2021 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-33535510

RESUMO

The prevalence of diabetes mellitus is increasing worldwide, causing health and economic implications. One of the principal microvascular complications of type 2 diabetes is Distal Symmetric Polyneuropathy (DSPN), affecting 42.6% of the population in Mexico. Therefore, the purpose of this study was to find out the predictors of this complication. The dataset contained a total number of 140 subjects, including clinical and paraclinical features. A multivariate analysis was constructed using Boruta as a feature selection method and Random Forest as a classification algorithm applying the strategy of K-Folds Cross Validation and Leave One Out Cross Validation. Then, the models were evaluated through a statistical analysis based on sensitivity, specificity, area under the curve (AUC) and receiving operating characteristic (ROC) curve. The results present significant values obtained by the model with this approach, presenting 67% of AUC with only three features as predictors. It is possible to conclude that this proposed methodology can classify patients with DSPN, obtaining a preliminary computer-aided diagnosis tool for the clinical area in helping to identify the diagnosis of DSPN.

20.
Diagnostics (Basel) ; 10(11)2020 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-33147746

RESUMO

Sudden infant death syndrome (SIDS) is defined as the death of a child under one year of age, during sleep, without apparent cause, after exhaustive investigation, so it is a diagnosis of exclusion. SIDS is the principal cause of death in industrialized countries. Inborn errors of metabolism (IEM) have been related to SIDS. These errors are a group of conditions characterized by the accumulation of toxic substances usually produced by an enzyme defect and there are thousands of them and included are the disorders of the ß-oxidation cycle, similarly to what can affect the metabolism of different types of fatty acid chain (within these, short chain fatty acids (SCFAs)). In this work, an analysis of postmortem SCFAs profiles of children who died due to SIDS is proposed. Initially, a set of features containing SCFAs information, obtained from the NIH Common Fund's National Metabolomics Data Repository (NMDR) is submitted to an univariate analysis, developing a model based on the relationship between each feature and the binary output (death due to SIDS or not), obtaining 11 univariate models. Then, each model is validated, calculating their receiver operating characteristic curve (ROC curve) and area under the ROC curve (AUC) value. For those features whose models presented an AUC value higher than 0.650, a new multivariate model is constructed, in order to validate its behavior in comparison to the univariate models. In addition, a comparison between this multivariate model and a model developed based on the whole set of features is finally performed. From the results, it can be observed that each SCFA which comprises of the SFCAs profile, has a relationship with SIDS and could help in risk identification.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA