Rechercher | Portail Régional BVS

1.

Who's your data? Primary immune deficiency differential diagnosis prediction via machine learning and data mining of the USIDNET registry.

Méndez Barrera, Jose Alfredo; Rocha Guzmán, Samuel; Hierro Cascajares, Elisa; Garabedian, Elizabeth K; Fuleihan, Ramsay L; Sullivan, Kathleen E; Lugo Reyes, Saul O.

Clin Immunol ; 255: 109759, 2023 10.

Article de Anglais | MEDLINE | ID: mdl-37678719

RÉSUMÉ

PURPOSE: There are currently more than 480 primary immune deficiency (PID) diseases and about 7000 rare diseases that together afflict around 1 in every 17 humans. Computational aids based on data mining and machine learning might facilitate the diagnostic task by extracting rules from large datasets and making predictions when faced with new problem cases. In a proof-of-concept data mining study, we aimed to predict PID diagnoses using a supervised machine learning algorithm based on classification tree boosting. METHODS: Through a data query at the USIDNET registry we obtained a database of 2396 patients with common diagnoses of PID, including their clinical and laboratory features. We kept 286 features and all 12 diagnoses to include in the model. We used the XGBoost package with parallel tree boosting for the supervised classification model, and SHAP for variable importance interpretation, on Python v3.7. The patient database was split into training and testing subsets, and after boosting through gradient descent, the predictive model provides measures of diagnostic prediction accuracy and individual feature importance. After a baseline performance test, we used the Class Weighting Hyperparameter, or scale_pos_weight to correct for imbalanced classification. RESULTS: The twelve PID diagnoses were CVID (1098 patients), DiGeorge syndrome, Chronic granulomatous disease, Congenital agammaglobulinemia, PID not otherwise classified, Specific antibody deficiency, Complement deficiency, Hyper-IgM, Leukocyte adhesion deficiency, ectodermal dysplasia with immune deficiency, Severe combined immune deficiency, and Wiskott-Aldrich syndrome. For CVID, the model found an accuracy on the train sample of 0.80, with an area under the ROC curve (AUC) of 0.80, and a Gini coefficient of 0.60. In the test subset, accuracy was 0.76, AUC 0.75, and Gini 0.51. The positive feature value to predict CVID was highest for upper respiratory infections, asthma, autoimmunity and hypogammaglobulinemia. Features with the highest negative predictive value were high IgE, growth delay, abscess, lymphopenia, and congenital heart disease. For the rest of the diagnoses, accuracy stayed between 0.75 and 0.99, AUC 0.46-0.87, Gini 0.07-0.75, and LogLoss 0.09-8.55. DISCUSSION: Clinicians should remember to consider the negative predictive features together with the positives. We are calling this a proof-of-concept study to continue with our explorations. A good performance is encouraging, and feature importance might aid feature selection for future endeavors. In the meantime, we can learn from the rules derived by the model and build a user-friendly decision tree to generate differential diagnoses.

Sujet(s)

Maladies d'immunodéficience primaire , Syndrome de Wiskott-Aldrich , Humains , Diagnostic différentiel , Apprentissage machine , Fouille de données

2.

Risk Factors Associated with COVID-19 Lethality: A Machine Learning Approach Using Mexico Database.

Carvantes-Barrera, Alejandro; Díaz-González, Lorena; Rosales-Rivera, Mauricio; Chávez-Almazán, Luis A.

J Med Syst ; 47(1): 90, 2023 Aug 19.

Article de Anglais | MEDLINE | ID: mdl-37597034

RÉSUMÉ

Identifying risk factors associated with COVID-19 lethality is crucial in combating the ongoing pandemic. In this study, we developed lethality predictive models for each epidemiological wave and for the overall dataset using the Extreme Gradient Boosting technique and analyzed them using Shapley values to determine the contribution levels of various features, including demographics, comorbidities, medical units, and recent medical information from confirmed COVID-19 cases in Mexico between February 23, 2020, and April 15, 2022. The results showed that pneumonia and advanced age were the most important factors predicting patient death in all cohorts. Additionally, the medical unit where the patient received care acted as a risk or protective factor. IMSS medical units were identified as high-risk factors in all cohorts, except in wave four, while SSA medical units generally were moderate protective factors. We also found that intubation was a high-risk factor in the first epidemiological wave and a moderate-risk factor in the following waves. Female gender was a protective factor of moderate-high importance in all cohorts, while being between 18 and 29 years old was a moderate protective factor and being between 50 and 59 years old was a moderate risk factor. Additionally, diabetes (all cohorts), obesity (third wave), and hypertension (fourth wave) were identified as moderate risk factors. Finally, residing in municipalities with the lowest Human Development Index level represented a moderate risk factor. In conclusion, this study identified several significant risk factors associated with COVID-19 lethality in Mexico, which could aid policymakers in developing targeted interventions to reduce mortality rates.

Sujet(s)

COVID-19 , Humains , Femelle , Adolescent , Jeune adulte , Adulte , Adulte d'âge moyen , COVID-19/épidémiologie , Mexique/épidémiologie , Facteurs de risque , Obésité , Apprentissage machine

3.

Machine learning and comorbidity network analysis for hospitalized patients with COVID-19 in a city in Southern Brazil.

Passarelli-Araujo, Hemanoel; Passarelli-Araujo, Hisrael; Urbano, Mariana R; Pescim, Rodrigo R.

Smart Health (Amst) ; 26: 100323, 2022 Dec.

Article de Anglais | MEDLINE | ID: mdl-36159078

RÉSUMÉ

The large amount of data generated during the COVID-19 pandemic requires advanced tools for the long-term prediction of risk factors associated with COVID-19 mortality with higher accuracy. Machine learning (ML) methods directly address this topic and are essential tools to guide public health interventions. Here, we used ML to investigate the importance of demographic and clinical variables on COVID-19 mortality. We also analyzed how comorbidity networks are structured according to age groups. We conducted a retrospective study of COVID-19 mortality with hospitalized patients from Londrina, Parana, Brazil, registered in the database for severe acute respiratory infections (SIVEP-Gripe), from January 2021 to February 2022. We tested four ML models to predict the COVID-19 outcome: Logistic Regression, Support Vector Machine, Random Forest, and XGBoost. We also constructed a comorbidity network to investigate the impact of co-occurring comorbidities on COVID-19 mortality. Our study comprised 8358 hospitalized patients, of whom 2792 (33.40%) died. The XGBoost model achieved excellent performance (ROC-AUC = 0.90). Both permutation method and SHAP values highlighted the importance of age, ventilatory support status, and intensive care unit admission as key features in predicting COVID-19 outcomes. The comorbidity networks for old deceased patients are denser than those for young patients. In addition, the co-occurrence of heart disease and diabetes may be the most important combination to predict COVID-19 mortality, regardless of age and sex. This work presents a valuable combination of machine learning and comorbidity network analysis to predict COVID-19 outcomes. Reliable evidence on this topic is crucial for guiding the post-pandemic response and assisting in COVID-19 care planning and provision.

4.

Spatial patterns and determinants of avocado frontier dynamics in Mexico.

Ramírez-Mejía, Diana; Levers, Christian; Mas, Jean-François.

Reg Environ Change ; 22(1): 28, 2022.

Article de Anglais | MEDLINE | ID: mdl-35250377

RÉSUMÉ

The surging demand for commodity crops has led to rapid and severe agricultural frontier expansion globally and has put producing regions increasingly under pressure. However, knowledge about spatial patterns of agricultural frontier dynamics, their leading spatial determinants, and socio-ecological trade-offs is often lacking, hindering contextualized decision making towards more sustainable food systems. Here, we used inventory data to map frontier dynamics of avocado production, a cash crop of increasing importance in global diets, for Michoacán, Mexico, before and after the implementation of the North American Free Trade Agreement (NAFTA). We compiled a set of environmental, accessibility and social variables and identified the leading determinants of avocado frontier expansion and their interactions using extreme gradient boosting. We predicted potential expansion patterns and assessed their impacts on areas important for biodiversity conservation. Avocado frontiers expanded more than tenfold from 12,909 ha (1974) to 152,493 ha (2011), particularly after NAFTA. Annual precipitation, distance to settlements, and land tenure were key factors explaining avocado expansion. Under favorable climatic and accessibility conditions, most avocado expansion occurred on private lands. Contrary, under suboptimal conditions, most avocado expansion occurred on communal lands. Large areas suitable for further avocado expansion overlapped with priority sites for restoration, highlighting an imminent conflict between conservation and economic revenues. This is the first analysis of avocado frontier dynamics and their spatial determinants across a major production region and our results provide entry points to implement government-based strategies to support small-scale farmers, mostly those on communal lands, while trying to minimize the socio-environmental impacts of avocado production. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10113-022-01883-6.

5.

Design of Automatic Tool for Diagnosis of Pneumonia Using Boosting Techniques

Postalcioglu, Seda.

Braz. arch. biol. technol ; Braz. arch. biol. technol;65: e22210322, 2022. tab, graf

Article de Anglais | LILACS-Express | LILACS | ID: biblio-1364443

RÉSUMÉ

Abstract Covid-19 is today's pandemic disease and can cause the hospital crowded. Additionally, It affects the lungs and may cause pneumonia. The most popular technique for diagnosis of pneumonia is the evaluation of X-ray. However, a sufficient number of radiologists are needed to interpret the X-ray images. High rates of child deaths due to pneumonia have been encountered. Using this type of system, a diagnosis can be made quickly, and then the treatment process can be started rapidly. This study aims to diagnose pneumonia using boosting techniques by the automatic tool. With this tool, the workload of the doctors/radiologists is reduced. The boosting techniques are a family of machine learning techniques. Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost) are used for the study. These techniques are chosen because of their simulation duration for modeling and convenience for real-time applications. L2 normalization and feature selection are applied to the data before applying the techniques. Random Forest Classifier is used for feature selection estimator. After the modeling, Categorical Boosting algorithm is observed as faster than the other techniques. Simulation duration is obtained as 0.7 seconds. By using this automatic tool, the user can be able to upload the desired X-ray image to the system and get the result easily from the screen without any radiologist/doctor.

6.

Automatic method for classifying COVID-19 patients based on chest X-ray images, using deep features and PSO-optimized XGBoost.

Dias Júnior, Domingos Alves; da Cruz, Luana Batista; Bandeira Diniz, João Otávio; França da Silva, Giovanni Lucca; Junior, Geraldo Braz; Silva, Aristófanes Corrêa; de Paiva, Anselmo Cardoso; Nunes, Rodolfo Acatauassú; Gattass, Marcelo.

Expert Syst Appl ; 183: 115452, 2021 Nov 30.

Article de Anglais | MEDLINE | ID: mdl-34177133

RÉSUMÉ

The COVID-19 pandemic, which originated in December 2019 in the city of Wuhan, China, continues to have a devastating effect on the health and well-being of the global population. Currently, approximately 8.8 million people have already been infected and more than 465,740 people have died worldwide. An important step in combating COVID-19 is the screening of infected patients using chest X-ray (CXR) images. However, this task is extremely time-consuming and prone to variability among specialists owing to its heterogeneity. Therefore, the present study aims to assist specialists in identifying COVID-19 patients from their chest radiographs, using automated computational techniques. The proposed method has four main steps: (1) the acquisition of the dataset, from two public databases; (2) the standardization of images through preprocessing; (3) the extraction of features using a deep features-based approach implemented through the networks VGG19, Inception-v3, and ResNet50; (4) the classifying of images into COVID-19 groups, using eXtreme Gradient Boosting (XGBoost) optimized by particle swarm optimization (PSO). In the best-case scenario, the proposed method achieved an accuracy of 98.71%, a precision of 98.89%, a recall of 99.63%, and an F1-score of 99.25%. In our study, we demonstrated that the problem of classifying CXR images of patients under COVID-19 and non-COVID-19 conditions can be solved efficiently by combining a deep features-based approach with a robust classifier (XGBoost) optimized by an evolutionary algorithm (PSO). The proposed method offers considerable advantages for clinicians seeking to tackle the current COVID-19 pandemic.

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

RÉSUMÉ

ENVOYER À:

SÉLECTION CITATIONS

DÉTAIL DE RECHERCHE