Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 202
Filter
1.
Stud Health Technol Inform ; 315: 368-372, 2024 Jul 24.
Article in English | MEDLINE | ID: mdl-39049285

ABSTRACT

This paper explores the balance between fairness and performance in machine learning classification, predicting the likelihood of a patient receiving anti-microbial treatment using structured data in community nursing wound care electronic health records. The data includes two important predictors (gender and language) of the social determinants of health, which we used to evaluate the fairness of the classifiers. At the same time, the impact of various groupings of language codes on classifiers' performance and fairness is analyzed. Most common statistical learning-based classifiers are evaluated. The findings indicate that while K-Nearest Neighbors offers the best fairness metrics among different grouping settings, the performance of all classifiers is generally consistent across different language code groupings. Also, grouping more variables tends to improve the fairness metrics over all classifiers while maintaining their performance.


Subject(s)
Electronic Health Records , Health Equity , Machine Learning , Electronic Health Records/classification , Humans , Social Determinants of Health
2.
Biotechnol Bioeng ; 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-39044472

ABSTRACT

In the burgeoning field of proteins, the effective analysis of intricate protein data remains a formidable challenge, necessitating advanced computational tools for data processing, feature extraction, and interpretation. This study introduces ProteinFlow, an innovative framework designed to revolutionize feature engineering in protein data analysis. ProteinFlow stands out by offering enhanced efficiency in data collection and preprocessing, along with advanced capabilities in feature extraction, directly addressing the complexities inherent in multidimensional protein data sets. Through a comparative analysis, ProteinFlow demonstrated a significant improvement over traditional methods, notably reducing data preprocessing time and expanding the scope of biologically significant features identified. The framework's parallel data processing strategy and advanced algorithms ensure not only rapid data handling but also the extraction of comprehensive, meaningful insights from protein sequences, structures, and interactions. Furthermore, ProteinFlow exhibits remarkable scalability, adeptly managing large-scale data sets without compromising performance, a crucial attribute in the era of big data.

3.
Adv Sci (Weinh) ; : e2405124, 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-39041889

ABSTRACT

Amid growing interest in the precise detection of volatile organic compounds (VOCs) in industrial field, the demand for highly effective gas sensors is at an all-time high. However, traditional sensors with their classic single-output signal, bulky and complex integrated structure when forming array often involve complicated technology and high cost, limiting their widespread adoption. Here, this study introduces a novel approach, employing an integrated YSZ-based (YSZ: yttria-stabilized zirconia) mixed potential sensor equipped with a triple-sensing electrode array, to efficiently detect and differentiate six types of VOCs gases. This innovative sensor integrates NiSb2O6, CuSb2O6, and MgSb2O6 sensing electrodes (SEs), which are sensitive to pentane, isoprene, n-propanol, acetone, acetic acid, and formaldehyde gases. Through feature engineering based on intuitive spike-based response values, it accentuates the distinct characteristics of every gas. Eventually, an average classification accuracy of 98.8% and an overall R-squared error (R2) of 99.3% for concentration regression toward six target gases can be achieved, showcasing the potential to quantitatively distinguish between industrial hazardous VOCs gases.

4.
Carbohydr Res ; 542: 109189, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38971003

ABSTRACT

There has been a long-standing bottleneck in the quantitative analysis of the frequencies of homoblock polyads beyond triads using 1H and 13C NMR for linear polysaccharides, primarily because monosaccharides within a long homoblock share similar chemical environments due to identical neighboring units, resulting in indistinct NMR peaks. In this study, through rigorous mathematical induction, inequality relations were established that enabled the calculation of frequency ranges of homoblock polyads from historically reported NMR-derived frequency values of diads and/or triads of alginates, chitosans, homogalacturonans, and galactomannans. The calculated homoblock frequency ranges were then applied to evaluate three chain growth statistical models, including the Bernoulli chain, first-order Markov chain, and second-order Markov chain, for predicting homoblock frequencies in these polysaccharides. Furthermore, based on the mathematically derived inequality relations, a novel 2D array was constructed, enabling the graphical visualization of homoblock features in polysaccharides. It was demonstrated, as a proof of concept, that the novel 2D array, along with a 1D code generated from it, could serve as an effective feature engineering tool for polymer classification using machine learning algorithms.


Subject(s)
Alginates , Magnetic Resonance Spectroscopy , Mannans , Mannans/chemistry , Alginates/chemistry , Galactose/chemistry , Galactose/analogs & derivatives , Pectins
5.
Article in English | MEDLINE | ID: mdl-39082872

ABSTRACT

Explorative data analysis (EDA) is a critical step in scientific projects, aiming to uncover valuable insights and patterns within data. Traditionally, EDA involves manual inspection, visualization, and various statistical methods. The advent of artificial intelligence (AI) and machine learning (ML) has the potential to improve EDA, offering more sophisticated approaches that enhance its efficacy. This review explores how AI and ML algorithms can improve feature engineering and selection during EDA, leading to more robust predictive models and data-driven decisions. Tree-based models, regularized regression, and clustering algorithms were identified as key techniques. These methods automate feature importance ranking, handle complex interactions, perform feature selection, reveal hidden groupings, and detect anomalies. Real-world applications include risk prediction in total hip arthroplasty and subgroup identification in scoliosis patients. Recent advances in explainable AI and EDA automation show potential for further improvement. The integration of AI and ML into EDA accelerates tasks and uncovers sophisticated insights. However, effective utilization requires a deep understanding of the algorithms, their assumptions, and limitations, along with domain knowledge for proper interpretation. As data continues to grow, AI will play an increasingly pivotal role in EDA when combined with human expertise, driving more informed, data-driven decision-making across various scientific domains. Level of Evidence: Level V - Expert opinion.

6.
Methods Mol Biol ; 2844: 33-44, 2024.
Article in English | MEDLINE | ID: mdl-39068330

ABSTRACT

Promoters are the genomic regions upstream of genes that RNA polymerase binds in order to initiate gene transcription. Understanding the regulation of gene expression depends on being able to identify promoters, because they are the most important component of gene expression. Agrobacterium tumefaciens (A. tumefaciens) strain C58 was the subject of this study with the goal of creating a machine learning-based model to predict promoters. In this study, nucleotide density (ND), k-mer, and one-hot were used to encode the promoter sequence. Support vector machine (SVM) on fivefold cross-validation with incremental feature selection (IFS) was used to optimize the generated features. These improved characteristics were then used to distinguish promoter sequences by feeding them into the random forest (RF) classifier. Tenfold cross-validation (CV) analysis revealed that the projected model has the ability to produce an accuracy of 84.22%.


Subject(s)
Agrobacterium tumefaciens , Artificial Intelligence , Promoter Regions, Genetic , Support Vector Machine , Agrobacterium tumefaciens/genetics , Computational Biology/methods , Algorithms
7.
JMIR Public Health Surveill ; 10: e52353, 2024 Jul 18.
Article in English | MEDLINE | ID: mdl-39024001

ABSTRACT

BACKGROUND: Multimorbidity is a significant public health concern, characterized by the coexistence and interaction of multiple preexisting medical conditions. This complex condition has been associated with an increased risk of COVID-19. Individuals with multimorbidity who contract COVID-19 often face a significant reduction in life expectancy. The postpandemic period has also highlighted an increase in frailty, emphasizing the importance of integrating existing multimorbidity details into epidemiological risk assessments. Managing clinical data that include medical histories presents significant challenges, particularly due to the sparsity of data arising from the rarity of multimorbidity conditions. Also, the complex enumeration of combinatorial multimorbidity features introduces challenges associated with combinatorial explosions. OBJECTIVE: This study aims to assess the severity of COVID-19 in individuals with multiple medical conditions, considering their demographic characteristics such as age and sex. We propose an evolutionary machine learning model designed to handle sparsity, analyzing preexisting multimorbidity profiles of patients hospitalized with COVID-19 based on their medical history. Our objective is to identify the optimal set of multimorbidity feature combinations strongly associated with COVID-19 severity. We also apply the Apriori algorithm to these evolutionarily derived predictive feature combinations to identify those with high support. METHODS: We used data from 3 administrative sources in Piedmont, Italy, involving 12,793 individuals aged 45-74 years who tested positive for COVID-19 between February and May 2020. From their 5-year pre-COVID-19 medical histories, we extracted multimorbidity features, including drug prescriptions, disease diagnoses, sex, and age. Focusing on COVID-19 hospitalization, we segmented the data into 4 cohorts based on age and sex. Addressing data imbalance through random resampling, we compared various machine learning algorithms to identify the optimal classification model for our evolutionary approach. Using 5-fold cross-validation, we evaluated each model's performance. Our evolutionary algorithm, utilizing a deep learning classifier, generated prediction-based fitness scores to pinpoint multimorbidity combinations associated with COVID-19 hospitalization risk. Eventually, the Apriori algorithm was applied to identify frequent combinations with high support. RESULTS: We identified multimorbidity predictors associated with COVID-19 hospitalization, indicating more severe COVID-19 outcomes. Frequently occurring morbidity features in the final evolved combinations were age>53, R03BA (glucocorticoid inhalants), and N03AX (other antiepileptics) in cohort 1; A10BA (biguanide or metformin) and N02BE (anilides) in cohort 2; N02AX (other opioids) and M04AA (preparations inhibiting uric acid production) in cohort 3; and G04CA (Alpha-adrenoreceptor antagonists) in cohort 4. CONCLUSIONS: When combined with other multimorbidity features, even less prevalent medical conditions show associations with the outcome. This study provides insights beyond COVID-19, demonstrating how repurposed administrative data can be adapted and contribute to enhanced risk assessment for vulnerable populations.


Subject(s)
COVID-19 , Hospitalization , Machine Learning , Multimorbidity , Humans , COVID-19/epidemiology , Italy/epidemiology , Male , Female , Aged , Hospitalization/statistics & numerical data , Middle Aged , Longitudinal Studies , Aged, 80 and over
8.
Sensors (Basel) ; 24(11)2024 May 23.
Article in English | MEDLINE | ID: mdl-38894140

ABSTRACT

Nocturnal enuresis (NE) is involuntary bedwetting during sleep, typically appearing in young children. Despite the potential benefits of the long-term home monitoring of NE patients for research and treatment enhancement, this area remains underexplored. To address this, we propose NEcare, an in-home monitoring system that utilizes wearable devices and machine learning techniques. NEcare collects sensor data from an electrocardiogram, body impedance (BI), a three-axis accelerometer, and a three-axis gyroscope to examine bladder volume (BV), heart rate (HR), and periodic limb movements in sleep (PLMS). Additionally, it analyzes the collected NE patient data and supports NE moment estimation using heuristic rules and deep learning techniques. To demonstrate the feasibility of in-home monitoring for NE patients using our wearable system, we used our datasets from 30 in-hospital patients and 4 in-home patients. The results show that NEcare captures expected trends associated with NE occurrences, including BV increase, HR increase, and PLMS appearance. In addition, we studied the machine learning-based NE moment estimation, which could help relieve the burdens of NE patients and their families. Finally, we address the limitations and outline future research directions for the development of wearable systems for NE patients.


Subject(s)
Nocturnal Enuresis , Wearable Electronic Devices , Humans , Nocturnal Enuresis/physiopathology , Monitoring, Physiologic/instrumentation , Monitoring, Physiologic/methods , Child , Heart Rate/physiology , Machine Learning , Male , Female , Electrocardiography/methods , Sleep/physiology , Monitoring, Ambulatory/instrumentation , Monitoring, Ambulatory/methods
9.
J Pathol Inform ; 15: 100382, 2024 Dec.
Article in English | MEDLINE | ID: mdl-38840834

ABSTRACT

Knee osteoarthritis (OA) is a prevalent condition causing significant disability, particularly among the elderly, necessitating advancements in diagnostic methodologies to facilitate early detection and treatment. Traditional OA diagnosis, relying on radiography and physical exams, faces limitations in accuracy and objectivity. This underscores the need for more advanced diagnostic methods, such as machine learning (ML) and deep learning (DL), to improve OA detection and classification. This research introduces a novel ensemble learning approach for image data feature extraction which ingeniously combines the strengths of 2 advanced (ML) models with a (DL) method to substantially improve the accuracy of OA detection from radiographic images. This innovative strategy aims to address the limitations of traditional diagnostic tools by leveraging the enhanced sensitivity and specificity of combined ML and DL models. The methodology deployed in this study encompasses the application of 10 ML models to a comprehensive publicly available Kaggle dataset with a total of 3615 samples of knee X-ray images. Through rigorous k-fold cross-validation and meticulous hyperparameter optimization, we also included evaluation metrics like accuracy, receiver operating characteristic, precision, recall, and F1-score to assess our models' performance effectively. The proposed novel CDK (convolutional neural network, decision tree, K-nearest classifier) ensemble approach for feature extraction is designed to synergize the predictive capabilities of individual models, thereby significantly improving the detection accuracy of OA indicators within radiographic images. We applied several ML and DL approaches to the newly created feature set to evaluate performance. The CDK ensemble model outperformed state-of-the-art studies with a high-performance score of 99.72% accuracy. This remarkable achievement underscores the model's exceptional capability in the early detection of OA, highlighting its superiority in comparison to existing methods.

10.
Front Cell Infect Microbiol ; 14: 1385562, 2024.
Article in English | MEDLINE | ID: mdl-38846353

ABSTRACT

Background: Lower respiratory tract infections represent prevalent ailments. Nonetheless, current comprehension of the microbial ecosystems within the lower respiratory tract remains incomplete and necessitates further comprehensive assessment. Leveraging the advancements in metagenomic next-generation sequencing (mNGS) technology alongside the emergence of machine learning, it is now viable to compare the attributes of lower respiratory tract microbial communities among patients across diverse age groups, diseases, and infection types. Method: We collected bronchoalveolar lavage fluid samples from 138 patients diagnosed with lower respiratory tract infections and conducted mNGS to characterize the lung microbiota. Employing various machine learning algorithms, we investigated the correlation of key bacteria in patients with concurrent bronchiectasis and developed a predictive model for hospitalization duration based on these identified key bacteria. Result: We observed variations in microbial communities across different age groups, diseases, and infection types. In the elderly group, Pseudomonas aeruginosa exhibited the highest relative abundance, followed by Corynebacterium striatum and Acinetobacter baumannii. Methylobacterium and Prevotella emerged as the dominant genera at the genus level in the younger group, while Mycobacterium tuberculosis and Haemophilus influenzae were prevalent species. Within the bronchiectasis group, dominant bacteria included Pseudomonas aeruginosa, Haemophilus influenzae, and Klebsiella pneumoniae. Significant differences in the presence of Pseudomonas phage JBD93 were noted between the bronchiectasis group and the control group. In the group with concomitant fungal infections, the most abundant genera were Acinetobacter and Pseudomonas, with Acinetobacter baumannii and Pseudomonas aeruginosa as the predominant species. Notable differences were observed in the presence of Human gammaherpesvirus 4, Human betaherpesvirus 5, Candida albicans, Aspergillus oryzae, and Aspergillus fumigatus between the group with concomitant fungal infections and the bacterial group. Machine learning algorithms were utilized to select bacteria and clinical indicators associated with hospitalization duration, confirming the excellent performance of bacteria in predicting hospitalization time. Conclusion: Our study provided a comprehensive description of the microbial characteristics among patients with lower respiratory tract infections, offering insights from various perspectives. Additionally, we investigated the advanced predictive capability of microbial community features in determining the hospitalization duration of these patients.


Subject(s)
Bacteria , Bronchoalveolar Lavage Fluid , High-Throughput Nucleotide Sequencing , Machine Learning , Metagenomics , Microbiota , Respiratory Tract Infections , Humans , Metagenomics/methods , Middle Aged , Respiratory Tract Infections/microbiology , Respiratory Tract Infections/virology , Aged , Male , Female , Adult , Bacteria/classification , Bacteria/genetics , Bacteria/isolation & purification , Bronchoalveolar Lavage Fluid/microbiology , Microbiota/genetics , Young Adult , Bronchiectasis/microbiology , Aged, 80 and over , Metagenome , Adolescent , Lung/microbiology , Lung/virology , Hospitalization
11.
BMC Med Inform Decis Mak ; 24(1): 152, 2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38831432

ABSTRACT

BACKGROUND: Machine learning (ML) has emerged as the predominant computational paradigm for analyzing large-scale datasets across diverse domains. The assessment of dataset quality stands as a pivotal precursor to the successful deployment of ML models. In this study, we introduce DREAMER (Data REAdiness for MachinE learning Research), an algorithmic framework leveraging supervised and unsupervised machine learning techniques to autonomously evaluate the suitability of tabular datasets for ML model development. DREAMER is openly accessible as a tool on GitHub and Docker, facilitating its adoption and further refinement within the research community.. RESULTS: The proposed model in this study was applied to three distinct tabular datasets, resulting in notable enhancements in their quality with respect to readiness for ML tasks, as assessed through established data quality metrics. Our findings demonstrate the efficacy of the framework in substantially augmenting the original dataset quality, achieved through the elimination of extraneous features and rows. This refinement yielded improved accuracy across both supervised and unsupervised learning methodologies. CONCLUSION: Our software presents an automated framework for data readiness, aimed at enhancing the integrity of raw datasets to facilitate robust utilization within ML pipelines. Through our proposed framework, we streamline the original dataset, resulting in enhanced accuracy and efficiency within the associated ML algorithms.


Subject(s)
Machine Learning , Humans , Datasets as Topic , Unsupervised Machine Learning , Algorithms , Supervised Machine Learning , Software
12.
Front Plant Sci ; 15: 1349569, 2024.
Article in English | MEDLINE | ID: mdl-38812738

ABSTRACT

Introduction: Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. Methods: When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussion: We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.

13.
Crit Care ; 28(1): 180, 2024 05 28.
Article in English | MEDLINE | ID: mdl-38802973

ABSTRACT

BACKGROUND: Sepsis, an acute and potentially fatal systemic response to infection, significantly impacts global health by affecting millions annually. Prompt identification of sepsis is vital, as treatment delays lead to increased fatalities through progressive organ dysfunction. While recent studies have delved into leveraging Machine Learning (ML) for predicting sepsis, focusing on aspects such as prognosis, diagnosis, and clinical application, there remains a notable deficiency in the discourse regarding feature engineering. Specifically, the role of feature selection and extraction in enhancing model accuracy has been underexplored. OBJECTIVES: This scoping review aims to fulfill two primary objectives: To identify pivotal features for predicting sepsis across a variety of ML models, providing valuable insights for future model development, and To assess model efficacy through performance metrics including AUROC, sensitivity, and specificity. RESULTS: The analysis included 29 studies across diverse clinical settings such as Intensive Care Units (ICU), Emergency Departments, and others, encompassing 1,147,202 patients. The review highlighted the diversity in prediction strategies and timeframes. It was found that feature extraction techniques notably outperformed others in terms of sensitivity and AUROC values, thus indicating their critical role in improving sepsis prediction models. CONCLUSION: Key dynamic indicators, including vital signs and critical laboratory values, are instrumental in the early detection of sepsis. Applying feature selection methods significantly boosts model precision, with models like Random Forest and XG Boost showing promising results. Furthermore, Deep Learning models (DL) reveal unique insights, spotlighting the pivotal role of feature engineering in sepsis prediction, which could greatly benefit clinical practice.


Subject(s)
Machine Learning , Sepsis , Humans , Sepsis/diagnosis , Sepsis/therapy , Machine Learning/trends , Machine Learning/standards
14.
Med Biol Eng Comput ; 2024 May 03.
Article in English | MEDLINE | ID: mdl-38700613

ABSTRACT

Neurodegenerative diseases often exhibit a strong link with sleep disruption, highlighting the importance of effective sleep stage monitoring. In this light, automatic sleep stage classification (ASSC) plays a pivotal role, now more streamlined than ever due to the advancements in deep learning (DL). However, the opaque nature of DL models can be a barrier in their clinical adoption, due to trust concerns among medical practitioners. To bridge this gap, we introduce SleepBoost, a transparent multi-level tree-based ensemble model specifically designed for ASSC. Our approach includes a crafted feature engineering block (FEB) that extracts 41 time and frequency domain features, out of which 23 are selected based on their high mutual information score (> 0.23). Uniquely, SleepBoost integrates three fundamental linear models into a cohesive multi-level tree structure, further enhanced by a novel reward-based adaptive weight allocation mechanism. Tested on the Sleep-EDF-20 dataset, SleepBoost demonstrates superior performance with an accuracy of 86.3%, F1-score of 80.9%, and Cohen kappa score of 0.807, outperforming leading DL models in ASSC. An ablation study underscores the critical role of our selective feature extraction in enhancing model accuracy and interpretability, crucial for clinical settings. This innovative approach not only offers a more transparent alternative to traditional DL models but also extends potential implications for monitoring and understanding sleep patterns in the context of neurodegenerative disorders. The open-source availability of SleepBoost's implementation at https://github.com/akibzaman/SleepBoost can further facilitate its accessibility and potential for widespread clinical adoption.

15.
J Affect Disord ; 356: 438-449, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38583596

ABSTRACT

BACKGROUND: General physicians misclassify depression in more than half of the cases. Researchers have explored the feasibility of leveraging passively collected data points, also called digital biomarkers, to provide more granular understanding of depression phenotypes as well as a more objective assessment of disease. METHOD: This paper provides a systematic review following the PRISMA guidelines (Page et al., 2021) to understand which digital biomarkers might be relevant for passive screening of depression. Pubmed and PsycInfo were systematically searched for studies published from 2019 to early 2024, resulting in 161 records assessed for eligibility. Excluded were intervention studies, studies focusing on a different disease or those with a lack of passive data collection. 74 studies remained for a quality assessment, after which 27 studies were included. RESULTS: The review shows that depressed participants' real-life behavior such as reduced communication with others can be tracked by passive data. Machine learning models for the classification of depression have shown accuracies up to 0.98, surpassing the quality of many standardized assessment methods. LIMITATIONS: Inconsistency of outcome reporting of current studies does not allow for drawing statistical conclusions regarding effectiveness of individual included features. The Covid-19 pandemic might have impacted the ongoing studies between 2020 and 2022. CONCLUSION: While digital biomarkers allow real-life tracking of participant's behavior and symptoms, further work is required to align the feature engineering of digital biomarkers. With shown high accuracies of assessments, connecting digital biomarkers with clinical practice can be a promising method of detecting symptoms of depression automatically.


Subject(s)
Biomarkers , Depression , Humans , Depression/diagnosis , Machine Learning , COVID-19 , Depressive Disorder/diagnosis
16.
Physiol Meas ; 45(5)2024 May 15.
Article in English | MEDLINE | ID: mdl-38663434

ABSTRACT

Objective. Electrocardiographic (ECG) lead misplacement can result in distorted waveforms and amplitudes, significantly impacting accurate interpretation. Although lead misplacement is a relatively low-probability event, with an incidence ranging from 0.4% to 4%, the large number of ECG records in clinical practice necessitates the development of an effective detection method. This paper aimed to address this gap by presenting a novel lead misplacement detection method based on deep learning models.Approach. We developed two novel lightweight deep learning model for limb and chest lead misplacement detection, respectively. For limb lead misplacement detection, two limb leads and V6 were used as inputs, while for chest lead misplacement detection, six chest leads were used as inputs. Our models were trained and validated using the Chapman database, with an 8:2 train-validation split, and evaluated on the PTB-XL, PTB, and LUDB databases. Additionally, we examined the model interpretability on the LUDB databases. Limb lead misplacement simulations were performed using mathematical transformations, while chest lead misplacement scenarios were simulated by interchanging pairs of leads. The detection performance was assessed using metrics such as accuracy, precision, sensitivity, specificity, and Macro F1-score.Main results. Our experiments simulated three scenarios of limb lead misplacement and nine scenarios of chest lead misplacement. The proposed two models achieved Macro F1-scores ranging from 93.42% to 99.61% on two heterogeneous test sets, demonstrating their effectiveness in accurately detecting lead misplacement across various arrhythmias.Significance. The significance of this study lies in providing a reliable open-source algorithm for lead misplacement detection in ECG recordings. The source code is available athttps://github.com/wjcai/ECG_lead_check.


Subject(s)
Deep Learning , Electrocardiography , Humans , Signal Processing, Computer-Assisted , Thorax
17.
J Stroke Cerebrovasc Dis ; 33(6): 107714, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38636829

ABSTRACT

OBJECTIVES: We set out to develop a machine learning model capable of distinguishing patients presenting with ischemic stroke from a healthy cohort of subjects. The model relies on a 3-min resting electroencephalogram (EEG) recording from which features can be computed. MATERIALS AND METHODS: Using a large-scale, retrospective database of EEG recordings and matching clinical reports, we were able to construct a dataset of 1385 healthy subjects and 374 stroke patients. With subjects often producing more than one recording per session, the final dataset consisted of 2401 EEG recordings (63% healthy, 37% stroke). RESULTS: Using a rich set of features encompassing both the spectral and temporal domains, our model yielded an AUC of 0.95, with a sensitivity and specificity of 93% and 86%, respectively. Allowing for multiple recordings per subject in the training set boosted sensitivity by 7%, attributable to a more balanced dataset. CONCLUSIONS: Our work demonstrates strong potential for the use of EEG in conjunction with machine learning methods to distinguish stroke patients from healthy subjects. Our approach provides a solution that is not only timely (3-minutes recording time) but also highly precise and accurate (AUC: 0.95).


Subject(s)
Brain Waves , Databases, Factual , Electroencephalography , Ischemic Stroke , Machine Learning , Predictive Value of Tests , Humans , Retrospective Studies , Male , Female , Middle Aged , Aged , Ischemic Stroke/diagnosis , Ischemic Stroke/physiopathology , Case-Control Studies , Adult , Brain/physiopathology , Signal Processing, Computer-Assisted , Reproducibility of Results , Aged, 80 and over , Diagnosis, Differential , Diagnosis, Computer-Assisted , Time Factors
18.
Biomed Phys Eng Express ; 10(4)2024 May 07.
Article in English | MEDLINE | ID: mdl-38663368

ABSTRACT

The intricate nature of lung cancer treatment poses considerable challenges upon diagnosis. Early detection plays a pivotal role in mitigating its escalating global mortality rates. Consequently, there are pressing demands for robust and dependable early detection and diagnostic systems. However, the technological limitations and complexity of the disease make it challenging to implement an efficient lung cancer screening system. AI-based CT image analysis techniques are showing significant contributions to the development of computer-assisted detection (CAD) systems for lung cancer screening. Various existing research groups are working on implementing CT image analysis systems for assessing and classifying lung cancer. However, the complexity of different structures inside the CT image is high and comprehension of significant information inherited by them is more complex even after applying advanced feature extraction and feature selection techniques. Traditional and classical feature selection techniques may struggle to capture complex interdependencies between features. They may get stuck in local optima and sometimes require additional exploration strategies. Traditional techniques may also struggle with combinatorial optimization problems when applied to a prominent feature space. This paper proposed a methodology to overcome the existing challenges by applying feature extraction using Vision Transformer (FexViT) and Feature selection using the Quantum Computing based Quadratic unconstrained binary optimization (QC-FSelQUBO) technique. This algorithm shows better performance when compared with other existing techniques. The proposed methodology showed better performance as compared to other existing techniques when evaluated by applying necessary output measures, such as accuracy, Area under roc (receiver operating characteristics) curve, precision, sensitivity, and specificity, obtained as 94.28%, 99.10%, 96.17%, 90.16% and 97.46%. The further advancement of CAD systems is essential to meet the demand for more reliable detection and diagnosis of cancer, which can be addressed by leading the proposed quantum computation and growing AI-based technology ahead.


Subject(s)
Algorithms , Lung Neoplasms , Tomography, X-Ray Computed , Humans , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/pathology , Tomography, X-Ray Computed/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Image Processing, Computer-Assisted/methods , Early Detection of Cancer/methods , ROC Curve , Quantum Theory
19.
Sensors (Basel) ; 24(7)2024 Apr 04.
Article in English | MEDLINE | ID: mdl-38610506

ABSTRACT

Anonymous networks, which aim primarily to protect user identities, have gained prominence as tools for enhancing network security and anonymity. Nonetheless, these networks have become a platform for adversarial affairs and sources of suspicious attack traffic. To defend against unpredictable adversaries on the Internet, detecting anonymous network traffic has emerged as a necessity. Many supervised approaches to identify anonymous traffic have harnessed machine learning strategies. However, many require access to engineered datasets and complex architectures to extract the desired information. Due to the resistance of anonymous network traffic to traffic analysis and the scarcity of publicly available datasets, those approaches may need to improve their training efficiency and achieve a higher performance when it comes to anonymous traffic detection. This study utilizes feature engineering techniques to extract pattern information and rank the feature importance of the static traces of anonymous traffic. To leverage these pattern attributes effectively, we developed a reinforcement learning framework that encompasses four key components: states, actions, rewards, and state transitions. A lightweight system is devised to classify anonymous and non-anonymous network traffic. Subsequently, two fine-tuned thresholds are proposed to substitute the traditional labels in a binary classification system. The system will identify anonymous network traffic without reliance on labeled data. The experimental results underscore that the system can identify anonymous traffic with an accuracy rate exceeding 80% (when based on pattern information).

20.
PeerJ Comput Sci ; 10: e1982, 2024.
Article in English | MEDLINE | ID: mdl-38660162

ABSTRACT

Maternal healthcare is a critical aspect of public health that focuses on the well-being of pregnant women before, during, and after childbirth. It encompasses a range of services aimed at ensuring the optimal health of both the mother and the developing fetus. During pregnancy and in the postpartum period, the mother's health is susceptible to several complications and risks, and timely detection of such risks can play a vital role in women's safety. This study proposes an approach to predict risks associated with maternal health. The first step of the approach involves utilizing principal component analysis (PCA) to extract significant features from the dataset. Following that, this study employs a stacked ensemble voting classifier which combines one machine learning and one deep learning model to achieve high performance. The performance of the proposed approach is compared to six machine learning algorithms and one deep learning algorithm. Two scenarios are considered for the experiments: one utilizing all features and the other using PCA features. By utilizing PCA-based features, the proposed model achieves an accuracy of 98.25%, precision of 99.17%, recall of 99.16%, and an F1 score of 99.16%. The effectiveness of the proposed model is further confirmed by comparing it to existing state of-the-art approaches.

SELECTION OF CITATIONS
SEARCH DETAIL