RESUMO
Ribonucleic acids (RNAs) play crucial roles in living organisms and some of them, such as bacterial ribosomes and precursor messenger RNA, are targets of small molecule drugs, whereas others, e.g. bacterial riboswitches or viral RNA motifs are considered as potential therapeutic targets. Thus, the continuous discovery of new functional RNA increases the demand for developing compounds targeting them and for methods for analyzing RNA-small molecule interactions. We recently developed fingeRNAt-a software for detecting non-covalent bonds formed within complexes of nucleic acids with different types of ligands. The program detects several non-covalent interactions and encodes them as structural interaction fingerprint (SIFt). Here, we present the application of SIFts accompanied by machine learning methods for binding prediction of small molecules to RNA. We show that SIFt-based models outperform the classic, general-purpose scoring functions in virtual screening. We also employed Explainable Artificial Intelligence (XAI)-the SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations and other methods to help understand the decision-making process behind the predictive models. We conducted a case study in which we applied XAI on a predictive model of ligand binding to human immunodeficiency virus type 1 trans-activation response element RNA to distinguish between residues and interaction types important for binding. We also used XAI to indicate whether an interaction has a positive or negative effect on binding prediction and to quantify its impact. Our results obtained using all XAI methods were consistent with the literature data, demonstrating the utility and importance of XAI in medicinal chemistry and bioinformatics.
Assuntos
Inteligência Artificial , RNA , Humanos , Ligantes , Aprendizado de Máquina , Precursores de RNA , RNA MensageiroRESUMO
BACKGROUND: Though several nomograms exist, machine learning (ML) approaches might improve prediction of pathologic stage in patients with prostate cancer. To develop ML models to predict pathologic stage that outperform existing nomograms that use readily available clinicopathologic variables. METHODS: Patients with prostate adenocarcinoma who underwent surgery were identified in the National Cancer Database. Seven ML models were trained to predict organ-confined (OC) disease, extracapsular extension, seminal vesicle invasion (SVI), and lymph node involvement (LNI). Model performance was measured using area under the curve (AUC) on a holdout testing data set. Clinical utility was evaluated using decision curve analysis (DCA). Performance metrics were confirmed on an external validation data set. RESULTS: The ML-based extreme gradient boosted trees model achieved the best performance with an AUC of 0.744, 0.749, 0.816, 0.811 for the OC, ECE, SVI, and LNI models, respectively. The MSK nomograms achieved an AUC of 0.708, 0.742, 0.806, 0.802 for the OC, ECE, SVI, and LNI models, respectively. These models also performed the best on DCA. Findings were consistent on both a holdout internal validation data set as well as an external validation data set. CONCLUSIONS: Our ML models better predicted pathologic stage relative to existing nomograms at predicting pathologic stage. Accurate prediction of pathologic stage can help oncologists and patients determine optimal definitive treatment options for patients with prostate cancer.
RESUMO
PURPOSE: Vestibular schwannomas (VSs) represent the most common cerebellopontine angle tumors, posing a challenge in preserving facial nerve (FN) function during surgery. We employed the Extreme Gradient Boosting machine learning classifier to predict long-term FN outcomes (classified as House-Brackmann grades 1-2 for good outcomes and 3-6 for bad outcomes) after VS surgery. METHODS: In a retrospective analysis of 256 patients, comprehensive pre-, intra-, and post-operative factors were examined. We applied the machine learning (ML) classifier Extreme Gradient Boosting (XGBoost) for the following binary classification: long-term good and bad FN outcome after VS surgery To enhance the interpretability of our model, we utilized an explainable artificial intelligence approach. RESULTS: Short-term FN function (tau = 0.6) correlated with long-term FN function. The model exhibited an average accuracy of 0.83, a ROC AUC score of 0.91, and Matthew's correlation coefficient score of 0.62. The most influential feature, identified through SHapley Additive exPlanations (SHAP), was short-term FN function. Conversely, large tumor volume and absence of preoperative auditory brainstem responses were associated with unfavorable outcomes. CONCLUSIONS: We introduce an effective ML model for classifying long-term FN outcomes following VS surgery. Short-term FN function was identified as the key predictor of long-term function. This model's excellent ability to differentiate bad and good outcomes makes it useful for evaluating patients and providing recommendations regarding FN dysfunction management.
RESUMO
BACKGROUND: Smoking is a critical risk factor responsible for over eight million annual deaths worldwide. It is essential to obtain information on smoking habits to advance research and implement preventive measures such as screening of high-risk individuals. In most countries, including Denmark, smoking habits are not systematically recorded and at best documented within unstructured free-text segments of electronic health records (EHRs). This would require researchers and clinicians to manually navigate through extensive amounts of unstructured data, which is one of the main reasons that smoking habits are rarely integrated into larger studies. Our aim is to develop machine learning models to classify patients' smoking status from their EHRs. METHODS: This study proposes an efficient natural language processing (NLP) pipeline capable of classifying patients' smoking status and providing explanations for the decisions. The proposed NLP pipeline comprises four distinct components, which are; (1) considering preprocessing techniques to address abbreviations, punctuation, and other textual irregularities, (2) four cutting-edge feature extraction techniques, i.e. Embedding, BERT, Word2Vec, and Count Vectorizer, employed to extract the optimal features, (3) utilization of a Stacking-based Ensemble (SE) model and a Convolutional Long Short-Term Memory Neural Network (CNN-LSTM) for the identification of smoking status, and (4) application of a local interpretable model-agnostic explanation to explain the decisions rendered by the detection models. The EHRs of 23,132 patients with suspected lung cancer were collected from the Region of Southern Denmark during the period 1/1/2009-31/12/2018. A medical professional annotated the data into 'Smoker' and 'Non-Smoker' with further classifications as 'Active-Smoker', 'Former-Smoker', and 'Never-Smoker'. Subsequently, the annotated dataset was used for the development of binary and multiclass classification models. An extensive comparison was conducted of the detection performance across various model architectures. RESULTS: The results of experimental validation confirm the consistency among the models. However, for binary classification, BERT method with CNN-LSTM architecture outperformed other models by achieving precision, recall, and F1-scores between 97% and 99% for both Never-Smokers and Active-Smokers. In multiclass classification, the Embedding technique with CNN-LSTM architecture yielded the most favorable results in class-specific evaluations, with equal performance measures of 97% for Never-Smoker and measures in the range of 86 to 89% for Active-Smoker and 91-92% for Never-Smoker. CONCLUSION: Our proposed NLP pipeline achieved a high level of classification performance. In addition, we presented the explanation of the decision made by the best performing detection model. Future work will expand the model's capabilities to analyze longer notes and a broader range of categories to maximize its utility in further research and screening applications.
Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Fumar , Humanos , Dinamarca/epidemiologia , Registros Eletrônicos de Saúde/estatística & dados numéricos , Fumar/epidemiologia , Aprendizado de Máquina , Feminino , Masculino , Pessoa de Meia-Idade , Redes Neurais de ComputaçãoRESUMO
Stream salinization is a global issue, yet few models can provide reliable salinity estimates for unmonitored locations at the time scales required for ecological exposure assessments. Machine learning approaches are presented that use spatially limited high-frequency monitoring and spatially distributed discrete samples to estimate the daily stream-specific conductance across a watershed. We compare the predictive performance of space- and time-unaware Random Forest models and space- and time-aware Recurrent Graph Convolution Neural Network models (KGE: 0.67 and 0.64, respectively) and use explainable artificial intelligence methods to interpret model predictions and understand salinization drivers. These models are applied to the Delaware River Basin, a developed watershed with diverse land uses that experiences anthropogenic salinization from winter deicer applications. These models capture seasonality for the winter first flush of deicers, and the streams with elevated predictions correspond well with indicators of deicer application. This result suggests that these models can be used to identify potential salinity-impaired streams for winter best management practices. Daily salinity predictions are driven primarily by land cover (urbanization) trends that may represent anthropogenic salinization processes and weather at time scales up to three months. Such modeling approaches are likely transferable to other watersheds and can be applied to further understand salinization risks and drivers.
Assuntos
Aprendizado de Máquina , Rios , Salinidade , Rios/química , Monitoramento Ambiental/métodos , Estações do Ano , Redes Neurais de ComputaçãoRESUMO
INTRODUCTION: Deep learning (DL) models offer improved performance in electrocardiogram (ECG)-based classification over rule-based methods. However, for widespread adoption by clinicians, explainability methods, like saliency maps, are essential. METHODS: On a subset of 100 ECGs from patients with chest pain, we generated saliency maps using a previously validated convolutional neural network for occlusion myocardial infarction (OMI) classification. Three clinicians reviewed ECG-saliency map dyads, first assessing the likelihood of OMI from standard ECGs and then evaluating clinical relevance and helpfulness of the saliency maps, as well as their confidence in the model's predictions. Questions were answered on a Likert scale ranging from +3 (most useful/relevant) to -3 (least useful/relevant). RESULTS: The adjudicated accuracy of the three clinicians matched the DL model when considering area under the receiver operating characteristics curve (AUC) and F1 score (AUC 0.855 vs. 0.872, F1 score = 0.789 vs. 0.747). On average, clinicians found saliency maps slightly clinically relevant (0.96 ± 0.92) and slightly helpful (0.66 ± 0.98) in identifying or ruling out OMI but had higher confidence in the model's predictions (1.71 ± 0.56). Clinicians noted that leads I and aVL were often emphasized, even when obvious ST changes were present in other leads. CONCLUSION: In this clinical usability study, clinicians deemed saliency maps somewhat helpful in enhancing explainability of DL-based ECG models. The spatial convolutional layers across the 12 leads in these models appear to contribute to the discrepancy between ECG segments considered most relevant by clinicians and segments that drove DL model predictions.
RESUMO
With the outbreak of COVID-19 in 2020, countries worldwide faced significant concerns and challenges. Various studies have emerged utilizing Artificial Intelligence (AI) and Data Science techniques for disease detection. Although COVID-19 cases have declined, there are still cases and deaths around the world. Therefore, early detection of COVID-19 before the onset of symptoms has become crucial in reducing its extensive impact. Fortunately, wearable devices such as smartwatches have proven to be valuable sources of physiological data, including Heart Rate (HR) and sleep quality, enabling the detection of inflammatory diseases. In this study, we utilize an already-existing dataset that includes individual step counts and heart rate data to predict the probability of COVID-19 infection before the onset of symptoms. We train three main model architectures: the Gradient Boosting classifier (GB), CatBoost trees, and TabNet classifier to analyze the physiological data and compare their respective performances. We also add an interpretability layer to our best-performing model, which clarifies prediction results and allows a detailed assessment of effectiveness. Moreover, we created a private dataset by gathering physiological data from Fitbit devices to guarantee reliability and avoid bias.The identical set of models was then applied to this private dataset using the same pre-trained models, and the results were documented. Using the CatBoost tree-based method, our best-performing model outperformed previous studies with an accuracy rate of 85% on the publicly available dataset. Furthermore, this identical pre-trained CatBoost model produced an accuracy of 81% when applied to the private dataset. You will find the source code in the link: https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data.git .
Assuntos
Inteligência Artificial , COVID-19 , Diagnóstico Precoce , Humanos , COVID-19/diagnóstico , Frequência Cardíaca/fisiologia , Dispositivos Eletrônicos VestíveisRESUMO
BACKGROUND: Worldwide, sepsis is the leading cause of death in hospitals. If mortality rates in patients with sepsis can be predicted early, medical resources can be allocated efficiently. We constructed machine learning (ML) models to predict the mortality of patients with sepsis in a hospital emergency department. METHODS: This study prospectively collected nationwide data from an ongoing multicenter cohort of patients with sepsis identified in the emergency department. Patients were enrolled from 19 hospitals between September 2019 and December 2020. For acquired data from 3,657 survivors and 1,455 deaths, six ML models (logistic regression, support vector machine, random forest, extreme gradient boosting [XGBoost], light gradient boosting machine, and categorical boosting [CatBoost]) were constructed using fivefold cross-validation to predict mortality. Through these models, 44 clinical variables measured on the day of admission were compared with six sequential organ failure assessment (SOFA) components (PaO2/FIO2 [PF], platelets (PLT), bilirubin, cardiovascular, Glasgow Coma Scale score, and creatinine). The confidence interval (CI) was obtained by performing 10,000 repeated measurements via random sampling of the test dataset. All results were explained and interpreted using Shapley's additive explanations (SHAP). RESULTS: Of the 5,112 participants, CatBoost exhibited the highest area under the curve (AUC) of 0.800 (95% CI, 0.756-0.840) using clinical variables. Using the SOFA components for the same patient, XGBoost exhibited the highest AUC of 0.678 (95% CI, 0.626-0.730). As interpreted by SHAP, albumin, lactate, blood urea nitrogen, and international normalization ratio were determined to significantly affect the results. Additionally, PF and PLTs in the SOFA component significantly influenced the prediction results. CONCLUSION: Newly established ML-based models achieved good prediction of mortality in patients with sepsis. Using several clinical variables acquired at the baseline can provide more accurate results for early predictions than using SOFA components. Additionally, the impact of each variable was identified.
Assuntos
Serviço Hospitalar de Emergência , Sepse , Humanos , Albuminas , Ácido Láctico , Aprendizado de Máquina , Sepse/diagnósticoRESUMO
Machine learning (ML) has found widespread application in various domains. Additionally, ML-based techniques have been employed to address security issues in technology, with numerous studies showcasing their potential and effectiveness in tackling security problems. Over the years, ML methods for identifying malicious software have been developed across various security domains. However, recent research has highlighted the susceptibility of ML models to small input perturbations, known as adversarial examples, which can significantly alter model predictions. While prior studies on adversarial examples primarily focused on ML models for image processing, they have progressively extended to other applications, including security. Interestingly, adversarial attacks have proven to be particularly effective in the realm of malware classification. This study aims to explore the transparency of malware classification and develop an explanation method for malware classifiers. The challenge at hand is more complex than those associated with explainable AI for homogeneous data due to the intricate data structure of malware compared to traditional image datasets. The research revealed that existing explanations fall short in interpreting heterogeneous data. Our employed methods demonstrated that current malware detectors, despite high classification accuracy, may provide a misleading sense of security and measuring classification accuracy is insufficient for validating detectors.
RESUMO
The early prediction of ocular disease is certainly an obligatory concern in the domain of ophthalmic medicine. Although modern scientific discoveries have shown the potential to treat eye diseases by using artificial intelligence (AI) and machine learning, explainable AI remains a crucial challenge confronting this area of research. Although some traditional methods put in significant effort, they cannot accurately predict the proper ocular diseases. However, incorporating AI into diagnosing eye diseases in healthcare complicates the situation as the decision-making process of AI demonstrates complexity, which is a significant concern, especially in major sectors like ocular disease prediction. The lack of transparency in the AI models may hinder the confidence and trust of the doctors and the patients, as well as their perception of the AI and its abilities. Accordingly, explainable AI is significant in ensuring trust in the technology, enhancing clinical decision-making ability, and deploying ocular disease detection. This research proposed an efficient transfer learning model for eye disease prediction to transform smart vision potential in the healthcare sector and meet conventional approaches' challenges while integrating explainable artificial intelligence (XAI). The integration of XAI in the proposed model ensures the transparency of the decision-making process through the comprehensive provision of rationale. This proposed model provides promising results with 95.74% accuracy and explains the transformative potential of XAI in advancing ocular healthcare. This significant milestone underscores the effectiveness of the proposed model in accurately determining various types of ocular disease. It is clearly shown that the proposed model is performing better than the previously published methods.
Assuntos
Inteligência Artificial , Oftalmopatias , Humanos , Oftalmopatias/diagnóstico , Aprendizado de Máquina , AlgoritmosRESUMO
Recently, explainability in machine and deep learning has become an important area in the field of research as well as interest, both due to the increasing use of artificial intelligence (AI) methods and understanding of the decisions made by models. The explainability of artificial intelligence (XAI) is due to the increasing consciousness in, among other things, data mining, error elimination, and learning performance by various AI algorithms. Moreover, XAI will allow the decisions made by models in problems to be more transparent as well as effective. In this study, models from the 'glass box' group of Decision Tree, among others, and the 'black box' group of Random Forest, among others, were proposed to understand the identification of selected types of currant powders. The learning process of these models was carried out to determine accuracy indicators such as accuracy, precision, recall, and F1-score. It was visualized using Local Interpretable Model Agnostic Explanations (LIMEs) to predict the effectiveness of identifying specific types of blackcurrant powders based on texture descriptors such as entropy, contrast, correlation, dissimilarity, and homogeneity. Bagging (Bagging_100), Decision Tree (DT0), and Random Forest (RF7_gini) proved to be the most effective models in the framework of currant powder interpretability. The measures of classifier performance in terms of accuracy, precision, recall, and F1-score for Bagging_100, respectively, reached values of approximately 0.979. In comparison, DT0 reached values of 0.968, 0.972, 0.968, and 0.969, and RF7_gini reached values of 0.963, 0.964, 0.963, and 0.963. These models achieved classifier performance measures of greater than 96%. In the future, XAI using agnostic models can be an additional important tool to help analyze data, including food products, even online.
Assuntos
Algoritmos , Inteligência Artificial , Aprendizado de Máquina , Pós , Ribes , Pós/química , Ribes/química , Árvores de DecisõesRESUMO
This study presents a novel approach to predicting price fluctuations for U.S. sector index ETFs. By leveraging information-theoretic measures like mutual information and transfer entropy, we constructed threshold networks highlighting nonlinear dependencies between log returns and trading volume rate changes. We derived centrality measures and node embeddings from these networks, offering unique insights into the ETFs' dynamics. By integrating these features into gradient-boosting algorithm-based models, we significantly enhanced the predictive accuracy. Our approach offers improved forecast performance for U.S. sector index futures and adds a layer of explainability to the existing literature.
RESUMO
Background: Cardiovascular diseases (CVD) remain the predominant global cause of mortality, with both low and high temperatures increasing CVD-related mortalities. Climate change impacts human health directly through temperature fluctuations and indirectly via factors like disease vectors. Elevated and reduced temperatures have been linked to increases in CVD-related hospitalizations and mortality, with various studies worldwide confirming the significant health implications of temperature variations and air pollution on cardiovascular outcomes. Methods: A database of daily Emergency Room admissions at the Giovanni XIII Polyclinic in Bari (Southern Italy) was developed, spanning from 2013 to 2019, including weather and air quality data. A Random Forest (RF) supervised machine learning model was used to simulate the trend of hospital admissions for CVD. The Seasonal and Trend decomposition using Loess (STL) decomposition model separated the trend component, while cross-validation techniques were employed to prevent overfitting. Model performance was assessed using specific metrics and error analysis. Additionally, the SHapley Additive exPlanations (SHAP) method, a feature importance technique within the eXplainable Artificial Intelligence (XAI) framework, was used to identify the feature importance. Results: An R 2 of 0.97 and a Mean Absolute Error of 0.36 admissions were achieved by the model. Atmospheric pressure, minimum temperature, and carbon monoxide were found to collectively contribute about 74% to the model's predictive power, with atmospheric pressure being the dominant factor at 37%. Conclusions: This research underscores the significant influence of weather-climate variables on cardiovascular diseases. The identified key climate factors provide a practical framework for policymakers and healthcare professionals to mitigate the adverse effects of climate change on CVD and devise preventive strategies.
RESUMO
IoT devices have grown in popularity in recent years. Statistics show that the number of online IoT devices exceeded 35 billion in 2022. This rapid growth in adoption made these devices an obvious target for malicious actors. Attacks such as botnets and malware injection usually start with a phase of reconnaissance to gather information about the target IoT device before exploitation. In this paper, we introduce a machine-learning-based detection system for reconnaissance attacks based on an explainable ensemble model. Our proposed system aims to detect scanning and reconnaissance activity of IoT devices and counter these attacks at an early stage of the attack campaign. The proposed system is designed to be efficient and lightweight to operate in severely resource-constrained environments. When tested, the implementation of the proposed system delivered an accuracy of 99%. Furthermore, the proposed system showed low false positive and false negative rates at 0.6% and 0.05%, respectively, while maintaining high efficiency and low resource consumption.
Assuntos
Aprendizagem , Aprendizado de MáquinaRESUMO
Despite several existing techniques for distributed sensing (temperature and strain) using standard Single-Mode optical Fiber (SMF), compensating or decoupling both effects is mandatory for many applications. Currently, most decoupling techniques require special optical fibers and are difficult to implement with high-spatial-resolution distributed techniques, such as OFDR. Therefore, this work's objective is to study the feasibility of decoupling temperature and strain out of the readouts of a phase and polarization analyzer OFDR (Ï-PA-OFDR) taken over an SMF. For this purpose, the readouts will be subjected to a study using several machine learning algorithms, among them Deep Neural Networks. The motivation that underlies this target is the current blockage in the widespread use of Fiber Optic Sensors in situations where both strain and temperature change, due to the coupled dependence of currently developed sensing methods. Instead of using other types of sensors or even other interrogation methods, the objective of this work is to analyze the available information in order to develop a sensing method capable of providing information about strain and temperature simultaneously.
Assuntos
Algoritmos , Redes Neurais de Computação , Estudos de Viabilidade , Temperatura , Tecnologia de Fibra ÓpticaRESUMO
Terminal neurological conditions can affect millions of people worldwide and hinder them from doing their daily tasks and movements normally. Brain computer interface (BCI) is the best hope for many individuals with motor deficiencies. It will help many patients interact with the outside world and handle their daily tasks without assistance. Therefore, machine learning-based BCI systems have emerged as non-invasive techniques for reading out signals from the brain and interpreting them into commands to help those people to perform diverse limb motor tasks. This paper proposes an innovative and improved machine learning-based BCI system that analyzes EEG signals obtained from motor imagery to distinguish among various limb motor tasks based on BCI competition III dataset IVa. The proposed framework pipeline for EEG signal processing performs the following major steps. The first step uses a meta-heuristic optimization technique, called the whale optimization algorithm (WOA), to select the optimal features for discriminating between neural activity patterns. The pipeline then uses machine learning models such as LDA, k-NN, DT, RF, and LR to analyze the chosen features to enhance the precision of EEG signal analysis. The proposed BCI system, which merges the WOA as a feature selection method and the optimized k-NN classification model, demonstrated an overall accuracy of 98.6%, outperforming other machine learning models and previous techniques on the BCI competition III dataset IVa. Additionally, the EEG feature contribution in the ML classification model is reported using Explainable AI (XAI) tools, which provide insights into the individual contributions of the features in the predictions made by the model. By incorporating XAI techniques, the results of this study offer greater transparency and understanding of the relationship between the EEG features and the model's predictions. The proposed method shows potential levels for better use in controlling diverse limb motor tasks to help people with limb impairments and support them while enhancing their quality of life.
Assuntos
Interfaces Cérebro-Computador , Qualidade de Vida , Eletroencefalografia/métodos , Algoritmos , Aprendizado de MáquinaRESUMO
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide, and the number of cases is constantly increasing. Early and accurate HCC diagnosis is crucial to improving the effectiveness of treatment. The aim of the study is to develop a supervised learning framework based on hierarchical community detection and artificial intelligence in order to classify patients and controls using publicly available microarray data. With our methodology, we identified 20 gene communities that discriminated between healthy and cancerous samples, with an accuracy exceeding 90%. We validated the performance of these communities on an independent dataset, and with two of them, we reached an accuracy exceeding 80%. Then, we focused on two communities, selected because they were enriched with relevant biological functions, and on these we applied an explainable artificial intelligence (XAI) approach to analyze the contribution of each gene to the classification task. In conclusion, the proposed framework provides an effective methodological and quantitative tool helping to find gene communities, which may uncover pivotal mechanisms responsible for HCC and thus discover new biomarkers.
Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Inteligência Artificial , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Marcadores Genéticos , Nível de SaúdeRESUMO
Deep learning networks powered by AI are essential predictive tools relying on image data availability and processing hardware advancements. However, little attention has been paid to explainable AI (XAI) in application fields, including environmental management. This study develops an explainability framework with a triadic structure to focus on input, AI model and output. The framework provides three main contributions. (1) A context-based augmentation of input data to maximize generalizability and minimize overfitting. (2) A direct monitoring of AI model layers and parameters to use leaner (lighter) networks suitable for edge device deployment, (3) An output explanation procedure focusing on interpretability and robustness of predictive decisions by AI networks. These contributions significantly advance state of the art in XAI for environmental management research, offering implications for improved understanding and utilization of AI networks in this field.
Assuntos
Conservação dos Recursos Naturais , Aprendizado ProfundoRESUMO
In this work, a computational scheme is proposed to identify the main combinations of handcrafted descriptors and deep-learned features capable of classifying histological images stained with hematoxylin and eosin. The handcrafted descriptors were those representatives of multiscale and multidimensional fractal techniques (fractal dimension, lacunarity and percolation) applied to quantify the histological images with the corresponding representations via explainable artificial intelligence (xAI) approaches. The deep-learned features were obtained from different convolutional neural networks (DenseNet-121, EfficientNet-b2, Inception-V3, ResNet-50 and VGG-19). The descriptors were investigated through different associations. The most relevant combinations, defined through a ranking algorithm, were analyzed via a heterogeneous ensemble of classifiers with the support vector machine, naive Bayes, random forest and K-nearest neighbors algorithms. The proposed scheme was applied to histological samples representative of breast cancer, colorectal cancer, oral dysplasia and liver tissue. The best results were accuracy rates of 94.83% to 100%, with the identification of pattern ensembles for classifying multiple histological images. The computational scheme indicated solutions exploring a reduced number of features (a maximum of 25 descriptors) and with better performance values than those observed in the literature. The presented information in this study is useful to complement and improve the development of computer-aided diagnosis focused on histological images.
RESUMO
The outbreak of the COVID-19 pandemic has transpired the global media to gallop with reports and news on the novel Coronavirus. The intensity of the news chatter on various aspects of the pandemic, in conjunction with the sentiment of the same, accounts for the uncertainty of investors linked to financial markets. In this research, Artificial Intelligence (AI) driven frameworks have been propounded to gauge the proliferation of COVID-19 news towards Indian stock markets through the lens of predictive modelling. Two hybrid predictive frameworks, UMAP-LSTM and ISOMAP-GBR, have been constructed to accurately forecast the daily stock prices of 10 Indian companies of different industry verticals using several systematic media chatter indices related to the COVID-19 pandemic alongside several orthodox technical indicators and macroeconomic variables. The outcome of the rigorous predictive exercise rationalizes the utility of monitoring relevant media news worldwide and in India. Additional model interpretation using Explainable AI (XAI) methodologies indicates that a high quantum of overall media hype, media coverage, fake news, etc., leads to bearish market regimes.