Search | VHL Regional Portal

Intelligent personalized shopping recommendation using clustering and supervised machine learning algorithms.

Chabane, Nail; Bouaoune, Achraf; Tighilt, Reda; Abdar, Moloud; Boc, Alix; Lord, Etienne; Tahiri, Nadia; Mazoure, Bogdan; Acharya, U Rajendra; Makarenkov, Vladimir.

PLoS One ; 17(12): e0278364, 2022.

Article in English | MEDLINE | ID: mdl-36454766

ABSTRACT

Next basket recommendation is a critical task in market basket data analysis. It is particularly important in grocery shopping, where grocery lists are an essential part of shopping habits of many customers. In this work, we first present a new grocery Recommender System available on the MyGroceryTour platform. Our online system uses different traditional machine learning (ML) and deep learning (DL) algorithms, and provides recommendations to users in a real-time manner. It aims to help Canadian customers create their personalized intelligent weekly grocery lists based on their individual purchase histories, weekly specials offered in local stores, and product cost and availability information. We perform clustering analysis to partition given customer profiles into four non-overlapping clusters according to their grocery shopping habits. Then, we conduct computational experiments to compare several traditional ML algorithms and our new DL algorithm based on the use of a gated recurrent unit (GRU)-based recurrent neural network (RNN) architecture. Our DL algorithm can be viewed as an extension of DREAM (Dynamic REcurrent bAsket Model) adapted to multi-class (i.e. multi-store) classification, since a given user can purchase recommended products in different grocery stores in which these products are available. Among traditional ML algorithms, the highest average F-score of 0.516 for the considered data set of 831 customers was obtained using Random Forest, whereas our proposed DL algorithm yielded the average F-score of 0.559 for this data set. The main advantage of the presented Recommender System is that our intelligent recommendation is personalized, since a separate traditional ML or DL model is built for each customer considered. Such a personalized approach allows us to outperform the prediction results provided by general state-of-the-art DL models.

Subject(s)

Algorithms , Supervised Machine Learning , Canada , Cluster Analysis , Machine Learning

DUNEScan: a web server for uncertainty estimation in skin cancer detection with deep neural networks.

Mazoure, Bogdan; Mazoure, Alexander; Bédard, Jocelyn; Makarenkov, Vladimir.

Sci Rep ; 12(1): 179, 2022 01 07.

Article in English | MEDLINE | ID: mdl-34996997

ABSTRACT

Recent years have seen a steep rise in the number of skin cancer detection applications. While modern advances in deep learning made possible reaching new heights in terms of classification accuracy, no publicly available skin cancer detection software provide confidence estimates for these predictions. We present DUNEScan (Deep Uncertainty Estimation for Skin Cancer), a web server that performs an intuitive in-depth analysis of uncertainty in commonly used skin cancer classification models based on convolutional neural networks (CNNs). DUNEScan allows users to upload a skin lesion image, and quickly compares the mean and the variance estimates provided by a number of new and traditional CNN models. Moreover, our web server uses the Grad-CAM and UMAP algorithms to visualize the classification manifold for the user's input, hence providing crucial information about its closeness to skin lesion images from the popular ISIC database. DUNEScan is freely available at: https://www.dunescan.org .

Subject(s)

Deep Learning , Diagnosis, Computer-Assisted , Image Interpretation, Computer-Assisted , Internet , Photography , Skin Neoplasms/pathology , Decision Support Techniques , Humans , Predictive Value of Tests , Reproducibility of Results , Skin Neoplasms/classification , Uncertainty

Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning.

Abdar, Moloud; Samami, Maryam; Dehghani Mahmoodabad, Sajjad; Doan, Thang; Mazoure, Bogdan; Hashemifesharaki, Reza; Liu, Li; Khosravi, Abbas; Acharya, U Rajendra; Makarenkov, Vladimir; Nahavandi, Saeid.

Comput Biol Med ; 135: 104418, 2021 08.

Article in English | MEDLINE | ID: mdl-34052016

ABSTRACT

Accurate automated medical image recognition, including classification and segmentation, is one of the most challenging tasks in medical image analysis. Recently, deep learning methods have achieved remarkable success in medical image classification and segmentation, clearly becoming the state-of-the-art methods. However, most of these methods are unable to provide uncertainty quantification (UQ) for their output, often being overconfident, which can lead to disastrous consequences. Bayesian Deep Learning (BDL) methods can be used to quantify uncertainty of traditional deep learning methods, and thus address this issue. We apply three uncertainty quantification methods to deal with uncertainty during skin cancer image classification. They are as follows: Monte Carlo (MC) dropout, Ensemble MC (EMC) dropout and Deep Ensemble (DE). To further resolve the remaining uncertainty after applying the MC, EMC and DE methods, we describe a novel hybrid dynamic BDL model, taking into account uncertainty, based on the Three-Way Decision (TWD) theory. The proposed dynamic model enables us to use different UQ methods and different deep neural networks in distinct classification phases. So, the elements of each phase can be adjusted according to the dataset under consideration. In this study, two best UQ methods (i.e., DE and EMC) are applied in two classification phases (the first and second phases) to analyze two well-known skin cancer datasets, preventing one from making overconfident decisions when it comes to diagnosing the disease. The accuracy and the F1-score of our final solution are, respectively, 88.95% and 89.00% for the first dataset, and 90.96% and 91.00% for the second dataset. Our results suggest that the proposed TWDBDL model can be used effectively at different stages of medical image analysis.

Subject(s)

Deep Learning , Skin Neoplasms , Bayes Theorem , Humans , Neural Networks, Computer , Skin Neoplasms/diagnostic imaging , Uncertainty

Horizontal gene transfer and recombination analysis of SARS-CoV-2 genes helps discover its close relatives and shed light on its origin.

Makarenkov, Vladimir; Mazoure, Bogdan; Rabusseau, Guillaume; Legendre, Pierre.

BMC Ecol Evol ; 21(1): 5, 2021 01 21.

Article in English | MEDLINE | ID: mdl-33514319

ABSTRACT

BACKGROUND: The SARS-CoV-2 pandemic is one of the greatest global medical and social challenges that have emerged in recent history. Human coronavirus strains discovered during previous SARS outbreaks have been hypothesized to pass from bats to humans using intermediate hosts, e.g. civets for SARS-CoV and camels for MERS-CoV. The discovery of an intermediate host of SARS-CoV-2 and the identification of specific mechanism of its emergence in humans are topics of primary evolutionary importance. In this study we investigate the evolutionary patterns of 11 main genes of SARS-CoV-2. Previous studies suggested that the genome of SARS-CoV-2 is highly similar to the horseshoe bat coronavirus RaTG13 for most of the genes and to some Malayan pangolin coronavirus (CoV) strains for the receptor binding (RB) domain of the spike protein. RESULTS: We provide a detailed list of statistically significant horizontal gene transfer and recombination events (both intergenic and intragenic) inferred for each of 11 main genes of the SARS-CoV-2 genome. Our analysis reveals that two continuous regions of genes S and N of SARS-CoV-2 may result from intragenic recombination between RaTG13 and Guangdong (GD) Pangolin CoVs. Statistically significant gene transfer-recombination events between RaTG13 and GD Pangolin CoV have been identified in region [1215-1425] of gene S and region [534-727] of gene N. Moreover, some statistically significant recombination events between the ancestors of SARS-CoV-2, RaTG13, GD Pangolin CoV and bat CoV ZC45-ZXC21 coronaviruses have been identified in genes ORF1ab, S, ORF3a, ORF7a, ORF8 and N. Furthermore, topology-based clustering of gene trees inferred for 25 CoV organisms revealed a three-way evolution of coronavirus genes, with gene phylogenies of ORF1ab, S and N forming the first cluster, gene phylogenies of ORF3a, E, M, ORF6, ORF7a, ORF7b and ORF8 forming the second cluster, and phylogeny of gene ORF10 forming the third cluster. CONCLUSIONS: The results of our horizontal gene transfer and recombination analysis suggest that SARS-CoV-2 could not only be a chimera virus resulting from recombination of the bat RaTG13 and Guangdong pangolin coronaviruses but also a close relative of the bat CoV ZC45 and ZXC21 strains. They also indicate that a GD pangolin may be an intermediate host of this dangerous virus.

Subject(s)

COVID-19 , SARS-CoV-2 , Animals , Evolution, Molecular , Gene Transfer, Horizontal , Genome, Viral/genetics , Humans

Identification and Correction of Additive and Multiplicative Spatial Biases in Experimental High-Throughput Screening.

Mazoure, Bogdan; Caraus, Iurie; Nadon, Robert; Makarenkov, Vladimir.

SLAS Discov ; 23(5): 448-458, 2018 06.

Article in English | MEDLINE | ID: mdl-29346010

ABSTRACT

Data generated by high-throughput screening (HTS) technologies are prone to spatial bias. Traditionally, bias correction methods used in HTS assume either a simple additive or, more recently, a simple multiplicative spatial bias model. These models do not, however, always provide an accurate correction of measurements in wells located at the intersection of rows and columns affected by spatial bias. The measurements in these wells depend on the nature of interaction between the involved biases. Here, we propose two novel additive and two novel multiplicative spatial bias models accounting for different types of bias interactions. We describe a statistical procedure that allows for detecting and removing different types of additive and multiplicative spatial biases from multiwell plates. We show how this procedure can be applied by analyzing data generated by the four HTS technologies (homogeneous, microorganism, cell-based, and gene expression HTS), the three high-content screening (HCS) technologies (area, intensity, and cell-count HCS), and the only small-molecule microarray technology available in the ChemBank small-molecule screening database. The proposed methods are included in the AssayCorrector program, implemented in R, and available on CRAN.

Subject(s)

High-Throughput Screening Assays/methods , Bias , Databases, Chemical , Drug Discovery/methods , Small Molecule Libraries/chemistry

Identification and correction of spatial bias are essential for obtaining quality data in high-throughput screening technologies.

Mazoure, Bogdan; Nadon, Robert; Makarenkov, Vladimir.

Sci Rep ; 7(1): 11921, 2017 09 20.

Article in English | MEDLINE | ID: mdl-28931934

ABSTRACT

Spatial bias continues to be a major challenge in high-throughput screening technologies. Its successful detection and elimination are critical for identifying the most promising drug candidates. Here, we examine experimental small molecule assays from the popular ChemBank database and show that screening data are widely affected by both assay-specific and plate-specific spatial biases. Importantly, the bias affecting screening data can fit an additive or multiplicative model. We show that the use of appropriate statistical methods is essential for improving the quality of experimental screening data. The presented methodology can be recommended for the analysis of current and next-generation screening data.

Detecting and removing multiplicative spatial bias in high-throughput screening technologies.

Caraus, Iurie; Mazoure, Bogdan; Nadon, Robert; Makarenkov, Vladimir.

Bioinformatics ; 33(20): 3258-3267, 2017 Oct 15.

Article in English | MEDLINE | ID: mdl-28633418

ABSTRACT

MOTIVATION: Considerable attention has been paid recently to improve data quality in high-throughput screening (HTS) and high-content screening (HCS) technologies widely used in drug development and chemical toxicity research. However, several environmentally- and procedurally-induced spatial biases in experimental HTS and HCS screens decrease measurement accuracy, leading to increased numbers of false positives and false negatives in hit selection. Although effective bias correction methods and software have been developed over the past decades, almost all of these tools have been designed to reduce the effect of additive bias only. Here, we address the case of multiplicative spatial bias. RESULTS: We introduce three new statistical methods meant to reduce multiplicative spatial bias in screening technologies. We assess the performance of the methods with synthetic and real data affected by multiplicative spatial bias, including comparisons with current bias correction methods. We also describe a wider data correction protocol that integrates methods for removing both assay and plate-specific spatial biases, which can be either additive or multiplicative. CONCLUSIONS: The methods for removing multiplicative spatial bias and the data correction protocol are effective in detecting and cleaning experimental data generated by screening technologies. As our protocol is of a general nature, it can be used by researchers analyzing current or next-generation high-throughput screens. AVAILABILITY AND IMPLEMENTATION: The AssayCorrector program, implemented in R, is available on CRAN. CONTACT: makarenkov.vladimir@uqam.ca. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Biological Assay/methods , Computational Biology/methods , High-Throughput Screening Assays/methods , Software , Bias , Drug Discovery/methods , HIV Infections/drug therapy , Humans , Toxicology/methods

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL