Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Anal Chem ; 93(47): 15633-15641, 2021 11 30.
Article in English | MEDLINE | ID: mdl-34780168

ABSTRACT

Machine learning is a popular technique to predict the retention times of molecules based on descriptors. Descriptors and associated labels (e.g., retention times) of a set of molecules can be used to train a machine learning algorithm. However, descriptors are fixed molecular features which are not necessarily optimized for the given machine learning problem (e.g., to predict retention times). Recent advances in molecular machine learning make use of so-called graph convolutional networks (GCNs) to learn molecular representations from atoms and their bonds to adjacent atoms to optimize the molecular representation for the given problem. In this study, two GCNs were implemented to predict the retention times of molecules for three different chromatographic data sets and compared to seven benchmarks (including two state-of-the art machine learning models). Additionally, saliency maps were computed from trained GCNs to better interpret the importance of certain molecular sub-structures in the data sets. Based on the overall observations of this study, the GCNs performed better than all benchmarks, either significantly outperforming them (5-25% lower mean absolute error) or performing similar to them (<5% difference). Saliency maps revealed a significant difference in molecular sub-structures that are important for predictions of different chromatographic data sets (reversed-phase liquid chromatography vs hydrophilic interaction liquid chromatography).


Subject(s)
Chromatography, Reverse-Phase , Machine Learning , Algorithms , Chromatography, Liquid , Hydrophobic and Hydrophilic Interactions
2.
Euro Surveill ; 23(50)2018 Dec.
Article in English | MEDLINE | ID: mdl-30563591

ABSTRACT

BackgroundThe recent global emergence and re-emergence of arboviruses has caused significant human disease. Common vectors, symptoms and geographical distribution make differential diagnosis both important and challenging. AimTo investigate the feasibility of metagenomic sequencing for recovering whole genome sequences of chikungunya and dengue viruses from clinical samples.MethodsWe performed metagenomic sequencing using both the Illumina MiSeq and the portable Oxford Nanopore MinION on clinical samples which were real-time reverse transcription-PCR (qRT-PCR) positive for chikungunya (CHIKV) or dengue virus (DENV), two of the most important arboviruses. A total of 26 samples with a range of representative clinical Ct values were included in the study.ResultsDirect metagenomic sequencing of nucleic acid extracts from serum or plasma without viral enrichment allowed for virus identification, subtype determination and elucidated complete or near-complete genomes adequate for phylogenetic analysis. One PCR-positive CHIKV sample was also found to be coinfected with DENV. ConclusionsThis work demonstrates that metagenomic whole genome sequencing is feasible for the majority of CHIKV and DENV PCR-positive patient serum or plasma samples. Additionally, it explores the use of Nanopore metagenomic sequencing for DENV and CHIKV, which can likely be applied to other RNA viruses, highlighting the applicability of this approach to front-line public health and potential portable applications using the MinION.


Subject(s)
Chikungunya virus/genetics , Dengue Virus/genetics , Real-Time Polymerase Chain Reaction/methods , Whole Genome Sequencing , Antibodies, Viral/blood , Antigens, Viral/blood , Chikungunya Fever/blood , Chikungunya Fever/diagnosis , Chikungunya virus/isolation & purification , Dengue/blood , Dengue/diagnosis , Dengue Virus/isolation & purification , Humans , Metagenomics , Nanopores , Serogroup
3.
Mol Inform ; 42(3): e2200232, 2023 03.
Article in English | MEDLINE | ID: mdl-36529710

ABSTRACT

Maximum common substructures (MCS) have received a lot of attention in the chemoinformatics community. They are typically used as a similarity measure between molecules, showing high predictive performance when used in classification tasks, while being easily explainable substructures. In the present work, we applied the Pairwise Maximum Common Subgraph Feature Generation (PMCSFG) algorithm to automatically detect toxicophores (structural alerts) and to compute fingerprints based on MCS. We present a comparison between our MCS-based fingerprints and 12 well-known chemical fingerprints when used as features in machine learning models. We provide an experimental evaluation and discuss the usefulness of the different methods on mutagenicity data. The features generated by the MCS method have a state-of-the-art performance when predicting mutagenicity, while they are more interpretable than the traditional chemical fingerprints.


Subject(s)
Algorithms , Mutagens , Mutagens/chemistry , Mutagenesis , Machine Learning
4.
J Chromatogr A ; 1672: 463005, 2022 Jun 07.
Article in English | MEDLINE | ID: mdl-35430477

ABSTRACT

Although commercially available software provides options for automatic peak detection, visual inspection and manual corrections are often needed. Peak detection algorithms commonly employed require carefully written rules and thresholds to increase true positive rates and decrease false positive rates. In this study, a deep learning model, specifically, a convolutional neural network (CNN), was implemented to perform automatic peak detection in reversed-phase liquid chromatography (RPLC). The model inputs a whole chromatogram and outputs predicted locations, probabilities, and areas of the peaks. The obtained results on a simulated validation set demonstrated that the model performed well (ROC-AUC of 0.996), and comparably or better than a derivative-based approach using the Savitzky-Golay algorithm for detecting peaks on experimental chromatograms (8.6% increase in true positives). In addition, predicted peak probabilities (typically between 0.5 and 1.0 for true positives) gave an indication of how confident the CNN model was in the peaks detected. The CNN model was trained entirely on simulated chromatograms (a training set of 1,000,000 chromatograms), and thus no effort had to be put into collecting and labeling chromatograms. A potential major drawback of this approach, namely training a CNN model on simulated chromatograms, is the risk of not capturing the actual "chromatogram space" well enough that is needed to perform accurate peak detection in real chromatograms.


Subject(s)
Chromatography, Reverse-Phase , Neural Networks, Computer , Algorithms , Software
5.
J Neural Eng ; 19(1)2022 02 28.
Article in English | MEDLINE | ID: mdl-35086076

ABSTRACT

Objective.Biosignal control is an interaction modality that allows users to interact with electronic devices by decoding the biological signals emanating from the movements or thoughts of the user. This manner of interaction with devices can enhance the sense of agency for users and enable persons suffering from a paralyzing condition to interact with everyday devices that would otherwise be challenging for them to use. It can also improve control of prosthetic devices and exoskeletons by making the interaction feel more natural and intuitive. However, with the current state of the art, several issues still need to be addressed to reliably decode user intent from biosignals and provide an improved user experience over other interaction modalities. One solution is to leverage advances in deep learning (DL) methods to provide more reliable decoding at the expense of added computational complexity. This scoping review introduces the basic concepts of DL and assists readers in deploying DL methods to a real-time control system that should operate under real-world conditions.Approach.The scope of this review covers any electronic device, but with an emphasis on robotic devices, as this is the most active area of research in biosignal control. We review the literature pertaining to the implementation and evaluation of control systems that incorporate DL to identify the main gaps and issues in the field, and formulate suggestions on how to mitigate them.Main results.The results highlight the main challenges in biosignal control with DL methods. Additionally, we were able to formulate guidelines on the best approach to designing, implementing and evaluating research prototypes that use DL in their biosignal control systems.Significance.This review should assist researchers that are new to the fields of biosignal control and DL in successfully deploying a full biosignal control system. Experts in their respective fields can use this article to identify possible avenues of research that would further advance the development of biosignal control with DL methods.


Subject(s)
Deep Learning , Computer Systems , Movement
6.
J Chromatogr A ; 1638: 461900, 2021 Feb 08.
Article in English | MEDLINE | ID: mdl-33485027

ABSTRACT

An important challenge in chromatography is the development of adequate separation methods. Accurate retention models can significantly simplify and expedite the development of adequate separation methods for complex mixtures. The purpose of this study was to introduce reinforcement learning to chromatographic method development, by training a double deep Q-learning algorithm to select optimal isocratic scouting runs to generate accurate retention models. These scouting runs were fit to the Neue-Kuss retention model, which was then used to predict retention factors both under isocratic and gradient conditions. The quality of these predictions was compared to experimental data points, by computing a mean relative percentage error (MRPE) between the predicted and actual retention factors. By providing the reinforcement learning algorithm with a reward whenever the scouting runs led to accurate retention models and a penalty when the analysis time of a selected scouting run was too high (> 1h); it was hypothesized that the reinforcement learning algorithm should by time learn to select good scouting runs for compounds displaying a variety of characteristics. The reinforcement learning algorithm developed in this work was first trained on simulated data, and then evaluated on experimental data for 57 small molecules - each run at 10 different fractions of organic modifier (0.05 to 0.90) and four different linear gradients. The results showed that the MRPE of these retention models (3.77% for isocratic runs and 1.93% for gradient runs), mostly obtained via 3 isocratic scouting runs for each compound, were comparable in performance to retention models obtained by fitting the Neue-Kuss model to all (10) available isocratic datapoints (3.26% for isocratic runs and 4.97% for gradient runs) and retention models obtained via a "chromatographer's selection" of three scouting runs (3.86% for isocratic runs and 6.66% for gradient runs). It was therefore concluded that the reinforcement learning algorithm learned to select optimal scouting runs for retention modeling, by selecting 3 (out of 10) isocratic scouting runs per compound, that were informative enough to successfully capture the retention behavior of each compound.


Subject(s)
Chromatography, Liquid/methods , Algorithms , Models, Theoretical
7.
J Chromatogr A ; 1646: 462093, 2021 Jun 07.
Article in English | MEDLINE | ID: mdl-33853038

ABSTRACT

Enhancement of chromatograms, such as the reduction of baseline noise and baseline drift, is often essential to accurately detect and quantify analytes in a mixture. Current methods have been well studied and adopted for decades and have assisted researchers in obtaining reliable results. However, these methods rely on relatively simple statistics of the data (chromatograms) which in some cases result in significant information loss and inaccuracies. In this study, a deep one-dimensional convolutional autoencoder was developed that simultaneously removes baseline noise and baseline drift with minimal information loss, for a large number and great variety of chromatograms. To enable the autoencoder to denoise a chromatogram to be almost, or completely, noise-free, it was trained on data obtained from an implemented chromatogram simulator that generated 190.000 representative simulated chromatograms. The trained autoencoder was then tested and compared to some of the most widely used and well-established denoising methods on testing datasets of tens of thousands of simulated chromatograms; and then further tested and verified on real chromatograms. The results show that the developed autoencoder can successfully remove baseline noise and baseline drift simultaneously with minimal information loss; outperforming methods like Savitzky-Golay smoothing, Gaussian smoothing and wavelet smoothing for baseline noise reduction (root mean squared error of 1.094 mAU compared to 2.074 mAU, 2.394 mAU and 2.199 mAU) and Savitkzy-Golay smoothing combined with asymmetric least-squares or polynomial fitting for baseline noise and baseline drift reduction (root mean absolute error of 1.171 mAU compared to 3.397 mAU and 4.923 mAU). Evidence is presented that autoencoders can be utilized to enhance and correct chromatograms and consequently improve and alleviate downstream data analysis, with the drawback of needing a carefully implemented simulator, that generates realistic chromatograms, to train the autoencoder.


Subject(s)
Chromatography/methods , Algorithms , Humans , Least-Squares Analysis , Neural Networks, Computer
8.
J Chromatogr A ; 1628: 461435, 2020 Sep 27.
Article in English | MEDLINE | ID: mdl-32822975

ABSTRACT

We report on the performance of three classes of evolutionary algorithms (genetic algorithms (GA), evolution strategies (ES) and covariance matrix adaptation evolution strategy (CMA-ES)) as a means to enhance searches in the method development spaces of 1D- and 2D-chromatography. After optimisation of the design parameters of the different algorithms, they were benchmarked against the performance of a plain grid search. It was found that all three classes significantly outperform the plain grid search, especially in terms of the number of search runs needed to achieve a given separation quality. As soon as more than 100 search runs are needed, the ES algorithm clearly outperforms the GA and CMA-ES algorithms, with the latter performing very well for short searches (<50 search runs) but being susceptible to convergence to local optima for longer searches. It was also found that the performance of the ES and GA algorithms, as well as the grid search, follow a hyperbolic law in the large search run number limit, such that the convergence rate parameter of this hyperbolic function can be used to quantify the difference in required number of search runs for these algorithms. In agreement with one's physical expectations, it was also found that the general advantage of the GA and ES algorithms over the grid search, as well as their mutual performance differences, grow with increasing difficulty of the separation problem.


Subject(s)
Algorithms , Chromatography/methods , Chromatography, Reverse-Phase , Computer Simulation
SELECTION OF CITATIONS
SEARCH DETAIL