Search | VHL Regional Portal

Show: 20 | 50 | 100

Results 1 - 11 de 11

Filter

Critical evaluation of the NIST retention index database reliability with specific examples.

Matyushin, Dmitriy D; Karnaeva, Anastasia E; Sholokhova, Anastasia Yu.

Anal Bioanal Chem ; 2024 Sep 27.

Article in English | MEDLINE | ID: mdl-39331169

ABSTRACT

The NIST gas chromatographic retention index database is widely used in gas chromatography-mass spectrometry analysis. For many compounds, the NIST database contains many entries that are presumably obtained independently of each other. We showed with specific examples that there are cases in the NIST database where several entries exist for the same compound, and all of them are equally erroneous (an error of more than 100 units). In particular, we demonstrated that all retention index values for such an important compound as imidazole for non-polar stationary phases in the NIST database are erroneous. In addition to imidazole, a similar situation is observed for four more nitrogen-containing heterocyclic compounds. For certainty, measurements were performed under several conditions, using various temperature programs, and using two specimens of columns. The structures were confirmed using nuclear magnetic resonance and mass spectrometry. It was shown with specific examples that many values are not reliable: either data were obtained using standard samples of undescribed origin without confirmation (without even using mass spectrometry) or, in some cases, standard samples were not used at all, and the retention index was obtained for a mixture component identified using a mass spectral library search. Some "independent" values are not such but are repeated publications of the same data (secondary sources), or simply several values taken from the same source. In the work, an analysis was carried out and assumptions were made about how several equally incorrect retention index values could appear in the NIST database.

Large-scale statistical study of the dependence of retention index on heating rate in temperature-programmed gas chromatography.

Matyushin, Dmitriy D; Sholokhova, Anastasia Yu.

J Chromatogr A ; 1732: 465223, 2024 Sep 13.

Article in English | MEDLINE | ID: mdl-39111182

ABSTRACT

Retention indices are values that characterize the retention of a compound in gas chromatography. In practice, retention indices are often assumed to depend only on the structure of the molecule and the type of the stationary phase, but this approximation is incorrect. This study is devoted to studying the dependence of retention indices on the column heating rate in the linear temperature programming mode, using a large and diverse data set. In the NIST 20 database, most data records are recorded in this mode. For stationary phases based on poly(5%-diphenyl-95%-dimethyl)siloxane (5%-phenyl-PDMS), there is a high proportion of records with heating rates of 10-15 K/min. In practice, such a high heating rate is rarely used and the use of such data may cause errors. A search was made for groups of records that were taken from the same primary source, recorded for the same compound and the same stationary phase, but differing in a heating rate. For each of these groups, the value D, the angular coefficient (slope) of the dependence of the retention index on the heating rate, was calculated. This value can take both positive and negative values. The highest values and the greatest variation of D values are observed for polar stationary phases, but further consideration was performed for 5%-phenyl-PDMS due to its greater practical significance. For these stationary phases, the highest D values are observed for aromatic and polyaromatic molecules; oxygen-containing compounds, on the contrary, exhibit lower D values. Negative D values are observed for many trimethylsilyl derivatives. A data set of D values for 756 molecules was selected and published online. There is almost no correlation between D and the retention index, lipophilicity factor logP, and molecular weight. Significant correlations with the number of cycles, the number of rotatable bonds, and the number of aromatic atoms were observed. Linear equations quantitatively relating the molecular descriptors to the D value were constructed. A number of cycles and halogen atoms were shown to contribute positively to the D value, while a number of oxygen atoms and bonds subject to internal rotation contributed negatively. The strong influence of the values related to the conformational rigidity of molecules and the weak influence of polarity allow us to suppose that the entropic factor has a key influence on the D value. A simple empirical linear equation for estimating the value of D is derived and presented in this study. Several machine learning methods for predicting D are compared. The best results are shown by gradient boosting and a random forest. However, the random forest does not achieve high accuracy in predicting the retention indices themselves.

Subject(s)

Hot Temperature , Chromatography, Gas/methods , Temperature

In-Column Dehydration Benzyl Alcohols and Their Chromatographic Behavior on Pyridinium-Based Ionic Liquids as Gas Stationary Phases.

Sholokhova, Anastasia Yu; Borovikova, Svetlana A.

Molecules ; 29(16)2024 Aug 06.

Article in English | MEDLINE | ID: mdl-39202801

ABSTRACT

At present, stationary phases based on ionic liquids are a promising and widely used technique in gas chromatography, yet they remain poorly studied. Unfortunately, testing of "new" stationary phases is often carried out on a limited set of test compounds (about 10 compounds) of relatively simple structures. This study represents the first investigation into the physicochemical patterns of retention of substituted (including polysubstituted) aromatic alcohols on two stationary phases of different polarities: one based on pyridinium-based ionic liquids and the other on a standard polar phase. The retention order of the studied compounds on such stationary phases compared to the standard polar phase, polyethylene glycol (SH-Stabilwax), was compared and studied. It was shown that pyridinium-based ionic liquids stationary phase has a different selectivity compared to the SH-Stabilwax. Using a quantitative structure-retention relationships (QSRR) study, the differences in selectivity of the two stationary phases were interpreted. Using CHERESHNYA software, the importance of descriptors on different stationary phases was evaluated for the same data set. Different selectivity of the stationary phases correlates with different contributions of descriptors for the analytes under study. For the first time, we show that in-column dehydration is observed for some compounds (mostly substituted benzyl alcohols). This effect is worthy of further investigation and requires attention when analyzing complex mixtures. It suggests that when testing "new" stationary phases, it is necessary to conduct tests on a large set of different classes of compounds. This is because, in the case of using ionic liquids as an stationary phase, a reaction between the analyte and the stationary phase is possible.

Quantitative structure-retention relationships for pyridinium-based ionic liquids used as gas chromatographic stationary phases: convenient software and assessment of reliability of the results.

Sholokhova, Anastasia Yu; Matyushin, Dmitriy D; Shashkov, Mikhail V.

J Chromatogr A ; 1730: 465144, 2024 Aug 16.

Article in English | MEDLINE | ID: mdl-38996513

ABSTRACT

Ionic liquids, i.e., organic salts with a low melting point, can be used as gas chromatographic liquid stationary phases. These stationary phases have some advantages such as peculiar selectivity, high polarity, and thermostability. Many previous works are devoted to such stationary phases. However, there are still no large enough retention data sets of structurally diverse compounds for them. Consequently, there are very few works devoted to quantitative structure-retention relationships (QSRR) for ionic liquid-based stationary phases. This work is aimed at closing this gap. Three ionic liquids with substituted pyridinium cations are considered. We provide large enough data sets (123-158 compounds) that can be used in further works devoted to QSRR and related methods. We provide a QSRR study using this data set and demonstrate the following. The retention index for a polyethylene glycol stationary phase (denoted as RI_PEG), predicted using another model, can be used as a molecular descriptor. This descriptor significantly improves the accuracy of the QSRR model. Both deep learning-based and linear models were considered for RI_PEG prediction. The ability to predict the retention indices for ionic liquid-based stationary phases with high accuracy is demonstrated. Particular attention is paid to the reproducibility and reliability of the QSRR study. It was demonstrated that adding/removing several compounds, small perturbations of the data set can considerably affect the results such as descriptor importance and model accuracy. These facts have to be considered in order to avoid misleading conclusions. For the QSRR research, we developed a software tool with a graphical user interface, which we called CHERESHNYA. It is intended to select molecular descriptors and construct linear equations connecting molecular descriptors with gas chromatographic retention indices for any stationary phase. The software allows the user to generate several hundred molecular descriptors (one-dimensional and two-dimensional). Among them, predicted retention indices for popular stationary phases such as polydimethylsiloxane and polyethylene glycol are used as molecular descriptors. Various methods for selecting (and assessing the importance of) molecular descriptors have been implemented, in particular the Boruta algorithm, partial least squares, genetic algorithms, L1-regularized regression (LASSO) and others. The software is free, open-source and available online: https://github.com/mtshn/chereshnya.

Subject(s)

Ionic Liquids , Pyridinium Compounds , Software , Ionic Liquids/chemistry , Chromatography, Gas/methods , Pyridinium Compounds/chemistry , Reproducibility of Results , Quantitative Structure-Activity Relationship , Linear Models , Polyethylene Glycols/chemistry

Validation of the identification reliability of known and assumed UDMH transformation products using gas chromatographic retention indices and machine learning.

Karnaeva, Anastasia E; Sholokhova, Anastasia Yu.

Chemosphere ; 362: 142679, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38909863

ABSTRACT

Thirty two commercially available standards were used to determine chromatographic retention indices for three different stationary phases (non-polar, polar and mid-polar) commonly used in gas chromatography. The selected compounds were nitrogen-containing heterocycles and amides, which are referred to in the literature as unsymmetrical dimethylhydrazine (UDMH) transformation products or its assumed transformation products. UDMH is a highly toxic compound widely used in the space industry. It is a reactive substance that forms a large number of different compounds in the environment. Well-known transformation products may exceed UDMH itself in their toxicity, but most of the products are poorly investigated, while posing a huge environmental threat. Experimental retention indices for the three stationary phases, retention indices from the NIST database, and predicted retention indices are presented in this paper. It is shown that there are virtually no retention indices for UDMH transformation products in the NIST database. In addition, even among those compounds for which retention indices were known, inconsistencies were identified. Adding retention indices to the database and eliminating erroneous data would allow for more reliable identification when standards are not available. The discrepancies identified between experimental retention index values and predicted values will allow for adjustments to the machine learning models that are used for prediction. Previously proposed compounds as possible transformation products without the use of standards and NMR method were confirmed.

Subject(s)

Machine Learning , Chromatography, Gas/methods , Hydrazines/analysis , Hydrazines/chemistry , Reproducibility of Results

Intelligent Workflow and Software for Non-Target Analysis of Complex Samples Using a Mixture of Toxic Transformation Products of Unsymmetrical Dimethylhydrazine as an Example.

Sholokhova, Anastasia Yu; Matyushin, Dmitriy D; Grinevich, Oksana I; Borovikova, Svetlana A; Buryak, Aleksey K.

Molecules ; 28(8)2023 Apr 12.

Article in English | MEDLINE | ID: mdl-37110641

ABSTRACT

Unsymmetrical dimethylhydrazine (UDMH) is a widely used rocket propellant. Entering the environment or being stored in uncontrolled conditions, UDMH easily forms an enormous variety (at least many dozens) of transformation products. Environmental pollution by UDMH and its transformation products is a major problem in many countries and across the Arctic region. Unfortunately, previous works often use only electron ionization mass spectrometry with a library search, or they consider only the molecular formula to propose the structures of new products. This is quite an unreliable approach. It was demonstrated that a newly proposed artificial intelligence-based workflow allows for the proposal of structures of UDMH transformation products with a greater degree of certainty. The presented free and open-source software with a convenient graphical user interface facilitates the non-target analysis of industrial samples. It has bundled machine learning models for the prediction of retention indices and mass spectra. A critical analysis of whether a combination of several methods of chromatography and mass spectrometry allows us to elucidate the structure of an unknown UDMH transformation product was provided. It was demonstrated that the use of gas chromatographic retention indices for two stationary phases (polar and non-polar) allows for the rejection of false candidates in many cases when only one retention index is not enough. The structures of five previously unknown UDMH transformation products were proposed, and four previously proposed structures were refined.

Machine learning-assisted non-target analysis of a highly complex mixture of possible toxic unsymmetrical dimethylhydrazine transformation products with chromatography-mass spectrometry.

Sholokhova, Anastasia Yu; Grinevich, Oksana I; Matyushin, Dmitriy D; Buryak, Aleksey K.

Chemosphere ; 307(Pt 1): 135764, 2022 Nov.

Article in English | MEDLINE | ID: mdl-35863423

ABSTRACT

Unsymmetrical dimethylhydrazine (UDMH) is a toxic and environmentally hostile compound that was massively introduced to the environment during previous decades due to its use in the space and rocket industry. The compound forms multiple transformation products, and many of them are as dangerous as UDMH or even more dangerous. The danger includes, but is not limited to, acute toxicity, chronic health hazards, carcinogenicity, and environmental damage. UDMH transformation products are poorly investigated. In this work, the mixture formed by long storage of the waste that contained UDMH was studied. Even a preliminary screening of such a mixture is a complex task. It consists of dozens of compounds, and most of them are missing in chemical and spectral databases. The complete preparative separation of such a mixture is very laborious. We applied several methods of gas chromatography-mass spectrometry and liquid chromatography-mass spectrometry, and several machine learning and chemoinformatics methods to make a preliminary but informative screening of the mixture. Machine learning allowed predicting retention indices and mass spectra of candidate structures. The combination of various ion sources and a comparison of the observed with the predicted spectra and retention was used to propose confident structures for 24 compounds. It was demonstrated that neither high-resolution mass spectrometry nor mass spectral library matching is enough to elucidate the structures of unknown UDMH transformation products. At the same time, the use of machine learning and a combination of methods significantly improves the identification power. Finally, machine learning was applied to estimate the acute toxicity of the discovered compounds. It was shown that many of them are comparable to or even more toxic than UDMH itself. Such an extremely wide and still underestimated variety of easily formed derivatives of UDMH can lead to a significant underestimation of the potential hazard of this compound.

Subject(s)

Complex Mixtures , Machine Learning , Dimethylhydrazines , Gas Chromatography-Mass Spectrometry , Mass Spectrometry/methods

Deep Learning Based Prediction of Gas Chromatographic Retention Indices for a Wide Variety of Polar and Mid-Polar Liquid Stationary Phases.

Matyushin, Dmitriy D; Sholokhova, Anastasia Yu; Buryak, Aleksey K.

Int J Mol Sci ; 22(17)2021 Aug 25.

Article in English | MEDLINE | ID: mdl-34502099

ABSTRACT

Prediction of gas chromatographic retention indices based on compound structure is an important task for analytical chemistry. The predicted retention indices can be used as a reference in a mass spectrometry library search despite the fact that their accuracy is worse in comparison with the experimental reference ones. In the last few years, deep learning was applied for this task. The use of deep learning drastically improved the accuracy of retention index prediction for non-polar stationary phases. In this work, we demonstrate for the first time the use of deep learning for retention index prediction on polar (e.g., polyethylene glycol, DB-WAX) and mid-polar (e.g., DB-624, DB-210, DB-1701, OV-17) stationary phases. The achieved accuracy lies in the range of 16-50 in terms of the mean absolute error for several stationary phases and test data sets. We also demonstrate that our approach can be directly applied to the prediction of the second dimension retention times (GC × GC) if a large enough data set is available. The achieved accuracy is considerably better compared with the previous results obtained using linear quantitative structure-retention relationships and ACD ChromGenius software. The source code and pre-trained models are available online.

Subject(s)

Chromatography, Gas/methods , Deep Learning , Chromatography, Gas/standards

Deep Learning Driven GC-MS Library Search and Its Application for Metabolomics.

Matyushin, Dmitriy D; Sholokhova, Anastasia Yu; Buryak, Aleksey K.

Anal Chem ; 92(17): 11818-11825, 2020 09 01.

Article in English | MEDLINE | ID: mdl-32867500

ABSTRACT

Preliminary compound identification and peak annotation in gas chromatography-mass spectrometry is usually made using mass spectral databases. There are a few algorithms that enable performing a search of a spectrum in a large mass spectral library. In many cases, a library search procedure returns a wrong answer even if a correct compound is contained in a library. In this work, we present a deep learning driven approach to a library search in order to reduce the probability of such cases. Machine learning ranking (learning to rank) is a class of machine learning and deep learning algorithms that perform a comparison (ranking) of objects. This work introduces the usage of deep learning ranking for small molecules identification using low-resolution electron ionization mass spectrometry. Instead of simple similarity measures for two spectra, such as the dot product or the Euclidean distance between vectors that represent spectra, a deep convolutional neural network is used. The deep learning ranking model outperforms other approaches and enables reducing a fraction of wrong answers (at rank-1) by 9-23% depending on the used data set. Spectra from the Golm Metabolome Database, Human Metabolome Database, and FiehnLib were used for testing the model.

Subject(s)

Deep Learning/standards , Gas Chromatography-Mass Spectrometry/methods , Machine Learning/standards , Metabolomics/methods , Humans

10.

Analysis of light components in pyrolysis products by comprehensive two-dimensional gas chromatography with PLOT columns.

Sholokhova, Anastasia Yu; Patrushev, Yuri V; Sidelnikov, Vladimir N; Buryak, Aleksey K.

Talanta ; 209: 120448, 2020 Mar 01.

Article in English | MEDLINE | ID: mdl-31892031

ABSTRACT

The most successful method for pyrolysis liquids analysis is comprehensive two-dimensional gas chromatography. Columns with a stationary liquid phase are used for this purpose. However, when is necessary to analyze a gas phase containing C3-C5 hydrocarbons over a liquid pyrolysis product, the use of columns with a liquid phase in CG*CG will not result to separation of light hydrocarbons. In this case, it is necessary to use PLOT columns with a porous layer of sorbents of various nature. Today this approach with two PLOT columns in GC*GC is not described, as well as its use for the analysis of light hydrocarbons resulting from pyrolysis. This paper describes an application of two PLOT columns in GC*GC mode. This paper describes an application of two PLOT columns in GC*GC mode. The next columns of different nature that have different selectivity were used: Rt-Q-BOND, Rt-S-BOND, Rt-U-BOND (columns based on divinylbenzene styrene copolymer), column with sorbent poly- (1-trimethylsilyl-1-propyne) (PTMSP) and an Agilent GASPRO silica column. The most suitable pair of the columns was determined by finding of their orthogonality. The numerical orthogonality data was found by studying of the correlation coefficients between compounds retention time on the first and second columns. It is shown that the best combination of columns are PTMSP - GASPRO and Rt-Q-BOND - GASPRO, however, the first combination of columns allows separation at the same temperature conditions about twice as fast as the second. Examples of the separation of Ð¡3-Ð¡8 hydrocarbons in the gas phase over pyrolysis mixtures of different origin are given.

11.

A deep convolutional neural network for the estimation of gas chromatographic retention indices.

Matyushin, Dmitriy D; Sholokhova, Anastasia Yu; Buryak, Aleksey K.

J Chromatogr A ; 1607: 460395, 2019 Dec 06.

Article in English | MEDLINE | ID: mdl-31405570

ABSTRACT

A deep convolutional neural network was used for the estimation of gas chromatographic retention indices on non-polar (polydimethylsiloxane and polydimethyl(5%-phenyl) siloxane) stationary phases. The neural network can be used for candidate ranking while searching a mass spectral database. A linear representation (SMILES notation) of the molecule structure was used as an input for the model. The input line was converted to a one-hot matrix and then directly processed by the neural network. The calculation of any common molecular descriptors is avoided, following the modern tendency in machine learning: to allow the neural network to find the most preferable features by itself instead of using hard-coded features. The model has two 1D-convolutional layers with 120 neurons each followed by a pooling layer and a fully-connected layer with 200 hidden neurons. The model was compared with state-of-the-art models for prediction of gas chromatographic indices based on molecular descriptors and on functional groups contributions. On different data sets better accuracy is shown together with greater versatility. The applicability to diverse sets of flavors and fragrances, essential oils, metabolites is shown. The possibility of using the model for improvement of mass spectral identification (without reference retention index) is demonstrated. The median absolute error and the median percentage error are in the range of 17.3 (0.93%) to 38.1 (2.15%) depending on used test data set. Ready-to-use neural network parameters are provided.

Subject(s)

Chromatography, Gas/methods , Neural Networks, Computer , Databases, Factual , Gas Chromatography-Mass Spectrometry , Regression Analysis

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL