Pesquisa | Portal de Pesquisa da BVS Enfermagem

Prediction Models of Retention Indices for Increased Confidence in Structural Elucidation during Complex Matrix Analysis: Application to Gas Chromatography Coupled with High-Resolution Mass Spectrometry.

Dossin, Eric; Martin, Elyette; Diana, Pierrick; Castellon, Antonio; Monge, Aurelien; Pospisil, Pavel; Bentley, Mark; Guy, Philippe A.

Anal Chem ; 88(15): 7539-47, 2016 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-27403731

RESUMO

Monitoring of volatile and semivolatile compounds was performed using gas chromatography (GC) coupled to high-resolution electron ionization mass spectrometry, using both headspace and liquid injection modes. A total of 560 reference compounds, including 8 odd n-alkanes, were analyzed and experimental linear retention indices (LRI) were determined. These reference compounds were randomly split into training (n = 401) and test (n = 151) sets. LRI for all 552 reference compounds were also calculated based upon computational Quantitative Structure-Property Relationship (QSPR) models, using two independent approaches RapidMiner (coupled to Dragon) and ACD/ChromGenius software. Correlation coefficients for experimental versus predicted LRI values calculated for both training and test set compounds were calculated at 0.966 and 0.949 for RapidMiner and at 0.977 and 0.976 for ACD/ChromGenius, respectively. In addition, the cross-validation correlation was calculated at 0.96 from RapidMiner and the residual standard error value obtained from ACD/ChromGenius was 53.635. These models were then used to predict LRI values for several thousand compounds reported present in tobacco and tobacco-related fractions, plus a range of specific flavor compounds. It was demonstrated that using the mean of the LRI values predicted by RapidMiner and ACD/ChromGenius, in combination with accurate mass data, could enhance the confidence level for compound identification from the analysis of complex matrixes, particularly when the two predicted LRI values for a compound were in close agreement. Application of this LRI modeling approach to matrixes with unknown composition has already enabled the confirmation of 23 postulated compounds, demonstrating its ability to facilitate compound identification in an analytical workflow. The goal is to reduce the list of putative candidates to a reasonable relevant number that can be obtained and measured for confirmation.

Computer-assisted structure identification (CASI)--an automated platform for high-throughput identification of small molecules by two-dimensional gas chromatography coupled to mass spectrometry.

Knorr, Arno; Monge, Aurelien; Stueber, Markus; Stratmann, André; Arndt, Daniel; Martin, Elyette; Pospisil, Pavel.

Anal Chem ; 85(23): 11216-24, 2013 Dec 03.

Artigo em Inglês | MEDLINE | ID: mdl-24160557

RESUMO

Compound identification is widely recognized as a major bottleneck for modern metabolomic approaches and high-throughput nontargeted characterization of complex matrices. To tackle this challenge, an automated platform entitled computer-assisted structure identification (CASI) was designed and developed in order to accelerate and standardize the identification of compound structures. In the first step of the process, CASI automatically searches mass spectral libraries for matches using a NIST MS Search algorithm, which proposes structural candidates for experimental spectra from two-dimensional gas chromatography with time-of-flight mass spectrometry (GC × GC-TOF-MS) measurements, each with an associated match factor. Next, quantitative structure-property relationship (QSPR) models implemented in CASI predict three specific parameters to enhance the confidence for correct compound identification, which were Kovats Index (KI) for the first dimension (1D) separation, relative retention time for the second dimension separation (2DrelRT) and boiling point (BP). In order to reduce the impact of chromatographic variability on the second dimension retention time, a concept based upon hypothetical reference points from linear regressions of a deuterated n-alkanes reference system was introduced, providing a more stable relative retention time measurement. Predicted values for KI and 2DrelRT were calculated and matched with experimentally derived values. Boiling points derived from 1D separations were matched with predicted boiling points, calculated from the chemical structures of the candidates. As a last step, CASI combines the NIST MS Search match factors (NIST MF) with up to three predicted parameter matches from the QSPR models to generate a combined CASI Score representing the measure of confidence for the identification. Threshold values were applied to the CASI Scores assigned to proposed structures, which improved the accuracy for the classification of true/false positives and true/false negatives. Results for the identification of compounds have been validated, and it has been demonstrated that identification using CASI is more accurate than using NIST MS Search alone. CASI is an easily accessible web-interfaced software platform which represents an innovative, high-throughput system that allows fast and accurate identification of constituents in complex matrices, such as those requiring 2D separation techniques.

Assuntos

Automação Laboratorial/métodos , Desenho Assistido por Computador , Cromatografia Gasosa-Espectrometria de Massas/métodos , Ensaios de Triagem em Larga Escala/métodos , Fumaça/análise , Software

Building an R&D chemical registration system.

Martin, Elyette; Monge, Aurélien; Duret, Jacques-Antoine; Gualandi, Federico; Peitsch, Manuel C; Pospisil, Pavel.

J Cheminform ; 4(1): 11, 2012 May 31.

Artigo em Inglês | MEDLINE | ID: mdl-22650418

RESUMO

Small molecule chemistry is of central importance to a number of R&D companies in diverse areas such as the pharmaceutical, nutraceutical, food flavoring, and cosmeceutical industries. In order to store and manage thousands of chemical compounds in such an environment, we have built a state-of-the-art master chemical database with unique structure identifiers. Here, we present the concept and methodology we used to build the system that we call the Unique Compound Database (UCD). In the UCD, each molecule is registered only once (uniqueness), structures with alternative representations are entered in a uniform way (normalization), and the chemical structure drawings are recognizable to chemists and to a cartridge. In brief, structural molecules are entered as neutral entities which can be associated with a salt. The salts are listed in a dictionary and bound to the molecule with the appropriate stoichiometric coefficient in an entity called "substance". The substances are associated with batches. Once a molecule is registered, some properties (e.g., ADMET prediction, IUPAC name, chemical properties) are calculated automatically. The UCD has both automated and manual data controls. Moreover, the UCD concept enables the management of user errors in the structure entry by reassigning or archiving the batches. It also allows updating of the records to include newly discovered properties of individual structures. As our research spans a wide variety of scientific fields, the database enables registration of mixtures of compounds, enantiomers, tautomers, and compounds with unknown stereochemistries.

Managing, profiling and analyzing a library of 2.6 million compounds gathered from 32 chemical providers.

Monge, Aurélien; Arrault, Alban; Marot, Christophe; Morin-Allory, Luc.

Mol Divers ; 10(3): 389-403, 2006 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-17031540

RESUMO

The data for 3.8 million compounds from structural databases of 32 providers were gathered and stored in a single chemical database. Duplicates are removed using the IUPAC International Chemical Identifier. After this, 2.6 million compounds remain. Each database and the final one were studied in term of uniqueness, diversity, frameworks, 'drug-like' and 'lead-like' properties. This study also shows that there are more than 87 000 frameworks in the database. It contains 2.1 million 'drug-like' molecules among which, more than one million are 'lead-like'. This study has been carried out using 'ScreeningAssistant', a software dedicated to chemical databases management and screening sets generation. Compounds are stored in a MySQL database and all the operations on this database are carried out by Java code. The druglikeness and leadlikeness are estimated with 'in-house' scores using functions to estimate convenience to properties; unicity using the InChI code and diversity using molecular frameworks and fingerprints. The software has been conceived in order to facilitate the update of the database. 'ScreeningAssistant' is freely available under the GPL license.

Assuntos

Técnicas de Química Combinatória , Desenho Assistido por Computador , Bases de Dados Factuais , Desenho de Fármacos , Preparações Farmacêuticas/classificação , Sistemas de Informação , Estrutura Molecular , Preparações Farmacêuticas/química , Software , Relação Estrutura-Atividade

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA