RESUMO
Transformation product (TP) information is essential to accurately evaluate the hazards compounds pose to human health and the environment. However, information about TPs is often limited, and existing data is often not fully Findable, Accessible, Interoperable, and Reusable (FAIR). FAIRifying existing TP knowledge is a relatively easy path toward improving access to data for identification workflows and for machine-learning-based algorithms. ShinyTPs was developed to curate existing transformation information derived from text-mined data within the PubChem database. The application (available as an R package) visualizes the text-mined chemical names to facilitate the user validation of the automatically extracted reactions. ShinyTPs was applied to a case study using 436 tentatively identified compounds to prioritize TP retrieval. This resulted in the extraction of 645 reactions (associated with 496 compounds), of which 319 were not previously available in PubChem. The curated reactions were added to the PubChem Transformations library, which was used as a TP suspect list for identification of TPs using the open-source workflow patRoon. In total, 72 compounds from the library were tentatively identified, 18% of which were curated using ShinyTPs, showing that the app can help support TP identification in non-target analysis workflows.
RESUMO
The diversity of hundreds of thousands of potential organic pollutants and the lack of (publicly available) information about many of them is a huge challenge for environmental sciences, engineering, and regulation. Suspect screening based on high-resolution liquid chromatography-mass spectrometry (LC-HRMS) has enormous potential to help characterize the presence of these chemicals in our environment, enabling the detection of known and newly emerging pollutants, as well as their potential transformation products (TPs). Here, suspect list creation (focusing on pesticides relevant for Luxembourg, incorporating data sources in 4 languages) was coupled to an automated retrieval of related TPs from PubChem based on high confidence suspect hits, to screen for pesticides and their TPs in Luxembourgish river samples. A computational workflow was established to combine LC-HRMS analysis and pre-screening of the suspects (including automated quality control steps), with spectral annotation to determine which pesticides and, in a second step, their related TPs may be present in the samples. The data analysis with Shinyscreen (https://gitlab.lcsb.uni.lu/eci/shinyscreen/), an open source software developed in house, coupled with custom-made scripts, revealed the presence of 162 potential pesticide masses and 96 potential TP masses in the samples. Further identification of these mass matches was performed using the open source approach MetFrag (https://msbi.ipb-halle.de/MetFrag/). Eventual target analysis of 36 suspects resulted in 31 pesticides and TPs confirmed at Level-1 (highest confidence), and five pesticides and TPs not confirmed due to different retention times. Spatio-temporal analysis of the results showed that TPs and pesticides followed similar trends, with a maximum number of potential detections in July. The highest detections were in the rivers Alzette and Mess and the lowest in the Sûre and Eisch. This study (a) added pesticides, classification information and related TPs into the open domain, (b) developed automated open source retrieval methods - both enhancing FAIRness (Findability, Accessibility, Interoperability and Reusability) of the data and methods; and (c) will directly support "L'Administration de la Gestion de l'Eau" on further monitoring steps in Luxembourg.
Assuntos
Praguicidas , Poluentes Químicos da Água , Quimioinformática , Luxemburgo , Praguicidas/análise , Rios , Poluentes Químicos da Água/análiseRESUMO
Background: The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide. Results: The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https://zenodo.org/communities/norman-sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA's CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101). Conclusions: The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the "one substance, one assessment" approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-network.com/nds/SLE/). Supplementary Information: The online version contains supplementary material available at 10.1186/s12302-022-00680-6.
RESUMO
Pharmaceuticals and their transformation products (TPs) are continuously released into the aquatic environment via anthropogenic activity. To expand knowledge on the presence of pharmaceuticals and their known TPs in Luxembourgish rivers, 92 samples collected during routine monitoring events between 2019 and 2020 were investigated using nontarget analysis. Water samples were concentrated using solid-phase extraction and then analyzed using liquid chromatography coupled to a high-resolution mass spectrometer. Suspect screening was performed using several open source computational tools and resources including Shinyscreen (https://git-r3lab.uni.lu/eci/shinyscreen/), MetFrag (https://msbi.ipb-halle.de/MetFrag/), PubChemLite (https://zenodo.org/record/4432124), and MassBank (https://massbank.eu/MassBank/). A total of 94 pharmaceuticals, 88 confirmed at a level 1 confidence (86 of which could be quantified, two compounds too low to be quantified) and six identified at level 2a, were found to be present in Luxembourg rivers. Pharmaceutical TPs (12) were also found at a level 2a confidence. The pharmaceuticals were present at median concentrations up to 214 ng/L, with caffeine having a median concentration of 1424 ng/L. Antihypertensive drugs (15), psychoactive drugs (15), and antimicrobials (eight) were the most detected groups of pharmaceuticals. A spatiotemporal analysis of the data revealed areas with higher concentrations of the pharmaceuticals, as well as differences in pharmaceutical concentrations between 2019 and 2020. The results of this work will help guide activities for improving water management in the country and set baseline data for continuous monitoring and screening efforts, as well as for further open data and software developments.