ABSTRACT
The International Union of Pure and Applied Chemistry (IUPAC) has a long tradition of supporting the compilation of chemical data and their evaluation through direct projects, nomenclature and terminology work, and partnerships with international scientific bodies, government agencies and other organizations. The IUPAC Interdivisional Subcommittee on Critical Evaluation of Data (ISCED) has been established to provide guidance on issues related to the evaluation of chemical data. In this first report we define the general principles of the evaluation of scientific data and describe best practices and approaches to data evaluation in chemistry.
ABSTRACT
Substances of unknown or variable composition, complex reaction products, or biological materials (UVCBs) are over 70â¯000 "complex" chemical mixtures produced and used at significant levels worldwide. Due to their unknown or variable composition, applying chemical assessments originally developed for individual compounds to UVCBs is challenging, which impedes sound management of these substances. Across the analytical sciences, toxicology, cheminformatics, and regulatory practice, new approaches addressing specific aspects of UVCB assessment are being developed, albeit in a fragmented manner. This review attempts to convey the "big picture" of the state of the art in dealing with UVCBs by holistically examining UVCB characterization and chemical identity representation, as well as hazard, exposure, and risk assessment. Overall, information gaps on chemical identities underpin the fundamental challenges concerning UVCBs, and better reporting and substance characterization efforts are needed to support subsequent chemical assessments. To this end, an information level scheme for improved UVCB data collection and management within databases is proposed. The development of UVCB testing shows early progress, in line with three main methods: whole substance, known constituents, and fraction profiling. For toxicity assessment, one option is a whole-mixture testing approach. If the identities of (many) constituents are known, grouping, read across, and mixture toxicity modeling represent complementary approaches to overcome data gaps in toxicity assessment. This review highlights continued needs for concerted efforts from all stakeholders to ensure proper assessment and sound management of UVCBs.
Subject(s)
Petroleum , Complex Mixtures , Petroleum/toxicity , Risk AssessmentABSTRACT
Research data management (RDM) is needed to assist experimental advances and data collection in the chemical sciences. Many funders require RDM because experiments are often paid for by taxpayers and the resulting data should be deposited sustainably for posterity. However, paper notebooks are still common in laboratories and research data is often stored in proprietary and/or dead-end file formats without experimental context. Data must mature beyond a mere supplement to a research paper. Electronic lab notebooks (ELN) and laboratory information management systems (LIMS) allow researchers to manage data better and they simplify research and publication. Thus, an agreement is needed on minimum information standards for data handling to support structured approaches to data reporting. As digitalization becomes part of curricular teaching, future generations of digital native chemists will embrace RDM and ELN as an organic part of their research.
Subject(s)
Data Management , LaboratoriesABSTRACT
While cheminformatics skills necessary for dealing with an ever-increasing amount of chemical information are considered important for students pursuing STEM careers in the age of big data, many schools do not offer a cheminformatics course or alternative training opportunities. This paper presents the Cheminformatics Online Chemistry Course (OLCC), which is organized and run by the Committee on Computers in Chemical Education (CCCE) of the American Chemical Society (ACS)'s Division of Chemical Education (CHED). The Cheminformatics OLCC is a highly collaborative teaching project involving instructors at multiple schools who teamed up with external chemical information experts recruited across sectors, including government and industry. From 2015 to 2019, three Cheminformatics OLCCs were offered. In each program, the instructors at participating schools would meet face-to-face with the students of a class, while external content experts engaged through online discussions across campuses with both the instructors and students. All the material created in the course has been made available at the open education repositories of LibreTexts and CCCE Web sites for other institutions to adapt to their future needs.
ABSTRACT
We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This Mixfile format is intended to fill the same role for substances that are composed of multiple components as the venerable Molfile does for specifying individual structures. This much needed datastructure is intended to replace current practices for communicating information about mixtures, which usually relies on human-readable text descriptions, drawing several species within a single molecular diagram, or mutually incompatible ad hoc solutions. We describe an open source software application for editing mixture files, which can also be used as web-ready tools for manipulating the file format. We also present a corpus of mixture examples, which we have extracted from collections of text-based descriptions. Furthermore, we present an early look at the proposed IUPAC Mixtures InChI specification, instances of which can be automatically generated using the Mixfile format as a precursor.