RESUMEN
Data science is assuming a pivotal role in guiding reaction optimization and streamlining experimental workloads in the evolving landscape of synthetic chemistry. A discipline-wide goal is the development of workflows that integrate computational chemistry and data science tools with high-throughput experimentation as it provides experimentalists the ability to maximize success in expensive synthetic campaigns. Here, we report an end-to-end data-driven process to effectively predict how structural features of coupling partners and ligands affect Cu-catalyzed C-N coupling reactions. The established workflow underscores the limitations posed by substrates and ligands while also providing a systematic ligand prediction tool that uses probability to assess when a ligand will be successful. This platform is strategically designed to confront the intrinsic unpredictability frequently encountered in synthetic reaction deployment.
RESUMEN
Hydrogen bond-based organocatalysts rely on networks of attractive noncovalent interactions (NCIs) to impart enantioselectivity. As a specific example, aryl pyrrolidine substituted urea, thiourea, and squaramide organocatalysts function cooperatively through hydrogen bonding and difficult-to-predict NCIs as a function of the reaction partners. To uncover the synergistic effect of the structural components of this catalyst class, we applied data science tools to study various model reactions using a derivatized, aryl pyrrolidine-based, hydrogen-bond donor (HBD) catalyst library. Through a combination of experimentally collected data and data mined from previous reports, statistical models were constructed, illuminating the general features necessary for high enantioselectivity. A distinct dependence on the identity of the electrophilic reaction partner and HBD catalyst is observed, suggesting that a general interaction is conserved throughout the reactions analyzed. The resulting models also demonstrate predictive capability by the successful improvement of a previously reported reaction using out-of-sample reaction components. Overall, this study highlights the power of data science in exploring mechanistic hypotheses in asymmetric HBD catalysis and provides a prediction platform applicable in future reaction optimization.