Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 95
Filter
1.
bioRxiv ; 2024 Aug 26.
Article in English | MEDLINE | ID: mdl-39253507

ABSTRACT

The macrodomain contained in the SARS-CoV-2 non-structural protein 3 (NSP3) is required for viral pathogenesis and lethality. Inhibitors that block the macrodomain could be a new therapeutic strategy for viral suppression. We previously performed a large-scale X-ray crystallography-based fragment screen and discovered a sub-micromolar inhibitor by fragment linking. However, this carboxylic acid-containing lead had poor membrane permeability and other liabilities that made optimization difficult. Here, we developed a shape-based virtual screening pipeline - FrankenROCS - to identify new macrodomain inhibitors using fragment X-ray crystal structures. We used FrankenROCS to exhaustively screen the Enamine high-throughput screening (HTS) collection of 2.1 million compounds and selected 39 compounds for testing, with the most potent compound having an IC50 value equal to 130 µM. We then paired FrankenROCS with an active learning algorithm (Thompson sampling) to efficiently search the Enamine REAL database of 22 billion molecules, testing 32 compounds with the most potent having an IC50 equal to 220 µM. Further optimization led to analogs with IC50 values better than 10 µM, with X-ray crystal structures revealing diverse binding modes despite conserved chemical features. These analogs represent a new lead series with improved membrane permeability that is poised for optimization. In addition, the collection of 137 X-ray crystal structures with associated binding data will serve as a resource for the development of structure-based drug discovery methods. FrankenROCS may be a scalable method for fragment linking to exploit ever-growing synthesis-on-demand libraries.

2.
J Histotechnol ; : 1-4, 2024 Apr 22.
Article in English | MEDLINE | ID: mdl-38648120

ABSTRACT

Hematoxylin and eosin staining can be hazardous, expensive, and prone to error and variability. To circumvent these issues, artificial intelligence/machine learning models such as generative adversarial networks (GANs), are being used to 'virtually' stain unstained tissue images indistinguishable from chemically stained tissue. Frameworks such as deep convolutional GANs (DCGAN) and conditional GANs (CGANs) have successfully generated highly reproducible 'stained' images. However, their utility may be limited by requiring registered, paired images which can be difficult to obtain. To avoid these dataset requirements, we attempted to use an unsupervised CycleGAN pix2pix model(5,6) to turn unpaired, unstained bright-field images into pathologist-approved digitally 'stained' images. Using formalin-fixed-paraffin-embedded liver samples, 5µm section images (20x) were obtained before and after staining to create "stained" an "unstained" datasets. Model implementation was conducted using Ubuntu 20.04.4 LTS, 32 GB RAM, Intel Core i7-9750 CPU @2.6 GHz, Nvidia GeForce RTX 2070 Mobile, Python 3.7.11 and Tensorflow 2.9.1. The CycleGAN framework utilized a u-net-based generator and discriminator from pix2pix, a CGAN. The CycleGAN used a modified loss function, cycle consistent loss that assumed unpaired images, so loss was measured twice. To our knowledge, this is the first documented application of this architecture using unpaired bright-field images. Results and suggested improvements are discussed.

3.
J Chem Inf Model ; 64(4): 1158-1171, 2024 02 26.
Article in English | MEDLINE | ID: mdl-38316125

ABSTRACT

Over the last five years, virtual screening of ultralarge synthesis on-demand libraries has emerged as a powerful tool for hit identification in drug discovery programs. As these libraries have grown to tens of billions of molecules, we have reached a point where it is no longer cost-effective to screen every molecule virtually. To address these challenges, several groups have developed heuristic search methods to rapidly identify the best molecules on a virtual screen. This article describes the application of Thompson sampling (TS), an active learning approach that streamlines the virtual screening of large combinatorial libraries by performing a probabilistic search in the reagent space, thereby never requiring the full enumeration of the library. TS is a general technique that can be applied to various virtual screening modalities, including 2D and 3D similarity search, docking, and application of machine-learning models. In an illustrative example, we show that TS can identify more than half of the top 100 molecules from a docking-based virtual screen of 335 million molecules by evaluating 1% of the data set.


Subject(s)
Databases, Chemical , Drug Discovery , Drug Discovery/methods
4.
ACS Sens ; 8(8): 2945-2951, 2023 08 25.
Article in English | MEDLINE | ID: mdl-37581255

ABSTRACT

Chemical weapons continue to be an ongoing threat that necessitates the improvement of existing detection technologies where new technologies are absent. Lower limits of detection will facilitate early warning of exposure to chemical weapons and enable more rapid deployment of countermeasures. Here, we evaluate two colorimetric gas detection tubes, developed by Draeger Inc., for sarin and sulfur mustard chemical warfare agents and determine their limits of detection using active chemical agent. Being that commercial companies are only able to use chemical agent simulants during sensor development, it is imperative to determine limits of detection using active agent. The limit of detection was determined based on the absence of a reasonably perceptible color response at incrementally lower concentrations. A chemical vapor generator was constructed to produce stable and quantifiable concentrations of chemical agent vapor, with the presence of chemical agent verified and monitored by a secondary detector. The limits of detection of the colorimetric gas detection tubes were determined to be 0.0046 ± 0.0002 and 2.1 ± 0.3 mg/m3 for sarin and sulfur mustard, respectively. The response of the sarin detection tube was readily observable with little issue. The sulfur mustard detection tube exhibited a weaker response to active agent compared to the simulant that was used during development, which will affect their concept of operations in real-world detection scenarios.


Subject(s)
Chemical Warfare Agents , Mustard Gas , Chemical Warfare Agents/analysis , Mustard Gas/analysis , Sarin , Limit of Detection , Colorimetry , Gases
6.
Sci Adv ; 8(36): eabq0279, 2022 Sep 09.
Article in English | MEDLINE | ID: mdl-36083906

ABSTRACT

Systematic development of accurate density functionals has been a decades-long challenge for scientists. Despite emerging applications of machine learning (ML) in approximating functionals, the resulting ML functionals usually contain more than tens of thousands of parameters, leading to a huge gap in the formulation with the conventional human-designed symbolic functionals. We propose a new framework, Symbolic Functional Evolutionary Search (SyFES), that automatically constructs accurate functionals in the symbolic form, which is more explainable to humans, cheaper to evaluate, and easier to integrate to existing codes than other ML functionals. We first show that, without prior knowledge, SyFES reconstructed a known functional from scratch. We then demonstrate that evolving from an existing functional ωB97M-V, SyFES found a new functional, GAS22 (Google Accelerated Science 22), that performs better for most of the molecular types in the test set of Main Group Chemistry Database (MGCDB84). Our framework opens a new direction in leveraging computing power for the systematic development of symbolic density functionals.

7.
Sci Rep ; 12(1): 14004, 2022 08 17.
Article in English | MEDLINE | ID: mdl-35978031

ABSTRACT

Breast cancer is the most commonly diagnosed female malignancy globally, with better survival rates if diagnosed early. Mammography is the gold standard in screening programmes for breast cancer, but despite technological advances, high error rates are still reported. Machine learning techniques, and in particular deep learning (DL), have been successfully used for breast cancer detection and classification. However, the added complexity that makes DL models so successful reduces their ability to explain which features are relevant to the model, or whether the model is biased. The main aim of this study is to propose a novel visualisation to help characterise breast cancer patients using Fisher Information Networks on features extracted from mammograms using a DL model. In the proposed visualisation, patients are mapped out according to their similarities and can be used to study new patients as a 'patient-like-me' approach. When applied to the CBIS-DDSM dataset, it was shown that it is a competitive methodology that can (i) facilitate the analysis and decision-making process in breast cancer diagnosis with the assistance of the FIN visualisations and 'patient-like-me' analysis, and (ii) help improve diagnostic accuracy and reduce overdiagnosis by identifying the most likely diagnosis based on clinical similarities with neighbouring patients.


Subject(s)
Breast Neoplasms , Deep Learning , Breast/pathology , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Female , Humans , Information Services , Mammography/methods
8.
Nat Rev Chem ; 6(4): 287-295, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35783295

ABSTRACT

One aspirational goal of computational chemistry is to predict potent and drug-like binders for any protein, such that only those that bind are synthesized. In this Roadmap, we describe the launch of Critical Assessment of Computational Hit-finding Experiments (CACHE), a public benchmarking project to compare and improve small molecule hit-finding algorithms through cycles of prediction and experimental testing. Participants will predict small molecule binders for new and biologically relevant protein targets representing different prediction scenarios. Predicted compounds will be tested rigorously in an experimental hub, and all predicted binders as well as all experimental screening data, including the chemical structures of experimentally tested compounds, will be made publicly available, and not subject to any intellectual property restrictions. The ability of a range of computational approaches to find novel binders will be evaluated, compared, and openly published. CACHE will launch 3 new benchmarking exercises every year. The outcomes will be better prediction methods, new small molecule binders for target proteins of importance for fundamental biology or drug discovery, and a major technological step towards achieving the goal of Target 2035, a global initiative to identify pharmacological probes for all human proteins.

9.
J Med Chem ; 65(10): 7073-7087, 2022 05 26.
Article in English | MEDLINE | ID: mdl-35511951

ABSTRACT

One application area of computational methods in drug discovery is the automated design of small molecules. Despite the large number of publications describing methods and their application in both retrospective and prospective studies, there is a lack of agreement on terminology and key attributes to distinguish these various systems. We introduce Automated Chemical Design (ACD) Levels to clearly define the level of autonomy along the axes of ideation and decision making. To fully illustrate this framework, we provide literature exemplars and place some notable methods and applications into the levels. The ACD framework provides a common language for describing automated small molecule design systems and enables medicinal chemists to better understand and evaluate such systems.


Subject(s)
Drug Discovery , Drug Discovery/methods , Prospective Studies , Retrospective Studies
10.
Brain Stimul ; 15(3): 586-597, 2022.
Article in English | MEDLINE | ID: mdl-35395424

ABSTRACT

BACKGROUND: Modulation of pathological neural circuit activity in the brain with a minimum of complications is an area of intense interest. OBJECTIVE: The goal of the study was to alter neurons' physiological states without apparent damage of cellular integrity using stereotactic radiosurgery (SRS). METHODS: We treated a 7.5 mm-diameter target on the visual cortex of Göttingen minipigs with doses of 40, 60, 80, and 100 Gy. Six months post-irradiation, the pigs were implanted with a 9 mm-wide, eight-shank multi-electrode probe, which spanned the radiation focus as well as the low-exposure neighboring areas. RESULTS: Doses of 40 Gy led to an increase of spontaneous firing rate, six months post-irradiation, while doses of 60 Gy and greater were associated with a decrease. Subjecting the animals to visual stimuli resulted in typical visual evoked potentials (VEP). At 40 Gy, a significant reduction of the P1 peak time, indicative of higher network excitability was observed. At 80 Gy, P1 peak time was not affected, while a minor reduction at 60 Gy was seen. No distance-dependent effects on spontaneous firing rate, or on VEP were observed. Post-mortem histology revealed no evidence of necrosis at doses below 60 Gy. In an in vitro assay comprising of iPS-derived human neuron-astrocyte co-cultures, we found a higher vulnerability of inhibitory neurons than excitatory neurons with respect to radiation, which might provide the cellular mechanism of the disinhibitory effect observed in vivo. CONCLUSION: We provide initial evidence for a rather circuit-wide, long-lasting disinhibitory effect of low sub-ablative doses of SRS.


Subject(s)
Evoked Potentials, Visual , Radiosurgery , Animals , Brain , Radiation, Ionizing , Radiosurgery/methods , Swine , Swine, Miniature
11.
Proc Natl Acad Sci U S A ; 118(37)2021 09 14.
Article in English | MEDLINE | ID: mdl-34508002

ABSTRACT

The quest to identify materials with tailored properties is increasingly expanding into high-order composition spaces, with a corresponding combinatorial explosion in the number of candidate materials. A key challenge is to discover regions in composition space where materials have novel properties. Traditional predictive models for material properties are not accurate enough to guide the search. Herein, we use high-throughput measurements of optical properties to identify novel regions in three-cation metal oxide composition spaces by identifying compositions whose optical trends cannot be explained by simple phase mixtures. We screen 376,752 distinct compositions from 108 three-cation oxide systems based on the cation elements Mg, Fe, Co, Ni, Cu, Y, In, Sn, Ce, and Ta. Data models for candidate phase diagrams and three-cation compositions with emergent optical properties guide the discovery of materials with complex phase-dependent properties, as demonstrated by the discovery of a Co-Ta-Sn substitutional alloy oxide with tunable transparency, catalytic activity, and stability in strong acid electrolytes. These results required close coupling of data validation to experiment design to generate a reliable end-to-end high-throughput workflow for accelerating scientific discovery.

12.
iScience ; 24(4): 102262, 2021 Apr 23.
Article in English | MEDLINE | ID: mdl-33817570

ABSTRACT

Autonomous experimentation (AE) accelerates research by combining automation and machine learning to perform experiments intelligently and rapidly in a sequential fashion. While AE systems are most needed to study properties that cannot be predicted analytically or computationally, even imperfect predictions can in principle be useful. Here, we investigate whether imperfect data from simulation can accelerate AE using a case study on the mechanics of additively manufactured structures. Initially, we study resilience, a property that is well-predicted by finite element analysis (FEA), and find that FEA can be used to build a Bayesian prior and experimental data can be integrated using discrepancy modeling to reduce the number of needed experiments ten-fold. Next, we study toughness, a property not well-predicted by FEA and find that FEA can still improve learning by transforming experimental data and guiding experiment selection. These results highlight multiple ways that simulation can improve AE through transfer learning.

13.
Phys Rev Lett ; 126(3): 036401, 2021 Jan 22.
Article in English | MEDLINE | ID: mdl-33543980

ABSTRACT

Including prior knowledge is important for effective machine learning models in physics and is usually achieved by explicitly adding loss terms or constraints on model architectures. Prior knowledge embedded in the physics computation itself rarely draws attention. We show that solving the Kohn-Sham equations when training neural networks for the exchange-correlation functional provides an implicit regularization that greatly improves generalization. Two separations suffice for learning the entire one-dimensional H_{2} dissociation curve within chemical accuracy, including the strongly correlated region. Our models also generalize to unseen types of molecules and overcome self-interaction error.

14.
Nat Biotechnol ; 39(6): 691-696, 2021 06.
Article in English | MEDLINE | ID: mdl-33574611

ABSTRACT

Modern experimental technologies can assay large numbers of biological sequences, but engineered protein libraries rarely exceed the sequence diversity of natural protein families. Machine learning (ML) models trained directly on experimental data without biophysical modeling provide one route to accessing the full potential diversity of engineered proteins. Here we apply deep learning to design highly diverse adeno-associated virus 2 (AAV2) capsid protein variants that remain viable for packaging of a DNA payload. Focusing on a 28-amino acid segment, we generated 201,426 variants of the AAV2 wild-type (WT) sequence yielding 110,689 viable engineered capsids, 57,348 of which surpass the average diversity of natural AAV serotype sequences, with 12-29 mutations across this region. Even when trained on limited data, deep neural network models accurately predict capsid viability across diverse variants. This approach unlocks vast areas of functional but previously unreachable sequence space, with many potential applications for the generation of improved viral vectors and protein therapeutics.


Subject(s)
Capsid Proteins/genetics , Dependovirus/genetics , Machine Learning , Genetic Vectors , HeLa Cells , Humans
16.
Sci Rep ; 10(1): 10478, 2020 Jun 23.
Article in English | MEDLINE | ID: mdl-32572065

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

17.
J Med Chem ; 63(16): 8857-8866, 2020 08 27.
Article in English | MEDLINE | ID: mdl-32525674

ABSTRACT

DNA-encoded small molecule libraries (DELs) have enabled discovery of novel inhibitors for many distinct protein targets of therapeutic value. We demonstrate a new approach applying machine learning to DEL selection data by identifying active molecules from large libraries of commercial and easily synthesizable compounds. We train models using only DEL selection data and apply automated or automatable filters to the predictions. We perform a large prospective study (∼2000 compounds) across three diverse protein targets: sEH (a hydrolase), ERα (a nuclear receptor), and c-KIT (a kinase). The approach is effective, with an overall hit rate of ∼30% at 30 µM and discovery of potent compounds (IC50 < 10 nM) for every target. The system makes useful predictions even for molecules dissimilar to the original DEL, and the compounds identified are diverse, predominantly drug-like, and different from known ligands. This work demonstrates a powerful new approach to hit-finding.


Subject(s)
DNA/chemistry , Drug Discovery/methods , Neural Networks, Computer , Small Molecule Libraries/chemistry , Epoxide Hydrolases/antagonists & inhibitors , Estrogen Receptor alpha/antagonists & inhibitors , Ligands , Protein Kinase Inhibitors/chemistry , Proto-Oncogene Proteins c-kit/antagonists & inhibitors
18.
Sci Adv ; 6(15): eaaz1708, 2020 Apr.
Article in English | MEDLINE | ID: mdl-32300652

ABSTRACT

While additive manufacturing (AM) has facilitated the production of complex structures, it has also highlighted the immense challenge inherent in identifying the optimum AM structure for a given application. Numerical methods are important tools for optimization, but experiment remains the gold standard for studying nonlinear, but critical, mechanical properties such as toughness. To address the vastness of AM design space and the need for experiment, we develop a Bayesian experimental autonomous researcher (BEAR) that combines Bayesian optimization and high-throughput automated experimentation. In addition to rapidly performing experiments, the BEAR leverages iterative experimentation by selecting experiments based on all available results. Using the BEAR, we explore the toughness of a parametric family of structures and observe an almost 60-fold reduction in the number of experiments needed to identify high-performing structures relative to a grid-based search. These results show the value of machine learning in experimental fields where data are sparse.

19.
Nature ; 572(7767): 27-29, 2019 08.
Article in English | MEDLINE | ID: mdl-31363197
20.
PLoS One ; 14(8): e0220809, 2019.
Article in English | MEDLINE | ID: mdl-31415601

ABSTRACT

Glioblastoma is the most frequent malignant intra-cranial tumour. Magnetic resonance imaging is the modality of choice in diagnosis, aggressiveness assessment, and follow-up. However, there are examples where it lacks diagnostic accuracy. Magnetic resonance spectroscopy enables the identification of molecules present in the tissue, providing a precise metabolomic signature. Previous research shows that combining imaging and spectroscopy information results in more accurate outcomes and superior diagnostic value. This study proposes a method to combine them, which builds upon a previous methodology whose main objective is to guide the extraction of sources. To this aim, prior knowledge about class-specific information is integrated into the methodology by setting the metric of a latent variable space where Non-negative Matrix Factorisation is performed. The former methodology, which only used spectroscopy and involved combining spectra from different subjects, was adapted to use selected areas of interest that arise from segmenting the T2-weighted image. Results showed that embedding imaging information into the source extraction (the proposed semi-supervised analysis) improved the quality of the tumour delineation, as compared to those obtained without this information (unsupervised analysis). Both approaches were applied to pre-clinical data, involving thirteen brain tumour-bearing mice, and tested against histopathological data. On results of twenty-eight images, the proposed Semi-Supervised Source Extraction (SSSE) method greatly outperformed the unsupervised one, as well as an alternative semi-supervised approach from the literature, with differences being statistically significant. SSSE has proven successful in the delineation of the tumour, while bringing benefits such as 1) not constricting the metabolomic-based prediction to the image-segmented area, 2) ability to deal with signal-to-noise issues, 3) opportunity to answer specific questions by allowing researchers/radiologists define areas of interest that guide the source extraction, 4) creation of an intra-subject model and avoiding contamination from inter-subject overlaps, and 5) extraction of meaningful, good-quality sources that adds interpretability, conferring validation and better understanding of each case.


Subject(s)
Brain Neoplasms/diagnostic imaging , Brain/diagnostic imaging , Glioblastoma/diagnostic imaging , Magnetic Resonance Imaging/methods , Magnetic Resonance Spectroscopy/methods , Animals , Disease Models, Animal , Mice , Retrospective Studies
SELECTION OF CITATIONS
SEARCH DETAIL