Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nature ; 590(7844): 89-96, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33536653

RESUMO

Reaction optimization is fundamental to synthetic chemistry, from optimizing the yield of industrial processes to selecting conditions for the preparation of medicinal candidates1. Likewise, parameter optimization is omnipresent in artificial intelligence, from tuning virtual personal assistants to training social media and product recommendation systems2. Owing to the high cost associated with carrying out experiments, scientists in both areas set numerous (hyper)parameter values by evaluating only a small subset of the possible configurations. Bayesian optimization, an iterative response surface-based global optimization algorithm, has demonstrated exceptional performance in the tuning of machine learning models3. Bayesian optimization has also been recently applied in chemistry4-9; however, its application and assessment for reaction optimization in synthetic chemistry has not been investigated. Here we report the development of a framework for Bayesian reaction optimization and an open-source software tool that allows chemists to easily integrate state-of-the-art optimization algorithms into their everyday laboratory practices. We collect a large benchmark dataset for a palladium-catalysed direct arylation reaction, perform a systematic study of Bayesian optimization compared to human decision-making in reaction optimization, and apply Bayesian optimization to two real-world optimization efforts (Mitsunobu and deoxyfluorination reactions). Benchmarking is accomplished via an online game that links the decisions made by expert chemists and engineers to real experiments run in the laboratory. Our findings demonstrate that Bayesian optimization outperforms human decisionmaking in both average optimization efficiency (number of experiments) and consistency (variance of outcome against initially available data). Overall, our studies suggest that adopting Bayesian optimization methods into everyday laboratory practices could facilitate more efficient synthesis of functional chemicals by enabling better-informed, data-driven decisions about which experiments to run.


Assuntos
Teorema de Bayes , Técnicas de Química Sintética/métodos , Algoritmos , Conjuntos de Dados como Assunto , Tomada de Decisões , Halogenação , Paládio/química , Reprodutibilidade dos Testes
2.
J Am Chem Soc ; 144(43): 19999-20007, 2022 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-36260788

RESUMO

We report the development of an open-source experimental design via Bayesian optimization platform for multi-objective reaction optimization. Using high-throughput experimentation (HTE) and virtual screening data sets containing high-dimensional continuous and discrete variables, we optimized the performance of the platform by fine-tuning the algorithm components such as reaction encodings, surrogate model parameters, and initialization techniques. Having established the framework, we applied the optimizer to real-world test scenarios for the simultaneous optimization of the reaction yield and enantioselectivity in a Ni/photoredox-catalyzed enantioselective cross-electrophile coupling of styrene oxide with two different aryl iodide substrates. Starting with no previous experimental data, the Bayesian optimizer identified reaction conditions that surpassed the previously human-driven optimization campaigns within 15 and 24 experiments, for each substrate, among 1728 possible configurations available in each optimization. To make the platform more accessible to nonexperts, we developed a graphical user interface (GUI) that can be accessed online through a web-based application and incorporated features such as condition modification on the fly and data visualization. This web application does not require software installation, removing any programming barrier to use the platform, which enables chemists to integrate Bayesian optimization routines into their everyday laboratory practices.


Assuntos
Aplicativos Móveis , Humanos , Teorema de Bayes , Software
3.
Soft Matter ; 16(32): 7524-7534, 2020 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-32700724

RESUMO

Cellular mechanical metamaterials are a special class of materials whose mechanical properties are primarily determined by their geometry. However, capturing the nonlinear mechanical behavior of these materials, especially those with complex geometries and under large deformation, can be challenging due to inherent computational complexity. In this work, we propose a data-driven multiscale computational scheme as a possible route to resolve this challenge. We use a neural network to approximate the effective strain energy density as a function of cellular geometry and overall deformation. The network is constructed by "learning" from the data generated by finite element calculation of a set of representative volume elements at cellular scales. This effective strain energy density is then used to predict the mechanical responses of cellular materials at larger scales. Compared with direct finite element simulation, the proposed scheme can reduce the computational time up to two orders of magnitude. Potentially, this scheme can facilitate new optimization algorithms for designing cellular materials of highly specific mechanical properties.


Assuntos
Algoritmos , Simulação por Computador , Análise de Elementos Finitos , Estresse Mecânico
4.
Nat Mater ; 15(10): 1120-7, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27500805

RESUMO

Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.

5.
Cogn Sci ; 47(4): e13262, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37051879

RESUMO

Humans can learn complex functional relationships between variables from small amounts of data. In doing so, they draw on prior expectations about the form of these relationships. In three experiments, we show that people learn to adjust these expectations through experience, learning about the likely forms of the functions they will encounter. Previous work has used Gaussian processes-a statistical framework that extends Bayesian nonparametric approaches to regression-to model human function learning. We build on this work, modeling the process of learning to learn functions as a form of hierarchical Bayesian inference about the Gaussian process hyperparameters.


Assuntos
Aprendizagem , Modelos Psicológicos , Humanos , Teorema de Bayes , Distribuição Normal
6.
ACS Cent Sci ; 5(4): 700-708, 2019 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-31041390

RESUMO

When confronted with a substance of unknown identity, researchers often perform mass spectrometry on the sample and compare the observed spectrum to a library of previously collected spectra to identify the molecule. While popular, this approach will fail to identify molecules that are not in the existing library. In response, we propose to improve the library's coverage by augmenting it with synthetic spectra that are predicted from candidate molecules using machine learning. We contribute a lightweight neural network model that quickly predicts mass spectra for small molecules, averaging 5 ms per molecule with a recall-at-10 accuracy of 91.8%. Achieving high-accuracy predictions requires a novel neural network architecture that is designed to capture typical fragmentation patterns from electron ionization. We analyze the effects of our modeling innovations on library matching performance and compare our models to prior machine-learning-based work on spectrum prediction.

7.
ACS Cent Sci ; 4(2): 268-276, 2018 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-29532027

RESUMO

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.

8.
Nat Genet ; 50(10): 1483-1493, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30177862

RESUMO

Biological interpretation of genome-wide association study data frequently involves assessing whether SNPs linked to a biological process, for example, binding of a transcription factor, show unsigned enrichment for disease signal. However, signed annotations quantifying whether each SNP allele promotes or hinders the biological process can enable stronger statements about disease mechanism. We introduce a method, signed linkage disequilibrium profile regression, for detecting genome-wide directional effects of signed functional annotations on disease risk. We validate the method via simulations and application to molecular quantitative trait loci in blood, recovering known transcriptional regulators. We apply the method to expression quantitative trait loci in 48 Genotype-Tissue Expression tissues, identifying 651 transcription factor-tissue associations including 30 with robust evidence of tissue specificity. We apply the method to 46 diseases and complex traits (average n = 290 K), identifying 77 annotation-trait associations representing 12 independent transcription factor-trait associations, and characterize the underlying transcriptional programs using gene-set enrichment analyses. Our results implicate new causal disease genes and new disease mechanisms.


Assuntos
Doença/genética , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Locos de Características Quantitativas , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Células Sanguíneas/metabolismo , Células Sanguíneas/patologia , Análise Química do Sangue , Regulação da Expressão Gênica , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Ligação Proteica , Fatores de Risco
9.
Lab Chip ; 16(20): 3929-3939, 2016 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-27713998

RESUMO

Iron deficiency anemia (IDA) is a nutritional disorder that impacts over one billion people worldwide; it may cause permanent cognitive impairment in children, fatigue in adults, and suboptimal outcomes in pregnancy. IDA can be diagnosed by detection of red blood cells (RBCs) that are characteristically small (microcytic) and deficient in hemoglobin (hypochromic), typically by examining the results of a complete blood count performed by a hematology analyzer. These instruments are expensive, not portable, and require trained personnel; they are, therefore, unavailable in many low-resource settings. This paper describes a low-cost and rapid method to diagnose IDA using aqueous multiphase systems (AMPS)-thermodynamically stable mixtures of biocompatible polymers and salt that spontaneously form discrete layers having sharp steps in density. AMPS are preloaded into a microhematocrit tube and used with a drop of blood from a fingerstick. After only two minutes in a low-cost centrifuge, the tests (n = 152) were read by eye with a sensitivity of 84% (72-93%) and a specificity of 78% (68-86%), corresponding to an area under the curve (AUC) of 0.89. The AMPS test outperforms diagnosis by hemoglobin alone (AUC = 0.73) and is comparable to methods used in clinics like reticulocyte hemoglobin concentration (AUC = 0.91). Standard machine learning tools were used to analyze images of the resulting tests captured by a standard desktop scanner to 1) slightly improve diagnosis of IDA-sensitivity of 90% (83-96%) and a specificity of 77% (64-87%), and 2) predict several important red blood cell parameters, such as mean corpuscular hemoglobin concentration. These results suggest that the use of AMPS combined with machine learning provides an approach to developing point-of-care hematology.


Assuntos
Anemia Ferropriva/sangue , Anemia Ferropriva/diagnóstico , Fracionamento Celular , Eritrócitos/patologia , Estudos de Casos e Controles , Tamanho Celular , Humanos , Aprendizado de Máquina
10.
Neuron ; 88(6): 1121-1135, 2015 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-26687221

RESUMO

Complex animal behaviors are likely built from simpler modules, but their systematic identification in mammals remains a significant challenge. Here we use depth imaging to show that 3D mouse pose dynamics are structured at the sub-second timescale. Computational modeling of these fast dynamics effectively describes mouse behavior as a series of reused and stereotyped modules with defined transition probabilities. We demonstrate this combined 3D imaging and machine learning method can be used to unmask potential strategies employed by the brain to adapt to the environment, to capture both predicted and previously hidden phenotypes caused by genetic or neural manipulations, and to systematically expose the global structure of behavior within an experiment. This work reveals that mouse body language is built from identifiable components and is organized in a predictable fashion; deciphering this language establishes an objective framework for characterizing the influence of environmental cues, genes and neural activity on behavior.


Assuntos
Comportamento Animal , Imageamento Tridimensional/métodos , Cinésica , Aprendizado de Máquina , Optogenética/métodos , Animais , Simulação por Computador , Imageamento Tridimensional/instrumentação , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Optogenética/instrumentação
11.
IEEE J Biomed Health Inform ; 19(3): 1068-76, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25014976

RESUMO

Cardiovascular variables such as heart rate (HR) and blood pressure (BP) are regulated by an underlying control system, and therefore, the time series of these vital signs exhibit rich dynamical patterns of interaction in response to external perturbations (e.g., drug administration), as well as pathological states (e.g., onset of sepsis and hypotension). A question of interest is whether "similar" dynamical patterns can be identified across a heterogeneous patient cohort, and be used for prognosis of patients' health and progress. In this paper, we used a switching vector autoregressive framework to systematically learn and identify a collection of vital sign time series dynamics, which are possibly recurrent within the same patient and may be shared across the entire cohort. We show that these dynamical behaviors can be used to characterize the physiological "state" of a patient. We validate our technique using simulated time series of the cardiovascular system, and human recordings of HR and BP time series from an orthostatic stress study with known postural states. Using the HR and BP dynamics of an intensive care unit (ICU) cohort of over 450 patients from the MIMIC II database, we demonstrate that the discovered cardiovascular dynamics are significantly associated with hospital mortality (dynamic modes 3 and 9, p=0.001, p=0.006 from logistic regression after adjusting for the APACHE scores). Combining the dynamics of BP time series and SAPS-I or APACHE-III provided a more accurate assessment of patient survival/mortality in the hospital than using SAPS-I and APACHE-III alone (p=0.005 and p=0.045). Our results suggest that the discovered dynamics of vital sign time series may contain additional prognostic value beyond that of the baseline acuity measures, and can potentially be used as an independent predictor of outcomes in the ICU.


Assuntos
Indicadores Básicos de Saúde , Modelos Estatísticos , Monitorização Fisiológica/métodos , Adulto , Algoritmos , Pressão Sanguínea/fisiologia , Bases de Dados Factuais , Feminino , Frequência Cardíaca/fisiologia , Mortalidade Hospitalar , Humanos , Unidades de Terapia Intensiva , Masculino , Informática Médica , Prognóstico , Reprodutibilidade dos Testes , Teste da Mesa Inclinada
12.
Artigo em Inglês | MEDLINE | ID: mdl-24111382

RESUMO

Model identification for physiological systems is complicated by changes between operating regimes and measurement artifacts. We present a solution to these problems by assuming that a cohort of physiological time series is generated by switching among a finite collection of physiologically-constrained dynamical models and artifactual segments. We model the resulting time series using the switching linear dynamical systems (SLDS) framework, and present a novel learning algorithm for the class of SLDS, with the objective of identifying time series dynamics that are predictive of physiological regimes or outcomes of interest. We present exploratory results based on a simulation study and a physiological classification example of decoding postural changes from heart rate and blood pressure. We demonstrate a significant improvement in classification over methods based on feature learning via expectation maximization. The proposed learning algorithm is general, and can be extended to other applications involving state-space formulations.


Assuntos
Modelos Teóricos , Algoritmos , Pressão Sanguínea/fisiologia , Frequência Cardíaca/fisiologia , Humanos , Cadeias de Markov , Equilíbrio Postural , Teste da Mesa Inclinada
13.
Artigo em Inglês | MEDLINE | ID: mdl-24111374

RESUMO

Physiologic systems generate complex dynamics in their output signals that reflect the changing state of the underlying control systems. In this work, we used a switching vector autoregressive (switching VAR) framework to systematically learn and identify a collection of vital sign dynamics, which can possibly be recurrent within the same patient and shared across the entire cohort. We show that these dynamical behaviors can be used to characterize and elucidate the progression of patients' states of health over time. Using the mean arterial blood pressure time series of 337 ICU patients during the first 24 hours of their ICU stays, we demonstrated that the learned dynamics from as early as the first 8 hours of patients' ICU stays can achieve similar hospital mortality prediction performance as the well-known SAPS-I acuity scores, suggesting that the discovered latent dynamics structure may yield more timely insights into the progression of a patient's state of health than the traditional snapshot-based acuity scores.


Assuntos
Cuidados Críticos/métodos , Processamento de Sinais Assistido por Computador , Adulto , Área Sob a Curva , Pressão Arterial , Progressão da Doença , Mortalidade Hospitalar , Humanos , Monitorização Fisiológica , Análise de Regressão
14.
Sci Transl Med ; 5(212): 212ra163, 2013 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-24259051

RESUMO

Biophysical characteristics of cells are attractive as potential diagnostic markers for cancer. Transformation of cell state or phenotype and the accompanying epigenetic, nuclear, and cytoplasmic modifications lead to measureable changes in cellular architecture. We recently introduced a technique called deformability cytometry (DC) that enables rapid mechanophenotyping of single cells in suspension at rates of 1000 cells/s-a throughput that is comparable to traditional flow cytometry. We applied this technique to diagnose malignant pleural effusions, in which disseminated tumor cells can be difficult to accurately identify by traditional cytology. An algorithmic diagnostic scoring system was developed on the basis of quantitative features of two-dimensional distributions of single-cell mechanophenotypes from 119 samples. The DC scoring system classified 63% of the samples into two high-confidence regimes with 100% positive predictive value or 100% negative predictive value, and achieved an area under the curve of 0.86. This performance is suitable for a prescreening role to focus cytopathologist analysis time on a smaller fraction of difficult samples. Diagnosis of samples that present a challenge to cytology was also improved. Samples labeled as "atypical cells," which require additional time and follow-up, were classified in high-confidence regimes in 8 of 15 cases. Further, 10 of 17 cytology-negative samples corresponding to patients with concurrent cancer were correctly classified as malignant or negative, in agreement with 6-month outcomes. This study lays the groundwork for broader validation of label-free quantitative biophysical markers for clinical diagnoses of cancer and inflammation, which could help to reduce laboratory workload and improve clinical decision-making.


Assuntos
Biomarcadores Tumorais/análise , Derrame Pleural Maligno/diagnóstico , Área Sob a Curva , Humanos , Fenótipo , Derrame Pleural Maligno/patologia
15.
Artigo em Inglês | MEDLINE | ID: mdl-23367281

RESUMO

Modern clinical databases include time series of vital signs, which are often recorded continuously during a hospital stay. Over several days, these recordings may yield many thousands of samples. In this work, we explore the feasibility of characterizing the "state of health" of a patient using the physiological dynamics inferred from these time series. The ultimate objective is to assist clinicians in allocating resources to high-risk patients. We hypothesize that "similar" patients exhibit similar dynamics and the properties and duration of these states are indicative of health and disease. We use Bayesian nonparametric machine learning methods to discover shared dynamics in patients' blood pressure (BP) time series. Each such "dynamic" captures a distinct pattern of evolution of BP and is possibly recurrent within the same time series and shared across multiple patients. Next, we examine the utility of this low-dimensional representation of BP time series for predicting mortality in patients. Our results are based on an intensive care unit (ICU) cohort of 480 patients (with 16% mortality) and indicate that the dynamics of time series of vital signs can be an independent useful predictor of outcome in ICU.


Assuntos
Unidades de Terapia Intensiva , Monitorização Fisiológica/métodos , Processamento de Sinais Assistido por Computador , Teorema de Bayes , Nível de Saúde , Humanos
16.
Artigo em Inglês | MEDLINE | ID: mdl-23367424

RESUMO

Cardiovascular variables such as heart rate (HR) and blood pressure (BP) are robustly regulated by an underlying control system. Time series of HR and BP exhibit distinct dynamical patterns of interaction in response to perturbations (e.g., drugs or exercise) as well as in pathological states (e.g., excessive sympathetic activation). A question of interest is whether "similar" dynamical patterns can be identified across a heterogeneous patient cohort. In this work, we present a technique based on switching linear dynamical systems for identification of shared dynamical patterns in the time series of HR and BP recorded from a patient cohort. The technique uses a mixture of linear dynamical systems, the components of which are shared across all patients, to capture both nonlinear dynamics and non-Gaussian perturbations. We present exploratory results based on a simulation study of the cardiovascular system, and real recordings from 10 healthy subjects undergoing a tilt-table test. These results demonstrate the ability of the proposed technique to identify similar dynamical patterns present across multiple time series.


Assuntos
Sistema Cardiovascular , Processamento de Sinais Assistido por Computador , Algoritmos , Pressão Sanguínea , Estudos de Coortes , Simulação por Computador , Processamento Eletrônico de Dados , Frequência Cardíaca , Humanos , Modelos Lineares , Modelos Cardiovasculares , Análise Multivariada , Distribuição Normal , Probabilidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA