Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
J Chem Inf Model ; 62(15): 3486-3502, 2022 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-35849793

RESUMO

The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate compound's interaction with the target. By contrast, molecular docking is a widely applied method in drug discovery to estimate binding affinities. However, docking studies require a significant amount of domain knowledge to set up correctly, which hampers adoption. Here, we present dockstring, a bundle for meaningful and robust comparison of ML models using docking scores. dockstring consists of three components: (1) an open-source Python package for straightforward computation of docking scores, (2) an extensive dataset of docking scores and poses of more than 260,000 molecules for 58 medically relevant targets, and (3) a set of pharmaceutically relevant benchmark tasks such as virtual screening or de novo design of selective kinase inhibitors. The Python package implements a robust ligand and target preparation protocol that allows nonexperts to obtain meaningful docking scores. Our dataset is the first to include docking poses, as well as the first of its size that is a full matrix, thus facilitating experiments in multiobjective optimization and transfer learning. Overall, our results indicate that docking scores are a more realistic evaluation objective than simple physicochemical properties, yielding benchmark tasks that are more challenging and more closely related to real problems in drug discovery.


Assuntos
Benchmarking , Proteínas , Desenho de Fármacos , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/química
2.
Eur J Nucl Med Mol Imaging ; 49(1): 125-136, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34405276

RESUMO

PURPOSE: Positron emission tomography (PET) studies with radioligands for 18-kDa translocator protein (TSPO) have been instrumental in increasing our understanding of the complex role neuroinflammation plays in disorders affecting the brain. However, (R)-[11C]PK11195, the first and most widely used TSPO radioligand has limitations, while the next-generation TSPO radioligands have suffered from high interindividual variability in binding due to a genetic polymorphism in the TSPO gene (rs6971). Herein, we present the biological evaluation of the two enantiomers of [18F]GE387, which we have previously shown to have low sensitivity to this polymorphism. METHODS: Dynamic PET scans were conducted in male Wistar rats and female rhesus macaques to investigate the in vivo behaviour of (S)-[18F]GE387 and (R)-[18F]GE387. The specific binding of (S)-[18F]GE387 to TSPO was investigated by pre-treatment with (R)-PK11195. (S)-[18F]GE387 was further evaluated in a rat model of lipopolysaccharide (LPS)-induced neuroinflammation. Sensitivity to polymorphism of (S)-GE387 was evaluated in genotyped human brain tissue. RESULTS: (S)-[18F]GE387 and (R)-[18F]GE387 entered the brain in both rats and rhesus macaques. (R)-PK11195 blocked the uptake of (S)-[18F]GE387 in healthy olfactory bulb and peripheral tissues constitutively expressing TSPO. A 2.7-fold higher uptake of (S)-[18F]GE387 was found in the inflamed striatum of LPS-treated rodents. In genotyped human brain tissue, (S)-GE387 was shown to bind similarly in low affinity binders (LABs) and high affinity binders (HABs) with a LAB to HAB ratio of 1.8. CONCLUSION: We established that (S)-[18F]GE387 has favourable kinetics in healthy rats and non-human primates and that it can distinguish inflamed from normal brain regions in the LPS model of neuroinflammation. Crucially, we have reconfirmed its low sensitivity to the TSPO polymorphism on genotyped human brain tissue. Based on these factors, we conclude that (S)-[18F]GE387 warrants further evaluation with studies on human subjects to assess its suitability as a TSPO PET radioligand for assessing neuroinflammation.


Assuntos
Compostos Radiofarmacêuticos , Receptores de GABA , Animais , Encéfalo/diagnóstico por imagem , Encéfalo/metabolismo , Proteínas de Transporte , Feminino , Humanos , Macaca mulatta/genética , Masculino , Polimorfismo Genético , Tomografia por Emissão de Pósitrons , Ratos , Ratos Wistar , Receptores de GABA/genética , Receptores de GABA/metabolismo , Receptores de GABA-A
4.
bioRxiv ; 2023 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-37961379

RESUMO

In metagenomics, the pool of uncharacterized microbial enzymes presents a challenge for functional annotation. Among these, carbohydrate-active enzymes (CAZymes) stand out due to their pivotal roles in various biological processes related to host health and nutrition. Here, we present CAZyLingua, the first tool that harnesses protein language model embeddings to build a deep learning framework that facilitates the annotation of CAZymes in metagenomic datasets. Our benchmarking results showed on average a higher F1 score (reflecting an average of precision and recall) on the annotated genomes of Bacteroides thetaiotaomicron, Eggerthella lenta and Ruminococcus gnavus compared to the traditional sequence homology-based method in dbCAN2. We applied our tool to a paired mother/infant longitudinal dataset and revealed unannotated CAZymes linked to microbial development during infancy. When applied to metagenomic datasets derived from patients affected by fibrosis-prone diseases such as Crohn's disease and IgG4-related disease, CAZyLingua uncovered CAZymes associated with disease and healthy states. In each of these metagenomic catalogs, CAZyLingua discovered new annotations that were previously overlooked by traditional sequence homology tools. Overall, the deep learning model CAZyLingua can be applied in combination with existing tools to unravel intricate CAZyme evolutionary profiles and patterns, contributing to a more comprehensive understanding of microbial metabolic dynamics.

5.
Proc Natl Acad Sci U S A ; 106(47): 19765-9, 2009 Nov 24.
Artigo em Inglês | MEDLINE | ID: mdl-19805023

RESUMO

Simulating the conformational dynamics of biomolecules is extremely difficult due to the rugged nature of their free energy landscapes and multiple long-lived, or metastable, states. Generalized ensemble (GE) algorithms, which have become popular in recent years, attempt to facilitate crossing between states at low temperatures by inducing a random walk in temperature space. Enthalpic barriers may be crossed more easily at high temperatures; however, entropic barriers will become more significant. This poses a problem because the dominant barriers to conformational change are entropic for many biological systems, such as the short RNA hairpin studied here. We present a new efficient algorithm for conformational sampling, called the adaptive seeding method (ASM), which uses nonequilibrium GE simulations to identify the metastable states, and seeds short simulations at constant temperature from each of them to quantitatively determine their equilibrium populations. Thus, the ASM takes advantage of the broad sampling possible with GE algorithms but generally crosses entropic barriers more efficiently during the seeding simulations at low temperature. We show that only local equilibrium is necessary for ASM, so very short seeding simulations may be used. Moreover, the ASM may be used to recover equilibrium properties from existing datasets that failed to converge, and is well suited to running on modern computer clusters.


Assuntos
Algoritmos , Modelos Moleculares , Simulação de Dinâmica Molecular , Cadeias de Markov , Termodinâmica
6.
Bayesian Anal ; 17(3): 685-709, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37876627

RESUMO

The predictive probabilities of the hierarchical Pitman-Yor process are approximated through Monte Carlo algorithms that exploits the Chinese Restaurant Franchise (CRF) representation. However, in order to simulate the posterior distribution of the hierarchical Pitman-Yor process, a set of auxiliary variables representing the arrangement of customers in tables of the CRF must be sampled through Markov chain Monte Carlo. This paper develops a perfect sampler for these latent variables employing ideas from the Propp-Wilson algorithm and evaluates its average running time by extensive simulations. The simulations reveal a significant dependence of running time on the parameters of the model, which exhibits sharp transitions. The algorithm is compared to simpler Gibbs sampling procedures, as well as a procedure for unbiased Monte Carlo estimation proposed by Glynn and Rhee. We illustrate its use with an example in microbial genomics studies.

7.
Nat Commun ; 12(1): 801, 2021 02 05.
Artigo em Inglês | MEDLINE | ID: mdl-33547324

RESUMO

Most trials do not release interim summaries on efficacy and toxicity of the experimental treatments being tested, with this information only released to the public after the trial has ended. While early release of clinical trial data to physicians and patients can inform enrollment decision making, it may also affect key operating characteristics of the trial, statistical validity and trial duration. We investigate the public release of early efficacy and toxicity results, during ongoing clinical studies, to better inform patients about their enrollment options. We use simulation models of phase II glioblastoma (GBM) clinical trials in which early efficacy and toxicity estimates are periodically released accordingly to a pre-specified protocol. Patients can use the reported interim efficacy and toxicity information, with the support of physicians, to decide which trial to enroll in. We describe potential effects on various operating characteristics, including the study duration, selection bias and power.


Assuntos
Antineoplásicos/uso terapêutico , Neoplasias Encefálicas/psicologia , Drogas em Investigação/uso terapêutico , Glioblastoma/psicologia , Disseminação de Informação/métodos , Modelagem Computacional Específica para o Paciente , Neoplasias Encefálicas/tratamento farmacológico , Neoplasias Encefálicas/mortalidade , Neoplasias Encefálicas/patologia , Ensaios Clínicos como Assunto , Tomada de Decisões , Glioblastoma/tratamento farmacológico , Glioblastoma/mortalidade , Glioblastoma/patologia , Humanos , Disseminação de Informação/ética , Segurança do Paciente , Seleção de Pacientes/ética , Análise de Sobrevida , Fatores de Tempo , Resultado do Tratamento
8.
J Chem Phys ; 131(4): 045106, 2009 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-19655927

RESUMO

Discrete-space Markov models are a convenient way of describing the kinetics of biomolecules. The most common strategies used to validate these models employ statistics from simulation data, such as the eigenvalue spectrum of the inferred rate matrix, which are often associated with large uncertainties. Here, we propose a Bayesian approach, which makes it possible to differentiate between models at a fixed lag time making use of short trajectories. The hierarchical definition of the models allows one to compare instances with any number of states. We apply a conjugate prior for reversible Markov chains, which was recently introduced in the statistics literature. The method is tested in two different systems, a Monte Carlo dynamics simulation of a two-dimensional model system and molecular dynamics simulations of the terminally blocked alanine dipeptide.


Assuntos
Simulação por Computador , DNA/química , Cadeias de Markov , Proteínas/química , Modelos Moleculares
10.
J Am Stat Assoc ; 112(520): 1430-1442, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29430070

RESUMO

Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters that capture and describe variations of OTU counts across biological samples. It remains important to evaluate how uncertainty in estimates of each biological sample's microbial distribution propagates to ordination analyses, including visualization of clusters and projections of biological samples on low dimensional spaces. We propose a Bayesian analysis for dependent distributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior, the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariate data analysis techniques we can visualize credible regions in ecological ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets.

11.
Stat Comput ; 25(4): 797-808, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-26412947

RESUMO

Recent advances in Monte Carlo methods allow us to revisit work by de Finetti who suggested the use of approximate exchangeability in the analyses of contingency tables. This paper gives examples of computational implementations using Metropolis Hastings, Langevin and Hamiltonian Monte Carlo to compute posterior distributions for test statistics relevant for testing independence, reversible or three way models for discrete exponential families using polynomial priors and Gröbner bases.

12.
PLoS One ; 8(4): e58699, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23565139

RESUMO

The large amount of molecular dynamics simulation data produced by modern computational models brings big opportunities and challenges to researchers. Clustering algorithms play an important role in understanding biomolecular kinetics from the simulation data, especially under the Markov state model framework. However, the ruggedness of the free energy landscape in a biomolecular system makes common clustering algorithms very sensitive to perturbations of the data. Here, we introduce a data-exploratory tool which provides an overview of the clustering structure under different parameters. The proposed Multi-Persistent Clustering analysis combines insights from recent studies on the dynamics of systems with dominant metastable states with the concept of multi-dimensional persistence in computational topology. We propose to explore the clustering structure of the data based on its persistence on scale and density. The analysis provides a systematic way to discover clusters that are robust to perturbations of the data. The dominant states of the system can be chosen with confidence. For the clusters on the borderline, the user can choose to do more simulation or make a decision based on their structural characteristics. Furthermore, our multi-resolution analysis gives users information about the relative potential of the clusters and their hierarchical relationship. The effectiveness of the proposed method is illustrated in three biomolecules: alanine dipeptide, Villin headpiece, and the FiP35 WW domain.


Assuntos
Análise por Conglomerados , Conformação Molecular , Simulação de Dinâmica Molecular , Alanina/química , Dipeptídeos/química , Proteínas dos Microfilamentos/química
13.
Nat Biotechnol ; 26(5): 561-9, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18438401

RESUMO

The safe and effective delivery of RNA interference (RNAi) therapeutics remains an important challenge for clinical development. The diversity of current delivery materials remains limited, in part because of their slow, multi-step syntheses. Here we describe a new class of lipid-like delivery molecules, termed lipidoids, as delivery agents for RNAi therapeutics. Chemical methods were developed to allow the rapid synthesis of a large library of over 1,200 structurally diverse lipidoids. From this library, we identified lipidoids that facilitate high levels of specific silencing of endogenous gene transcripts when formulated with either double-stranded small interfering RNA (siRNA) or single-stranded antisense 2'-O-methyl (2'-OMe) oligoribonucleotides targeting microRNA (miRNA). The safety and efficacy of lipidoids were evaluated in three animal models: mice, rats and nonhuman primates. The studies reported here suggest that these materials may have broad utility for both local and systemic delivery of RNA therapeutics.


Assuntos
Técnicas de Química Combinatória/métodos , Portadores de Fármacos/química , Desenho de Fármacos , Lipídeos/química , Interferência de RNA , RNA/administração & dosagem , RNA/genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa