Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 2024 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-38905502

RESUMEN

SUMMARY: The design of two overlapping genes in a microbial genome is an emerging technique for adding more reliable control mechanisms in engineered organisms for increased stability. The design of functional overlapping gene pairs is a challenging procedure and computational design tools are used to improve the efficiency to deploy successful designs in genetically engineered systems. GENTANGLE (Gene Tuples ArraNGed in overLapping Elements) is a high-performance containerized pipeline for the computational design of two overlapping genes translated in different reading frames of the genome. This new software package can be used to design and test gene entanglements for microbial engineering projects using arbitrary sets of user specified gene pairs. AVAILABILITY AND IMPLEMENTATION: The GENTANGLE source code and its submodules are freely available on GitHub at https://github.com/BiosecSFA/gentangle. The DATANGLE (DATA for genTANGLE) repository contains related data and results, and is freely available on GitHub at https://github.com/BiosecSFA/datangle. The GENTANGLE container is freely available on Singularity Cloud Library at https://cloud.sylabs.io/library/khyox/gentangle/gentangle.sif. The GENTANGLE repository wiki (https://github.com/BiosecSFA/gentangle/wiki), website (https://biosecsfa.github.io/gentangle/) and user manual contain detailed instructions on how to use the different components of software and data, including examples and reproducing the results. The code is licensed under the GNU Affero General Public License version 3 (https://www.gnu.org/licenses/agpl.html). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
PLoS One ; 19(1): e0289198, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38271318

RESUMEN

Viral populations in natural infections can have a high degree of sequence diversity, which can directly impact immune escape. However, antibody potency is often tested in vitro with a relatively clonal viral populations, such as laboratory virus or pseudotyped virus stocks, which may not accurately represent the genetic diversity of circulating viral genotypes. This can affect the validity of viral phenotype assays, such as antibody neutralization assays. To address this issue, we tested whether recombinant virus carrying SARS-CoV-2 spike (VSV-SARS-CoV-2-S) stocks could be made more genetically diverse by passage, and if a stock passaged under selective pressure was more capable of escaping monoclonal antibody (mAb) neutralization than unpassaged stock or than viral stock passaged without selective pressures. We passaged VSV-SARS-CoV-2-S four times concurrently in three cell lines and then six times with or without polyclonal antiserum selection pressure. All three of the monoclonal antibodies tested neutralized the viral population present in the unpassaged stock. The viral inoculum derived from serial passage without antiserum selection pressure was neutralized by two of the three mAbs. However, the viral inoculum derived from serial passage under antiserum selection pressure escaped neutralization by all three mAbs. Deep sequencing revealed the rapid acquisition of multiple mutations associated with antibody escape in the VSV-SARS-CoV-2-S that had been passaged in the presence of antiserum, including key mutations present in currently circulating Omicron subvariants. These data indicate that viral stock that was generated under polyclonal antiserum selection pressure better reflects the natural environment of the circulating virus and may yield more biologically relevant outcomes in phenotypic assays. Thus, mAb assessment assays that utilize a more genetically diverse, biologically relevant, virus stock may yield data that are relevant for prediction of mAb efficacy and for enhancing biosurveillance.


Asunto(s)
Anticuerpos Neutralizantes , COVID-19 , Humanos , SARS-CoV-2/genética , Anticuerpos Antivirales , Pruebas de Neutralización , Sueros Inmunes , Glicoproteína de la Espiga del Coronavirus/genética
3.
J Chem Inf Model ; 63(21): 6655-6666, 2023 11 13.
Artículo en Inglés | MEDLINE | ID: mdl-37847557

RESUMEN

Protein-ligand interactions are essential to drug discovery and drug development efforts. Desirable on-target or multitarget interactions are the first step in finding an effective therapeutic, while undesirable off-target interactions are the first step in assessing safety. In this work, we introduce a novel ligand-based featurization and mapping of human protein pockets to identify closely related protein targets and to project novel drugs into a hybrid protein-ligand feature space to identify their likely protein interactions. Using structure-based template matches from PDB, protein pockets are featured by the ligands that bind to their best co-complex template matches. The simplicity and interpretability of this approach provide a granular characterization of the human proteome at the protein-pocket level instead of the traditional protein-level characterization by family, function, or pathway. We demonstrate the power of this featurization method by clustering a subset of the human proteome and evaluating the predicted cluster associations of over 7000 compounds.


Asunto(s)
Proteoma , Humanos , Unión Proteica , Sitios de Unión , Conformación Proteica , Ligandos , Análisis por Conglomerados
4.
Artif Intell Chem ; 1(1)2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37583465

RESUMEN

Neural Network (NN) models provide potential to speed up the drug discovery process and reduce its failure rates. The success of NN models requires uncertainty quantification (UQ) as drug discovery explores chemical space beyond the training data distribution. Standard NN models do not provide uncertainty information. Some methods require changing the NN architecture or training procedure, limiting the selection of NN models. Moreover, predictive uncertainty can come from different sources. It is important to have the ability to separately model different types of predictive uncertainty, as the model can take assorted actions depending on the source of uncertainty. In this paper, we examine UQ methods that estimate different sources of predictive uncertainty for NN models aiming at protein-ligand binding prediction. We use our prior knowledge on chemical compounds to design the experiments. By utilizing a visualization method we create non-overlapping and chemically diverse partitions from a collection of chemical compounds. These partitions are used as training and test set splits to explore NN model uncertainty. We demonstrate how the uncertainties estimated by the selected methods describe different sources of uncertainty under different partitions and featurization schemes and the relationship to prediction error.

5.
ACS Omega ; 8(24): 21871-21884, 2023 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-37309388

RESUMEN

Minimizing the human and economic costs of the COVID-19 pandemic and future pandemics requires the ability to develop and deploy effective treatments for novel pathogens as soon as possible after they emerge. To this end, we introduce a new computational pipeline for the rapid identification and characterization of binding sites in viral proteins along with the key chemical features, which we call chemotypes, of the compounds predicted to interact with those same sites. The composition of source organisms for the structural models associated with an individual binding site is used to assess the site's degree of structural conservation across different species, including other viruses and humans. We propose a search strategy for novel therapeutics that involves the selection of molecules preferentially containing the most structurally rich chemotypes identified by our algorithm. While we demonstrate the pipeline on SARS-CoV-2, it is generalizable to any new virus, as long as either experimentally solved structures for its proteins are available or sufficiently accurate predicted structures can be constructed.

6.
Astrobiology ; 23(8): 897-907, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37102710

RESUMEN

Molecular biology methods and technologies have advanced substantially over the past decade. These new molecular methods should be incorporated among the standard tools of planetary protection (PP) and could be validated for incorporation by 2026. To address the feasibility of applying modern molecular techniques to such an application, NASA conducted a technology workshop with private industry partners, academics, and government agency stakeholders, along with NASA staff and contractors. The technical discussions and presentations of the Multi-Mission Metagenomics Technology Development Workshop focused on modernizing and supplementing the current PP assays. The goals of the workshop were to assess the state of metagenomics and other advanced molecular techniques in the context of providing a validated framework to supplement the bacterial endospore-based NASA Standard Assay and to identify knowledge and technology gaps. In particular, workshop participants were tasked with discussing metagenomics as a stand-alone technology to provide rapid and comprehensive analysis of total nucleic acids and viable microorganisms on spacecraft surfaces, thereby allowing for the development of tailored and cost-effective microbial reduction plans for each hardware item on a spacecraft. Workshop participants recommended metagenomics approaches as the only data source that can adequately feed into quantitative microbial risk assessment models for evaluating the risk of forward (exploring extraterrestrial planet) and back (Earth harmful biological) contamination. Participants were unanimous that a metagenomics workflow, in tandem with rapid targeted quantitative (digital) PCR, represents a revolutionary advance over existing methods for the assessment of microbial bioburden on spacecraft surfaces. The workshop highlighted low biomass sampling, reagent contamination, and inconsistent bioinformatics data analysis as key areas for technology development. Finally, it was concluded that implementing metagenomics as an additional workflow for addressing concerns of NASA's robotic mission will represent a dramatic improvement in technology advancement for PP and will benefit future missions where mission success is affected by backward and forward contamination.


Asunto(s)
Planetas , Vuelo Espacial , Estados Unidos , Humanos , Medio Ambiente Extraterrestre , Metagenómica , United States National Aeronautics and Space Administration , Nave Espacial , Políticas
7.
Viruses ; 14(12)2022 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-36560780

RESUMEN

Genetic analysis of intra-host viral populations provides unique insight into pre-emergent mutations that may contribute to the genotype of future variants. Clinical samples positive for SARS-CoV-2 collected in California during the first months of the pandemic were sequenced to define the dynamics of mutation emergence as the virus became established in the state. Deep sequencing of 90 nasopharyngeal samples showed that many mutations associated with the establishment of SARS-CoV-2 globally were present at varying frequencies in a majority of the samples, even those collected as the virus was first detected in the US. A subset of mutations that emerged months later in consensus sequences were detected as subconsensus members of intra-host populations. Spike mutations P681H, H655Y, and V1104L were detected prior to emergence in variant genotypes, mutations were detected at multiple positions within the furin cleavage site, and pre-emergent mutations were identified in the nucleocapsid and the envelope genes. Because many of the samples had a very high depth of coverage, a bioinformatics pipeline, "Mappgene", was established that uses both iVar and LoFreq variant calling to enable identification of very low-frequency variants. This enabled detection of a spike protein deletion present in many samples at low frequency and associated with a variant of concern.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Pandemias , SARS-CoV-2/genética , Mutación , Biología Computacional , Glicoproteína de la Espiga del Coronavirus/genética
8.
NAR Genom Bioinform ; 4(4): lqac078, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36225529

RESUMEN

We present a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions ('spheres') adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library. PDBspheres uses the LGA (Local-Global Alignment) structure alignment algorithm as the main engine for detecting structural similarities between the protein of interest and template spheres from the library, which currently contains >2 million spheres. To assess confidence in structural matches, an all-atom-based similarity metric takes side chain placement into account. Here, we describe the PDBspheres method, demonstrate its ability to detect and characterize binding sites in protein structures, show how PDBspheres-a strictly structure-based method-performs on a curated dataset of 2528 ligand-bound and ligand-free crystal structures, and use PDBspheres to cluster pockets and assess structural similarities among protein binding sites of 4876 structures in the 'refined set' of the PDBbind 2019 dataset.

9.
J Chem Inf Model ; 62(15): 3551-3564, 2022 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-35857932

RESUMEN

The growing capabilities of synthetic biology and organic chemistry demand tools to guide syntheses toward useful molecules. Here, we present Molecular AutoenCoding Auto-Workaround (MACAW), a tool that uses a novel approach to generate molecules predicted to meet a desired property specification (e.g., a binding affinity of 50 nM or an octane number of 90). MACAW describes molecules by embedding them into a smooth multidimensional numerical space, avoiding uninformative dimensions that previous methods often introduce. The coordinates in this embedding provide a natural choice of features for accurately predicting molecular properties, which we demonstrate with examples for cetane and octane numbers, flash points, and histamine H1 receptor binding affinity. The approach is computationally efficient and well-suited to the small- and medium-size datasets commonly used in biosciences. We showcase the utility of MACAW for virtual screening by identifying molecules with high predicted binding affinity to the histamine H1 receptor and limited affinity to the muscarinic M2 receptor, which are targets of medicinal relevance. Combining these predictive capabilities with a novel generative algorithm for molecules allows us to recommend molecules with a desired property value (i.e., inverse molecular design). We demonstrate this capability by recommending molecules with predicted octane numbers of 40, 80, and 120, which is an important characteristic of biofuels. Thus, MACAW augments classical retrosynthesis tools by providing recommendations for molecules on specification.


Asunto(s)
Octanos , Receptores Histamínicos H1 , Algoritmos , Unión Proteica
10.
J Chem Theory Comput ; 18(7): 4047-4069, 2022 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-35710099

RESUMEN

Atomistic Molecular Dynamics (MD) simulations provide researchers the ability to model biomolecular structures such as proteins and their interactions with drug-like small molecules with greater spatiotemporal resolution than is otherwise possible using experimental methods. MD simulations are notoriously expensive computational endeavors that have traditionally required massive investment in specialized hardware to access biologically relevant spatiotemporal scales. Our goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice. We consider three broad categories of accelerators: Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application Specific Integrated Circuits (ASICs). These categories are comparatively studied to facilitate discussion of their relative trade-offs and to gain context for the current state of the art. We conclude by providing insights into the potential of emerging hardware platforms and algorithms for MD.


Asunto(s)
Algoritmos , Simulación de Dinámica Molecular , Computadores
11.
J Chem Inf Model ; 62(10): 2301-2315, 2022 05 23.
Artículo en Inglés | MEDLINE | ID: mdl-35447030

RESUMEN

The identification of promising lead compounds showing pharmacological activities toward a biological target is essential in early stage drug discovery. With the recent increase in available small-molecule databases, virtual high-throughput screening using physics-based molecular docking has emerged as an essential tool in assisting fast and cost-efficient lead discovery and optimization. However, the best scored docking poses are often suboptimal, resulting in incorrect screening and chemical property calculation. We address the pose classification problem by leveraging data-driven machine learning approaches to identify correct docking poses from AutoDock Vina and Glide screens. To enable effective classification of docking poses, we present two convolutional neural network approaches: a three-dimensional convolutional neural network (3D-CNN) and an attention-based point cloud network (PCN) trained on the PDBbind refined set. We demonstrate the effectiveness of our proposed classifiers on multiple evaluation data sets including the standard PDBbind CASF-2016 benchmark data set and various compound libraries with structurally different protein targets including an ion channel data set extracted from Protein Data Bank (PDB) and an in-house KCa3.1 inhibitor data set. Our experiments show that excluding false positive docking poses using the proposed classifiers improves virtual high-throughput screening to identify novel molecules against each target protein compared to the initial screen based on the docking scores.


Asunto(s)
Canales Iónicos , Redes Neurales de la Computación , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica
12.
Front Mol Biosci ; 8: 678701, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34327214

RESUMEN

A rapid response is necessary to contain emergent biological outbreaks before they can become pandemics. The novel coronavirus (SARS-CoV-2) that causes COVID-19 was first reported in December of 2019 in Wuhan, China and reached most corners of the globe in less than two months. In just over a year since the initial infections, COVID-19 infected almost 100 million people worldwide. Although similar to SARS-CoV and MERS-CoV, SARS-CoV-2 has resisted treatments that are effective against other coronaviruses. Crystal structures of two SARS-CoV-2 proteins, spike protein and main protease, have been reported and can serve as targets for studies in neutralizing this threat. We have employed molecular docking, molecular dynamics simulations, and machine learning to identify from a library of 26 million molecules possible candidate compounds that may attenuate or neutralize the effects of this virus. The viability of selected candidate compounds against SARS-CoV-2 was determined experimentally by biolayer interferometry and FRET-based activity protein assays along with virus-based assays. In the pseudovirus assay, imatinib and lapatinib had IC50 values below 10 µM, while candesartan cilexetil had an IC50 value of approximately 67 µM against Mpro in a FRET-based activity assay. Comparatively, candesartan cilexetil had the highest selectivity index of all compounds tested as its half-maximal cytotoxicity concentration 50 (CC50) value was the only one greater than the limit of the assay (>100 µM).

13.
J Chem Inf Model ; 61(4): 1583-1592, 2021 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-33754707

RESUMEN

Predicting accurate protein-ligand binding affinities is an important task in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the application of deep convolutional and graph neural network-based approaches, it remains unclear what the relative advantages of each approach are and how they compare with physics-based methodologies that have found more mainstream success in virtual screening pipelines. We present fusion models that combine features and inference from complementary representations to improve binding affinity prediction. This, to our knowledge, is the first comprehensive study that uses a common series of evaluations to directly compare the performance of three-dimensional (3D)-convolutional neural networks (3D-CNNs), spatial graph neural networks (SG-CNNs), and their fusion. We use temporal and structure-based splits to assess performance on novel protein targets. To test the practical applicability of our models, we examine their performance in cases that assume that the crystal structure is not available. In these cases, binding free energies are predicted using docking pose coordinates as the inputs to each model. In addition, we compare these deep learning approaches to predictions based on docking scores and molecular mechanic/generalized Born surface area (MM/GBSA) calculations. Our results show that the fusion models make more accurate predictions than their constituent neural network models as well as docking scoring and MM/GBSA rescoring, with the benefit of greater computational efficiency than the MM/GBSA method. Finally, we provide the code to reproduce our results and the parameter files of the trained models used in this work. The software is available as open source at https://github.com/llnl/fast. Model parameter files are available at ftp://gdo-bioinformatics.ucllnl.org/fast/pdbbind2016_model_checkpoints/.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Ligandos , Unión Proteica , Proteínas/metabolismo , Programas Informáticos
14.
J Chem Inf Model ; 61(2): 587-602, 2021 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-33502191

RESUMEN

Cholestatic liver injury is frequently associated with drug inhibition of bile salt transporters, such as the bile salt export pump (BSEP). Reliable in silico models to predict BSEP inhibition directly from chemical structures would significantly reduce costs during drug discovery and could help avoid injury to patients. We report our development of classification and regression models for BSEP inhibition with substantially improved performance over previously published models. We assessed the performance effects of different methods of chemical featurization, data set partitioning, and class labeling and identified the methods producing models that generalized best to novel chemical entities.


Asunto(s)
Enfermedad Hepática Inducida por Sustancias y Drogas , Colestasis , Miembro 11 de la Subfamilia B de Transportador de Casetes de Unión al ATP , Transportadoras de Casetes de Unión a ATP , Humanos , Aprendizaje Automático
15.
J Chem Inf Model ; 60(11): 5375-5381, 2020 11 23.
Artículo en Inglés | MEDLINE | ID: mdl-32794768

RESUMEN

Accurately predicting small molecule partitioning and hydrophobicity is critical in the drug discovery process. There are many heterogeneous chemical environments within a cell and entire human body. For example, drugs must be able to cross the hydrophobic cellular membrane to reach their intracellular targets, and hydrophobicity is an important driving force for drug-protein binding. Atomistic molecular dynamics (MD) simulations are routinely used to calculate free energies of small molecules binding to proteins, crossing lipid membranes, and solvation but are computationally expensive. Machine learning (ML) and empirical methods are also used throughout drug discovery but rely on experimental data, limiting the domain of applicability. We present atomistic MD simulations calculating 15,000 small molecule free energies of transfer from water to cyclohexane. This large data set is used to train ML models that predict the free energies of transfer. We show that a spatial graph neural network model achieves the highest accuracy, followed closely by a 3D-convolutional neural network, and shallow learning based on the chemical fingerprint is significantly less accurate. A mean absolute error of ∼4 kJ/mol compared to the MD calculations was achieved for our best ML model. We also show that including data from the MD simulation improves the predictions, tests the transferability of each model to a diverse set of molecules, and show multitask learning improves the predictions. This work provides insight into the hydrophobicity of small molecules and ML cheminformatics modeling, and our data set will be useful for designing and testing future ML cheminformatics methods.


Asunto(s)
Aprendizaje Profundo , Simulación de Dinámica Molecular , Entropía , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Termodinámica
16.
J Chem Inf Model ; 60(6): 2766-2772, 2020 06 22.
Artículo en Inglés | MEDLINE | ID: mdl-32338892

RESUMEN

We present a new approach to estimate the binding affinity from given three-dimensional poses of protein-ligand complexes. In this scheme, every protein-ligand atom pair makes an additive free-energy contribution. The sum of these pairwise contributions then gives the total binding free energy or the logarithm of the dissociation constant. The pairwise contribution is calculated by a function implemented via a neural network that takes the properties of the two atoms and their distance as input. The pairwise function is trained using a portion of the PDBbind 2018 data set. The model achieves good accuracy for affinity predictions when evaluated with PDBbind 2018 and with the CASF-2016 benchmark, comparing favorably to many scoring functions such as that of AutoDock Vina. The framework here may be extended to incorporate other factors to further improve its accuracy and power.


Asunto(s)
Diseño de Fármacos , Redes Neurales de la Computación , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica
17.
J Chem Inf Model ; 60(4): 1955-1968, 2020 04 27.
Artículo en Inglés | MEDLINE | ID: mdl-32243153

RESUMEN

One of the key requirements for incorporating machine learning (ML) into the drug discovery process is complete traceability and reproducibility of the model building and evaluation process. With this in mind, we have developed an end-to-end modular and extensible software pipeline for building and sharing ML models that predict key pharma-relevant parameters. The ATOM Modeling PipeLine, or AMPL, extends the functionality of the open source library DeepChem and supports an array of ML and molecular featurization tools. We have benchmarked AMPL on a large collection of pharmaceutical data sets covering a wide range of parameters. Our key findings indicate that traditional molecular fingerprints underperform other feature representation methods. We also find that data set size correlates directly with prediction performance, which points to the need to expand public data sets. Uncertainty quantification can help predict model error, but correlation with error varies considerably between data sets and model types. Our findings point to the need for an extensible pipeline that can be shared to make model building more widely accessible and reproducible. This software is open source and available at: https://github.com/ATOMconsortium/AMPL.


Asunto(s)
Descubrimiento de Drogas , Programas Informáticos , Aprendizaje Automático , Reproducibilidad de los Resultados
18.
PLoS One ; 14(12): e0225699, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31809512

RESUMEN

The question of how Zika virus (ZIKV) changed from a seemingly mild virus to a human pathogen capable of microcephaly and sexual transmission remains unanswered. The unexpected emergence of ZIKV's pathogenicity and capacity for sexual transmission may be due to genetic changes, and future changes in phenotype may continue to occur as the virus expands its geographic range. Alternatively, the sheer size of the 2015-16 epidemic may have brought attention to a pre-existing virulent ZIKV phenotype in a highly susceptible population. Thus, it is important to identify patterns of genetic change that may yield a better understanding of ZIKV emergence and evolution. However, because ZIKV has an RNA genome and a polymerase incapable of proofreading, it undergoes rapid mutation which makes it difficult to identify combinations of mutations associated with viral emergence. As next generation sequencing technology has allowed whole genome consensus and variant sequence data to be generated for numerous virus samples, the task of analyzing these genomes for patterns of mutation has become more complex. However, understanding which combinations of mutations spread widely and become established in new geographic regions versus those that disappear relatively quickly is essential for defining the trajectory of an ongoing epidemic. In this study, multiscale analysis of the wealth of genomic data generated over the course of the epidemic combined with in vivo laboratory data allowed trends in mutations and outbreak trajectory to be assessed. Mutations were detected throughout the genome via deep sequencing, and many variants appeared in multiple samples and in some cases become consensus. Similarly, amino acids that were previously consensus in pre-outbreak samples were detected as low frequency variants in epidemic strains. Protein structural models indicate that most of the mutations associated with the epidemic transmission occur on the exposed surface of viral proteins. At the macroscale level, consensus data was organized into large and interactive databases to allow the spread of individual mutations and combinations of mutations to be visualized and assessed for temporal and geographical patterns. Thus, the use of multiscale modeling for identifying mutations or combinations of mutations that impact epidemic transmission and phenotypic impact can aid the formation of hypotheses which can then be tested using reverse genetics.


Asunto(s)
Brotes de Enfermedades/prevención & control , Genoma Viral/genética , Tasa de Mutación , Infección por el Virus Zika/prevención & control , Virus Zika/genética , Bases de Datos Genéticas/estadística & datos numéricos , Conjuntos de Datos como Asunto , Genotipo , Geografía , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Moleculares , Filogenia , ARN Viral/genética , ARN Viral/aislamiento & purificación , Análisis Espacio-Temporal , Proteínas no Estructurales Virales/genética , Proteínas Estructurales Virales/genética , Virus Zika/aislamiento & purificación , Virus Zika/patogenicidad , Infección por el Virus Zika/epidemiología , Infección por el Virus Zika/transmisión , Infección por el Virus Zika/virología
19.
PLoS One ; 13(12): e0209683, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30592753

RESUMEN

Kawasaki disease (KD), first identified in 1967, is a pediatric vasculitis of unknown etiology that has an increasing incidence in Japan and many other countries. KD can cause coronary artery aneurysms. Its epidemiological characteristics, such as seasonality and clinical picture of acute systemic inflammation with prodromal intestinal/respiratory symptoms, suggest an infectious etiology for KD. Interestingly, multiple host genotypes have been identified as predisposing factors for KD. To explore experimental methodology for identifying etiological agent(s) for KD and to optimize epidemiological study design (particularly the sample size) for future studies, we conducted a pilot study. For a 1-year period, we prospectively enrolled 11 patients with KD. To each KD patient, we assigned two control individuals (one with diarrhea and the other with respiratory infections), matched for age, sex, and season of diagnosis. During the acute phase of disease, we collected peripheral blood, nasopharyngeal aspirate, and feces. We also determined genotypes, to identify those that confer susceptibility to KD. There was no statistically significant difference in the frequency of the risk genotypes between KD patients and control subjects. We also used unbiased metagenomic sequencing to analyze these samples. Metagenomic sequencing and PCR detected torque teno virus 7 (TTV7) in two patients with KD (18%), but not in control subjects (P = 0.111). Sanger sequencing revealed that the TTV7 found in the two KD patients contained almost identical variants in nucleotide and identical changes in resulting amino acid, relative to the reference sequence. Additionally, we estimated the sample size that would be required to demonstrate a statistical correlation between TTV7 and KD. Future larger scale studies with carefully optimized metagenomic sequencing experiments and adequate sample size are warranted to further examine the association between KD and potential pathogens, including TTV7.


Asunto(s)
Infecciones por Virus ADN/complicaciones , Infecciones por Virus ADN/virología , Síndrome Mucocutáneo Linfonodular/etiología , Torque teno virus/fisiología , Alelos , Biomarcadores , Preescolar , Susceptibilidad a Enfermedades , Evolución Molecular , Femenino , Genoma Viral , Genómica/métodos , Genotipo , Humanos , Lactante , Masculino , Metagenoma , Metagenómica , Oportunidad Relativa , Estaciones del Año
20.
BMC Bioinformatics ; 19(Suppl 18): 486, 2018 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-30577754

RESUMEN

BACKGROUND: The National Cancer Institute drug pair screening effort against 60 well-characterized human tumor cell lines (NCI-60) presents an unprecedented resource for modeling combinational drug activity. RESULTS: We present a computational model for predicting cell line response to a subset of drug pairs in the NCI-ALMANAC database. Based on residual neural networks for encoding features as well as predicting tumor growth, our model explains 94% of the response variance. While our best result is achieved with a combination of molecular feature types (gene expression, microRNA and proteome), we show that most of the predictive power comes from drug descriptors. To further demonstrate value in detecting anticancer therapy, we rank the drug pairs for each cell line based on model predicted combination effect and recover 80% of the top pairs with enhanced activity. CONCLUSIONS: We present promising results in applying deep learning to predicting combinational drug response. Our feature analysis indicates screening data involving more cell lines are needed for the models to make better use of molecular features.


Asunto(s)
Aprendizaje Profundo/tendencias , Evaluación Preclínica de Medicamentos/métodos , Línea Celular Tumoral , Humanos , National Cancer Institute (U.S.) , Redes Neurales de la Computación , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA