Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 120
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nucleic Acids Res ; 51(5): 2333-2344, 2023 03 21.
Article in English | MEDLINE | ID: mdl-36727449

ABSTRACT

The clustered regularly interspaced short palindromic repeats (CRISPR) Cas system is a powerful tool that has the potential to become a therapeutic gene editor in the near future. Cas9 is the best studied CRISPR system and has been shown to have problems that restrict its use in therapeutic applications. Chromatin structure is a known impactor of Cas9 targeting and there is a gap in knowledge on Cas9's efficacy when targeting such locations. To quantify at a single base pair resolution how chromatin inhibits on-target gene editing relative to off-target editing of exposed mismatching targets, we developed the gene editor mismatch nucleosome inhibition assay (GEMiNI-seq). GEMiNI-seq utilizes a library of nucleosome sequences to examine all target locations throughout nucleosomes in a single assay. The results from GEMiNI-seq revealed that the location of the protospacer-adjacent motif (PAM) sequence on the nucleosome edge drives the ability for Cas9 to access its target sequence. In addition, Cas9 had a higher affinity for exposed mismatched targets than on-target sequences within a nucleosome. Overall, our results show how chromatin structure impacts the fidelity of Cas9 to potential targets and highlight how targeting sequences with exposed PAMs could limit off-target gene editing, with such considerations improving Cas9 efficacy and resolving current limitations.


Subject(s)
CRISPR-Cas Systems , Nucleosomes , CRISPR-Cas Systems/genetics , Nucleosomes/genetics , Gene Editing/methods , Gene Library
2.
Clin Microbiol Rev ; 36(1): e0004022, 2023 03 23.
Article in English | MEDLINE | ID: mdl-36645300

ABSTRACT

Preventing and controlling influenza virus infection remains a global public health challenge, as it causes seasonal epidemics to unexpected pandemics. These infections are responsible for high morbidity, mortality, and substantial economic impact. Vaccines are the prophylaxis mainstay in the fight against influenza. However, vaccination fails to confer complete protection due to inadequate vaccination coverages, vaccine shortages, and mismatches with circulating strains. Antivirals represent an important prophylactic and therapeutic measure to reduce influenza-associated morbidity and mortality, particularly in high-risk populations. Here, we review current FDA-approved influenza antivirals with their mechanisms of action, and different viral- and host-directed influenza antiviral approaches, including immunomodulatory interventions in clinical development. Furthermore, we also illustrate the potential utility of machine learning in developing next-generation antivirals against influenza.


Subject(s)
Influenza Vaccines , Influenza, Human , Orthomyxoviridae Infections , Orthomyxoviridae , Humans , Influenza, Human/drug therapy , Influenza, Human/prevention & control , Antiviral Agents/pharmacology , Antiviral Agents/therapeutic use , Orthomyxoviridae Infections/drug therapy , Influenza Vaccines/therapeutic use
3.
Int J Mol Sci ; 24(2)2023 Jan 05.
Article in English | MEDLINE | ID: mdl-36674513

ABSTRACT

Pharmacogenomics is a rapidly growing field with the goal of providing personalized care to every patient. Previously, we developed the Computational Analysis of Novel Drug Opportunities (CANDO) platform for multiscale therapeutic discovery to screen optimal compounds for any indication/disease by performing analytics on their interactions using large protein libraries. We implemented a comprehensive precision medicine drug discovery pipeline within the CANDO platform to determine which drugs are most likely to be effective against mutant phenotypes of non-small cell lung cancer (NSCLC) based on the supposition that drugs with similar interaction profiles (or signatures) will have similar behavior and therefore show synergistic effects. CANDO predicted that osimertinib, an EGFR inhibitor, is most likely to synergize with four KRAS inhibitors.Validation studies with cellular toxicity assays confirmed that osimertinib in combination with ARS-1620, a KRAS G12C inhibitor, and BAY-293, a pan-KRAS inhibitor, showed a synergistic effect on decreasing cellular proliferation by acting on mutant KRAS. Gene expression studies revealed that MAPK expression is strongly correlated with decreased cellular proliferation following treatment with KRAS inhibitor BAY-293, but not treatment with ARS-1620 or osimertinib. These results indicate that our precision medicine pipeline may be used to identify compounds capable of synergizing with inhibitors of KRAS G12C, and to assess their likelihood of becoming drugs by understanding their behavior at the proteomic/interactomic scales.


Subject(s)
Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Humans , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Proto-Oncogene Proteins p21(ras)/genetics , Proteomics , Mutation , Protein Kinase Inhibitors/pharmacology , Protein Kinase Inhibitors/therapeutic use , Drug Combinations
4.
Molecules ; 27(9)2022 May 08.
Article in English | MEDLINE | ID: mdl-35566372

ABSTRACT

Humans are exposed to numerous compounds daily, some of which have adverse effects on health. Computational approaches for modeling toxicological data in conjunction with machine learning algorithms have gained popularity over the last few years. Machine learning approaches have been used to predict toxicity-related biological activities using chemical structure descriptors. However, toxicity-related proteomic features have not been fully investigated. In this study, we construct a computational pipeline using machine learning models for predicting the most important protein features responsible for the toxicity of compounds taken from the Tox21 dataset that is implemented within the multiscale Computational Analysis of Novel Drug Opportunities (CANDO) therapeutic discovery platform. Tox21 is a highly imbalanced dataset consisting of twelve in vitro assays, seven from the nuclear receptor (NR) signaling pathway and five from the stress response (SR) pathway, for more than 10,000 compounds. For the machine learning model, we employed a random forest with the combination of Synthetic Minority Oversampling Technique (SMOTE) and the Edited Nearest Neighbor (ENN) method (SMOTE+ENN), which is a resampling method to balance the activity class distribution. Within the NR and SR pathways, the activity of the aryl hydrocarbon receptor (NR-AhR) and the mitochondrial membrane potential (SR-MMP) were two of the top-performing twelve toxicity endpoints with AUCROCs of 0.90 and 0.92, respectively. The top extracted features for evaluating compound toxicity were analyzed for enrichment to highlight the implicated biological pathways and proteins. We validated our enrichment results for the activity of the AhR using a thorough literature search. Our case study showed that the selected enriched pathways and proteins from our computational pipeline are not only correlated with AhR toxicity but also form a cascading upstream/downstream arrangement. Our work elucidates significant relationships between protein and compound interactions computed using CANDO and the associated biological pathways to which the proteins belong for twelve toxicity endpoints. This novel study uses machine learning not only to predict and understand toxicity but also elucidates therapeutic mechanisms at a proteomic level for a variety of toxicity endpoints.


Subject(s)
Machine Learning , Proteomics , Algorithms , Drug Discovery/methods , Humans , Proteins
5.
Molecules ; 26(9)2021 Apr 28.
Article in English | MEDLINE | ID: mdl-33925237

ABSTRACT

Drug repurposing, the practice of utilizing existing drugs for novel clinical indications, has tremendous potential for improving human health outcomes and increasing therapeutic development efficiency. The goal of multi-disease multitarget drug repurposing, also known as shotgun drug repurposing, is to develop platforms that assess the therapeutic potential of each existing drug for every clinical indication. Our Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multitarget repurposing implements several pipelines for the large-scale modeling and simulation of interactions between comprehensive libraries of drugs/compounds and protein structures. In these pipelines, each drug is described by an interaction signature that is compared to all other signatures that are subsequently sorted and ranked based on similarity. Pipelines within the platform are benchmarked based on their ability to recover known drugs for all indications in our library, and predictions are generated based on the hypothesis that (novel) drugs with similar signatures may be repurposed for the same indication(s). The drug-protein interactions used to create the drug-proteome signatures may be determined by any screening or docking method, but the primary approach used thus far has been BANDOCK, our in-house bioanalytical or similarity docking protocol. In this study, we calculated drug-proteome interaction signatures using the publicly available molecular docking method Autodock Vina and created hybrid decision tree pipelines that combined our original bio- and chem-informatic approach with the goal of assessing and benchmarking their drug repurposing capabilities and performance. The hybrid decision tree pipeline outperformed the two docking-based pipelines from which it was synthesized, yielding an average indication accuracy of 13.3% at the top10 cutoff (the most stringent), relative to 10.9% and 7.1% for its constituent pipelines, and a random control accuracy of 2.2%. We demonstrate that docking-based virtual screening pipelines have unique performance characteristics and that the CANDO shotgun repurposing paradigm is not dependent on a specific docking method. Our results also provide further evidence that multiple CANDO pipelines can be synthesized to enhance drug repurposing predictive capability relative to their constituent pipelines. Overall, this study indicates that pipelines consisting of varied docking-based signature generation methods can capture unique and useful signals for accurate comparison of drug-proteome interaction signatures, leading to improvements in the benchmarking and predictive performance of the CANDO shotgun drug repurposing platform.


Subject(s)
Computational Biology/methods , Drug Discovery , Drug Repositioning , Molecular Docking Simulation , Molecular Dynamics Simulation , Drug Discovery/methods , Humans , Proteome , Proteomics/methods , Reproducibility of Results , Structure-Activity Relationship
6.
J Chem Inf Model ; 60(9): 4131-4136, 2020 09 28.
Article in English | MEDLINE | ID: mdl-32515949

ABSTRACT

Traditional drug discovery methods focus on optimizing the efficacy of a drug against a single biological target of interest for a specific disease. However, evidence supports the multitarget theory, i.e., drugs work by exerting their therapeutic effects via interaction with multiple biological targets, which have multiple phenotypic effects. Analytics of drug-protein interactions on a large proteomic scale provides insight into disease systems while also allowing for prediction of putative therapeutics against specific indications. We present a Python package for analysis of drug-proteome and drug-disease relationships implementing the Computational Analysis of Novel Drug Opportunities (CANDO) platform. The CANDO package allows for rapid drug similarity assessment, most notably via an in-house interaction scoring protocol where billions of drug-protein interactions are rapidly scored and the similarity of drug-proteome interaction signatures is calculated. The package also implements a variety of benchmarking protocols for shotgun drug discovery and repurposing, i.e., to determine how every known drug is related to every other in the context of the indications/diseases for which they are approved. Drug predictions are generated through consensus scoring of the most similar compounds to drugs known to treat a particular indication. Support for comparing and ranking novel chemical entities, as well as machine learning modules for both benchmarking and putative drug candidate prediction is also available. The CANDO Python package is available on GitHub at https://github.com/ram-compbio/CANDO, through the Conda Python package installer, and at http://compbio.org/software/.


Subject(s)
Pharmaceutical Preparations , Proteomics , Drug Discovery , Proteome , Software
7.
J Chem Inf Model ; 60(3): 1509-1527, 2020 03 23.
Article in English | MEDLINE | ID: mdl-32069042

ABSTRACT

Small-molecule docking has proven to be invaluable for drug design and discovery. However, existing docking methods have several limitations such as improper treatment of the interactions of essential components in the chemical environment of the binding pocket (e.g., cofactors, metal ions, etc.), incomplete sampling of chemically relevant ligand conformational space, and the inability to consistently correlate docking scores of the best binding pose with experimental binding affinities. We present CANDOCK, a novel docking algorithm, that utilizes a hierarchical approach to reconstruct ligands from an atomic grid using graph theory and generalized statistical potential functions to sample biologically relevant ligand conformations. Our algorithm accounts for protein flexibility, solvent, metal ions, and cofactor interactions in the binding pocket that are traditionally ignored by current methods. We evaluate the algorithm on the PDBbind, Astex, and PINC proteins to show its ability to reproduce the binding mode of the ligands that is independent of the initial ligand conformation in these benchmarks. Finally, we identify the best selector and ranker potential functions such that the statistical score of the best selected docked pose correlates with the experimental binding affinities of the ligands for any given protein target. Our results indicate that CANDOCK is a generalized flexible docking method that addresses several limitations of current docking methods by considering all interactions in the chemical environment of a binding pocket for correlating the best-docked pose with biological activity. CANDOCK along with all structures and scripts used for benchmarking is available at https://github.com/chopralab/candock_benchmark.


Subject(s)
Algorithms , Proteins , Binding Sites , Drug Design , Ligands , Molecular Docking Simulation , Protein Binding , Protein Conformation , Proteins/metabolism
8.
Molecules ; 24(1)2019 Jan 04.
Article in English | MEDLINE | ID: mdl-30621144

ABSTRACT

Drug repurposing is a valuable tool for combating the slowing rates of novel therapeutic discovery. The Computational Analysis of Novel Drug Opportunities (CANDO) platform performs shotgun repurposing of 2030 indications/diseases using 3733 drugs/compounds to predict interactions with 46,784 proteins and relating them via proteomic interaction signatures. The accuracy is calculated by comparing interaction similarities of drugs approved for the same indications. We performed a unique subset analysis by breaking down the full protein library into smaller subsets and then recombining the best performing subsets into larger supersets. Up to 14% improvement in accuracy is seen upon benchmarking the supersets, representing a 100⁻1000-fold reduction in the number of proteins considered relative to the full library. Further analysis revealed that libraries comprised of proteins with more equitably diverse ligand interactions are important for describing compound behavior. Using one of these libraries to generate putative drug candidates against malaria, tuberculosis, and large cell carcinoma results in more drugs that could be validated in the biomedical literature compared to using those suggested by the full protein library. Our work elucidates the role of particular protein subsets and corresponding ligand interactions that play a role in drug repurposing, with implications for drug design and machine learning approaches to improve the CANDO platform.


Subject(s)
Computational Biology/methods , Drug Evaluation, Preclinical/methods , Proteins/chemistry , Proteomics , Drug Design , Humans , Machine Learning , Protein Binding , Proteins/antagonists & inhibitors , Proteins/classification
9.
Molecules ; 22(10)2017 Oct 20.
Article in English | MEDLINE | ID: mdl-29053626

ABSTRACT

Ebola virus disease (EVD) is a deadly global public health threat, with no currently approved treatments. Traditional drug discovery and development is too expensive and inefficient to react quickly to the threat. We review published research studies that utilize computational approaches to find or develop drugs that target the Ebola virus and synthesize its results. A variety of hypothesized and/or novel treatments are reported to have potential anti-Ebola activity. Approaches that utilize multi-targeting/polypharmacology have the most promise in treating EVD.


Subject(s)
Antiviral Agents/pharmacology , Drug Repositioning/methods , Hemorrhagic Fever, Ebola/drug therapy , Computational Biology/methods , Disease Outbreaks , Humans , Machine Learning
10.
Molecules ; 21(12)2016 Nov 25.
Article in English | MEDLINE | ID: mdl-27898018

ABSTRACT

Ebola virus disease (EVD) is extremely virulent with an estimated mortality rate of up to 90%. However, the state-of-the-art treatment for EVD is limited to quarantine and supportive care. The 2014 Ebola epidemic in West Africa, the largest in history, is believed to have caused more than 11,000 fatalities. The countries worst affected are also among the poorest in the world. Given the complexities, time, and resources required for a novel drug development, finding efficient drug discovery pathways is going to be crucial in the fight against future outbreaks. We have developed a Computational Analysis of Novel Drug Opportunities (CANDO) platform based on the hypothesis that drugs function by interacting with multiple protein targets to create a molecular interaction signature that can be exploited for rapid therapeutic repurposing and discovery. We used the CANDO platform to identify and rank FDA-approved drug candidates that bind and inhibit all proteins encoded by the genomes of five different Ebola virus strains. Top ranking drug candidates for EVD treatment generated by CANDO were compared to in vitro screening studies against Ebola virus-like particles (VLPs) by Kouznetsova et al. and genetically engineered Ebola virus and cell viability studies by Johansen et al. to identify drug overlaps between the in virtuale and in vitro studies as putative treatments for future EVD outbreaks. Our results indicate that integrating computational docking predictions on a proteomic scale with results from in vitro screening studies may be used to select and prioritize compounds for further in vivo and clinical testing. This approach will significantly reduce the lead time, risk, cost, and resources required to determine efficacious therapies against future EVD outbreaks.


Subject(s)
Antiviral Agents/therapeutic use , Hemorrhagic Fever, Ebola/drug therapy , Disease Outbreaks , Drug Approval/legislation & jurisprudence , Drug Discovery , Hemorrhagic Fever, Ebola/epidemiology , Humans , United States , United States Food and Drug Administration
11.
Bioinformatics ; 30(12): 1774-6, 2014 Jun 15.
Article in English | MEDLINE | ID: mdl-24532722

ABSTRACT

MOTIVATION: fast_protein_cluster is a fast, parallel and memory efficient package used to cluster 60 000 sets of protein models (with up to 550 000 models per set) generated by the Nutritious Rice for the World project. RESULTS: fast_protein_cluster is an optimized and extensible toolkit that supports Root Mean Square Deviation after optimal superposition (RMSD) and Template Modeling score (TM-score) as metrics. RMSD calculations using a laptop CPU are 60× faster than qcprot and 3× faster than current graphics processing unit (GPU) implementations. New GPU code further increases the speed of RMSD and TM-score calculations. fast_protein_cluster provides novel k-means and hierarchical clustering methods that are up to 250× and 2000× faster, respectively, than Clusco, and identify significantly more accurate models than Spicker and Clusco. AVAILABILITY AND IMPLEMENTATION: fast_protein_cluster is written in C++ using OpenMP for multi-threading support. Custom streaming Single Instruction Multiple Data (SIMD) extensions and advanced vector extension intrinsics code accelerate CPU calculations, and OpenCL kernels support AMD and Nvidia GPUs. fast_protein_cluster is available under the M.I.T. license. (http://software.compbio.washington.edu/fast_protein_cluster)


Subject(s)
Protein Conformation , Software , Algorithms , Cluster Analysis , Models, Molecular
12.
Indian J Microbiol ; 54(4): 403-12, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25320438

ABSTRACT

A total of 74 morphologically distinct bacterial colonies were selected during isolation of bacteria from different parts of tomato plant (rhizoplane, phylloplane and rhizosphere) as well as nearby bulk soil. The isolates were screened for plant growth promoting (PGP) traits such as production of indole acetic acid, siderophore, chitinase and hydrogen cyanide as well as phosphate solubilization. Seven isolates viz., NR4, NR6, RP3, PP1, RS4, RP6 and NR1 that exhibited multiple PGP traits were identified, based on morphological, biochemical and 16S rRNA gene sequence analysis, as species that belonged to four genera Aeromonas, Pseudomonas, Bacillus and Enterobacter. All the seven isolates were positive for 1-aminocyclopropane-1-carboxylate deaminase. Isolate NR6 was antagonistic to Fusarium solani and Fusarium moniliforme, and both PP1 and RP6 isolates were antagonistic to F. moniliforme. Except RP6, all isolates adhered significantly to glass surface suggestive of biofilm formation. Seed bacterization of tomato, groundnut, sorghum and chickpea with the seven bacterial isolates resulted in varied growth response in laboratory assay on half strength Murashige and Skoog medium. Most of the tomato isolates positively influenced tomato growth. The growth response was either neutral or negative with groundnut, sorghum and chickpea. Overall, the results suggested that bacteria with PGP traits do not positively influence the growth of all plants, and certain PGP bacteria may exhibit host-specificity. Among the isolates that positively influenced growth of tomato (NR1, RP3, PP1, RS4 and RP6) only RS4 was isolated from tomato rhizosphere. Therefore, the best PGP bacteria can also be isolated from zones other than rhizosphere or rhizoplane of a plant.

13.
Commun Biol ; 7(1): 529, 2024 May 04.
Article in English | MEDLINE | ID: mdl-38704509

ABSTRACT

Intra-organism biodiversity is thought to arise from epigenetic modification of constituent genes and post-translational modifications of translated proteins. Here, we show that post-transcriptional modifications, like RNA editing, may also contribute. RNA editing enzymes APOBEC3A and APOBEC3G catalyze the deamination of cytosine to uracil. RNAsee (RNA site editing evaluation) is a computational tool developed to predict the cytosines edited by these enzymes. We find that 4.5% of non-synonymous DNA single nucleotide polymorphisms that result in cytosine to uracil changes in RNA are probable sites for APOBEC3A/G RNA editing; the variant proteins created by such polymorphisms may also result from transient RNA editing. These polymorphisms are associated with over 20% of Medical Subject Headings across ten categories of disease, including nutritional and metabolic, neoplastic, cardiovascular, and nervous system diseases. Because RNA editing is transient and not organism-wide, future work is necessary to confirm the extent and effects of such editing in humans.


Subject(s)
APOBEC Deaminases , Cytidine Deaminase , RNA Editing , Humans , Cytidine Deaminase/metabolism , Cytidine Deaminase/genetics , Polymorphism, Single Nucleotide , Cytosine/metabolism , APOBEC-3G Deaminase/metabolism , APOBEC-3G Deaminase/genetics , Uracil/metabolism , Proteins/genetics , Proteins/metabolism , Cytosine Deaminase/genetics , Cytosine Deaminase/metabolism
14.
BMC Plant Biol ; 13: 43, 2013 Mar 15.
Article in English | MEDLINE | ID: mdl-23497186

ABSTRACT

BACKGROUND: The protein encoded by GmRLK18-1 (Glyma_18_02680 on chromosome 18) was a receptor like kinase (RLK) encoded within the soybean (Glycine max L. Merr.) Rhg1/Rfs2 locus. The locus underlies resistance to the soybean cyst nematode (SCN) Heterodera glycines (I.) and causal agent of sudden death syndrome (SDS) Fusarium virguliforme (Aoki). Previously the leucine rich repeat (LRR) domain was expressed in Escherichia coli. RESULTS: The aims here were to evaluate the LRRs ability to; homo-dimerize; bind larger proteins; and bind to small peptides. Western analysis suggested homo-dimers could form after protein extraction from roots. The purified LRR domain, from residue 131-485, was seen to form a mixture of monomers and homo-dimers in vitro. Cross-linking experiments in vitro showed the H274N region was close (<11.1 A) to the highly conserved cysteine residue C196 on the second homo-dimer subunit. Binding constants of 20-142 nM for peptides found in plant and nematode secretions were found. Effects on plant phenotypes including wilting, stem bending and resistance to infection by SCN were observed when roots were treated with 50 pM of the peptides. Far-Western analyses followed by MS showed methionine synthase and cyclophilin bound strongly to the LRR domain. A second LRR from GmRLK08-1 (Glyma_08_g11350) did not show these strong interactions. CONCLUSIONS: The LRR domain of the GmRLK18-1 protein formed both a monomer and a homo-dimer. The LRR domain bound avidly to 4 different CLE peptides, a cyclophilin and a methionine synthase. The CLE peptides GmTGIF, GmCLE34, GmCLE3 and HgCLE were previously reported to be involved in root growth inhibition but here GmTGIF and HgCLE were shown to alter stem morphology and resistance to SCN. One of several models from homology and ab-initio modeling was partially validated by cross-linking. The effect of the 3 amino acid replacements present among RLK allotypes, A87V, Q115K and H274N were predicted to alter domain stability and function. Therefore, the LRR domain of GmRLK18-1 might underlie both root development and disease resistance in soybean and provide an avenue to develop new variants and ligands that might promote reduced losses to SCN.


Subject(s)
Fusarium/pathogenicity , Glycine max/metabolism , Nematoda/pathogenicity , Plant Diseases/microbiology , Plant Diseases/parasitology , Plant Proteins/chemistry , Plant Proteins/metabolism , Animals , Dimerization , Disease Resistance/genetics , Disease Resistance/physiology , Plant Proteins/genetics , Glycine max/genetics
15.
Bioinformatics ; 28(16): 2191-2, 2012 Aug 15.
Article in English | MEDLINE | ID: mdl-22718788

ABSTRACT

MOTIVATION: Accurate comparisons of different protein structures play important roles in structural biology, structure prediction and functional annotation. The root-mean-square-deviation (RMSD) after optimal superposition is the predominant measure of similarity due to the ease and speed of computation. However, global RMSD is dependent on the length of the protein and can be dominated by divergent loops that can obscure local regions of similarity. A more sophisticated measure of structure similarity, Template Modeling (TM)-score, avoids these problems, and it is one of the measures used by the community-wide experiments of critical assessment of protein structure prediction to compare predicted models with experimental structures. TM-score calculations are, however, much slower than RMSD calculations. We have therefore implemented a very fast version of TM-score for Graphical Processing Units (TM-score-GPU), using a new and novel hybrid Kabsch/quaternion method for calculating the optimal superposition and RMSD that is designed for parallel applications. This acceleration in speed allows TM-score to be used efficiently in computationally intensive applications such as for clustering of protein models and genome-wide comparisons of structure. RESULTS: TM-score-GPU was applied to six sets of models from Nutritious Rice for the World for a total of 3 million comparisons. TM-score-GPU is 68 times faster on an ATI 5870 GPU, on average, than the original CPU single-threaded implementation on an AMD Phenom II 810 quad-core processor. AVAILABILITY AND IMPLEMENTATION: The complete source, including the GPU code and the hybrid RMSD subroutine, can be downloaded and used without restriction at http://software.compbio.washington.edu/misc/downloads/tmscore/. The implementation is in C++/OpenCL.


Subject(s)
Computational Biology/methods , Protein Conformation , Proteins/chemistry , Software , Algorithms
16.
Int J Mol Sci ; 14(7): 14892-907, 2013 Jul 17.
Article in English | MEDLINE | ID: mdl-23867606

ABSTRACT

Protein structure information is essential to understand protein function. Computational methods to accurately predict protein structure from the sequence have primarily been evaluated on protein sequences representing full-length native proteins. Here, we demonstrate that top-performing structure prediction methods can accurately predict the partial structures of proteins encoded by sequences that contain approximately 50% or more of the full-length protein sequence. We hypothesize that structure prediction may be useful for predicting functions of proteins whose corresponding genes are mapped expressed sequence tags (ESTs) that encode partial-length amino acid sequences. Additionally, we identify a confidence score representing the quality of a predicted structure as a useful means of predicting the likelihood that an arbitrary polypeptide sequence represents a portion of a foldable protein sequence ("foldability"). This work has ramifications for the prediction of protein structure with limited or noisy sequence information, as well as genome annotation.


Subject(s)
Proteins/chemistry , Databases, Protein , Expressed Sequence Tags , Protein Folding , Protein Structure, Tertiary , Proteins/metabolism , Software
17.
Front Pharmacol ; 14: 1113007, 2023.
Article in English | MEDLINE | ID: mdl-37180722

ABSTRACT

The two most common reasons for attrition in therapeutic clinical trials are efficacy and safety. We integrated heterogeneous data to create a human interactome network to comprehensively describe drug behavior in biological systems, with the goal of accurate therapeutic candidate generation. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multiscale therapeutic discovery, repurposing, and design was enhanced by integrating drug side effects, protein pathways, protein-protein interactions, protein-disease associations, and the Gene Ontology, and complemented with its existing drug/compound, protein, and indication libraries. These integrated networks were reduced to a "multiscale interactomic signature" for each compound that describe its functional behavior as vectors of real values. These signatures are then used for relating compounds to each other with the hypothesis that similar signatures yield similar behavior. Our results indicated that there is significant biological information captured within our networks (particularly via side effects) which enhance the performance of our platform, as evaluated by performing all-against-all leave-one-out drug-indication association benchmarking as well as generating novel drug candidates for colon cancer and migraine disorders corroborated via literature search. Further, drug impacts on pathways derived from computed compound-protein interaction scores served as the features for a random forest machine learning model trained to predict drug-indication associations, with applications to mental disorders and cancer metastasis highlighted. This interactomic pipeline highlights the ability of Computational Analysis of Novel Drug Opportunities to accurately relate drugs in a multitarget and multiscale context, particularly for generating putative drug candidates using the information gleaned from indirect data such as side effect profiles and protein pathway information.

18.
Methods Mol Biol ; 2673: 111-122, 2023.
Article in English | MEDLINE | ID: mdl-37258909

ABSTRACT

Epitopes are the cornerstones for the development of rational vaccine design strategies. Conventionally, epitopes are used by chemical conjugation with the carrier protein. This chapter describes our computational epitope grafting methodology to identify the preferential grafting site in a carrier protein/scaffold. We have used the mota epitope as an example, as it was already experimentally validated by an independent group. In this chapter, we have provided sufficient details to enable the wet experimentalist to employ this computational methodology in their research objective. Scripts/programs are extensively described in this chapter and freely accessible through the provided link.


Subject(s)
Carrier Proteins , Computational Biology , Epitopes , Epitopes, T-Lymphocyte , Epitopes, B-Lymphocyte
19.
bioRxiv ; 2023 Jul 31.
Article in English | MEDLINE | ID: mdl-37577456

ABSTRACT

Intra-organism biodiversity is thought to arise from epigenetic modification of our constituent genes and post-translational modifications after mRNA is translated into proteins. We have found that post-transcriptional modification, also known as RNA editing, is also responsible for a significant amount of our biodiversity, substantively expanding this story. The APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like) family RNA editing enzymes APOBEC3A and APOBEC3G catalyze the deamination of cytosines to uracils (C>U) in specific stem-loop structures.1,2 We used RNAsee (RNA site editing evaluation), a tool developed to predict the locations of APOBEC3A/G RNA editing sites, to determine whether known single nucleotide polymorphisms (SNPs) in DNA could be replicated in RNA via RNA editing. About 4.5% of non-synonymous SNPs which result in C>U changes in RNA, and about 5.4% of such SNPs labelled as pathogenic, were identified as probable sites for APOBEC3A/G editing. This suggests that the variant proteins created by these DNA mutations may also be created by transient RNA editing, with the potential to affect human health. Those SNPs identified as potential APOBEC3A/G-mediated RNA editing sites were disproportionately associated with cardiovascular diseases, digestive system diseases, and musculoskeletal diseases. Future work should focus on common sites of RNA editing, any variant proteins created by these RNA editing sites, and the effects of these variants on protein diversity and human health. Classically, our biodiversity is thought to come from our constitutive genetics, epigenetic phenomenon, transcriptional differences, and post-translational modification of proteins. Here, we have shown evidence that RNA editing, often stimulated by environmental factors, could account for a significant degree of the protein biodiversity leading to human disease. In an era where worries about our changing environment are ever increasing, from the warming of our climate to the emergence of new diseases to the infiltration of microplastics and pollutants into our bodies, understanding how environmentally sensitive mechanisms like RNA editing affect our own cells is essential.

20.
Biomacromolecules ; 13(11): 3494-502, 2012 Nov 12.
Article in English | MEDLINE | ID: mdl-22974364

ABSTRACT

Enamel matrix self-assembly has long been suggested as the driving force behind aligned nanofibrous hydroxyapatite formation. We tested if amelogenin, the main enamel matrix protein, can self-assemble into ribbon-like structures in physiologic solutions. Ribbons 17 nm wide were observed to grow several micrometers in length, requiring calcium, phosphate, and pH 4.0-6.0. The pH range suggests that the formation of ion bridges through protonated histidine residues is essential to self-assembly, supported by a statistical analysis of 212 phosphate-binding proteins predicting 12 phosphate-binding histidines. Thermophoretic analysis verified the importance of calcium and phosphate in self-assembly. X-ray scattering characterized amelogenin dimers with dimensions fitting the cross-section of the amelogenin ribbon, leading to the hypothesis that antiparallel dimers are the building blocks of the ribbons. Over 5-7 days, ribbons self-organized into bundles composed of aligned ribbons mimicking the structure of enamel crystallites in enamel rods. These observations confirm reports of filamentous organic components in developing enamel and provide a new model for matrix-templated enamel mineralization.


Subject(s)
Amelogenin/chemistry , Dental Enamel Proteins/chemistry , Protein Multimerization , Calcium/chemistry , Hydrogen-Ion Concentration , Microscopy, Atomic Force , Microscopy, Electron, Transmission , Nanotubes, Carbon , Phosphates/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL