Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36759333

RESUMO

The knowledge of contacting residue pairs between interacting proteins is very useful for the structural characterization of protein-protein interactions (PPIs). However, accurately identifying the tens of contacting ones from hundreds of thousands of inter-protein residue pairs is extremely challenging, and performances of the state-of-the-art inter-protein contact prediction methods are still quite limited. In this study, we developed a deep learning method for inter-protein contact prediction, which is referred to as DRN-1D2D_Inter. Specifically, we employed pretrained protein language models to generate structural information-enriched input features to residual networks formed by dimensional hybrid residual blocks to perform inter-protein contact prediction. Extensively bechmarking DRN-1D2D_Inter on multiple datasets, including both heteromeric PPIs and homomeric PPIs, we show DRN-1D2D_Inter consistently and significantly outperformed two state-of-the-art inter-protein contact prediction methods, including GLINTER and DeepHomo, although both the latter two methods leveraged the native structures of interacting proteins in the prediction, and DRN-1D2D_Inter made the prediction purely from sequences. We further show that applying the predicted contacts as constraints for protein-protein docking can significantly improve its performance for protein complex structure prediction.


Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Proteínas/química
2.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35649388

RESUMO

AlphaFold2 can predict protein complex structures as long as a multiple sequence alignment (MSA) of the interologs of the target protein-protein interaction (PPI) can be provided. In this study, a simplified phylogeny-based approach was applied to generate the MSA of interologs, which was then used as the input to AlphaFold2 for protein complex structure prediction. In this extensively benchmarked protocol on nonredundant PPI dataset, including 107 bacterial PPIs and 442 eukaryotic PPIs, we show complex structures of 79.5% of the bacterial PPIs and 49.8% of the eukaryotic PPIs can be successfully predicted, which yielded significantly better performance than the application of MSA of interologs prepared by two existing approaches. Considering PPIs may not be conserved in species with long evolutionary distances, we further restricted interologs in the MSA to different taxonomic ranks of the species of the target PPI in protein complex structure prediction. We found that the success rates can be increased to 87.9% for the bacterial PPIs and 56.3% for the eukaryotic PPIs if interologs in the MSA are restricted to a specific taxonomic rank of the species of each target PPI. Finally, we show that the optimal taxonomic ranks for protein complex structure prediction can be selected with the application of the predicted template modeling (TM) scores of the output models.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Filogenia , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Alinhamento de Sequência
3.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35037015

RESUMO

Direct coupling analysis (DCA) has been widely used to infer evolutionary coupled residue pairs from the multiple sequence alignment (MSA) of homologous sequences. However, effectively selecting residue pairs with significant evolutionary couplings according to the result of DCA is a non-trivial task. In this study, we developed a general statistical framework for significant evolutionary coupling detection, referred to as irreproducible discovery rate (IDR)-DCA, which is based on reproducibility analysis of the coupling scores obtained from DCA on manually created MSA replicates. IDR-DCA was applied to select residue pairs for contact prediction for monomeric proteins, protein-protein interactions and monomeric RNAs, in which three different versions of DCA were applied. We demonstrated that with the application of IDR-DCA, the residue pairs selected using a universal threshold always yielded stable performance for contact prediction. Comparing with the application of carefully tuned coupling score cutoffs, IDR-DCA always showed better performance. The robustness of IDR-DCA was also supported through the MSA downsampling analysis. We further demonstrated the effectiveness of applying constraints obtained from residue pairs selected by IDR-DCA to assist RNA secondary structure prediction.


Assuntos
Algoritmos , Proteínas , Estrutura Secundária de Proteína , Proteínas/química , RNA , Reprodutibilidade dos Testes , Alinhamento de Sequência
4.
Nat Methods ; 17(8): 807-814, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32737473

RESUMO

Enhancers are important non-coding elements, but they have traditionally been hard to characterize experimentally. The development of massively parallel assays allows the characterization of large numbers of enhancers for the first time. Here, we developed a framework using Drosophila STARR-seq to create shape-matching filters based on meta-profiles of epigenetic features. We integrated these features with supervised machine-learning algorithms to predict enhancers. We further demonstrated that our model could be transferred to predict enhancers in mammals. We comprehensively validated the predictions using a combination of in vivo and in vitro approaches, involving transgenic assays in mice and transduction-based reporter assays in human cell lines (153 enhancers in total). The results confirmed that our model can accurately predict enhancers in different species without re-parameterization. Finally, we examined the transcription factor binding patterns at predicted enhancers versus promoters. We demonstrated that these patterns enable the construction of a secondary model that effectively distinguishes enhancers and promoters.


Assuntos
Epigênese Genética/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Animais , Linhagem Celular , Drosophila , Histonas/genética , Histonas/metabolismo , Humanos , Camundongos , Camundongos Transgênicos , Reprodutibilidade dos Testes
5.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34448830

RESUMO

Deep residual learning has shown great success in protein contact prediction. In this study, a new deep residual learning-based protein contact prediction model was developed. Comparing with previous models, a new type of residual block hybridizing 1D and 2D convolutions was designed to increase the effective receptive field of the residual network, and a new loss function emphasizing the easily misclassified residue pairs was proposed to enhance the model training. The developed protein contact prediction model referred to as DRN-1D2D was first evaluated on 105 CASP11 targets, 76 CAMEO hard targets and 398 membrane proteins together with two in house-developed reference models based on either the standard 2D residual block or the traditional BCE loss function, from which we confirmed that both the dimensional hybrid residual block and the singularity enhanced loss function can be employed to improve the model performance for protein contact prediction. DRN-1D2D was further evaluated on 39 CASP13 and CASP14 free modeling targets together with the two reference models and six state-of-the-art protein contact prediction models including DeepCov, DeepCon, DeepConPred2, SPOT-Contact, RaptorX-Contact and TripleRes. The result shows that DRN-1D2D consistently achieved the best performance among all these models.


Assuntos
Proteínas de Transporte/química , Biologia Computacional/métodos , Aprendizado Profundo , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas de Transporte/metabolismo , Ligação Proteica , Proteínas/metabolismo , Reprodutibilidade dos Testes , Software
6.
J Fluoresc ; 33(4): 1593-1602, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36790631

RESUMO

Rosin-based fluorescent polyurethane emulsion (FPU) was prepared using isophorone diisocyanate, ester of acrylic rosin and glycidyl methacrylate, 1,5-dihydroxy naphthalene (1,5-DN), and 1,4-butanediol as the raw materials. Then, rosin-based fluorescent polyurethane microspheres (FPUMs) were successfully prepared by suspension polymerization method using FPU as the main material, azodiisobutyronitrile as the initiator, and gelatin as the dispersant. FPUMs were characterized by Fourier transform infrared spectra, thermogravimetric analysis, optical microscopy, scanning electron microscopy and fluorescence spectra, and the response performance of FPUMs to pH was studied. The results showed that FPUMs were successfully prepared. With the increase of the level of 1,5-DN, the particle size of FPUMs increased gradually, and the fluorescence intensity increased first and then decreased. When the level of 1,5-DN was 3 wt.%, the average particle size was 49.3 µm, the particle distribution index (PDI) was 1.05, and the fluorescence intensity was the largest (3662 a.u.). The fluorescence intensity of FPUMs increased linearly with the decrease of pH, which can be used for pH detection in solution. Furthermore, the FPUMs exhibited good thermal stability, anti-interference and recoverability.

7.
J Comput Chem ; 39(28): 2409-2413, 2018 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-30368849

RESUMO

Protein-peptide interactions play a crucial role in a variety of cellular processes. The protein-peptide complex structure is a key to understand the mechanisms underlying protein-peptide interactions and is critical for peptide therapeutic development. We present a user-friendly protein-peptide docking server, MDockPeP. Starting from a peptide sequence and a protein receptor structure, the MDockPeP Server globally docks the all-atom, flexible peptide to the protein receptor. The produced modes are then evaluated with a statistical potential-based scoring function, ITScorePeP. This method was systematically validated using the peptiDB benchmarking database. At least one near-native peptide binding mode was ranked among top 10 (or top 500) in 59% (85%) of the bound cases, and in 40.6% (71.9%) of the challenging unbound cases. The server can be used for both protein-peptide complex structure prediction and initial-stage sampling of the protein-peptide binding modes for other docking or simulation methods. MDockPeP Server is freely available at http://zougrouptoolkit.missouri.edu/mdockpep. © 2018 Wiley Periodicals, Inc.


Assuntos
Computadores , Internet , Simulação de Acoplamento Molecular , Peptídeos/química , Proteínas/química , Bases de Dados de Proteínas , Ligação Proteica , Conformação Proteica
8.
Bioinformatics ; 33(14): 2199-2201, 2017 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-28369339

RESUMO

SUMMARY: Genome-wide proximity ligation based assays like Hi-C have opened a window to the 3D organization of the genome. In so doing, they present data structures that are different from conventional 1D signal tracks. To exploit the 2D nature of Hi-C contact maps, matrix techniques like spectral analysis are particularly useful. Here, we present HiC-spector, a collection of matrix-related functions for analyzing Hi-C contact maps. In particular, we introduce a novel reproducibility metric for quantifying the similarity between contact maps based on spectral decomposition. The metric successfully separates contact maps mapped from Hi-C data coming from biological replicates, pseudo-replicates and different cell types. AVAILABILITY AND IMPLEMENTATION: Source code in Julia and Python, and detailed documentation is available at https://github.com/gersteinlab/HiC-spector . CONTACT: koonkiu.yan@gmail.com or mark@gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Cromossomos/química , Técnicas Genéticas , Genoma , Biotinilação , DNA/química , Biblioteca Gênica , Humanos , Reprodutibilidade dos Testes
9.
Proteins ; 85(3): 424-434, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27802576

RESUMO

Protein-protein interactions are either through direct contacts between two binding partners or mediated by structural waters. Both direct contacts and water-mediated interactions are crucial to the formation of a protein-protein complex. During the recent CAPRI rounds, a novel parallel searching strategy for predicting water-mediated interactions is introduced into our protein-protein docking method, MDockPP. Briefly, a FFT-based docking algorithm is employed in generating putative binding modes, and an iteratively derived statistical potential-based scoring function, ITScorePP, in conjunction with biological information is used to assess and rank the binding modes. Up to 10 binding modes are selected as the initial protein-protein complex structures for MD simulations in explicit solvent. Water molecules near the interface are clustered based on the snapshots extracted from independent equilibrated trajectories. Then, protein-ligand docking is employed for a parallel search for water molecules near the protein-protein interface. The water molecules generated by ligand docking and the clustered water molecules generated by MD simulations are merged, referred to as the predicted structural water molecules. Here, we report the performance of this protocol for CAPRI rounds 28-29 and 31-35 containing 20 valid docking targets and 11 scoring targets. In the docking experiments, we predicted correct binding modes for nine targets, including one high-accuracy, two medium-accuracy, and six acceptable predictions. Regarding the two targets for the prediction of water-mediated interactions, we achieved models ranked as "excellent" in accordance with the CAPRI evaluation criteria; one of these two targets is considered as a difficult target for structural water prediction. Proteins 2017; 85:424-434. © 2016 Wiley Periodicals, Inc.


Assuntos
Algoritmos , Biologia Computacional/métodos , Simulação de Acoplamento Molecular/métodos , Proteínas/química , Água/química , Benchmarking , Sítios de Ligação , Simulação de Dinâmica Molecular , Ligação Proteica , Conformação Proteica , Mapeamento de Interação de Proteínas , Multimerização Proteica , Projetos de Pesquisa , Software , Homologia Estrutural de Proteína , Termodinâmica
10.
J Comput Aided Mol Des ; 31(8): 689-699, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28668990

RESUMO

The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.


Assuntos
Proteínas de Choque Térmico HSP90/química , Peptídeos e Proteínas de Sinalização Intracelular/química , Simulação de Acoplamento Molecular , Proteínas Serina-Treonina Quinases/química , Sítios de Ligação , Bases de Dados de Proteínas , Desenho de Fármacos , Humanos , Ligantes , Ligação Proteica , Conformação Proteica
11.
J Chem Inf Model ; 56(6): 1013-21, 2016 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-26389744

RESUMO

In this study, we developed two iterative knowledge-based scoring functions, ITScore_pdbbind(rigid) and ITScore_pdbbind(flex), using rigid decoy structures and flexible decoy structures, respectively, that were generated from the protein-ligand complexes in the refined set of PDBbind 2012. These two scoring functions were evaluated using the 2013 and 2014 CSAR benchmarks. The results were compared with the results of two other scoring functions, the Vina scoring function and ITScore, the scoring function that we previously developed from rigid decoy structures for a smaller set of protein-ligand complexes. A graph-based method was developed to evaluate the root-mean-square deviation between two conformations of the same ligand with different atom names and orders due to different file preparations, and the program is freely available. Our study showed that the two new scoring functions developed from the larger training set yielded significantly improved performance in binding mode predictions. For binding affinity predictions, all four scoring functions showed protein-dependent performance. We suggest the development of protein-family-dependent scoring functions for accurate binding affinity prediction.


Assuntos
Descoberta de Drogas/métodos , Simulação de Acoplamento Molecular , Benchmarking , Ligantes , Ligação Proteica , Conformação Proteica , Proteínas/química , Proteínas/metabolismo , Relação Estrutura-Atividade
12.
Plant J ; 77(2): 222-34, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24245741

RESUMO

Plant recognition of pathogen-associated molecular patterns (PAMPs) such as bacterial flagellin-derived flg22 triggers rapid activation of mitogen-activated protein kinases (MAPKs) and generation of reactive oxygen species (ROS). Arabidopsis has at least four PAMP/pathogen-responsive MAPKs: MPK3, MPK6, MPK4 and MPK11. It was speculated that these MAPKs may function downstream of ROS in plant immunity because of their activation by exogenously added H2 O2 . MPK3/MPK6 or their orthologs in other plant species have also been reported to be involved in the ROS burst from the plant respiratory burst oxidase homolog (Rboh) of the human neutrophil gp91phox. However, detailed genetic analysis is lacking. Using a chemical genetic approach, we generated a conditional loss-of-function mpk3 mpk6 double mutant. Consistent with results obtained using a conditionally rescued mpk3 mpk6 double mutant generated previously, the results obtained using the new conditional loss-of-function mpk3 mpk6 double mutant demonstrate that the flg22-triggered ROS burst is independent of MPK3/MPK6. In Arabidopsis mutants lacking a functional AtRbohD, the flg22-induced ROS burst was completely blocked. However, activation of MPK3/MPK6 was not affected. Based on these results, we conclude that the rapid ROS burst and MPK3/MPK6 activation are two independent early signaling events in plant immunity, downstream of FLS2. We also found that MPK4 negatively affects the flg22-induced ROS burst. In addition, salicylic acid pre-treatment enhances the AtRbohD-mediated ROS burst, which is again independent of MPK3/MPK6 based on analysis of the mpk3 mpk6 double mutant. The establishment of an mpk3 mpk6 double mutant system using a chemical genetic approach provides a powerful tool to investigate the function of MPK3/MPK6 in the plant defense signaling pathway.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/imunologia , Quinases de Proteína Quinase Ativadas por Mitógeno/metabolismo , Proteínas Quinases Ativadas por Mitógeno/metabolismo , NADPH Oxidases/metabolismo , Explosão Respiratória , Transdução de Sinais , Arabidopsis/enzimologia , Arabidopsis/metabolismo , Ativação Enzimática , Espécies Reativas de Oxigênio/metabolismo
13.
J Comput Chem ; 36(1): 49-61, 2015 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-25363279

RESUMO

Short peptides play important roles in cellular processes including signal transduction, immune response, and transcription regulation. Correct identification of the peptide binding site on a given protein surface is of great importance not only for mechanistic investigation of these biological processes but also for therapeutic development. In this study, we developed a novel computational approach, referred to as ACCLUSTER, for predicting the peptide binding sites on protein surfaces. Specifically, we use the 20 standard amino acids as probes to globally scan the protein surface. The poses forming good chemical interactions with the protein are identified, followed by clustering with the density-based spatial clustering of applications with noise technique. Finally, these clusters are ranked based on their sizes. The cluster with the largest size is predicted as the putative binding site. Assessment of ACCLUSTER was performed on a diverse test set of 251 nonredundant protein-peptide complexes. The results were compared with the performance of POCASA, a pocket detection method for ligand binding site prediction. Peptidb, another protein-peptide database that contains both bound structures and unbound or homologous structures was used to test the robustness of ACCLUSTER. The performance of ACCLUSTER was also compared with PepSite2 and PeptiMap, two recently developed methods developed for identifying peptide binding sites. The results showed that ACCLUSTER is a promising method for peptide binding site prediction. Additionally, ACCLUSTER was also shown to be applicable to nonpeptide ligand binding site prediction.


Assuntos
Biologia Computacional , Peptídeos/química , Proteínas/química , Algoritmos , Aminoácidos/química , Sítios de Ligação , Análise por Conglomerados , Simulação de Acoplamento Molecular , Propriedades de Superfície
14.
Elife ; 122024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38564241

RESUMO

Accurate prediction of contacting residue pairs between interacting proteins is very useful for structural characterization of protein-protein interactions. Although significant improvement has been made in inter-protein contact prediction recently, there is still a large room for improving the prediction accuracy. Here we present a new deep learning method referred to as PLMGraph-Inter for inter-protein contact prediction. Specifically, we employ rotationally and translationally invariant geometric graphs obtained from structures of interacting proteins to integrate multiple protein language models, which are successively transformed by graph encoders formed by geometric vector perceptrons and residual networks formed by dimensional hybrid residual blocks to predict inter-protein contacts. Extensive evaluation on multiple test sets illustrates that PLMGraph-Inter outperforms five top inter-protein contact prediction methods, including DeepHomo, GLINTER, CDPred, DeepHomo2, and DRN-1D2D_Inter, by large margins. In addition, we also show that the prediction of PLMGraph-Inter can complement the result of AlphaFold-Multimer. Finally, we show leveraging the contacts predicted by PLMGraph-Inter as constraints for protein-protein docking can dramatically improve its performance for protein complex structure prediction.


Assuntos
Idioma , Redes Neurais de Computação
15.
Proteins ; 81(12): 2183-91, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24227686

RESUMO

Inclusion of entropy is important and challenging for protein-protein binding prediction. Here, we present a statistical mechanics-based approach to empirically consider the effect of orientational entropy. Specifically, we globally sample the possible binding orientations based on a simple shape-complementarity scoring function using an FFT-type docking method. Then, for each generated orientation, we calculate the probability through the partition function of the ensemble of accessible states, which are assumed to be represented by the set of nearby binding modes. For each mode, the interaction energy is calculated using our ITScorePP scoring function that was developed in our laboratory based on principles of statistical mechanics. Using the above protocol, we present the results of our participation in Rounds 22-27 of the Critical Assessment of PRedicted Interactions (CAPRI) experiment for 10 targets (T46-T58). Additional experimental information, such as low-resolution small-angle X-ray scattering data, was used when available. In the prediction (or docking) experiments of the 10 target complexes, we achieved correct binding modes for six targets: one with high accuracy (T47), two with medium accuracy (T48 and T57), and three with acceptable accuracy (T49, T50, and T58). In the scoring experiments of seven target complexes, we obtained correct binding modes for six targets: one with high accuracy (T47), two with medium accuracy (T49 and T50), and three with acceptable accuracy (T46, T51, and T53).


Assuntos
Simulação de Acoplamento Molecular , Mapas de Interação de Proteínas , Proteínas/química , Software , Algoritmos , Biologia Computacional , Cristalografia por Raios X , Bases de Dados de Proteínas , Entropia , Modelos Moleculares , Ligação Proteica , Conformação Proteica
16.
J Chem Inf Model ; 53(8): 1905-14, 2013 Aug 26.
Artigo em Inglês | MEDLINE | ID: mdl-23656179

RESUMO

In this study, we use the recently released 2012 Community Structure-Activity Resource (CSAR) data set to evaluate two knowledge-based scoring functions, ITScore and STScore, and a simple force-field-based potential (VDWScore). The CSAR data set contains 757 compounds, most with known affinities, and 57 crystal structures. With the help of the script files for docking preparation, we use the full CSAR data set to evaluate the performances of the scoring functions on binding affinity prediction and active/inactive compound discrimination. The CSAR subset that includes crystal structures is used as well, to evaluate the performances of the scoring functions on binding mode and affinity predictions. Within this structure subset, we investigate the importance of accurate ligand and protein conformational sampling and find that the binding affinity predictions are less sensitive to non-native ligand and protein conformations than the binding mode predictions. We also find the full CSAR data set to be more challenging in making binding mode predictions than the subset with structures. The script files used for preparing the CSAR data set for docking, including scripts for canonicalization of the ligand atoms, are offered freely to the academic community.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Processamento Eletrônico de Dados , Simulação de Acoplamento Molecular/métodos , Automação , Cristalografia por Raios X , Conformação Proteica , Relação Estrutura-Atividade
17.
Genome Biol ; 20(1): 109, 2019 05 29.
Artigo em Inglês | MEDLINE | ID: mdl-31142351

RESUMO

Data science allows the extraction of practical insights from large-scale data. Here, we contextualize it as an umbrella term, encompassing several disparate subdomains. We focus on how genomics fits as a specific application subdomain, in terms of well-known 3 V data and 4 M process frameworks (volume-velocity-variety and measurement-mining-modeling-manipulation, respectively). We further analyze the technical and cultural "exports" and "imports" between genomics and other data-science subdomains (e.g., astronomy). Finally, we discuss how data value, privacy, and ownership are pressing issues for data science applications, in general, and are especially relevant to genomics, due to the persistent nature of DNA.


Assuntos
Ciência de Dados , Genômica
18.
Structure ; 27(9): 1469-1481.e3, 2019 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-31279629

RESUMO

A key issue in drug design is how population variation affects drug efficacy by altering binding affinity (BA) in different individuals, an essential consideration for government regulators. Ideally, we would like to evaluate the BA perturbations of millions of single-nucleotide variants (SNVs). However, only hundreds of protein-drug complexes with SNVs have experimentally characterized BAs, constituting too small a gold standard for straightforward statistical model training. Thus, we take a hybrid approach: using physically based calculations to bootstrap the parameterization of a full model. In particular, we do 3D structure-based docking on ∼10,000 SNVs modifying known protein-drug complexes to construct a pseudo gold standard. Then we use this augmented set of BAs to train a statistical model combining structure, ligand and sequence features and illustrate how it can be applied to millions of SNVs. Finally, we show that our model has good cross-validated performance (97% AUROC) and can also be validated by orthogonal ligand-binding data.


Assuntos
Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único , Proteínas/química , Proteínas/genética , Bases de Dados de Proteínas , Desenho de Fármacos , Humanos , Ligantes , Aprendizado de Máquina , Modelos Estatísticos , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Proteínas/metabolismo
19.
Science ; 362(6420)2018 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-30545857

RESUMO

Despite progress in defining genetic risk for psychiatric disorders, their molecular mechanisms remain elusive. Addressing this, the PsychENCODE Consortium has generated a comprehensive online resource for the adult brain across 1866 individuals. The PsychENCODE resource contains ~79,000 brain-active enhancers, sets of Hi-C linkages, and topologically associating domains; single-cell expression profiles for many cell types; expression quantitative-trait loci (QTLs); and further QTLs associated with chromatin, splicing, and cell-type proportions. Integration shows that varying cell-type proportions largely account for the cross-population variation in expression (with >88% reconstruction accuracy). It also allows building of a gene regulatory network, linking genome-wide association study variants to genes (e.g., 321 for schizophrenia). We embed this network into an interpretable deep-learning model, which improves disease prediction by ~6-fold versus polygenic risk scores and identifies key genes and pathways in psychiatric disorders.


Assuntos
Encéfalo/metabolismo , Regulação da Expressão Gênica , Transtornos Mentais/genética , Conjuntos de Dados como Assunto , Aprendizado Profundo , Elementos Facilitadores Genéticos , Epigênese Genética , Epigenômica , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Humanos , Locos de Características Quantitativas , Análise de Célula Única , Transcriptoma
20.
Methods Mol Biol ; 1561: 3-9, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28236229

RESUMO

Peptides mediate up to 40 % of protein-protein interactions in a variety of cellular processes and are also attractive drug candidates. Thus, predicting peptide binding sites on the given protein structure is of great importance for mechanistic investigation of protein-peptide interactions and peptide therapeutics development. In this chapter, we describe the usage of our web server, referred to as ACCLUSTER, for peptide binding site prediction for a given protein structure. ACCLUSTER is freely available for users without registration at http://zougrouptoolkit.missouri.edu/accluster .


Assuntos
Bases de Dados de Proteínas , Fragmentos de Peptídeos/metabolismo , Proteínas/metabolismo , Animais , Sítios de Ligação , Eucariotos/química , Eucariotos/metabolismo , Humanos , Simulação de Acoplamento Molecular , Fragmentos de Peptídeos/química , Proteínas/química , Software , Navegador
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa