Pesquisa | Secretaria de Estado da Saúde

1.

AbFlex: designing antibody complementarity determining regions with flexible CDR definition.

Jeon, Woosung; Kim, Dongsup.

Bioinformatics ; 40(3)2024 03 04.

Artigo em Inglês | MEDLINE | ID: mdl-38449295

RESUMO

MOTIVATION: Antibodies are proteins that the immune system produces in response to foreign pathogens. Designing antibodies that specifically bind to antigens is a key step in developing antibody therapeutics. The complementarity determining regions (CDRs) of the antibody are mainly responsible for binding to the target antigen, and therefore must be designed to recognize the antigen. RESULTS: We develop an antibody design model, AbFlex, that exhibits state-of-the-art performance in terms of structure prediction accuracy and amino acid recovery rate. Furthermore, >38% of newly designed antibody models are estimated to have better binding energies for their antigens than wild types. The effectiveness of the model is attributed to two different strategies that are developed to overcome the difficulty associated with the scarcity of antibody-antigen complex structure data. One strategy is to use an equivariant graph neural network model that is more data-efficient. More importantly, a new data augmentation strategy based on the flexible definition of CDRs significantly increases the performance of the CDR prediction model. AVAILABILITY AND IMPLEMENTATION: The source code and implementation are available at https://github.com/wsjeon92/AbFlex.

Assuntos

Complexo Antígeno-Anticorpo , Regiões Determinantes de Complementaridade , Regiões Determinantes de Complementaridade/química , Regiões Determinantes de Complementaridade/metabolismo , Sequência de Aminoácidos , Modelos Moleculares , Complexo Antígeno-Anticorpo/química , Antígenos

2.

TSpred: a robust prediction framework for TCR-epitope interactions using paired chain TCR sequence data.

Kim, Ha Young; Kim, Sungsik; Park, Woong-Yang; Kim, Dongsup.

Bioinformatics ; 40(8)2024 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-39052940

RESUMO

MOTIVATION: Prediction of T-cell receptor (TCR)-epitope interactions is important for many applications in biomedical research, such as cancer immunotherapy and vaccine design. The prediction of TCR-epitope interactions remains challenging especially for novel epitopes, due to the scarcity of available data. RESULTS: We propose TSpred, a new deep learning approach for the pan-specific prediction of TCR binding specificity based on paired chain TCR data. We develop a robust model that generalizes well to unseen epitopes by combining the predictive power of CNN and the attention mechanism. In particular, we design a reciprocal attention mechanism which focuses on extracting the patterns underlying TCR-epitope interactions. Upon a comprehensive evaluation of our model, we find that TSpred achieves state-of-the-art performances in both seen and unseen epitope specificity prediction tasks. Also, compared to other predictors, TSpred is more robust to bias related to peptide imbalance in the dataset. In addition, the reciprocal attention component of our model allows for model interpretability by capturing structurally important binding regions. Results indicate that TSpred is a robust and reliable method for the task of TCR-epitope binding prediction. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/ha01994/TSpred.

Assuntos

Receptores de Antígenos de Linfócitos T , Receptores de Antígenos de Linfócitos T/metabolismo , Receptores de Antígenos de Linfócitos T/química , Receptores de Antígenos de Linfócitos T/imunologia , Epitopos de Linfócito T/imunologia , Epitopos de Linfócito T/química , Epitopos de Linfócito T/metabolismo , Humanos , Aprendizado Profundo , Biologia Computacional/métodos , Software , Ligação Proteica

3.

DeepLUCIA: predicting tissue-specific chromatin loops using Deep Learning-based Universal Chromatin Interaction Annotator.

Yang, Dongchan; Chung, Taesu; Kim, Dongsup.

Bioinformatics ; 38(14): 3501-3512, 2022 07 11.

Artigo em Inglês | MEDLINE | ID: mdl-35640981

RESUMO

MOTIVATION: The importance of chromatin loops in gene regulation is broadly accepted. There are mainly two approaches to predict chromatin loops: transcription factor (TF) binding-dependent approach and genomic variation-based approach. However, neither of these approaches provides an adequate understanding of gene regulation in human tissues. To address this issue, we developed a deep learning-based chromatin loop prediction model called Deep Learning-based Universal Chromatin Interaction Annotator (DeepLUCIA). RESULTS: Although DeepLUCIA does not use TF binding profile data which previous TF binding-dependent methods critically rely on, its prediction accuracies are comparable to those of the previous TF binding-dependent methods. More importantly, DeepLUCIA enables the tissue-specific chromatin loop predictions from tissue-specific epigenomes that cannot be handled by genomic variation-based approach. We demonstrated the utility of the DeepLUCIA by predicting several novel target genes of SNPs identified in genome-wide association studies targeting Brugada syndrome, COVID-19 severity and age-related macular degeneration. Availability and implementation DeepLUCIA is freely available at https://github.com/bcbl-kaist/DeepLUCIA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

COVID-19 , Aprendizado Profundo , Humanos , Cromatina , Estudo de Associação Genômica Ampla , Genômica/métodos

4.

Rising of LOXHD1 as a signature causative gene of down-sloping hearing loss in people in their teens and 20s.

Kim, Bong Jik; Jeon, Hyoung Won; Jeon, Woosung; Han, Jin Hee; Oh, Jayoung; Yi, Nayoung; Kim, Min Young; Kim, Minah; Kim, Justin Namju; Kim, Bo Hye; Hyon, Joon Young; Kim, Dongsup; Koo, Ja-Won; Oh, Doo-Yi; Choi, Byung Yoon.

J Med Genet ; 59(5): 470-480, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-33753533

RESUMO

BACKGROUND: Down-sloping sensorineural hearing loss (SNHL) in people in their teens and 20s hampers efficient learning and communication and in-depth social interactions. Nonetheless, its aetiology remains largely unclear, with the exception of some potential causative genes, none of which stands out especially in people in their teens and 20s. Here, we examined the role and genotype-phenotype correlation of lipoxygenase homology domain 1 (LOXHD1) in down-sloping SNHL through a cohort study. METHODS: Based on the Seoul National University Bundang Hospital (SNUBH) genetic deafness cohort, in which the patients show varying degrees of deafness and different onset ages (n=1055), we have established the 'SNUBH Teenager-Young Adult Down-sloping SNHL' cohort (10-35 years old) (n=47), all of whom underwent exome sequencing. Three-dimensional molecular modelling, minigene splicing assay and short tandem repeat marker genotyping were performed, and medical records were reviewed. RESULTS: LOXHD1 accounted for 33.3% of all genetically diagnosed cases of down-sloping SNHL (n=18) and 12.8% of cases in the whole down-sloping SNHL cohort (n=47) of young adults. We identified a potential common founder allele, as well as an interesting genotype-phenotype correlation. We also showed that transcript 6 is necessary and probably sufficient for normal hearing. CONCLUSIONS: LOXHD1 exceeds other genes in its contribution to down-sloping SNHL in young adults, rising as a signature causative gene, and shows a potential but interesting genotype-phenotype correlation.

Assuntos

Surdez , Perda Auditiva Neurossensorial , Perda Auditiva , Adolescente , Adulto , Proteínas de Transporte/genética , Estudos de Coortes , Perda Auditiva Neurossensorial/diagnóstico , Perda Auditiva Neurossensorial/epidemiologia , Perda Auditiva Neurossensorial/genética , Humanos , Lipoxigenase , Adulto Jovem

5.

Hypomorphic Mutations in TONSL Cause SPONASTRIME Dysplasia.

Chang, Hae Ryung; Cho, Sung Yoon; Lee, Jae Hoon; Lee, Eunkyung; Seo, Jieun; Lee, Hye Ran; Cavalcanti, Denise P; Mäkitie, Outi; Valta, Helena; Girisha, Katta M; Lee, Chung; Neethukrishna, Kausthubham; Bhavani, Gandham S; Shukla, Anju; Nampoothiri, Sheela; Phadke, Shubha R; Park, Mi Jung; Ikegawa, Shiro; Wang, Zheng; Higgs, Martin R; Stewart, Grant S; Jung, Eunyoung; Lee, Myeong-Sok; Park, Jong Hoon; Lee, Eun A; Kim, Hongtae; Myung, Kyungjae; Jeon, Woosung; Lee, Kyoungyeul; Kim, Dongsup; Kim, Ok-Hwa; Choi, Murim; Lee, Han-Woong; Kim, Yonghwan; Cho, Tae-Joon.

Am J Hum Genet ; 104(3): 439-453, 2019 03 07.

Artigo em Inglês | MEDLINE | ID: mdl-30773278

RESUMO

SPONASTRIME dysplasia is a rare, recessive skeletal dysplasia characterized by short stature, facial dysmorphism, and aberrant radiographic findings of the spine and long bone metaphysis. No causative genetic alterations for SPONASTRIME dysplasia have yet been determined. Using whole-exome sequencing (WES), we identified bi-allelic TONSL mutations in 10 of 13 individuals with SPONASTRIME dysplasia. TONSL is a multi-domain scaffold protein that interacts with DNA replication and repair factors and which plays critical roles in resistance to replication stress and the maintenance of genome integrity. We show here that cellular defects in dermal fibroblasts from affected individuals are complemented by the expression of wild-type TONSL. In addition, in vitro cell-based assays and in silico analyses of TONSL structure support the pathogenicity of those TONSL variants. Intriguingly, a knock-in (KI) Tonsl mouse model leads to embryonic lethality, implying the physiological importance of TONSL. Overall, these findings indicate that genetic variants resulting in reduced function of TONSL cause SPONASTRIME dysplasia and highlight the importance of TONSL in embryonic development and postnatal growth.

Assuntos

Fibroblastos/patologia , Genes Letais , Mutação , NF-kappa B/genética , Osteocondrodisplasias/patologia , Adolescente , Adulto , Animais , Células Cultivadas , Criança , Pré-Escolar , Dano ao DNA , Derme/metabolismo , Derme/patologia , Feminino , Fibroblastos/metabolismo , Humanos , Lactente , Recém-Nascido , Camundongos , Camundongos Endogâmicos C57BL , Osteocondrodisplasias/genética , Sequenciamento do Exoma/métodos , Adulto Jovem

6.

CRDS: Consensus Reverse Docking System for target fishing.

Lee, Aeri; Kim, Dongsup.

Bioinformatics ; 36(3): 959-960, 2020 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-31432077

RESUMO

MOTIVATION: Identification of putative drug targets is a critical step for explaining the mechanism of drug action against multiple targets, finding new therapeutic indications for existing drugs and unveiling the adverse drug reactions. One important approach is to use the molecular docking. However, its widespread utilization has been hindered by the lack of easy-to-use public servers. Therefore, it is vital to develop a streamlined computational tool for target prediction by molecular docking on a large scale. RESULTS: We present a fully automated web tool named Consensus Reverse Docking System (CRDS), which predicts potential interaction sites for a given drug. To improve hit rates, we developed a strategy of consensus scoring. CRDS carries out reverse docking against 5254 candidate protein structures using three different scoring functions (GoldScore, Vina and LeDock from GOLD version 5.7.1, AutoDock Vina version 1.1.2 and LeDock version 1.0, respectively), and those scores are combined into a single score named Consensus Docking Score (CDS). The web server provides the list of top 50 predicted interaction sites, docking conformations, 10 most significant pathways and the distribution of consensus scores. AVAILABILITY AND IMPLEMENTATION: The web server is available at http://pbil.kaist.ac.kr/CRDS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Computadores , Proteínas , Consenso , Ligantes , Conformação Molecular , Simulação de Acoplamento Molecular

7.

Prediction of mutation effects using a deep temporal convolutional network.

Kim, Ha Young; Kim, Dongsup.

Bioinformatics ; 36(7): 2047-2052, 2020 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-31746978

RESUMO

MOTIVATION: Accurate prediction of the effects of genetic variation is a major goal in biological research. Towards this goal, numerous machine learning models have been developed to learn information from evolutionary sequence data. The most effective method so far is a deep generative model based on the variational autoencoder (VAE) that models the distributions using a latent variable. In this study, we propose a deep autoregressive generative model named mutationTCN, which employs dilated causal convolutions and attention mechanism for the modeling of inter-residue correlations in a biological sequence. RESULTS: We show that this model is competitive with the VAE model when tested against a set of 42 high-throughput mutation scan experiments, with the mean improvement in Spearman rank correlation â¼0.023. In particular, our model can more efficiently capture information from multiple sequence alignments with lower effective number of sequences, such as in viral sequence families, compared with the latent variable model. Also, we extend this architecture to a semi-supervised learning framework, which shows high prediction accuracy. We show that our model enables a direct optimization of the data likelihood and allows for a simple and stable training process. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/ha01994/mutationTCN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado de Máquina , Redes Neurais de Computação , Mutação , Alinhamento de Sequência , Software

8.

FP2VEC: a new molecular featurizer for learning molecular properties.

Jeon, Woosung; Kim, Dongsup.

Bioinformatics ; 35(23): 4979-4985, 2019 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-31070725

RESUMO

MOTIVATION: One of the most successful methods for predicting the properties of chemical compounds is the quantitative structure-activity relationship (QSAR) methods. The prediction accuracy of QSAR models has recently been greatly improved by employing deep learning technology. Especially, newly developed molecular featurizers based on graph convolution operations on molecular graphs significantly outperform the conventional extended connectivity fingerprints (ECFP) feature in both classification and regression tasks, indicating that it is critical to develop more effective new featurizers to fully realize the power of deep learning techniques. Motivated by the fact that there is a clear analogy between chemical compounds and natural languages, this work develops a new molecular featurizer, FP2VEC, which represents a chemical compound as a set of trainable embedding vectors. RESULTS: To implement and test our new featurizer, we build a QSAR model using a simple convolutional neural network (CNN) architecture that has been successfully used for natural language processing tasks such as sentence classification task. By testing our new method on several benchmark datasets, we demonstrate that the combination of FP2VEC and CNN model can achieve competitive results in many QSAR tasks, especially in classification tasks. We also demonstrate that the FP2VEC model is especially effective for multitask learning. AVAILABILITY AND IMPLEMENTATION: FP2VEC is available from https://github.com/wsjeon92/FP2VEC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado Profundo , Processamento de Linguagem Natural , Relação Quantitativa Estrutura-Atividade , Software

9.

3DIV: A 3D-genome Interaction Viewer and database.

Yang, Dongchan; Jang, Insu; Choi, Jinhyuk; Kim, Min-Seo; Lee, Andrew J; Kim, Hyunwoong; Eom, Junghyun; Kim, Dongsup; Jung, Inkyung; Lee, Byungwook.

Nucleic Acids Res ; 46(D1): D52-D57, 2018 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-29106613

RESUMO

Three-dimensional (3D) chromatin structure is an emerging paradigm for understanding gene regulation mechanisms. Hi-C (high-throughput chromatin conformation capture), a method to detect long-range chromatin interactions, allows extensive genome-wide investigation of 3D chromatin structure. However, broad application of Hi-C data have been hindered by the level of complexity in processing Hi-C data and the large size of raw sequencing data. In order to overcome these limitations, we constructed a database named 3DIV (a 3D-genome Interaction Viewer and database) that provides a list of long-range chromatin interaction partners for the queried locus with genomic and epigenomic annotations. 3DIV is the first of its kind to collect all publicly available human Hi-C data to provide 66 billion uniformly processed raw Hi-C read pairs obtained from 80 different human cell/tissue types. In contrast to other databases, 3DIV uniquely provides normalized chromatin interaction frequencies against genomic distance dependent background signals and a dynamic browsing visualization tool for the listed interactions, which could greatly advance the interpretation of chromatin interactions. '3DIV' is available at http://kobic.kr/3div.

Assuntos

Cromatina/genética , Bases de Dados Genéticas , Genoma Humano , Software , Cromatina/ultraestrutura , Bases de Dados de Ácidos Nucleicos , Epigênese Genética , Estudo de Associação Genômica Ampla , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imageamento Tridimensional , Internet , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , Polimorfismo de Nucleotídeo Único

10.

BGN Mutations in X-Linked Spondyloepimetaphyseal Dysplasia.

Cho, Sung Yoon; Bae, Jun-Seok; Kim, Nayoung K D; Forzano, Francesca; Girisha, Katta Mohan; Baldo, Chiara; Faravelli, Francesca; Cho, Tae-Joon; Kim, Dongsup; Lee, Kyoung Yeul; Ikegawa, Shiro; Shim, Jong Sup; Ko, Ah-Ra; Miyake, Noriko; Nishimura, Gen; Superti-Furga, Andrea; Spranger, Jürgen; Kim, Ok-Hwa; Park, Woong-Yang; Jin, Dong-Kyu.

Am J Hum Genet ; 98(6): 1243-1248, 2016 06 02.

Artigo em Inglês | MEDLINE | ID: mdl-27236923

RESUMO

Spondyloepimetaphyseal dysplasias (SEMDs) comprise a heterogeneous group of autosomal-dominant and autosomal-recessive disorders. An apparent X-linked recessive (XLR) form of SEMD in a single Italian family was previously reported. We have been able to restudy this family together with a second family from Korea by segregating a severe SEMD in an X-linked pattern. Exome sequencing showed missense mutations in BGN c.439A>G (p.Lys147Glu) in the Korean family and c.776G>T (p.Gly259Val) in the Italian family; the c.439A>G (p.Lys147Glu) mutation was also identified in a further simplex SEMD case from India. Biglycan is an extracellular matrix proteoglycan that can bind transforming growth factor beta (TGF-ß) and thus regulate its free concentration. In 3-dimensional simulation, both altered residues localized to the concave arc of leucine-rich repeat domains of biglycan that interact with TGF-ß. The observation of recurrent BGN mutations in XLR SEMD individuals from different ethnic backgrounds allows us to define "XLR SEMD, BGN type" as a nosologic entity.

Assuntos

Biglicano/genética , Doenças Genéticas Ligadas ao Cromossomo X/genética , Mutação/genética , Osteocondrodisplasias/genética , Adulto , Idoso , Sequência de Aminoácidos , Biglicano/química , Biglicano/metabolismo , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Linhagem , Ligação Proteica , Conformação Proteica , Homologia de Sequência de Aminoácidos , Fator de Crescimento Transformador beta/química , Fator de Crescimento Transformador beta/genética , Fator de Crescimento Transformador beta/metabolismo

11.

Computational characterization of chromatin domain boundary-associated genomic elements.

Hong, Seungpyo; Kim, Dongsup.

Nucleic Acids Res ; 45(18): 10403-10414, 2017 Oct 13.

Artigo em Inglês | MEDLINE | ID: mdl-28977568

RESUMO

Topologically associated domains (TADs) are 3D genomic structures with high internal interactions that play important roles in genome compaction and gene regulation. Their genomic locations and their association with CCCTC-binding factor (CTCF)-binding sites and transcription start sites (TSSs) were recently reported. However, the relationship between TADs and other genomic elements has not been systematically evaluated. This was addressed in the present study, with a focus on the enrichment of these genomic elements and their ability to predict the TAD boundary region. We found that consensus CTCF-binding sites were strongly associated with TAD boundaries as well as with the transcription factors (TFs) Zinc finger protein (ZNF)143 and Yin Yang (YY)1. TAD boundary-associated genomic elements include DNase I-hypersensitive sites, H3K36 trimethylation, TSSs, RNA polymerase II, and TFs such as Specificity protein 1, ZNF274 and SIX homeobox 5. Computational modeling with these genomic elements suggests that they have distinct roles in TAD boundary formation. We propose a structural model of TAD boundaries based on these findings that provides a basis for studying the mechanism of chromatin structure formation and gene regulation.

Assuntos

Cromatina/genética , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Genoma Humano , Elementos Reguladores de Transcrição , Algoritmos , Sítios de Ligação , Cromatina/metabolismo , Bases de Dados Genéticas , Regulação da Expressão Gênica , Componentes Genômicos , Humanos , Regiões Promotoras Genéticas

12.

Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.

Han, Youngmahn; Kim, Dongsup.

BMC Bioinformatics ; 18(1): 585, 2017 12 28.

Artigo em Inglês | MEDLINE | ID: mdl-29281985

RESUMO

BACKGROUND: Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. RESULTS: Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc . CONCLUSIONS: We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.

Assuntos

Aprendizado Profundo , Antígenos de Histocompatibilidade Classe I/metabolismo , Peptídeos/metabolismo , Alelos , Sequência de Aminoácidos , Animais , Antígenos de Histocompatibilidade Classe I/imunologia , Humanos , Internet , Aprendizado de Máquina , Peptídeos/química , Ligação Proteica , Reprodutibilidade dos Testes

13.

Utilizing random Forest QSAR models with optimized parameters for target identification and its application to target-fishing server.

Lee, Kyoungyeul; Lee, Minho; Kim, Dongsup.

BMC Bioinformatics ; 18(Suppl 16): 567, 2017 12 28.

Artigo em Inglês | MEDLINE | ID: mdl-29297315

RESUMO

BACKGROUND: The identification of target molecules is important for understanding the mechanism of "target deconvolution" in phenotypic screening and "polypharmacology" of drugs. Because conventional methods of identifying targets require time and cost, in-silico target identification has been considered an alternative solution. One of the well-known in-silico methods of identifying targets involves structure activity relationships (SARs). SARs have advantages such as low computational cost and high feasibility; however, the data dependency in the SAR approach causes imbalance of active data and ambiguity of inactive data throughout targets. RESULTS: We developed a ligand-based virtual screening model comprising 1121 target SAR models built using a random forest algorithm. The performance of each target model was tested by employing the ROC curve and the mean score using an internal five-fold cross validation. Moreover, recall rates for top-k targets were calculated to assess the performance of target ranking. A benchmark model using an optimized sampling method and parameters was examined via external validation set. The result shows recall rates of 67.6% and 73.9% for top-11 (1% of the total targets) and top-33, respectively. We provide a website for users to search the top-k targets for query ligands available publicly at http://rfqsar.kaist.ac.kr . CONCLUSIONS: The target models that we built can be used for both predicting the activity of ligands toward each target and ranking candidate targets for a query ligand using a unified scoring scheme. The scores are additionally fitted to the probability so that users can estimate how likely a ligand-target interaction is active. The user interface of our web site is user friendly and intuitive, offering useful information and cross references.

Assuntos

Algoritmos , Sistemas de Liberação de Medicamentos , Modelos Teóricos , Relação Quantitativa Estrutura-Atividade , Simulação por Computador , Ligantes , Probabilidade , Curva ROC , Reprodutibilidade dos Testes

14.

Library of binding protein scaffolds (LibBP): a computational platform for selection of binding protein scaffolds.

Hong, Seungpyo; Kim, Dongsup.

Bioinformatics ; 32(11): 1709-15, 2016 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-26826717

RESUMO

MOTIVATION: Developments in biotechnology have enabled the in vitro evolution of binding proteins. The emerging limitations of antibodies in binding protein engineering have led to suggestions for other proteins as alternative binding protein scaffolds. Most of these proteins were selected based on human intuition rather than systematic analysis of the available data. To improve this strategy, we developed a computational framework for finding desirable binding protein scaffolds by utilizing protein structure and sequence information. RESULTS: For each protein, its structure and the sequences of evolutionarily-related proteins were analyzed, and spatially contiguous regions composed of highly variable residues were identified. A large number of proteins have these regions, but leucine rich repeats (LRRs), histidine kinase domains and immunoglobulin domains are predominant among them. The candidates suggested as new binding protein scaffolds include histidine kinase, LRR, titin and pentapeptide repeat protein. AVAILABILITY AND IMPLEMENTATION: The database and web-service are accessible via http://bcbl.kaist.ac.kr/LibBP CONTACT: kds@kaist.ac.krSupplementary data: Supplementary data are available at Bioinformatics online.

Assuntos

Ligação Proteica , Sequência de Aminoácidos , Biblioteca Gênica , Humanos , Engenharia de Proteínas , Proteínas

15.

Structure-based Markov random field model for representing evolutionary constraints on functional sites.

Jeong, Chan-Seok; Kim, Dongsup.

BMC Bioinformatics ; 17: 99, 2016 Feb 24.

Artigo em Inglês | MEDLINE | ID: mdl-26911566

RESUMO

BACKGROUND: Elucidating the cooperative mechanism of interconnected residues is an important component toward understanding the biological function of a protein. Coevolution analysis has been developed to model the coevolutionary information reflecting structural and functional constraints. Recently, several methods have been developed based on a probabilistic graphical model called the Markov random field (MRF), which have led to significant improvements for coevolution analysis; however, thus far, the performance of these models has mainly been assessed by focusing on the aspect of protein structure. RESULTS: In this study, we built an MRF model whose graphical topology is determined by the residue proximity in the protein structure, and derived a novel positional coevolution estimate utilizing the node weight of the MRF model. This structure-based MRF method was evaluated for three data sets, each of which annotates catalytic site, allosteric site, and comprehensively determined functional site information. We demonstrate that the structure-based MRF architecture can encode the evolutionary information associated with biological function. Furthermore, we show that the node weight can more accurately represent positional coevolution information compared to the edge weight. Lastly, we demonstrate that the structure-based MRF model can be reliably built with only a few aligned sequences in linear time. CONCLUSIONS: The results show that adoption of a structure-based architecture could be an acceptable approximation for coevolution modeling with efficient computation complexity.

Assuntos

Sítio Alostérico/genética , Evolução Molecular , Proteínas/metabolismo , Evolução Biológica , Alinhamento de Sequência

16.

Interaction between bound water molecules and local protein structures: A statistical analysis of the hydrogen bond structures around bound water molecules.

Hong, Seungpyo; Kim, Dongsup.

Proteins ; 84(1): 43-51, 2016 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-26518137

RESUMO

Water molecules play an important role in protein folding and protein interactions through their structural association with proteins. Examples of such structural association can be found in protein crystal structures, and can often explain protein functionality in the context of structure. We herein report the systematic analysis of the local structures of proteins interacting with water molecules, and the characterization of their geometric features. We first examined the interaction of water molecules with a large local interaction environment by comparing the preference of water molecules in three regions, namely, the protein-protein interaction (PPI) interfaces, the crystal contact (CC) interfaces, and the non-interfacial regions. High preference of water molecules to the PPI and CC interfaces was found. In addition, the bound water on the PPI interface was more favorably associated with the complex interaction structure, implying that such water-mediated structures may participate in the shaping of the PPI interface. The pairwise water-mediated interaction was then investigated, and the water-mediated residue-residue interaction potential was derived. Subsequently, the types of polar atoms surrounding the water molecules were analyzed, and the preference of the hydrogen bond acceptor was observed. Furthermore, the geometries of the structures interacting with water were analyzed, and it was found that the major structure on the protein surface exhibited planar geometry rather than tetrahedral geometry. Several previously undiscovered characteristics of water-protein interactions were unfolded in this study, and are expected to lead to a better understanding of protein structure and function.

Assuntos

Proteínas/química , Água/química , Análise por Conglomerados , Cristalografia por Raios X , Bases de Dados de Proteínas , Ligação de Hidrogênio , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Mapas de Interação de Proteínas , Proteínas/metabolismo , Água/metabolismo

17.

Genetic landscape of open chromatin in yeast.

Lee, Kibaick; Kim, Sang Cheol; Jung, Inkyung; Kim, Kwoneel; Seo, Jungmin; Lee, Heun-Sik; Bogu, Gireesh K; Kim, Dongsup; Lee, Sanghyuk; Lee, Byungwook; Choi, Jung Kyoon.

PLoS Genet ; 9(2): e1003229, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23408895

RESUMO

Chromatin regulation underlies a variety of DNA metabolism processes, including transcription, recombination, repair, and replication. To perform a quantitative genetic analysis of chromatin accessibility, we obtained open chromatin profiles across 96 genetically different yeast strains by FAIRE (formaldehyde-assisted isolation of regulatory elements) assay followed by sequencing. While 5â¼10% of open chromatin region (OCRs) were significantly affected by variations in their underlying DNA sequences, subtelomeric areas as well as gene-rich and gene-poor regions displayed high levels of sequence-independent variation. We performed quantitative trait loci (QTL) mapping using the FAIRE signal for each OCR as a quantitative trait. While individual OCRs were associated with a handful of specific genetic markers, gene expression levels were associated with many regulatory loci. We found multi-target trans-loci responsible for a very large number of OCRs, which seemed to reflect the widespread influence of certain chromatin regulators. Such regulatory hotspots were enriched for known regulatory functions, such as recombinational DNA repair, telomere replication, and general transcription control. The OCRs associated with these multi-target trans-loci coincided with recombination hotspots, telomeres, and gene-rich regions according to the function of the associated regulators. Our findings provide a global quantitative picture of the genetic architecture of chromatin regulation.

Assuntos

Cromatina , Locos de Características Quantitativas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Saccharomyces cerevisiae , Sequência de Bases , Sítios de Ligação , Cromatina/genética , Cromatina/metabolismo , Mapeamento Cromossômico , Regulação Fúngica da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Telômero/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo

18.

H2B monoubiquitylation is a 5'-enriched active transcription mark and correlates with exon-intron structure in human cells.

Jung, Inkyung; Kim, Seung-Kyoon; Kim, Mirang; Han, Yong-Mahn; Kim, Yong Sung; Kim, Dongsup; Lee, Daeyoup.

Genome Res ; 22(6): 1026-35, 2012 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-22421545

RESUMO

H2B monoubiquitylation (H2Bub1), which is required for multiple methylations of both H3K4 and H3K79, has been implicated in gene expression in numerous organisms ranging from yeast to human. However, the molecular crosstalk between H2Bub1 and other modifications, especially the methylations of H3K4 and H3K79, remains unclear in vertebrates. To better understand the functional role of H2Bub1, we measured genome-wide histone modification patterns in human cells. Our results suggest that H2Bub1 has dual roles, one that is H3 methylation dependent, and another that is H3 methylation independent. First, H2Bub1 is a 5'-enriched active transcription mark and co-occupies with H3K79 methylations in actively transcribed regions. Second, this study shows for the first time that H2Bub1 plays a histone H3 methylations-independent role in chromatin architecture. Furthermore, the results of this work indicate that H2Bub1 is largely positioned at the exon-intron boundaries of highly expressed exons, and it demonstrates increased occupancy in skipped exons compared with flanking exons in the human and mouse genomes. Our findings collectively suggest that a potentiating mechanism links H2Bub1 to both H3K79 methylations in actively transcribed regions and the exon-intron structure of highly expressed exons via the regulation of nucleosome dynamics during transcription elongation.

Assuntos

Cromatina/genética , Éxons , Histonas/metabolismo , Íntrons , Transcrição Gênica , Animais , Linhagem Celular Tumoral , Cromatina/metabolismo , Imunoprecipitação da Cromatina , Drosophila/genética , Regulação da Expressão Gênica , Genoma Humano , Histonas/genética , Humanos , Metilação , Camundongos , Neoplasias Embrionárias de Células Germinativas/genética , Ubiquitinação

19.

A high-affinity protein binder that blocks the IL-6/STAT3 signaling pathway effectively suppresses non-small cell lung cancer.

Lee, Joong-Jae; Kim, Hyun Jung; Yang, Chul-Su; Kyeong, Hyun-Ho; Choi, Jung-Min; Hwang, Da-Eun; Yuk, Jae-Min; Park, Keunwan; Kim, Yu Jung; Lee, Seung-Goo; Kim, Dongsup; Jo, Eun-Kyeong; Cheong, Hae-Kap; Kim, Hak-Sung.

Mol Ther ; 22(7): 1254-1265, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24682171

RESUMO

Interleukin-6 (IL-6) is a multifunctional cytokine that regulates immune responses for host defense and tumorigenic process. Upregulation of IL-6 is known to constitutively phosphorylate signal transducer and activator of transcription 3 (STAT3), leading to activation of multiple oncogene pathways and inflammatory cascade. Here, we present the development of a high-affinity protein binder, termed repebody, which effectively suppresses non-small cell lung cancer in vivo by blocking the IL-6/STAT3 signaling. We selected a repebody that prevents human IL-6 (hIL-6) from binding to its receptor by a competitive immunoassay, and modulated its binding affinity for hIL-6 up to a picomolar range by a modular approach that mimics the combinatorial assembly of diverse modules to form antigen-specific receptors in nature. The resulting repebody was highly specific for hIL-6, effectively inhibiting the STAT3 phosphorylation in a dose- and binding affinity-response manner in vitro. The repebody was shown to have a remarkable suppression effect on the growth of tumors and STAT3 phosphorylation in xenograft mice with non-small cell lung cancer by blocking the hIL-6/STAT3 signaling. Structural analysis of the repebody and IL-6 complex revealed that the repebody binds the site 2a of hIL-6, overlapping a number of epitope residues at site 2a with gp130, and consequently causes a steric hindrance to the formation of IL-6/IL-6Rα complex. Our results suggest that high-affinity repebody targeting the IL-6/STAT3 pathway can be developed as therapeutics for non-small cell lung cancer.

Assuntos

Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Interleucina-6/metabolismo , Fator de Transcrição STAT3/metabolismo , Transdução de Sinais/efeitos dos fármacos , Animais , Antineoplásicos/uso terapêutico , Carcinoma Pulmonar de Células não Pequenas/metabolismo , Feminino , Humanos , Camundongos , Camundongos Nus , Ensaios Antitumorais Modelo de Xenoenxerto

20.

Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering.

Lee, Sang-Chul; Park, Keunwan; Han, Jieun; Lee, Joong-jae; Kim, Hyun Jung; Hong, Seungpyo; Heu, Woosung; Kim, Yu Jung; Ha, Jae-Seok; Lee, Seung-Goo; Cheong, Hae-Kap; Jeon, Young Ho; Kim, Dongsup; Kim, Hak-Sung.

Proc Natl Acad Sci U S A ; 109(9): 3299-304, 2012 Feb 28.

Artigo em Inglês | MEDLINE | ID: mdl-22328160

RESUMO

Repeat proteins have recently been of great interest as potential alternatives to immunoglobulin antibodies due to their unique structural and biophysical features. We here present the development of a binding scaffold based on variable lymphocyte receptors, which are nonimmunoglobulin antibodies composed of Leucine-rich repeat modules in jawless vertebrates, by module engineering. A template scaffold was first constructed by joining consensus repeat modules between the N- and C-capping motifs of variable lymphocyte receptors. The N-terminal domain of the template scaffold was redesigned based on the internalin-B cap by analyzing the modular similarity between the respective repeat units using a computational approach. The newly designed scaffold, termed "Repebody," showed a high level of soluble expression in bacteria, displaying high thermodynamic and pH stabilities. Ease of molecular engineering was shown by designing repebodies specific for myeloid differentiation protein-2 and hen egg lysozyme, respectively, by a rational approach. The crystal structures of designed repebodies were determined to elucidate the structural features and interaction interfaces. We demonstrate general applicability of the scaffold by selecting repebodies with different binding affinities for interleukin-6 using phage display.

Assuntos

Fragmentos de Peptídeos/química , Engenharia de Proteínas , Receptores Imunológicos/química , Sequência de Aminoácidos , Animais , Dicroísmo Circular , Sequência Consenso , Cristalografia por Raios X , Feiticeiras (Peixe)/metabolismo , Concentração de Íons de Hidrogênio , Lampreias/metabolismo , Modelos Moleculares , Dados de Sequência Molecular , Fragmentos de Peptídeos/síntese química , Fragmentos de Peptídeos/metabolismo , Biblioteca de Peptídeos , Ligação Proteica , Conformação Proteica , Estabilidade Proteica , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/metabolismo , Especificidade por Substrato , Temperatura

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa