Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38856168

RESUMO

Nucleic acid-binding proteins (NABPs), including DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs), play important roles in essential biological processes. To facilitate functional annotation and accurate prediction of different types of NABPs, many machine learning-based computational approaches have been developed. However, the datasets used for training and testing as well as the prediction scopes in these studies have limited their applications. In this paper, we developed new strategies to overcome these limitations by generating more accurate and robust datasets and developing deep learning-based methods including both hierarchical and multi-class approaches to predict the types of NABPs for any given protein. The deep learning models employ two layers of convolutional neural network and one layer of long short-term memory. Our approaches outperform existing DBP and RBP predictors with a balanced prediction between DBPs and RBPs, and are more practically useful in identifying novel NABPs. The multi-class approach greatly improves the prediction accuracy of DBPs and RBPs, especially for the DBPs with ~12% improvement. Moreover, we explored the prediction accuracy of single-stranded DNA binding proteins and their effect on the overall prediction accuracy of NABP predictions.


Assuntos
Biologia Computacional , Proteínas de Ligação a DNA , Aprendizado Profundo , Proteínas de Ligação a RNA , Proteínas de Ligação a RNA/metabolismo , Proteínas de Ligação a DNA/metabolismo , Biologia Computacional/métodos , Redes Neurais de Computação , Humanos
2.
Proteins ; 91(8): 1077-1088, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-36978156

RESUMO

Computational modeling of protein-DNA complex structures has important implications in biomedical applications such as structure-based, computer aided drug design. A key step in developing methods for accurate modeling of protein-DNA complexes is similarity assessment between models and their reference complex structures. Existing methods primarily rely on distance-based metrics and generally do not consider important functional features of the complexes, such as interface hydrogen bonds that are critical to specific protein-DNA interactions. Here, we present a new scoring function, ComparePD, which takes interface hydrogen bond energy and strength into account besides the distance-based metrics for accurate similarity measure of protein-DNA complexes. ComparePD was tested on two datasets of computational models of protein-DNA complexes generated using docking (classified as easy, intermediate, and difficult cases) and homology modeling methods. The results were compared with PDDockQ, a modified version of DockQ tailored for protein-DNA complexes, as well as the metrics employed by the community-wide experiment CAPRI (Critical Assessment of PRedicted Interactions). We demonstrated that ComparePD provides an improved similarity measure over PDDockQ and the CAPRI classification method by considering both conformational similarity and functional importance of the complex interface. ComparePD identified more meaningful models as compared to PDDockQ for all the cases having different top models between ComparePD and PDDockQ except for one intermediate docking case.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Ligação Proteica , Conformação Proteica , Ligação de Hidrogênio , Benchmarking , Algoritmos , Biologia Computacional/métodos , Software , Simulação de Acoplamento Molecular
3.
Proteins ; 90(6): 1303-1314, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35122321

RESUMO

Hydrogen bonds play important roles in protein folding and protein-ligand interactions, particularly in specific protein-DNA recognition. However, the distributions of hydrogen bonds, especially hydrogen bond energy (HBE) in different types of protein-ligand complexes, is unknown. Here we performed a comparative analysis of hydrogen bonds among three non-redundant datasets of protein-protein, protein-peptide, and protein-DNA complexes. Besides comparing the number of hydrogen bonds in terms of types and locations, we investigated the distributions of HBE. Our results indicate that while there is no significant difference of hydrogen bonds within protein chains among the three types of complexes, interfacial hydrogen bonds are significantly more prevalent in protein-DNA complexes. More importantly, the interfacial hydrogen bonds in protein-DNA complexes displayed a unique energy distribution of strong and weak hydrogen bonds whereas majority of the interfacial hydrogen bonds in protein-protein and protein-peptide complexes are of predominantly high strength with low energy. Moreover, there is a significant difference in the energy distributions of minor groove hydrogen bonds between protein-DNA complexes with different binding specificity. Highly specific protein-DNA complexes contain more strong hydrogen bonds in the minor groove than multi-specific complexes, suggesting important role of minor groove in specific protein-DNA recognition. These results can help better understand protein-DNA interactions and have important implications in improving quality assessments of protein-DNA complex models.


Assuntos
DNA , Proteínas , DNA/química , Ligação de Hidrogênio , Ligantes , Proteínas/química
4.
Nucleic Acids Res ; 47(21): 11103-11113, 2019 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-31665426

RESUMO

Knowledge of protein-DNA binding specificity has important implications in understanding DNA metabolism, transcriptional regulation and developing therapeutic drugs. Previous studies demonstrated hydrogen bonds between amino acid side chains and DNA bases play major roles in specific protein-DNA interactions. In this paper, we investigated the roles of individual DNA strands and protein secondary structure types in specific protein-DNA recognition based on side chain-base hydrogen bonds. By comparing the contribution of each DNA strand to the overall binding specificity between DNA-binding proteins with different degrees of binding specificity, we found that highly specific DNA-binding proteins show balanced hydrogen bonding with each of the two DNA strands while multi-specific DNA binding proteins are generally biased towards one strand. Protein-base pair hydrogen bonds, in which both bases of a base pair are involved in forming hydrogen bonds with amino acid side chains, are more prevalent in the highly specific protein-DNA complexes than those in the multi-specific group. Amino acids involved in side chain-base hydrogen bonds favor strand and coil secondary structure types in highly specific DNA-binding proteins while multi-specific DNA-binding proteins prefer helices.


Assuntos
Proteínas de Ligação a DNA/química , DNA/química , Modelos Moleculares , Aminoácidos/química , Pareamento de Bases , Sítios de Ligação , Ligação de Hidrogênio , Conformação de Ácido Nucleico , Estrutura Secundária de Proteína
5.
BMC Bioinformatics ; 19(Suppl 20): 506, 2018 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-30577740

RESUMO

BACKGROUND: Atomic details of protein-DNA complexes can provide insightful information for better understanding of the function and binding specificity of DNA binding proteins. In addition to experimental methods for solving protein-DNA complex structures, protein-DNA docking can be used to predict native or near-native complex models. A docking program typically generates a large number of complex conformations and predicts the complex model(s) based on interaction energies between protein and DNA. However, the prediction accuracy is hampered by current approaches to model assessment, especially when docking simulations fail to produce any near-native models. RESULTS: We present here a Support Vector Machine (SVM)-based approach for quality assessment of the predicted transcription factor (TF)-DNA complex models. Besides a knowledge-based protein-DNA interaction potential DDNA3, we applied several structural features that have been shown to play important roles in binding specificity between transcription factors and DNA molecules to quality assessment of complex models. To address the issue of unbalanced positive and negative cases in the training dataset, we applied hard-negative mining, an iterative training process that selects an initial training dataset by combining all of the positive cases and a random sample from the negative cases. Results show that the SVM model greatly improves prediction accuracy (84.2%) over two knowledge-based protein-DNA interaction potentials, orientation potential (60.8%) and DDNA3 (68.4%). The improvement is achieved through reducing the number of false positive predictions, especially for the hard docking cases, in which a docking algorithm fails to produce any near-native complex models. CONCLUSIONS: A learning-based SVM scoring model with structural features for specific protein-DNA binding and an atomic-level protein-DNA interaction potential DDNA3 significantly improves prediction accuracy of complex models by successfully identifying cases without near-native structural models.


Assuntos
DNA/metabolismo , Modelos Moleculares , Máquina de Vetores de Suporte , Fatores de Transcrição/metabolismo , Algoritmos , DNA/química , Ligação Proteica
6.
BMC Bioinformatics ; 18(1): 342, 2017 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-28715997

RESUMO

BACKGROUND: Gene expression is regulated by transcription factors binding to specific target DNA sites. Understanding how and where transcription factors bind at genome scale represents an essential step toward our understanding of gene regulation networks. Previously we developed a structure-based method for prediction of transcription factor binding sites using an integrative energy function that combines a knowledge-based multibody potential and two atomic energy terms. While the method performs well, it is not computationally efficient due to the exponential increase in the number of binding sequences to be evaluated for longer binding sites. In this paper, we present an efficient pentamer algorithm by splitting DNA binding sequences into overlapping fragments along with a simplified integrative energy function for transcription factor binding site prediction. RESULTS: A DNA binding sequence is split into overlapping pentamers (5 base pairs) for calculating transcription factor-pentamer interaction energy. To combine the results from overlapping pentamer scores, we developed two methods, Kmer-Sum and PWM (Position Weight Matrix) stacking, for full-length binding motif prediction. Our results show that both Kmer-Sum and PWM stacking in the new pentamer approach along with a simplified integrative energy function improved transcription factor binding site prediction accuracy and dramatically reduced computation time, especially for longer binding sites. CONCLUSION: Our new fragment-based pentamer algorithm and simplified energy function improve both efficiency and accuracy. To our knowledge, this is the first fragment-based method for structure-based transcription factor binding sites prediction.


Assuntos
Algoritmos , Análise de Sequência de DNA/métodos , Fatores de Transcrição/metabolismo , Sítios de Ligação , DNA/química , DNA/metabolismo , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Ligação Proteica
7.
Bioinformatics ; 32(12): i306-i313, 2016 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-27307632

RESUMO

UNLABELLED: Transcription factors (TFs) regulate gene expression through binding to specific target DNA sites. Accurate annotation of transcription factor binding sites (TFBSs) at genome scale represents an essential step toward our understanding of gene regulation networks. In this article, we present a structure-based method for computational prediction of TFBSs using a novel, integrative energy (IE) function. The new energy function combines a multibody (MB) knowledge-based potential and two atomic energy terms (hydrogen bond and π interaction) that might not be accurately captured by the knowledge-based potential owing to the mean force nature and low count problem. We applied the new energy function to the TFBS prediction using a non-redundant dataset that consists of TFs from 12 different families. Our results show that the new IE function improves the prediction accuracy over the knowledge-based, statistical potentials, especially for homeodomain TFs, the second largest TF family in mammals. CONTACT: jguo4@uncc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Fatores de Transcrição/química , Animais , Sítios de Ligação , Biologia Computacional , Proteínas de Ligação a DNA , Regulação da Expressão Gênica , Ligação Proteica
8.
Cladistics ; 33(1): 1-20, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34724757

RESUMO

Zika virus was previously considered to cause only a benign infection in humans. Studies of recent outbreaks of Zika virus in the Pacific, South America, Mexico and the Caribbean have associated the virus with severe neuropathology. Viral evolution may be one factor contributing to an apparent change in Zika disease as it spread from Southeast Asia across the Pacific to the Americas. To address this possibility, we have employed computational tools to compare the phylogeny, geography, immunology and RNA structure of Zika virus isolates from Africa, Asia, the Pacific and the Americas. In doing so, we compare and contrast methods and results for tree search and rooting of Zika virus phylogenies. In some phylogenetic analyses we find support for the hypothesis that there is a deep common ancestor between African and Asian clades (the "Asia/Africa" hypothesis). In other phylogenetic analyses, we find that Asian lineages are descendent from African lineages (the "out of Africa" hypothesis). In addition, we identify and evaluate key mutations in viral envelope protein coding and untranslated terminal RNA regions. We find stepwise mutations that have altered both immunological motif sets and regulatory sequence elements. Both of these sets of changes distinguish viruses found in Africa from those in the emergent Asia-Pacific-Americas lineage. These findings support the working hypothesis that mutations acquired by Zika virus in the Pacific and Americas contribute to changes in pathology. These results can inform experiments required to elucidate the role of viral genetic evolution in changes in neuropathology, including microcephaly and other neurological and skeletomuscular issues in infants, and Guillain-Barré syndrome in adults.

9.
J Org Chem ; 82(4): 1888-1894, 2017 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-28107007

RESUMO

Natural pigment chlorophyll was used as a green photosensitizer for the first time in a visible-light photoredox catalysis for the efficient synthesis of tetrahydroquinolines from N,N-dimethylanilines and maleimides in an air atmosphere. The reaction involves direct cyclization via an sp3 C-H bond functionalization process to afford products in moderate to high yields (61-98%) from a wide range of substrates with a low loading of chlorophyll under mild conditions. This work demonstrates the potential benefits of chlorophyll as photosensitizer in visible light catalysis.

10.
Proteins ; 84(8): 1147-61, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27147539

RESUMO

DNA-binding proteins play critical roles in biological processes including gene expression, DNA packaging and DNA repair. They bind to DNA target sequences with different degrees of binding specificity, ranging from highly specific (HS) to nonspecific (NS). Alterations of DNA-binding specificity, due to either genetic variation or somatic mutations, can lead to various diseases. In this study, a comparative analysis of protein-DNA complex structures was carried out to investigate the structural features that contribute to binding specificity. Protein-DNA complexes were grouped into three general classes based on degrees of binding specificity: HS, multispecific (MS), and NS. Our results show a clear trend of structural features among the three classes, including amino acid binding propensities, simple and complex hydrogen bonds, major/minor groove and base contacts, and DNA shape. We found that aspartate is enriched in HS DNA binding proteins and predominately binds to a cytosine through a single hydrogen bond or two consecutive cytosines through bidentate hydrogen bonds. Aromatic residues, histidine and tyrosine, are highly enriched in the HS and MS groups and may contribute to specific binding through different mechanisms. To further investigate the role of protein flexibility in specific protein-DNA recognition, we analyzed the conformational changes between the bound and unbound states of DNA-binding proteins and structural variations. The results indicate that HS and MS DNA-binding domains have larger conformational changes upon DNA-binding and larger degree of flexibility in both bound and unbound states. Proteins 2016; 84:1147-1161. © 2016 Wiley Periodicals, Inc.


Assuntos
Aminoácidos/química , Proteínas de Ligação a DNA/química , DNA/química , Sítios de Ligação , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Conformação de Ácido Nucleico , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Estrutura Secundária de Proteína , Eletricidade Estática , Termodinâmica
11.
Nucleic Acids Res ; 42(7): 4375-90, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24500196

RESUMO

The newly developed transcription activator-like effector protein (TALE) and clustered regularly interspaced short palindromic repeats/Cas9 transcription factors (TF) offered a powerful and precise approach for modulating gene expression. In this article, we systematically investigated the potential of these new tools in activating the stringently silenced pluripotency gene Oct4 (Pou5f1) in mouse and human somatic cells. First, with a number of TALEs and sgRNAs targeting various regions in the mouse and human Oct4 promoters, we found that the most efficient TALE-VP64s bound around -120 to -80 bp, while highly effective sgRNAs targeted from -147 to -89-bp upstream of the transcription start sites to induce high activity of luciferase reporters. In addition, we observed significant transcriptional synergy when multiple TFs were applied simultaneously. Although individual TFs exhibited marginal activity to up-regulate endogenous gene expression, optimized combinations of TALE-VP64s could enhance endogenous Oct4 transcription up to 30-fold in mouse NIH3T3 cells and 20-fold in human HEK293T cells. More importantly, the enhancement of OCT4 transcription ultimately generated OCT4 proteins. Furthermore, examination of different epigenetic modifiers showed that histone acetyltransferase p300 could enhance both TALE-VP64 and sgRNA/dCas9-VP64 induced transcription of endogenous OCT4. Taken together, our study suggested that engineered TALE-TF and dCas9-TF are useful tools for modulating gene expression in mammalian cells.


Assuntos
Fator 3 de Transcrição de Octâmero/genética , Fatores de Transcrição/metabolismo , Ativação Transcricional , Animais , Células Cultivadas , Inativação Gênica , Humanos , Camundongos , Proteínas Recombinantes de Fusão/química , Fatores de Transcrição/genética , Fatores de Transcrição de p300-CBP/metabolismo , Pequeno RNA não Traduzido
12.
Bioinformatics ; 29(3): 322-30, 2013 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-23220572

RESUMO

MOTIVATION: Computational modeling of protein-DNA complexes remains a challenging problem in structural bioinformatics. One of the key factors for a successful protein-DNA docking is a potential function that can accurately discriminate the near-native structures from decoy complexes and at the same time make conformational sampling more efficient. Here, we developed a novel orientation-dependent, knowledge-based, residue-level potential for improving transcription factor (TF)-DNA docking. RESULTS: We demonstrated the performance of this new potential in TF-DNA binding affinity prediction, discrimination of native protein-DNA complex from decoy structures, and most importantly in rigid TF-DNA docking. The rigid TF-DNA docking with the new orientation potential, on a benchmark of 38 complexes, successfully predicts 42% of the cases with root mean square deviations lower than 1 Å and 55% of the cases with root mean square deviations lower than 3 Å. The results suggest that docking with this new orientation-dependent, coarse-grained statistical potential can achieve high-docking accuracy and can serve as a crucial first step in multi-stage flexible protein-DNA docking. AVAILABILITY AND IMPLEMENTATION: The new potential is available at http://bioinfozen.uncc.edu/Protein_DNA_orientation_potential.tar.


Assuntos
DNA/química , Simulação de Acoplamento Molecular/métodos , Fatores de Transcrição/química , DNA/metabolismo , Bases de Conhecimento , Ligação Proteica , Fatores de Transcrição/metabolismo
13.
Food Chem Toxicol ; 190: 114814, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38876379

RESUMO

Lead (Pb) is a common environmental neurotoxicant that causes behavioral impairments in both rodents and humans. Isochlorogenic acid A (ICAA), a phenolic acid found in a variety of natural sources such as tea, fruits, vegetables, coffee, plant-based food products, and various medicinal plants, exerts multiple effects, including protective effects on the lungs, livers, and intestines. The objective of this study was to investigate the potential neuroprotective effects of ICAA against Pb-induced neurotoxicity in ICR mice. The results indicate that ICAA attenuates Pb-induced anxiety-like behaviors. ICAA reduced neuroinflammation, ferroptosis, and oxidative stress caused by Pb. ICAA successfully mitigated the Pb-induced deficits in the cholinergic system in the brain through the reduction of ACH levels and the enhancement of AChE and BChE activities. ICAA significantly reduced the levels of ferrous iron and MDA in the brain and prevented decreases in GSH, SOD, and GPx activity. Immunofluorescence analysis demonstrated that ICAA attenuated ferroptosis and upregulated GPx4 expression in the context of Pb-induced nerve damage. Additionally, ICAA downregulated TNF-α and IL-6 expression while concurrently enhancing the activations of Nrf2, HO-1, NQO1, BDNF, and CREB in the brains of mice. The inhibition of BDNF, Nrf2 and GPx4 reversed the protective effects of ICAA on Pb-induced ferroptosis in nerve cells. In general, ICAA ameliorates Pb-induced neuroinflammation, ferroptosis, oxidative stress, and anxiety-like behaviors through the activation of the BDNF/Nrf2/GPx4 pathways.


Assuntos
Ansiedade , Ácido Clorogênico , Ferroptose , Chumbo , Doenças Neuroinflamatórias , Transdução de Sinais , Animais , Masculino , Camundongos , Ansiedade/tratamento farmacológico , Ansiedade/induzido quimicamente , Comportamento Animal/efeitos dos fármacos , Fator Neurotrófico Derivado do Encéfalo/metabolismo , Ácido Clorogênico/farmacologia , Ácido Clorogênico/análogos & derivados , Ferroptose/efeitos dos fármacos , Glutationa Peroxidase/metabolismo , Chumbo/toxicidade , Camundongos Endogâmicos ICR , Doenças Neuroinflamatórias/tratamento farmacológico , Doenças Neuroinflamatórias/induzido quimicamente , Doenças Neuroinflamatórias/metabolismo , Fator 2 Relacionado a NF-E2/metabolismo , Estresse Oxidativo/efeitos dos fármacos , Transdução de Sinais/efeitos dos fármacos
14.
Toxicol Res (Camb) ; 13(3): tfae072, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38737339

RESUMO

Lead (Pb) is a nonessential heavy metal, which can cause many health problems. Isochlorogenic acid A (ICAA), a phenolic acid present in tea, fruits, vegetables, coffee, plant-based food products, and various medicinal plants, exerts multiple effects, including anti-oxidant, antiviral, anti-inflammatory and antifibrotic functions. Thus, the purpose of our study was to determine if ICAA could prevent Pb-induced hepatotoxicity in ICR mice. An evaluation was performed on oxidative stress, inflammation and fibrosis, and related signaling. The results indicate that ICAA attenuates Pb-induced abnormal liver function. ICAA reduced liver fibrosis, inflammation and oxidative stress caused by Pb. ICAA abated Pb-induced fibrosis and decreased inflammatory cytokines interleukin-1ß (IL-1ß) and tumor necrosis factor-alpha (TNF-α). ICAA abrogated reductions in activities of superoxide dismutase (SOD), catalase (CAT), and glutathione peroxidase (GPx). Masson staining revealed that ICAA reduced collagen fiber deposition in Pb-induced fibrotic livers. Western blot and immunohistochemistry analyses showed ICAA increased phosphorylated AMP-activated protein kinase (p-AMPK) expression. ICAA also reduced the expression of collagen I, α-smooth muscle actin (α-SMA), phosphorylated extracellular signal-regulated kinase (p-ERK), phosphorylated c-jun N-terminal kinase (p-JNK), p-p38, phosphorylated signal transducer and phosphorylated activator of transcription 3 (p-STAT3), transforming growth factor ß1 (TGF-ß1), and p-Smad2/3 in livers of mice. Overall, ICAA ameliorates Pb-induced hepatitis and fibrosis by inhibiting the AMPK/MAPKs/NF-κB and STAT3/TGF-ß1/Smad2/3 pathways.

15.
Clin Immunol ; 146(1): 46-55, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23220404

RESUMO

V(H) replacement occurs through RAG-mediated secondary recombination to change unwanted IgH genes and diversify antibody repertoire. The biological significance of V(H) replacement remains to be explored. Here, we show that V(H) replacement products are highly enriched in IgH genes encoding anti-HIV antibodies, including anti-gp41, anti-V3 loop, anti-gp120, CD4i, and PGT antibodies. In particular, 73% of the CD4i antibodies and 100% of the PGT antibodies are encoded by potential VH replacement products. Such frequencies are significantly higher than those in IgH genes derived from HIV infected individuals or autoimmune patients. The identified V(H) replacement products encoding anti-HIV antibodies are highly mutated; the V(H) replacement "footprints" within CD4i antibodies preferentially encode negatively charged amino acids within the IgH CDR3; many IgH encoding PGT antibodies are likely generated from multiple rounds of V(H) replacement. Taken together, these findings uncovered a potentially significant contribution of V(H) replacement products to the generation of anti-HIV antibodies.


Assuntos
Diversidade de Anticorpos/imunologia , Anticorpos Anti-HIV/imunologia , Cadeias Pesadas de Imunoglobulinas/imunologia , Região Variável de Imunoglobulina/imunologia , Sequência de Aminoácidos , Diversidade de Anticorpos/genética , Antígenos CD4/química , Antígenos CD4/imunologia , Regiões Determinantes de Complementaridade/química , Regiões Determinantes de Complementaridade/genética , Regiões Determinantes de Complementaridade/imunologia , Anticorpos Anti-HIV/química , Proteína gp120 do Envelope de HIV/química , Proteína gp120 do Envelope de HIV/imunologia , Humanos , Cadeias Pesadas de Imunoglobulinas/química , Cadeias Pesadas de Imunoglobulinas/genética , Região Variável de Imunoglobulina/genética , Modelos Moleculares , Dados de Sequência Molecular , Estrutura Terciária de Proteína
16.
BMC Bioinformatics ; 13: 220, 2012 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-22943312

RESUMO

BACKGROUND: One of the crucial steps in regulation of gene expression is the binding of transcription factor(s) to specific DNA sequences. Knowledge of the binding affinity and specificity at a structural level between transcription factors and their target sites has important implications in our understanding of the mechanism of gene regulation. Due to their unique functions and binding specificity, there is a need for a transcription factor-specific, structure-based database and corresponding web service to facilitate structural bioinformatics studies of transcription factor-DNA interactions, such as development of knowledge-based interaction potential, transcription factor-DNA docking, binding induced conformational changes, and the thermodynamics of protein-DNA interactions. DESCRIPTION: TFinDit is a relational database and a web search tool for studying transcription factor-DNA interactions. The database contains annotated transcription factor-DNA complex structures and related data, such as unbound protein structures, thermodynamic data, and binding sequences for the corresponding transcription factors in the complex structures. TFinDit also provides a user-friendly interface and allows users to either query individual entries or generate datasets through culling the database based on one or more search criteria. CONCLUSIONS: TFinDit is a specialized structural database with annotated transcription factor-DNA complex structures and other preprocessed data. We believe that this database/web service can facilitate the development and testing of TF-DNA interaction potentials and TF-DNA docking algorithms, and the study of protein-DNA recognition mechanisms.


Assuntos
DNA/química , Bases de Dados Genéticas , Fatores de Transcrição/química , Algoritmos , Sítios de Ligação , Biologia Computacional , DNA/metabolismo , Internet , Software , Termodinâmica , Fatores de Transcrição/metabolismo
17.
Proteome Sci ; 10 Suppl 1: S17, 2012 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-22759575

RESUMO

BACKGROUND: Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. METHODS: In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. RESULTS: The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. CONCLUSIONS: We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem.

18.
Biomolecules ; 12(9)2022 08 26.
Artigo em Inglês | MEDLINE | ID: mdl-36139026

RESUMO

Single-stranded DNA (ssDNA) binding proteins (SSBs) are critical in maintaining genome stability by protecting the transient existence of ssDNA from damage during essential biological processes, such as DNA replication and gene transcription. The single-stranded region of telomeres also requires protection by ssDNA binding proteins from being attacked in case it is wrongly recognized as an anomaly. In addition to their critical roles in genome stability and integrity, it has been demonstrated that ssDNA and SSB-ssDNA interactions play critical roles in transcriptional regulation in all three domains of life and viruses. In this review, we present our current knowledge of the structure and function of SSBs and the structural features for SSB binding specificity. We then discuss the machine learning-based approaches that have been developed for the prediction of SSBs from double-stranded DNA (dsDNA) binding proteins (DSBs).


Assuntos
DNA de Cadeia Simples , Proteínas de Ligação a DNA , DNA/química , Proteínas de Ligação a DNA/metabolismo , Instabilidade Genômica , Humanos , Aprendizado de Máquina , Ligação Proteica
19.
BMC Struct Biol ; 11: 13, 2011 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-21371326

RESUMO

BACKGROUND: Heme is an essential molecule and plays vital roles in many biological processes. The structural determination of a large number of heme proteins has made it possible to study the detailed chemical and structural properties of heme binding environment. Knowledge of these characteristics can provide valuable guidelines in the design of novel heme proteins and help us predict unknown heme binding proteins. RESULTS: In this paper, we constructed a non-redundant dataset of 125 heme-binding protein chains and found that these heme proteins encompass at least 31 different structural folds with all-α class as the dominating scaffold. Heme binding pockets are enriched in aromatic and non-polar amino acids with fewer charged residues. The differences between apo and holo forms of heme proteins in terms of the structure and the binding pockets have been investigated. In most cases the proteins undergo small conformational changes upon heme binding. We also examined the CP (cysteine-proline) heme regulatory motifs and demonstrated that the conserved dipeptide has structural implications in protein-heme interactions. CONCLUSIONS: Our analysis revealed that heme binding pockets show special features and that most of the heme proteins undergo small conformational changes after heme binding, suggesting the apo structures can be used for structure-based heme protein prediction and as scaffolds for future heme protein design.


Assuntos
Biologia Computacional/métodos , Hemeproteínas/química , Engenharia de Proteínas/métodos , Motivos de Aminoácidos , Animais , Apoproteínas/química , Apoproteínas/genética , Apoproteínas/metabolismo , Sítios de Ligação , Heme/metabolismo , Hemeproteínas/genética , Hemeproteínas/metabolismo , Humanos , Modelos Moleculares , Conformação Proteica
20.
BMC Struct Biol ; 11: 45, 2011 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22044637

RESUMO

BACKGROUND: Structural insight from transcription factor-DNA (TF-DNA) complexes is of paramount importance to our understanding of the affinity and specificity of TF-DNA interaction, and to the development of structure-based prediction of TF binding sites. Yet the majority of the TF-DNA complexes remain unsolved despite the considerable experimental efforts being made. Computational docking represents a promising alternative to bridge the gap. To facilitate the study of TF-DNA docking, carefully designed benchmarks are needed for performance evaluation and identification of the strengths and weaknesses of docking algorithms. RESULTS: We constructed two benchmarks for flexible and rigid TF-DNA docking respectively using a unified non-redundant set of 38 test cases. The test cases encompass diverse fold families and are classified into easy and hard groups with respect to the degrees of difficulty in TF-DNA docking. The major parameters used to classify expected docking difficulty in flexible docking are the conformational differences between bound and unbound TFs and the interaction strength between TFs and DNA. For rigid docking in which the starting structure is a bound TF conformation, only interaction strength is considered. CONCLUSIONS: We believe these benchmarks are important for the development of better interaction potentials and TF-DNA docking algorithms, which bears important implications to structure-based prediction of transcription factor binding sites and drug design.


Assuntos
DNA/metabolismo , Fatores de Transcrição/metabolismo , Algoritmos , Sítios de Ligação , Simulação por Computador , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA