Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
1.
Proc Natl Acad Sci U S A ; 115(51): E11943-E11950, 2018 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-30504143

RESUMO

Abundant and essential motifs, such as phosphate-binding loops (P-loops), are presumed to be the seeds of modern enzymes. The Walker-A P-loop is absolutely essential in modern NTPase enzymes, in mediating binding, and transfer of the terminal phosphate groups of NTPs. However, NTPase function depends on many additional active-site residues placed throughout the protein's scaffold. Can motifs such as P-loops confer function in a simpler context? We applied a phylogenetic analysis that yielded a sequence logo of the putative ancestral Walker-A P-loop element: a ß-strand connected to an α-helix via the P-loop. Computational design incorporated this element into de novo designed ß-α repeat proteins with relatively few sequence modifications. We obtained soluble, stable proteins that unlike modern P-loop NTPases bound ATP in a magnesium-independent manner. Foremost, these simple P-loop proteins avidly bound polynucleotides, RNA, and single-strand DNA, and mutations in the P-loop's key residues abolished binding. Binding appears to be facilitated by the structural plasticity of these proteins, including quaternary structure polymorphism that promotes a combined action of multiple P-loops. Accordingly, oligomerization enabled a 55-aa protein carrying a single P-loop to confer avid polynucleotide binding. Overall, our results show that the P-loop Walker-A motif can be implemented in small and simple ß-α repeat proteins, primarily as a polynucleotide binding motif.


Assuntos
Sítios de Ligação , Fosfatos/química , Domínios e Motivos de Interação entre Proteínas , Proteínas/química , Trifosfato de Adenosina/química , Sequência de Aminoácidos , Domínio Catalítico , DNA , Evolução Molecular , Magnésio , Modelos Moleculares , Mutação , Nucleosídeo-Trifosfatase/química , Filogenia , Polinucleotídeos , Ligação Proteica , Conformação Proteica , RNA , Proteínas de Ligação a RNA/química , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
2.
Brief Bioinform ; 19(6): 1085-1101, 2018 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-28498882

RESUMO

Cancer is a genetic disorder, meaning that a plethora of different mutations, whether somatic or germ line, underlie the etiology of the 'Emperor of Maladies'. Point mutations, chromosomal rearrangements and copy number changes, whether they have occurred spontaneously in predisposed individuals or have been induced by intrinsic or extrinsic (environmental) mutagens, lead to the activation of oncogenes and inactivation of tumor suppressor genes, thereby promoting malignancy. This scenario has now been recognized and experimentally confirmed in a wide range of different contexts. Over the past decade, a surge in available sequencing technologies has allowed the sequencing of whole genomes from liquid malignancies and solid tumors belonging to different types and stages of cancer, giving birth to the new field of cancer genomics. One of the most striking discoveries has been that cancer genomes are highly enriched with mutations of specific kinds. It has been suggested that these mutations can be classified into 'families' based on their mutational signatures. A mutational signature may be regarded as a type of base substitution (e.g. C:G to T:A) within a particular context of neighboring nucleotide sequence (the bases upstream and/or downstream of the mutation). These mutational signatures, supplemented by mutable motifs (a wider mutational context), promise to help us to understand the nature of the mutational processes that operate during tumor evolution because they represent the footprints of interactions between DNA, mutagens and the enzymes of the repair/replication/modification pathways.


Assuntos
Genômica , Mutação , Neoplasias/genética , DNA/genética , Metilação de DNA , Evolução Molecular , Expressão Gênica , Predisposição Genética para Doença , Humanos , Modelos Genéticos , Mutagênicos/farmacologia , Oncogenes , Seleção Genética
3.
PLoS Comput Biol ; 15(4): e1006981, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31034466

RESUMO

Identifying driver mutations in cancer is notoriously difficult. To date, recurrence of a mutation in patients remains one of the most reliable markers of mutation driver status. However, some mutations are more likely to occur than others due to differences in background mutation rates arising from various forms of infidelity of DNA replication and repair machinery, endogenous, and exogenous mutagens. We calculated nucleotide and codon mutability to study the contribution of background processes in shaping the observed mutational spectrum in cancer. We developed and tested probabilistic pan-cancer and cancer-specific models that adjust the number of mutation recurrences in patients by background mutability in order to find mutations which may be under selection in cancer. We showed that mutations with higher mutability values had higher observed recurrence frequency, especially in tumor suppressor genes. This trend was prominent for nonsense and silent mutations or mutations with neutral functional impact. In oncogenes, however, highly recurring mutations were characterized by relatively low mutability, resulting in an inversed U-shaped trend. Mutations not yet observed in any tumor had relatively low mutability values, indicating that background mutability might limit mutation occurrence. We compiled a dataset of missense mutations from 58 genes with experimentally validated functional and transforming impacts from various studies. We found that mutability of driver mutations was lower than that of passengers and consequently adjusting mutation recurrence frequency by mutability significantly improved ranking of mutations and driver mutation prediction. Even though no training on existing data was involved, our approach performed similarly or better to the state-of-the-art methods.


Assuntos
Códon/genética , Replicação do DNA/genética , Mutação/genética , Mutação/fisiologia , Neoplasias/genética , Biologia Computacional , Humanos , Oncogenes/genética
4.
Nucleic Acids Res ; 45(W1): W514-W522, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28472504

RESUMO

Much remains unknown about the progression and heterogeneity of mutational processes in different cancers and their diagnostic and clinical potential. A growing body of evidence supports mutation rate dependence on the local DNA sequence context for various types of mutations. We propose several tools for the analysis of cancer context-dependent mutations, which are implemented in an online computational framework MutaGene. The framework explores DNA context-dependent mutational patterns and underlying somatic cancer mutagenesis, analyzes mutational profiles of cancer samples, identifies the combinations of underlying mutagenic processes including those related to infidelity of DNA replication and repair machinery, and various other endogenous and exogenous mutagenic factors. As a result, the combination of mutagenic processes can be identified in any query sample with subsequent comparison to mutational profiles derived from malignant and benign samples. In addition, mutagen or cancer-specific mutational background models are applied to calculate expected DNA and protein site mutability to decouple relative contributions of mutagenesis and selection in carcinogenesis, thus elucidating the site-specific driving events in cancer. MutaGene is freely available at https://www.ncbi.nlm.nih.gov/projects/mutagene/.


Assuntos
Mutação , Neoplasias/genética , Software , Substituição de Aminoácidos , Análise Mutacional de DNA , Humanos , Internet , Mutagênese
5.
Nucleic Acids Res ; 44(D1): D301-7, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26507856

RESUMO

NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand-protein interactions found in crystallized ligand-protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions.


Assuntos
Motivos de Aminoácidos , Bases de Dados de Proteínas , Nucleotídeos/metabolismo , Ligantes , Ligação Proteica , Proteínas/metabolismo , Análise de Sequência de Proteína
6.
Nucleic Acids Res ; 44(W1): W494-501, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27150810

RESUMO

Proteins engage in highly selective interactions with their macromolecular partners. Sequence variants that alter protein binding affinity may cause significant perturbations or complete abolishment of function, potentially leading to diseases. There exists a persistent need to develop a mechanistic understanding of impacts of variants on proteins. To address this need we introduce a new computational method MutaBind to evaluate the effects of sequence variants and disease mutations on protein interactions and calculate the quantitative changes in binding affinity. The MutaBind method uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms. The MutaBind server maps mutations on a structural protein complex, calculates the associated changes in binding affinity, determines the deleterious effect of a mutation, estimates the confidence of this prediction and produces a mutant structural model for download. MutaBind can be applied to a large number of problems, including determination of potential driver mutations in cancer and other diseases, elucidation of the effects of sequence variants on protein fitness in evolution and protein design. MutaBind is available at http://www.ncbi.nlm.nih.gov/projects/mutabind/.


Assuntos
Internet , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/genética , Proteínas/química , Proteínas/metabolismo , Software , Algoritmos , Sítios de Ligação , Conjuntos de Dados como Assunto , Evolução Molecular , Aptidão Genética , Humanos , Simulação de Dinâmica Molecular , Neoplasias/genética , Ligação Proteica/genética , Proteínas/genética
7.
Int J Mol Sci ; 19(7)2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30037003

RESUMO

Cancer is a complex disease that is driven by genetic alterations. There has been a rapid development of genome-wide techniques during the last decade along with a significant lowering of the cost of gene sequencing, which has generated widely available cancer genomic data. However, the interpretation of genomic data and the prediction of the association of genetic variations with cancer and disease phenotypes still requires significant improvement. Missense mutations, which can render proteins non-functional and provide a selective growth advantage to cancer cells, are frequently detected in cancer. Effects caused by missense mutations can be pinpointed by in silico modeling, which makes it more feasible to find a treatment and reverse the effect. Specific human phenotypes are largely determined by stability, activity, and interactions between proteins and other biomolecules that work together to execute specific cellular functions. Therefore, analysis of missense mutations' effects on proteins and their complexes would provide important clues for identifying functionally important missense mutations, understanding the molecular mechanisms of cancer progression and facilitating treatment and prevention. Herein, we summarize the major computational approaches and tools that provide not only the classification of missense mutations as cancer drivers or passengers but also the molecular mechanisms induced by driver mutations. This review focuses on the discussion of annotation and prediction methods based on structural and biophysical data, analysis of somatic cancer missense mutations in 3D structures of proteins and their complexes, predictions of the effects of missense mutations on protein stability, protein-protein and protein-nucleic acid interactions, and assessment of conformational changes in protein conformations induced by mutations.


Assuntos
Mutação de Sentido Incorreto/genética , Neoplasias/genética , Animais , Biologia Computacional/métodos , Humanos , Neoplasias/metabolismo , Conformação Proteica , Estabilidade Proteica
8.
Nucleic Acids Res ; 42(5): 2879-92, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24371267

RESUMO

DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea.


Assuntos
Adaptação Fisiológica/genética , DNA/química , Evolução Molecular , Proteínas/química , RNA/química , Aerobiose , Composição de Bases , Sequência de Bases , Códon , Nucleotídeos/análise , RNA Mensageiro/química , Análise de Sequência de Proteína , Temperatura
9.
Biophys J ; 109(6): 1295-306, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-26213149

RESUMO

Structures of protein complexes provide atomistic insights into protein interactions. Human proteins represent a quarter of all structures in the Protein Data Bank; however, available protein complexes cover less than 10% of the human proteome. Although it is theoretically possible to infer interactions in human proteins based on structures of homologous protein complexes, it is still unclear to what extent protein interactions and binding sites are conserved, and whether protein complexes from remotely related species can be used to infer interactions and binding sites. We considered biological units of protein complexes and clustered protein-protein binding sites into similarity groups based on their structure and sequence, which allowed us to identify unique binding sites. We showed that the growth rate of the number of unique binding sites in the Protein Data Bank was much slower than the growth rate of the number of structural complexes. Next, we investigated the evolutionary roots of unique binding sites and identified the major phyletic branches with the largest expansion in the number of novel binding sites. We found that many binding sites could be traced to the universal common ancestor of all cellular organisms, whereas relatively few binding sites emerged at the major evolutionary branching points. We analyzed the physicochemical properties of unique binding sites and found that the most ancient sites were the largest in size, involved many salt bridges, and were the most compact and least planar. In contrast, binding sites that appeared more recently in the evolution of eukaryotes were characterized by a larger fraction of polar and aromatic residues, and were less compact and more planar, possibly due to their more transient nature and roles in signaling processes.


Assuntos
Sítios de Ligação/genética , Evolução Molecular , Ligação Proteica/genética , Proteínas/genética , Proteínas/metabolismo , Animais , Humanos , Modelos Moleculares
10.
Phys Biol ; 12(4): 045002, 2015 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-26057563

RESUMO

The goal of this work is to learn from nature the rules that govern evolution and the design of protein function. The fundamental laws of physics lie in the foundation of the protein structure and all stages of the protein evolution, determining optimal sizes and shapes at different levels of structural hierarchy. We looked back into the very onset of the protein evolution with a goal to find elementary functions (EFs) that came from the prebiotic world and served as building blocks of the first enzymes. We defined the basic structural and functional units of biochemical reactions-elementary functional loops. The diversity of contemporary enzymes can be described via combinations of a limited number of elementary chemical reactions, many of which are performed by the descendants of primitive prebiotic peptides/proteins. By analyzing protein sequences we were able to identify EFs shared by seemingly unrelated protein superfamilies and folds and to unravel evolutionary relations between them. Binding and metabolic processing of the metal- and nucleotide-containing cofactors and ligands are among the most abundant ancient EFs that became indispensable in many natural enzymes. Highly designable folds provide structural scaffolds for many different biochemical reactions. We show that contemporary proteins are built from a limited number of EFs, making their analysis instrumental for establishing the rules for protein design. Evolutionary studies help us to accumulate the library of essential EFs and to establish intricate relations between different folds and functional superfamilies. Generalized sequence-structure descriptors of the EF will become useful in future design and engineering of desired enzymatic functions.


Assuntos
Archaea/genética , Proteínas Arqueais/genética , Evolução Molecular , Modelos Genéticos , Archaea/química , Archaea/enzimologia , Archaea/metabolismo , Proteínas Arqueais/química , Proteínas Arqueais/metabolismo , Conformação Proteica
11.
Nucleic Acids Res ; 41(Web Server issue): W266-72, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23737445

RESUMO

The SPACER server provides an interactive framework for exploring allosteric communication in proteins with different sizes, degrees of oligomerization and function. SPACER uses recently developed theoretical concepts based on the thermodynamic view of allostery. It proposes easily tractable and meaningful measures that allow users to analyze the effect of ligand binding on the intrinsic protein dynamics. The server shows potential allosteric sites and allows users to explore communication between the regulatory and functional sites. It is possible to explore, for instance, potential effector binding sites in a given structure as targets for allosteric drugs. As input, the server only requires a single structure. The server is freely available at http://allostery.bii.a-star.edu.sg/.


Assuntos
Conformação Proteica , Software , Regulação Alostérica , Sítio Alostérico , Internet , Modelos Moleculares , Termodinâmica
12.
Curr Res Struct Biol ; 7: 100142, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38655428

RESUMO

Binding of nucleotides and their derivatives is one of the most ancient elementary functions dating back to the Origin of Life. We review here the works considering one of the key elements in binding of (di)nucleotide-containing ligands - phosphate binding. We start from a brief discussion of major participants, conditions, and events in prebiotic evolution that resulted in the Origin of Life. Tracing back to the basic functions, including metal and phosphate binding, and, potentially, formation of primitive protein-protein interactions, we focus here on the phosphate binding. Critically assessing works on the structural, functional, and evolutionary aspects of phosphate binding, we perform a simple computational experiment reconstructing its most ancient and generic sequence prototype. The profiles of the phosphate binding signatures have been derived in form of position-specific scoring matrices (PSSMs), their peculiarities depending on the type of the ligands have been analyzed, and evolutionary connections between them have been delineated. Then, the apparent prototype that gave rise to all relevant phosphate-binding signatures had also been reconstructed. We show that two major signatures of the phosphate binding that discriminate between the binding of dinucleotide- and nucleotide-containing ligands are GxGxxG and GxxGxG, respectively. It appears that the signature archetypal for dinucleotide-containing ligands is more generic, and it can frequently bind phosphate groups in nucleotide-containing ligands as well. The reconstructed prototype's key signature GxGGxG underlies the role of glycine residues in providing flexibility and interactions necessary for binding the phosphate groups. The prototype also contains other ancient amino acids, valine, and alanine, showing versatility towards evolutionary design and functional diversification.

13.
J Biol Chem ; 287(35): 29348-61, 2012 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-22733820

RESUMO

Zinc is an essential mineral, and infants are particularly vulnerable to zinc deficiency as they require large amounts of zinc for their normal growth and development. We have recently described the first loss-of-function mutation (H54R) in the zinc transporter ZnT-2 (SLC30A2) in mothers with infants harboring transient neonatal zinc deficiency (TNZD). Here we identified and characterized a novel heterozygous G87R ZnT-2 mutation in two unrelated Ashkenazi Jewish mothers with infants displaying TNZD. Transient transfection of G87R ZnT-2 resulted in endoplasmic reticulum-Golgi retention, whereas the WT transporter properly localized to intracellular secretory vesicles in HC11 and MCF-7 cells. Consequently, G87R ZnT-2 showed decreased stability compared with WT ZnT-2 as revealed by Western blot analysis. Three-dimensional homology modeling based on the crystal structure of YiiP, a close zinc transporter homologue from Escherichia coli, revealed that the basic arginine residue of the mutant G87R points toward the membrane lipid core, suggesting misfolding and possible loss-of-function. Indeed, functional assays including vesicular zinc accumulation, zinc secretion, and cytoplasmic zinc pool assessment revealed markedly impaired zinc transport in G87R ZnT-2 transfectants. Moreover, co-transfection experiments with both mutant and WT transporters revealed a dominant negative effect of G87R ZnT-2 over the WT ZnT-2; this was associated with mislocalization, decreased stability, and loss of zinc transport activity of the WT ZnT-2 due to homodimerization observed upon immunoprecipitation experiments. These findings establish that inactivating ZnT-2 mutations are an underlying basis of TNZD and provide the first evidence for the dominant inheritance of heterozygous ZnT-2 mutations via negative dominance due to homodimer formation.


Assuntos
Proteínas de Transporte de Cátions , Doenças do Recém-Nascido , Modelos Moleculares , Mutação de Sentido Incorreto , Dobramento de Proteína , Multimerização Proteica/genética , Zinco/deficiência , Substituição de Aminoácidos , Proteínas de Transporte de Cátions/química , Proteínas de Transporte de Cátions/genética , Proteínas de Transporte de Cátions/metabolismo , Linhagem Celular Tumoral , Citoplasma , Retículo Endoplasmático/genética , Retículo Endoplasmático/metabolismo , Escherichia coli , Proteínas de Escherichia coli , Feminino , Humanos , Lactente , Recém-Nascido , Doenças do Recém-Nascido/genética , Doenças do Recém-Nascido/metabolismo , Judaísmo , Masculino , Proteínas de Membrana Transportadoras , Erros Inatos do Metabolismo/genética , Erros Inatos do Metabolismo/metabolismo , Estabilidade Proteica , Estrutura Terciária de Proteína , Homologia Estrutural de Proteína
14.
Comput Struct Biotechnol J ; 21: 238-250, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36544476

RESUMO

The process of designing biomolecules, in particular proteins, is witnessing a rapid change in available tooling and approaches, moving from design through physicochemical force fields, to producing plausible, complex sequences fast via end-to-end differentiable statistical models. To achieve conditional and controllable protein design, researchers at the interface of artificial intelligence and biology leverage advances in natural language processing (NLP) and computer vision techniques, coupled with advances in computing hardware to learn patterns from growing biological databases, curated annotations thereof, or both. Once learned, these patterns can be used to provide novel insights into mechanistic biology and the design of biomolecules. However, navigating and understanding the practical applications for the many recent protein design tools is complex. To facilitate this, we 1) document recent advances in deep learning (DL) assisted protein design from the last three years, 2) present a practical pipeline that allows to go from de novo-generated sequences to their predicted properties and web-powered visualization within minutes, and 3) leverage it to suggest a generated protein sequence which might be used to engineer a biosynthetic gene cluster to produce a molecular glue-like compound. Lastly, we discuss challenges and highlight opportunities for the protein design field.

15.
BMC Evol Biol ; 12: 75, 2012 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-22646318

RESUMO

BACKGROUND: Despite recent progress in studies of the evolution of protein function, the questions what were the first functional protein domains and what were their basic building blocks remain unresolved. Previously, we introduced the concept of elementary functional loops (EFLs), which are the functional units of enzymes that provide elementary reactions in biochemical transformations. They are presumably descendants of primordial catalytic peptides. RESULTS: We analyzed distant evolutionary connections between protein functions in Archaea based on the EFLs comprising them. We show examples of the involvement of EFLs in new functional domains, as well as reutilization of EFLs and functional domains in building multidomain structures and protein complexes. CONCLUSIONS: Our analysis of the archaeal superkingdom yields the dominating mechanisms in different periods of protein evolution, which resulted in several levels of the organization of biochemical function. First, functional domains emerged as combinations of prebiotic peptides with the very basic functions, such as nucleotide/phosphate and metal cofactor binding. Second, domain recombination brought to the evolutionary scene the multidomain proteins and complexes. Later, reutilization and de novo design of functional domains and elementary functional loops complemented evolution of protein function.


Assuntos
Archaea/genética , Archaea/metabolismo , Proteínas Arqueais/genética , Evolução Molecular , Aminoacil-tRNA Sintetases/genética , Archaea/química , Archaea/enzimologia , Proteínas Arqueais/química , Metano/biossíntese , Modelos Moleculares , Dobramento de Proteína , Estrutura Terciária de Proteína , Proteoma/química , Proteoma/genética , Relação Estrutura-Atividade
16.
Bioinformatics ; 27(17): 2368-75, 2011 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-21724592

RESUMO

MOTIVATION: Enzymes are complex catalytic machines, which perform sequences of elementary chemical transformations resulting in biochemical function. The building blocks of enzymes, elementary functional loops (EFLs), possess distinct functional signatures and provide catalytic and binding amino acids to the enzyme's active sites. The goal of this work is to obtain primordial prototypes of EFLs that existed before the formation of enzymatic domains and served as their building blocks. RESULTS: We developed a computational strategy for reconstructing ancient prototypes of EFLs based on the comparison of sequence segments on the proteomic scale, which goes beyond detection of conserved functional motifs in homologous proteins. We illustrate the procedure by a CxxC-containing prototype with a very basic and ancient elementary function of metal/metal-containing cofactor binding and redox activity. Acquiring the prototypes of EFLs is necessary for revealing how the original set of protein folds with enzymatic functions emerged in predomain evolution. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: igor.berezovsky@uni.no.


Assuntos
Metaloproteínas/química , Proteômica/métodos , Biologia Computacional , Cisteína/química , Enzimas/química , Evolução Molecular , Análise de Sequência de Proteína
17.
Bioinformatics ; 26(18): i497-503, 2010 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-20823313

RESUMO

MOTIVATION: Earlier studies of protein structure revealed closed loops with a characteristic size 25-30 residues and ring-like shape as a basic universal structural element of globular proteins. Elementary functional loops (EFLs) have specific signatures and provide functional residues important for binding/activation and principal chemical transformation steps of the enzymatic reaction. The goal of this work is to show how these functional loops evolved from pre-domain peptides and to find a set of prototypes from which the EFLs of contemporary proteins originated. RESULTS: This article describes a computational method for deriving prototypes of EFLs based on the sequences of complete genomes. The procedure comprises the iterative derivation of sequence profiles followed by their hierarchical clustering. The scoring function takes into account information content on profile positions, thus preserving the signature. The statistical significance of scores is evaluated from the empirical distribution of scores of the background model. A set of prototypes of EFLs from archaeal proteomes is derived. This set delineates evolutionary connections between major functions and illuminates how folds and functions emerged in pre-domain evolution as a combination of prototypes.


Assuntos
Motivos de Aminoácidos , Proteínas Arqueais/química , Biologia Computacional/métodos , Evolução Molecular , Dobramento de Proteína , Motivos de Aminoácidos/genética , Sequência de Aminoácidos , Proteínas Arqueais/genética , Sequência de Bases , Genoma Arqueal , Modelos Moleculares , Dados de Sequência Molecular , Proteoma
18.
Front Bioinform ; 1: 657529, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-36303771

RESUMO

The rational design of proteins with desired functions requires a comprehensive description of the functional building blocks. The evolutionary conserved functional units constitute nature's toolbox; however, they are not readily available to protein designers. This study focuses on protein units of subdomain size that possess structural properties and amino acid residues sufficient to carry out elementary reactions in the catalytic mechanisms. The interactions within such elementary functional loops (ELFs) and the interactions with the surrounding protein scaffolds constitute the descriptor of elementary function. The computational approach to deriving descriptors directly from protein sequences and structures and applying them in rational design was implemented in a proof-of-concept DEFINED-PROTEINS software package. Once the descriptor is obtained, the ELF can be fitted into existing or novel scaffolds to obtain the desired function. For instance, the descriptor may be used to determine the necessary spatial restraints in a fragment-based grafting protocol. We illustrated the approach by applying it to well-known cases of ELFs, including phosphate-binding P-loop, diphosphate-binding glycine-rich motif, and calcium-binding EF-hand motif, which could be used to jumpstart templates for user applications. The DEFINED-PROTEINS package is available for free at https://github.com/MelvinYin/Defined_Proteins.

19.
J Mol Biol ; 433(6): 166684, 2021 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-33098859

RESUMO

To elucidate the properties of human histone interactions on the large scale, we perform a comprehensive mapping of human histone interaction networks by using data from structural, chemical cross-linking and various high-throughput studies. Histone interactomes derived from different data sources show limited overlap and complement each other. It inspires us to integrate these data into the combined histone global interaction network which includes 5308 proteins and 10,330 interactions. The analysis of topological properties of the human histone interactome reveals its scale free behavior and high modularity. Our study of histone binding interfaces uncovers a remarkably high number of residues involved in interactions between histones and non-histone proteins, 80-90% of residues in histones H3 and H4 have at least one binding partner. Two types of histone binding modes are detected: interfaces conserved in most histone variants and variant specific interfaces. Finally, different types of chromatin factors recognize histones in nucleosomes via distinct binding modes, and many of these interfaces utilize acidic patches among other sites. Interaction networks are available at https://github.com/Panchenko-Lab/Human-histone-interactome.


Assuntos
Proteínas Cromossômicas não Histona/química , DNA/química , Histonas/química , Nucleossomos/ultraestrutura , Mapas de Interação de Proteínas , Sítios de Ligação , Proteínas Cromossômicas não Histona/genética , Proteínas Cromossômicas não Histona/metabolismo , DNA/genética , DNA/metabolismo , Bases de Dados de Proteínas , Histonas/genética , Histonas/metabolismo , Humanos , Internet , Conformação de Ácido Nucleico , Nucleossomos/química , Nucleossomos/metabolismo , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Software
20.
Epigenetics ; 16(5): 537-553, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-32892676

RESUMO

Genomes of KhoeSan individuals of the Kalahari Desert provide the greatest understanding of single nucleotide diversity in the human genome. Compared with individuals in industrialized environments, the KhoeSan have a unique foraging and hunting lifestyle. Given these dramatic environmental differences, and the responsiveness of the methylome to environmental exposures of many types, we hypothesized that DNA methylation patterns would differ between KhoeSan and neighbouring agropastoral and/or industrial Bantu. We analysed Illumina HumanMethylation 450 k array data generated from blood samples from 38 KhoeSan and 42 Bantu, and 6 Europeans. After removing CpG positions associated with annotated and novel polymorphisms and controlling for white blood cell composition, sex, age and technical variation we identified 816 differentially methylated CpG loci, out of which 133 had an absolute beta-value difference of at least 0.05. Notably SLC39A4/ZIP4, which plays a role in zinc transport, was one of the most differentially methylated loci. Although the chronological ages of the KhoeSan are not formally recorded, we compared historically estimated ages to methylation-based calculations. This study demonstrates that the epigenetic profile of KhoeSan individuals reveals differences from other populations, and along with extensive genetic diversity, this community brings increased accessibility and understanding to the diversity of the human genome.


Assuntos
População Negra/genética , Proteínas de Transporte de Cátions , Ilhas de CpG , Metilação de DNA , Epigênese Genética , Botsuana , Etnicidade , Genoma Humano , Humanos , População Branca
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa