Búsqueda | OPS/OMS Uruguay

1.

An atlas of protein homo-oligomerization across domains of life.

Schweke, Hugo; Pacesa, Martin; Levin, Tal; Goverde, Casper A; Kumar, Prasun; Duhoo, Yoan; Dornfeld, Lars J; Dubreuil, Benjamin; Georgeon, Sandrine; Ovchinnikov, Sergey; Woolfson, Derek N; Correia, Bruno E; Dey, Sucharita; Levy, Emmanuel D.

Cell ; 187(4): 999-1010.e15, 2024 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-38325366

RESUMEN

Protein structures are essential to understanding cellular processes in molecular detail. While advances in artificial intelligence revealed the tertiary structure of proteins at scale, their quaternary structure remains mostly unknown. We devise a scalable strategy based on AlphaFold2 to predict homo-oligomeric assemblies across four proteomes spanning the tree of life. Our results suggest that approximately 45% of an archaeal proteome and a bacterial proteome and 20% of two eukaryotic proteomes form homomers. Our predictions accurately capture protein homo-oligomerization, recapitulate megadalton complexes, and unveil hundreds of homo-oligomer types, including three confirmed experimentally by structure determination. Integrating these datasets with omics information suggests that a majority of known protein complexes are symmetric. Finally, these datasets provide a structural context for interpreting disease mutations and reveal coiled-coil regions as major enablers of quaternary structure evolution in human. Our strategy is applicable to any organism and provides a comprehensive view of homo-oligomerization in proteomes.

Asunto(s)

Inteligencia Artificial , Proteínas , Proteoma , Humanos , Proteínas/química , Proteínas/genética , Archaea/química , Archaea/genética , Eucariontes/química , Eucariontes/genética , Bacterias/química , Bacterias/genética

2.

Architectures of Lipid Transport Systems for the Bacterial Outer Membrane.

Ekiert, Damian C; Bhabha, Gira; Isom, Georgia L; Greenan, Garrett; Ovchinnikov, Sergey; Henderson, Ian R; Cox, Jeffery S; Vale, Ronald D.

Cell ; 169(2): 273-285.e17, 2017 04 06.

Artículo en Inglés | MEDLINE | ID: mdl-28388411

RESUMEN

How phospholipids are trafficked between the bacterial inner and outer membranes through the hydrophilic space of the periplasm is not known. We report that members of the mammalian cell entry (MCE) protein family form hexameric assemblies with a central channel capable of mediating lipid transport. The E. coli MCE protein, MlaD, forms a ring associated with an ABC transporter complex in the inner membrane. A soluble lipid-binding protein, MlaC, ferries lipids between MlaD and an outer membrane protein complex. In contrast, EM structures of two other E. coli MCE proteins show that YebT forms an elongated tube consisting of seven stacked MCE rings, and PqiB adopts a syringe-like architecture. Both YebT and PqiB create channels of sufficient length to span the periplasmic space. This work reveals diverse architectures of highly conserved protein-based channels implicated in the transport of lipids between the membranes of bacteria and some eukaryotic organelles.

Asunto(s)

Proteínas de Escherichia coli/química , Escherichia coli/química , Proteínas de la Membrana/química , Membrana Celular/química , Cristalografía por Rayos X , Microscopía Electrónica , Modelos Moleculares , Complejos Multiproteicos/química

3.

Predicting multiple conformations via sequence clustering and AlphaFold2.

Wayment-Steele, Hannah K; Ojoawo, Adedolapo; Otten, Renee; Apitz, Julia M; Pitsawong, Warintra; Hömberger, Marc; Ovchinnikov, Sergey; Colwell, Lucy; Kern, Dorothee.

Nature ; 625(7996): 832-839, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-37956700

RESUMEN

AlphaFold2 (ref. 1) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple-sequence alignment by sequence similarity enables AlphaFold2 to sample alternative states of known metamorphic proteins with high confidence. Using this method, named AF-Cluster, we investigated the evolutionary distribution of predicted structures for the metamorphic protein KaiB5 and found that predictions of both conformations were distributed in clusters across the KaiB family. We used nuclear magnetic resonance spectroscopy to confirm an AF-Cluster prediction: a cyanobacteria KaiB variant is stabilized in the opposite state compared with the more widely studied variant. To test AF-Cluster's sensitivity to point mutations, we designed and experimentally verified a set of three mutations predicted to flip KaiB from Rhodobacter sphaeroides from the ground to the fold-switched state. Finally, screening for alternative states in protein families without known fold switching identified a putative alternative state for the oxidoreductase Mpt53 in Mycobacterium tuberculosis. Further development of such bioinformatic methods in tandem with experiments will probably have a considerable impact on predicting protein energy landscapes, essential for illuminating biological function.

Asunto(s)

Análisis por Conglomerados , Aprendizaje Automático , Conformación Proteica , Pliegue de Proteína , Proteínas , Alineación de Secuencia , Mutación , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Rhodobacter sphaeroides , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo

4.

Computational design of soluble and functional membrane protein analogues.

Goverde, Casper A; Pacesa, Martin; Goldbach, Nicolas; Dornfeld, Lars J; Balbi, Petra E M; Georgeon, Sandrine; Rosset, Stéphane; Kapoor, Srajan; Choudhury, Jagrity; Dauparas, Justas; Schellhaas, Christian; Kozlov, Simon; Baker, David; Ovchinnikov, Sergey; Vecchio, Alex J; Correia, Bruno E.

Nature ; 2024 Jun 19.

Artículo en Inglés | MEDLINE | ID: mdl-38898281

RESUMEN

De novo design of complex protein folds using solely computational means remains a substantial challenge1. Here we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as those from G-protein-coupled receptors2, are not found in the soluble proteome, and we demonstrate that their structural features can be recapitulated in solution. Biophysical analyses demonstrate the high thermal stability of the designs, and experimental structures show remarkable design accuracy. The soluble analogues were functionalized with native structural motifs, as a proof of concept for bringing membrane protein functions to the soluble proteome, potentially enabling new approaches in drug discovery. In summary, we have designed complex protein topologies and enriched them with functionalities from membrane proteins, with high experimental success rates, leading to a de facto expansion of the functional soluble fold space.

5.

Mega-scale experimental analysis of protein folding stability in biology and design.

Tsuboyama, Kotaro; Dauparas, Justas; Chen, Jonathan; Laine, Elodie; Mohseni Behbahani, Yasser; Weinstein, Jonathan J; Mangan, Niall M; Ovchinnikov, Sergey; Rocklin, Gabriel J.

Nature ; 620(7973): 434-444, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37468638

RESUMEN

Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.

Asunto(s)

Biología , Ingeniería de Proteínas , Pliegue de Proteína , Proteínas , Aminoácidos/genética , Aminoácidos/metabolismo , Biología/métodos , ADN Complementario/genética , Estabilidad Proteica , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Termodinámica , Proteolisis , Ingeniería de Proteínas/métodos , Dominios Proteicos/genética , Mutación

6.

Unraveling the functional dark matter through global metagenomics.

Pavlopoulos, Georgios A; Baltoumas, Fotis A; Liu, Sirui; Selvitopi, Oguz; Camargo, Antonio Pedro; Nayfach, Stephen; Azad, Ariful; Roux, Simon; Call, Lee; Ivanova, Natalia N; Chen, I Min; Paez-Espino, David; Karatzas, Evangelos; Iliopoulos, Ioannis; Konstantinidis, Konstantinos; Tiedje, James M; Pett-Ridge, Jennifer; Baker, David; Visel, Axel; Ouzounis, Christos A; Ovchinnikov, Sergey; Buluç, Aydin; Kyrpides, Nikos C.

Nature ; 622(7983): 594-602, 2023 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-37821698

RESUMEN

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.

Asunto(s)

Metagenoma , Metagenómica , Microbiología , Proteínas , Análisis por Conglomerados , Metagenoma/genética , Metagenómica/métodos , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Bases de Datos de Proteínas , Conformación Proteica

7.

De novo design of protein structure and function with RFdiffusion.

Watson, Joseph L; Juergens, David; Bennett, Nathaniel R; Trippe, Brian L; Yim, Jason; Eisenach, Helen E; Ahern, Woody; Borst, Andrew J; Ragotte, Robert J; Milles, Lukas F; Wicky, Basile I M; Hanikel, Nikita; Pellock, Samuel J; Courbet, Alexis; Sheffler, William; Wang, Jue; Venkatesh, Preetham; Sappington, Isaac; Torres, Susana Vázquez; Lauko, Anna; De Bortoli, Valentin; Mathieu, Emile; Ovchinnikov, Sergey; Barzilay, Regina; Jaakkola, Tommi S; DiMaio, Frank; Baek, Minkyung; Baker, David.

Nature ; 620(7976): 1089-1100, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37433327

RESUMEN

There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.

Asunto(s)

Aprendizaje Profundo , Proteínas , Dominio Catalítico , Microscopía por Crioelectrón , Glicoproteínas Hemaglutininas del Virus de la Influenza/química , Glicoproteínas Hemaglutininas del Virus de la Influenza/metabolismo , Glicoproteínas Hemaglutininas del Virus de la Influenza/ultraestructura , Unión Proteica , Proteínas/química , Proteínas/metabolismo , Proteínas/ultraestructura

8.

Advances in Chromatin and Chromosome Research: Perspectives from Multiple Fields.

Agbleke, Andrews Akwasi; Amitai, Assaf; Buenrostro, Jason D; Chakrabarti, Aditi; Chu, Lingluo; Hansen, Anders S; Koenig, Kristen M; Labade, Ajay S; Liu, Sirui; Nozaki, Tadasu; Ovchinnikov, Sergey; Seeber, Andrew; Shaban, Haitham A; Spille, Jan-Hendrik; Stephens, Andrew D; Su, Jun-Han; Wadduwage, Dushan.

Mol Cell ; 79(6): 881-901, 2020 09 17.

Artículo en Inglés | MEDLINE | ID: mdl-32768408

RESUMEN

Nucleosomes package genomic DNA into chromatin. By regulating DNA access for transcription, replication, DNA repair, and epigenetic modification, chromatin forms the nexus of most nuclear processes. In addition, dynamic organization of chromatin underlies both regulation of gene expression and evolution of chromosomes into individualized sister objects, which can segregate cleanly to different daughter cells at anaphase. This collaborative review shines a spotlight on technologies that will be crucial to interrogate key questions in chromatin and chromosome biology including state-of-the-art microscopy techniques, tools to physically manipulate chromatin, single-cell methods to measure chromatin accessibility, computational imaging with neural networks and analytical tools to interpret chromatin structure and dynamics. In addition, this review provides perspectives on how these tools can be applied to specific research fields such as genome stability and developmental biology and to test concepts such as phase separation of chromatin.

Asunto(s)

Cromatina/genética , Cromosomas/genética , ADN/genética , Nucleosomas/genética , Reparación del ADN/genética , Replicación del ADN/genética , Epigénesis Genética/genética , Humanos

9.

De novo protein design by deep network hallucination.

Anishchenko, Ivan; Pellock, Samuel J; Chidyausiku, Tamuka M; Ramelot, Theresa A; Ovchinnikov, Sergey; Hao, Jingzhou; Bafna, Khushboo; Norn, Christoffer; Kang, Alex; Bera, Asim K; DiMaio, Frank; Carter, Lauren; Chow, Cameron M; Montelione, Gaetano T; Baker, David.

Nature ; 600(7889): 547-552, 2021 12.

Artículo en Inglés | MEDLINE | ID: mdl-34853475

RESUMEN

There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences1-3. Here we investigate whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.

Asunto(s)

Redes Neurales de la Computación , Proteínas , Secuencia de Aminoácidos , Cristalografía por Rayos X , Alucinaciones , Humanos , Conformación Proteica , Proteínas/química , Proteínas/genética

10.

NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes.

Baltoumas, Fotis A; Karatzas, Evangelos; Liu, Sirui; Ovchinnikov, Sergey; Sofianatos, Yorgos; Chen, I-Min; Kyrpides, Nikos C; Pavlopoulos, Georgios A.

Nucleic Acids Res ; 52(D1): D502-D512, 2024 Jan 05.

Artículo en Inglés | MEDLINE | ID: mdl-37811892

RESUMEN

The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.

Asunto(s)

Bases de Datos de Proteínas , Metagenoma , Proteínas , Secuencia de Aminoácidos , Bases de Datos Factuales , Ecosistema , Proteínas/química , Geografía

11.

De novo design of small beta barrel proteins.

Kim, David E; Jensen, Davin R; Feldman, David; Tischer, Doug; Saleem, Ayesha; Chow, Cameron M; Li, Xinting; Carter, Lauren; Milles, Lukas; Nguyen, Hannah; Kang, Alex; Bera, Asim K; Peterson, Francis C; Volkman, Brian F; Ovchinnikov, Sergey; Baker, David.

Proc Natl Acad Sci U S A ; 120(11): e2207974120, 2023 03 14.

Artículo en Inglés | MEDLINE | ID: mdl-36897987

RESUMEN

Small beta barrel proteins are attractive targets for computational design because of their considerable functional diversity despite their very small size (<70 amino acids). However, there are considerable challenges to designing such structures, and there has been little success thus far. Because of the small size, the hydrophobic core stabilizing the fold is necessarily very small, and the conformational strain of barrel closure can oppose folding; also intermolecular aggregation through free beta strand edges can compete with proper monomer folding. Here, we explore the de novo design of small beta barrel topologies using both Rosetta energy-based methods and deep learning approaches to design four small beta barrel folds: Src homology 3 (SH3) and oligonucleotide/oligosaccharide-binding (OB) topologies found in nature and five and six up-and-down-stranded barrels rarely if ever seen in nature. Both approaches yielded successful designs with high thermal stability and experimentally determined structures with less than 2.4 Å rmsd from the designed models. Using deep learning for backbone generation and Rosetta for sequence design yielded higher design success rates and increased structural diversity than Rosetta alone. The ability to design a large and structurally diverse set of small beta barrel proteins greatly increases the protein shape space available for designing binders to protein targets of interest.

Asunto(s)

Aminoácidos , Proteínas , Estructura Secundaria de Proteína , Modelos Moleculares , Proteínas/química , Conformación Proteica en Lámina beta , Pliegue de Proteína

12.

ColabFold: making protein folding accessible to all.

Mirdita, Milot; Schütze, Konstantin; Moriwaki, Yoshitaka; Heo, Lim; Ovchinnikov, Sergey; Steinegger, Martin.

Nat Methods ; 19(6): 679-682, 2022 06.

Artículo en Inglés | MEDLINE | ID: mdl-35637307

RESUMEN

ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold's 40-60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com .

Asunto(s)

Pliegue de Proteína , Programas Informáticos , Computadores , Bases de Datos Factuales , Proteínas

13.

Co-evolution-based prediction of metal-binding sites in proteomes by machine learning.

Cheng, Yao; Wang, Haobo; Xu, Hua; Liu, Yuan; Ma, Bin; Chen, Xuemin; Zeng, Xin; Wang, Xianghe; Wang, Bo; Shiau, Carina; Ovchinnikov, Sergey; Su, Xiao-Dong; Wang, Chu.

Nat Chem Biol ; 19(5): 548-555, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-36593274

RESUMEN

Metal ions have various important biological roles in proteins, including structural maintenance, molecular recognition and catalysis. Previous methods of predicting metal-binding sites in proteomes were based on either sequence or structural motifs. Here we developed a co-evolution-based pipeline named 'MetalNet' to systematically predict metal-binding sites in proteomes. We applied MetalNet to proteomes of four representative prokaryotic species and predicted 4,849 potential metalloproteins, which substantially expands the currently annotated metalloproteomes. We biochemically and structurally validated previously unannotated metal-binding sites in several proteins, including apo-citrate lyase phosphoribosyl-dephospho-CoA transferase citX, an Escherichia coli enzyme lacking structural or sequence homology to any known metalloprotein (Protein Data Bank (PDB) codes: 7DCM and 7DCN ). MetalNet also successfully recapitulated all known zinc-binding sites from the human spliceosome complex. The pipeline of MetalNet provides a unique and enabling tool for interrogating the hidden metalloproteome and studying metal biology.

Asunto(s)

Metaloproteínas , Proteoma , Humanos , Secuencia de Aminoácidos , Proteoma/química , Metales/metabolismo , Metaloproteínas/metabolismo , Sitios de Unión , Escherichia coli/metabolismo , Aprendizaje Automático

14.

End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman.

Petti, Samantha; Bhattacharya, Nicholas; Rao, Roshan; Dauparas, Justas; Thomas, Neil; Zhou, Juannan; Rush, Alexander M; Koo, Peter; Ovchinnikov, Sergey.

Bioinformatics ; 39(1)2023 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-36355460

RESUMEN

MOTIVATION: Multiple sequence alignments (MSAs) of homologous sequences contain information on structural and functional constraints and their evolutionary histories. Despite their importance for many downstream tasks, such as structure prediction, MSA generation is often treated as a separate pre-processing step, without any guidance from the application it will be used for. RESULTS: Here, we implement a smooth and differentiable version of the Smith-Waterman pairwise alignment algorithm that enables jointly learning an MSA and a downstream machine learning system in an end-to-end fashion. To demonstrate its utility, we introduce SMURF (Smooth Markov Unaligned Random Field), a new method that jointly learns an alignment and the parameters of a Markov Random Field for unsupervised contact prediction. We find that SMURF learns MSAs that mildly improve contact prediction on a diverse set of protein and RNA families. As a proof of concept, we demonstrate that by connecting our differentiable alignment module to AlphaFold2 and maximizing predicted confidence, we can learn MSAs that improve structure predictions over the initial MSAs. Interestingly, the alignments that improve AlphaFold predictions are self-inconsistent and can be viewed as adversarial. This work highlights the potential of differentiable dynamic programming to improve neural network pipelines that rely on an alignment and the potential dangers of optimizing predictions of protein sequences with methods that are not fully understood. AVAILABILITY AND IMPLEMENTATION: Our code and examples are available at: https://github.com/spetti/SMURF. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Proteínas , Humanos , Alineación de Secuencia , Proteínas/química , Redes Neurales de la Computación , Secuencia de Aminoácidos

15.

De novo design of a fluorescence-activating ß-barrel.

Dou, Jiayi; Vorobieva, Anastassia A; Sheffler, William; Doyle, Lindsey A; Park, Hahnbeom; Bick, Matthew J; Mao, Binchen; Foight, Glenna W; Lee, Min Yen; Gagnon, Lauren A; Carter, Lauren; Sankaran, Banumathi; Ovchinnikov, Sergey; Marcos, Enrique; Huang, Po-Ssu; Vaughan, Joshua C; Stoddard, Barry L; Baker, David.

Nature ; 561(7724): 485-491, 2018 09.

Artículo en Inglés | MEDLINE | ID: mdl-30209393

RESUMEN

The regular arrangements of ß-strands around a central axis in ß-barrels and of α-helices in coiled coils contrast with the irregular tertiary structures of most globular proteins, and have fascinated structural biologists since they were first discovered. Simple parametric models have been used to design a wide range of α-helical coiled-coil structures, but to date there has been no success with ß-barrels. Here we show that accurate de novo design of ß-barrels requires considerable symmetry-breaking to achieve continuous hydrogen-bond connectivity and eliminate backbone strain. We then build ensembles of ß-barrel backbone models with cavity shapes that match the fluorogenic compound DFHBI, and use a hierarchical grid-based search method to simultaneously optimize the rigid-body placement of DFHBI in these cavities and the identities of the surrounding amino acids to achieve high shape and chemical complementarity. The designs have high structural accuracy and bind and fluorescently activate DFHBI in vitro and in Escherichia coli, yeast and mammalian cells. This de novo design of small-molecule binding activity, using backbones custom-built to bind the ligand, should enable the design of increasingly sophisticated ligand-binding proteins, sensors and catalysts that are not limited by the backbone geometries available in known protein structures.

Asunto(s)

Compuestos de Bencilo/química , Fluorescencia , Imidazolinas/química , Proteínas/química , Animales , Compuestos de Bencilo/análisis , Células COS , Chlorocebus aethiops , Escherichia coli , Proteínas Fluorescentes Verdes/genética , Proteínas Fluorescentes Verdes/metabolismo , Enlace de Hidrógeno , Imidazolinas/análisis , Ligandos , Unión Proteica , Dominios Proteicos , Pliegue de Proteína , Estabilidad Proteica , Estructura Secundaria de Proteína , Reproducibilidad de los Resultados , Levaduras

16.

Protein sequence design by conformational landscape optimization.

Norn, Christoffer; Wicky, Basile I M; Juergens, David; Liu, Sirui; Kim, David; Tischer, Doug; Koepnick, Brian; Anishchenko, Ivan; Baker, David; Ovchinnikov, Sergey.

Proc Natl Acad Sci U S A ; 118(11)2021 03 16.

Artículo en Inglés | MEDLINE | ID: mdl-33712545

RESUMEN

The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen's thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure.

Asunto(s)

Redes Neurales de la Computación , Proteínas/química , Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Termodinámica

17.

Cryo-EM structure of the protein-conducting ERAD channel Hrd1 in complex with Hrd3.

Schoebel, Stefan; Mi, Wei; Stein, Alexander; Ovchinnikov, Sergey; Pavlovicz, Ryan; DiMaio, Frank; Baker, David; Chambers, Melissa G; Su, Huayou; Li, Dongsheng; Rapoport, Tom A; Liao, Maofu.

Nature ; 548(7667): 352-355, 2017 08 17.

Artículo en Inglés | MEDLINE | ID: mdl-28682307

RESUMEN

Misfolded endoplasmic reticulum proteins are retro-translocated through the membrane into the cytosol, where they are poly-ubiquitinated, extracted from the membrane, and degraded by the proteasome-a pathway termed endoplasmic reticulum-associated protein degradation (ERAD). Proteins with misfolded domains in the endoplasmic reticulum lumen or membrane are discarded through the ERAD-L and ERAD-M pathways, respectively. In Saccharomyces cerevisiae, both pathways require the ubiquitin ligase Hrd1, a multi-spanning membrane protein with a cytosolic RING finger domain. Hrd1 is the crucial membrane component for retro-translocation, but it is unclear whether it forms a protein-conducting channel. Here we present a cryo-electron microscopy structure of S. cerevisiae Hrd1 in complex with its endoplasmic reticulum luminal binding partner, Hrd3. Hrd1 forms a dimer within the membrane with one or two Hrd3 molecules associated at its luminal side. Each Hrd1 molecule has eight transmembrane segments, five of which form an aqueous cavity extending from the cytosol almost to the endoplasmic reticulum lumen, while a segment of the neighbouring Hrd1 molecule forms a lateral seal. The aqueous cavity and lateral gate are reminiscent of features of protein-conducting conduits that facilitate polypeptide movement in the opposite direction-from the cytosol into or across membranes. Our results suggest that Hrd1 forms a retro-translocation channel for the movement of misfolded polypeptides through the endoplasmic reticulum membrane.

Asunto(s)

Microscopía por Crioelectrón , Degradación Asociada con el Retículo Endoplásmico , Glicoproteínas de Membrana/metabolismo , Glicoproteínas de Membrana/ultraestructura , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/ultraestructura , Saccharomyces cerevisiae/química , Ubiquitina-Proteína Ligasas/metabolismo , Ubiquitina-Proteína Ligasas/ultraestructura , Interacciones Hidrofóbicas e Hidrofílicas , Glicoproteínas de Membrana/química , Modelos Moleculares , Conformación Proteica , Saccharomyces cerevisiae/ultraestructura , Proteínas de Saccharomyces cerevisiae/química , Ubiquitina-Proteína Ligasas/química

18.

Improved protein structure prediction using predicted interresidue orientations.

Yang, Jianyi; Anishchenko, Ivan; Park, Hahnbeom; Peng, Zhenling; Ovchinnikov, Sergey; Baker, David.

Proc Natl Acad Sci U S A ; 117(3): 1496-1503, 2020 01 21.

Artículo en Inglés | MEDLINE | ID: mdl-31896580

RESUMEN

The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the "ideality" of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.

Asunto(s)

Conformación Proteica , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Animales , Aprendizaje Profundo , Humanos

19.

State-of-the-Art Estimation of Protein Model Accuracy Using AlphaFold.

Roney, James P; Ovchinnikov, Sergey.

Phys Rev Lett ; 129(23): 238101, 2022 Dec 02.

Artículo en Inglés | MEDLINE | ID: mdl-36563190

RESUMEN

The problem of predicting a protein's 3D structure from its primary amino acid sequence is a longstanding challenge in structural biology. Recently, approaches like alphafold have achieved remarkable performance on this task by combining deep learning techniques with coevolutionary data from multiple sequence alignments of related protein sequences. The use of coevolutionary information is critical to these models' accuracy, and without it their predictive performance drops considerably. In living cells, however, the 3D structure of a protein is fully determined by its primary sequence and the biophysical laws that cause it to fold into a low-energy configuration. Thus, it should be possible to predict a protein's structure from only its primary sequence by learning an approximate biophysical energy function. We provide evidence that alphafold has learned such an energy function, and uses coevolution data to solve the global search problem of finding a low-energy conformation. We demonstrate that alphafold'slearned energy function can be used to rank the quality of candidate protein structures with state-of-the-art accuracy, without using any coevolution data. Finally, we explore several applications of this energy function, including the prediction of protein structures without multiple sequence alignments.

Asunto(s)

Algoritmos , Proteínas , Conformación Proteica , Modelos Moleculares , Proteínas/química , Secuencia de Aminoácidos

20.

Temperature- and Field-Induced Transformation of the Magnetic State in Co_2.5Ge_0.5BO₅.

Kazak, Natalia; Arauzo, Ana; Bartolomé, Juan; Belskaya, Nadezhda; Vasiliev, Alexander; Velikanov, Dmitry; Eremin, Evgeny; Gavrilkin, Sergey; Zhandun, Vyacheslav; Patrin, Gennadiy; Ovchinnikov, Sergey.

Inorg Chem ; 61(33): 13034-13046, 2022 Aug 22.

Artículo en Inglés | MEDLINE | ID: mdl-35947773

RESUMEN

A tetravalent-substituted cobalt ludwigite Co2.5Ge0.5BO5 has been synthesized using the flux method. The compound undergoes two magnetic transitions: a long-range antiferromagnetic transition at TN1 = 84 K and a metamagnetic one at TN2 = 36 K. The sample-oriented magnetization measurements revealed a fully compensated magnetic moment along the a- and c-axes and an uncompensated one along the b-axis leading to high uniaxial anisotropy. A field-induced enhancement of the ferromagnetic correlations at TN2 is observed in specific heat measurements. The DFT+GGA calculation predicts the spin configuration of (↑↓↓↑) as a ground state with a magnetic moment of 1.37 µB/f.u. The strong hybridization of Ge(4s, 4p) with O (2p) orbitals resulting from the high electronegativity of Ge4+ is assumed to cause an increase in the interlayer interaction, contributing to the long-range magnetic order. The effect of two super-superexchange pathways Co2+-O-B-O-Co2+ and Co2+-O-M4-O-Co2+ on the magnetic state is discussed.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA