Pesquisa | Secretaria de Estado da Saúde

Uncovering new families and folds in the natural protein universe.

Durairaj, Janani; Waterhouse, Andrew M; Mets, Toomas; Brodiazhenko, Tetiana; Abdullah, Minhal; Studer, Gabriel; Tauriello, Gerardo; Akdel, Mehmet; Andreeva, Antonina; Bateman, Alex; Tenson, Tanel; Hauryliuk, Vasili; Schwede, Torsten; Pereira, Joana.

Nature ; 622(7983): 646-653, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37704037

RESUMO

We are now entering a new era in protein sequence and structure annotation, with hundreds of millions of predicted protein structures made available through the AlphaFold database1. These models cover nearly all proteins that are known, including those challenging to annotate for function or putative biological role using standard homology-based approaches. In this study, we examine the extent to which the AlphaFold database has structurally illuminated this 'dark matter' of the natural protein universe at high predicted accuracy. We further describe the protein diversity that these models cover as an annotated interactive sequence similarity network, accessible at https://uniprot3d.org/atlas/AFDB90v4 . By searching for novelties from sequence, structure and semantic perspectives, we uncovered the ß-flower fold, added several protein families to Pfam database2 and experimentally demonstrated that one of these belongs to a new superfamily of translation-targeting toxin-antitoxin systems, TumE-TumA. This work underscores the value of large-scale efforts in identifying, annotating and prioritizing new protein families. By leveraging the recent deep learning revolution in protein bioinformatics, we can now shed light into uncharted areas of the protein universe at an unprecedented scale, paving the way to innovations in life sciences and biotechnology.

Assuntos

Bases de Dados de Proteínas , Aprendizado Profundo , Anotação de Sequência Molecular , Dobramento de Proteína , Proteínas , Homologia Estrutural de Proteína , Sequência de Aminoácidos , Internet , Proteínas/química , Proteínas/classificação , Proteínas/metabolismo

The structure assessment web server: for proteins, complexes and more.

Waterhouse, Andrew M; Studer, Gabriel; Robin, Xavier; Bienert, Stefan; Tauriello, Gerardo; Schwede, Torsten.

Nucleic Acids Res ; 52(W1): W318-W323, 2024 Jul 05.

Artigo em Inglês | MEDLINE | ID: mdl-38634802

RESUMO

The 'structure assessment' web server is a one-stop shop for interactive evaluation and benchmarking of structural models of macromolecular complexes including proteins and nucleic acids. A user-friendly web dashboard links sequence with structure information and results from a variety of state-of-the-art tools, which facilitates the visual exploration and evaluation of structure models. The dashboard integrates stereochemistry information, secondary structure information, global and local model quality assessment of the tertiary structure of comparative protein models, as well as prediction of membrane location. In addition, a benchmarking mode is available where a model can be compared to a reference structure, providing easy access to scores that have been used in recent CASP experiments and CAMEO. The structure assessment web server is available at https://swissmodel.expasy.org/assess.

Assuntos

Internet , Modelos Moleculares , Software , Proteínas/química , Benchmarking , Conformação Proteica

QMEANDisCo-distance constraints applied on model quality estimation.

Studer, Gabriel; Rempfer, Christine; Waterhouse, Andrew M; Gumienny, Rafal; Haas, Juergen; Schwede, Torsten.

Bioinformatics ; 36(6): 1765-1771, 2020 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-31697312

RESUMO

MOTIVATION: Methods that estimate the quality of a 3D protein structure model in absence of an experimental reference structure are crucial to determine a model's utility and potential applications. Single model methods assess individual models whereas consensus methods require an ensemble of models as input. In this work, we extend the single model composite score QMEAN that employs statistical potentials of mean force and agreement terms by introducing a consensus-based distance constraint (DisCo) score. RESULTS: DisCo exploits distance distributions from experimentally determined protein structures that are homologous to the model being assessed. Feed-forward neural networks are trained to adaptively weigh contributions by the multi-template DisCo score and classical single model QMEAN parameters. The result is the composite score QMEANDisCo, which combines the accuracy of consensus methods with the broad applicability of single model approaches. We also demonstrate that, despite being the de-facto standard for structure prediction benchmarking, CASP models are not the ideal data source to train predictive methods for model quality estimation. For performance assessment, QMEANDisCo is continuously benchmarked within the CAMEO project and participated in CASP13. For both, it ranks among the top performers and excels with low response times. AVAILABILITY AND IMPLEMENTATION: QMEANDisCo is available as web-server at https://swissmodel.expasy.org/qmean. The source code can be downloaded from https://git.scicore.unibas.ch/schwede/QMEAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Proteínas , Software , Modelos Moleculares , Redes Neurais de Computação , Conformação Proteica

QMEANDisCo-distance constraints applied on model quality estimation.

Studer, Gabriel; Rempfer, Christine; Waterhouse, Andrew M; Gumienny, Rafal; Haas, Juergen; Schwede, Torsten.

Bioinformatics ; 36(8): 2647, 2020 04 15.

Artigo em Inglês | MEDLINE | ID: mdl-32048708

Jalview Version 2--a multiple sequence alignment editor and analysis workbench.

Waterhouse, Andrew M; Procter, James B; Martin, David M A; Clamp, Michèle; Barton, Geoffrey J.

Bioinformatics ; 25(9): 1189-91, 2009 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-19151095

RESUMO

UNLABELLED: Jalview Version 2 is a system for interactive WYSIWYG editing, analysis and annotation of multiple sequence alignments. Core features include keyboard and mouse-based editing, multiple views and alignment overviews, and linked structure display with Jmol. Jalview 2 is available in two forms: a lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment, secondary structure prediction and the retrieval of alignments, sequences, annotation and structures from public databases and any DAS 1.53 compliant sequence or annotation server. AVAILABILITY: The Jalview 2 Desktop application and JalviewLite applet are made freely available under the GPL, and can be downloaded from www.jalview.org.

Assuntos

Biologia Computacional/métodos , Proteínas/química , Alinhamento de Sequência/métodos , Software , Bases de Dados de Proteínas , Análise de Sequência de Proteína

FANTOM4 EdgeExpressDB: an integrated database of promoters, genes, microRNAs, expression dynamics and regulatory interactions.

Severin, Jessica; Waterhouse, Andrew M; Kawaji, Hideya; Lassmann, Timo; van Nimwegen, Erik; Balwierz, Piotr J; de Hoon, Michiel Jl; Hume, David A; Carninci, Piero; Hayashizaki, Yoshihide; Suzuki, Harukazu; Daub, Carsten O; Forrest, Alistair Rr.

Genome Biol ; 10(4): R39, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19374773

RESUMO

EdgeExpressDB is a novel database and set of interfaces for interpreting biological networks and comparing large high-throughput expression datasets that requires minimal development for new data types and search patterns. The FANTOM4 EdgeExpress database http://fantom.gsc.riken.jp/4/edgeexpress summarizes gene expression patterns in the context of alternative promoter structures and regulatory transcription factors and microRNAs using intuitive gene-centric and sub-network views. This is an important resource for gene regulation in acute myeloid leukemia, monocyte/macrophage differentiation and human transcriptional networks.

Assuntos

Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Redes Reguladoras de Genes/genética , Doença Aguda , Diferenciação Celular/genética , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Genômica/estatística & dados numéricos , Humanos , Internet , Leucemia Mieloide/genética , MicroRNAs/genética , Regiões Promotoras Genéticas/genética , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa