Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 71
Filtrar
1.
Nucleic Acids Res ; 2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38686797

RESUMEN

Residue interaction networks (RINs) are a valuable approach for representing contacts in protein structures. RINs have been widely used in various research areas, including the analysis of mutation effects, domain-domain communication, catalytic activity, and molecular dynamics simulations. The RING server is a powerful tool to calculate non-covalent molecular interactions based on geometrical parameters, providing high-quality and reliable results. Here, we introduce RING 4.0, which includes significant enhancements for identifying both covalent and non-covalent bonds in protein structures. It now encompasses seven different interaction types, with the addition of π-hydrogen, halogen bonds and metal ion coordination sites. The definitions of all available bond types have also been refined and RING can now process the complete PDB chemical component dictionary (over 35000 different molecules) which provides atom names and covalent connectivity information for all known ligands. Optimization of the software has improved execution time by an order of magnitude. The RING web server has been redesigned to provide a more engaging and interactive user experience, incorporating new visualization tools. Users can now visualize all types of interactions simultaneously in the structure viewer and network component. The web server, including extensive help and tutorials, is available from URL: https://ring.biocomputingup.it/.

2.
Database (Oxford) ; 20242024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38507044

RESUMEN

The DisProt database is a resource containing manually curated data on experimentally validated intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) from the literature. Developed in 2005, its primary goal was to collect structural and functional information into proteins that lack a fixed three-dimensional structure. Today, DisProt has evolved into a major repository that not only collects experimental data but also contributes to our understanding of the IDPs/IDRs roles in various biological processes, such as autophagy or the life cycle mechanisms in viruses or their involvement in diseases (such as cancer and neurodevelopmental disorders). DisProt offers detailed information on the structural states of IDPs/IDRs, including state transitions, interactions and their functions, all provided as curated annotations. One of the central activities of DisProt is the meticulous curation of experimental data from the literature. For this reason, to ensure that every expert and volunteer curator possesses the requisite knowledge for data evaluation, collection and integration, training courses and curation materials are available. However, biocuration guidelines concur on the importance of developing robust guidelines that not only provide critical information about data consistency but also ensure data acquisition.This guideline aims to provide both biocurators and external users with best practices for manually curating IDPs and IDRs in DisProt. It describes every step of the literature curation process and provides use cases of IDP curation within DisProt. Database URL: https://disprot.org/.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Humanos , Proteínas Intrínsecamente Desordenadas/genética , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica , Bases de Datos Factuales
3.
Bioinform Adv ; 4(1): vbae043, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38545087

RESUMEN

We present CAFA-evaluator, a powerful Python program designed to evaluate the performance of prediction methods on targets with hierarchical concept dependencies. It generalizes multi-label evaluation to modern ontologies where the prediction targets are drawn from a directed acyclic graph and achieves high efficiency by leveraging matrix computation and topological sorting. The program requirements include a small number of standard Python libraries, making CAFA-evaluator easy to maintain. The code replicates the Critical Assessment of protein Function Annotation (CAFA) benchmarking, which evaluates predictions of the consistent subgraphs in Gene Ontology. Owing to its reliability and accuracy, the organizers have selected CAFA-evaluator as the official CAFA evaluation software. Availability and implementation: https://pypi.org/project/cafaeval.

4.
Nucleic Acids Res ; 52(D1): D434-D441, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37904585

RESUMEN

DisProt (URL: https://disprot.org) is the gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions. The latest version of DisProt brings significant advancements, including a broader representation of functions and an enhanced curation process. These improvements aim to increase both the quality of annotations and their coverage at the sequence level. Higher coverage has been achieved by adopting additional evidence codes. Quality of annotations has been improved by systematically applying Minimum Information About Disorder Experiments (MIADE) principles and reporting all the details of the experimental setup that could potentially influence the structural state of a protein. The DisProt database now includes new thematic datasets and has expanded the adoption of Gene Ontology terms, resulting in an extensive functional repertoire which is automatically propagated to UniProtKB. Finally, we show that DisProt's curated annotations strongly correlate with disorder predictions inferred from AlphaFold2 pLDDT (predicted Local Distance Difference Test) confidence scores. This comparison highlights the utility of DisProt in explaining apparent uncertainty of certain well-defined predicted structures, which often correspond to folding-upon-binding fragments. Overall, DisProt serves as a comprehensive resource, combining experimental evidence of disorder information to enhance our understanding of intrinsically disordered proteins and their functional implications.


Asunto(s)
Bases de Datos de Proteínas , Proteínas Intrínsecamente Desordenadas , Ontología de Genes , Proteínas Intrínsecamente Desordenadas/química , Anotación de Secuencia Molecular
5.
Res Sq ; 2023 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-37577579

RESUMEN

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

6.
Proteins ; 91(12): 1925-1934, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37621223

RESUMEN

Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The CAID2 challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. Methods based on AlphaFold2 seem to be good ID predictors but they are better at detecting absence of order rather than ID regions as defined in DisProt. The CAID2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteínas , Programas Informáticos , Bases de Datos de Proteínas
7.
J Struct Biol ; 215(4): 108023, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37652396

RESUMEN

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Asunto(s)
Proteínas , Secuencias Repetidas en Tándem , Proteínas/genética , Proteínas/química , Secuencias Repetidas en Tándem/genética , Secuencia de Aminoácidos
8.
Nat Methods ; 20(9): 1291-1303, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37400558

RESUMEN

An unambiguous description of an experiment, and the subsequent biological observation, is vital for accurate data interpretation. Minimum information guidelines define the fundamental complement of data that can support an unambiguous conclusion based on experimental observations. We present the Minimum Information About Disorder Experiments (MIADE) guidelines to define the parameters required for the wider scientific community to understand the findings of an experiment studying the structural properties of intrinsically disordered regions (IDRs). MIADE guidelines provide recommendations for data producers to describe the results of their experiments at source, for curators to annotate experimental data to community resources and for database developers maintaining community resources to disseminate the data. The MIADE guidelines will improve the interpretability of experimental results for data consumers, facilitate direct data submission, simplify data curation, improve data exchange among repositories and standardize the dissemination of the key metadata on an IDR experiment by IDR data sources.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica
9.
J Struct Biol ; 215(3): 108001, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37467824

RESUMEN

Structured tandem repeats proteins (STRPs) are a specific kind of tandem repeat proteins characterized by a modular and repetitive three-dimensional structure arrangement. The majority of STRPs adopt solenoid structures, but with the increasing availability of experimental structures and high-quality predicted structural models, more STRP folds can be characterized. Here, we describe "Box repeats", an overlooked STRP fold present in the DNA sliding clamp processivity factors, which has eluded classification although structural data has been available since the late 1990s. Each Box repeat is a ß⍺ßßß module of about 60 residues, which forms a class V "beads-on-a-string" type STRP. The number of repeats present in processivity factors is organism dependent. Monomers of PCNA proteins in both Archaea and Eukarya have 4 repeats, while the monomers of bacterial beta-sliding clamps have 6 repeats. This new repeat fold has been added to the RepeatsDB database, which now provides structural annotation for 66 Box repeat proteins belonging to different organisms, including viruses.


Asunto(s)
Proteínas , Secuencias Repetidas en Tándem , Proteínas/química , Secuencias Repetidas en Tándem/genética , ADN/genética
10.
Nucleic Acids Res ; 51(W1): W62-W69, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37246642

RESUMEN

Intrinsic disorder (ID) in proteins is well-established in structural biology, with increasing evidence for its involvement in essential biological processes. As measuring dynamic ID behavior experimentally on a large scale remains difficult, scores of published ID predictors have tried to fill this gap. Unfortunately, their heterogeneity makes it difficult to compare performance, confounding biologists wanting to make an informed choice. To address this issue, the Critical Assessment of protein Intrinsic Disorder (CAID) benchmarks predictors for ID and binding regions as a community blind-test in a standardized computing environment. Here we present the CAID Prediction Portal, a web server executing all CAID methods on user-defined sequences. The server generates standardized output and facilitates comparison between methods, producing a consensus prediction highlighting high-confidence ID regions. The website contains extensive documentation explaining the meaning of different CAID statistics and providing a brief description of all methods. Predictor output is visualized in an interactive feature viewer and made available for download in a single table, with the option to recover previous sessions via a private dashboard. The CAID Prediction Portal is a valuable resource for researchers interested in studying ID in proteins. The server is available at the URL: https://caid.idpcentral.org.


Asunto(s)
Biología Molecular , Proteínas , Benchmarking , Consenso , Proteínas/química , Programas Informáticos , Proteínas Intrínsecamente Desordenadas
11.
Bioinformatics ; 39(5)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37079739

RESUMEN

RING-PyMOL is a plugin for PyMOL providing a set of analysis tools for structural ensembles and molecular dynamic simulations. RING-PyMOL combines residue interaction networks, as provided by the RING software, with structural clustering to enhance the analysis and visualization of the conformational complexity. It combines precise calculation of non-covalent interactions with the power of PyMOL to manipulate and visualize protein structures. The plugin identifies and highlights correlating contacts and interaction patterns that can explain structural allostery, active sites, and structural heterogeneity connected with molecular function. It is easy to use and extremely fast, processing and rendering hundreds of models and long trajectories in seconds. RING-PyMOL generates a number of interactive plots and output files for use with external tools. The underlying RING software has been improved extensively. It is 10 times faster, can process mmCIF files and it identifies typed interactions also for nucleic acids. AVAILABILITY AND IMPLEMENTATION: https://github.com/BioComputingUP/ring-pymol.


Asunto(s)
Simulación de Dinámica Molecular , Programas Informáticos , Proteínas/química , Análisis por Conglomerados , Dominio Catalítico
12.
Nucleic Acids Res ; 51(D1): D438-D444, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36416266

RESUMEN

The MobiDB database (URL: https://mobidb.org/) is a knowledge base of intrinsically disordered proteins. MobiDB aggregates disorder annotations derived from the literature and from experimental evidence along with predictions for all known protein sequences. MobiDB generates new knowledge and captures the functional significance of disordered regions by processing and combining complementary sources of information. Since its first release 10 years ago, the MobiDB database has evolved in order to improve the quality and coverage of protein disorder annotations and its accessibility. MobiDB has now reached its maturity in terms of data standardization and visualization. Here, we present a new release which focuses on the optimization of user experience and database content. The major advances compared to the previous version are the integration of AlphaFoldDB predictions and the re-implementation of the homology transfer pipeline, which expands manually curated annotations by two orders of magnitude. Finally, the entry page has been restyled in order to provide an overview of the available annotations along with two separate views that highlight structural disorder evidence and functions associated with different binding modes.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteínas Intrínsecamente Desordenadas/química , Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Secuencia de Aminoácidos , Bases del Conocimiento , Conformación Proteica
13.
Gigascience ; 112022 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-36448847

RESUMEN

While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.


Asunto(s)
Metadatos , Registros , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Simulación por Computador
14.
Protein Sci ; 31(11): e4466, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36210722

RESUMEN

Intrinsically disordered regions (IDRs) defying the traditional protein structure-function paradigm have been difficult to analyze. The availability of accurate structure predictions on a large scale in AlphaFoldDB offers a fresh perspective on IDR prediction. Here, we establish three baselines for IDR prediction from AlphaFoldDB models based on the recent CAID dataset. Surprisingly, AlphaFoldDB is highly competitive for predicting both IDRs and conditionally folded binding regions, demonstrating the plasticity of the disorder to structure continuum.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Conformación Proteica , Proteínas Intrínsecamente Desordenadas/química , Pliegue de Proteína
15.
Curr Protoc ; 2(7): e484, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35789137

RESUMEN

DisProt is the major repository of manually curated data for intrinsically disordered proteins collected from the literature. Although lacking a stable three-dimensional structure under physiological conditions, intrinsically disordered proteins carry out a plethora of biological functions, some of them directly arising from their flexible nature. A growing number of scientific studies have been published during the last few decades to shed light on their unstructured state, their binding modes, and their functions. DisProt makes use of a team of expert biocurators to provide up-to-date annotations of intrinsically disordered proteins from the literature, making them available to the scientific community. Here we present a comprehensive description on how to use DisProt in different contexts and provide a detailed explanation of how to explore and interpret manually curated annotations of intrinsically disordered proteins. We describe how to search DisProt annotations, both using the web interface and the API for programmatic access. Finally, we explain how to visualize and interpret a DisProt entry, the SARS-CoV-2 Nucleoprotein, characterized by the presence of unstructured N-terminal and C-terminal regions and a flexible linker. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Performing a search in DisProt Support Protocol 1: Downloading options Support Protocol 2: Programmatic access with DisProt REST API Basic Protocol 2: Exploring the DisProt Ontology page Basic Protocol 3: Visualizing and interpreting DisProt entries-the SARS-CoV-2 Nucleoprotein use case.


Asunto(s)
COVID-19 , Proteínas Intrínsecamente Desordenadas , Humanos , Nucleoproteínas , SARS-CoV-2
17.
Nucleic Acids Res ; 50(W1): W651-W656, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35554554

RESUMEN

Residue interaction networks (RINs) are used to represent residue contacts in protein structures. Thanks to the advances in network theory, RINs have been proved effective as an alternative to coordinate data in the analysis of complex systems. The RING server calculates high quality and reliable non-covalent molecular interactions based on geometrical parameters. Here, we present the new RING 3.0 version extending the previous functionality in several ways. The underlying software library has been re-engineered to improve speed by an order of magnitude. RING now also supports the mmCIF format and provides typed interactions for the entire PDB chemical component dictionary, including nucleic acids. Moreover, RING now employs probabilistic graphs, where multiple conformations (e.g. NMR or molecular dynamics ensembles) are mapped as weighted edges, opening up new ways to analyze structural data. The web interface has been expanded to include a simultaneous view of the RIN alongside a structure viewer, with both synchronized and clickable. Contact evolution across models (or time) is displayed as a heatmap and can help in the discovery of correlating interaction patterns. The web server, together with an extensive help and tutorial, is available from URL: https://ring.biocomputingup.it/.


Asunto(s)
Proteínas , Programas Informáticos , Internet , Simulación de Dinámica Molecular , Conformación Proteica , Proteínas/química , Probabilidad
18.
Acta Crystallogr D Struct Biol ; 78(Pt 2): 144-151, 2022 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-35102880

RESUMEN

Intrinsically disordered regions (IDRs) lacking a fixed three-dimensional protein structure are widespread and play a central role in cell regulation. Only a small fraction of IDRs have been functionally characterized, with heterogeneous experimental evidence that is largely buried in the literature. Predictions of IDRs are still difficult to estimate and are poorly characterized. Here, an overview of the publicly available knowledge about IDRs is reported, including manually curated resources, deposition databases and prediction repositories. The types, scopes and availability of the various resources are analyzed, and their complementarity and overlap are highlighted. The volume of information included and the relevance to the field of structural biology are compared.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Bases de Datos Factuales , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Conformación Proteica
19.
Biomolecules ; 12(1)2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-35053240

RESUMEN

Biomolecular condensates challenge the classical concepts of molecular recognition. The variable composition and heterogeneous conformations of liquid-like protein droplets are bottlenecks for high-resolution structural studies. To obtain atomistic insights into the organization of these assemblies, here we have characterized the conformational ensembles of specific disordered complexes, including those of droplet-driving proteins. First, we found that these specific complexes exhibit a high degree of conformational heterogeneity. Second, we found that residues forming contacts at the interface also sample many conformations. Third, we found that different patterns of contacting residues form the specific interface. In addition, we observed a wide range of sequence motifs mediating disordered interactions, including charged, hydrophobic and polar contacts. These results demonstrate that selective recognition can be realized by variable patterns of weakly defined interaction motifs in many different binding configurations. We propose that these principles also play roles in determining the selectivity of biomolecular condensates.


Asunto(s)
Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica
20.
Bioinformatics ; 38(4): 1129-1130, 2022 01 27.
Artículo en Inglés | MEDLINE | ID: mdl-34788797

RESUMEN

SUMMARY: Biological data is ever-increasing in amount and complexity. The mapping of this data to biological entities such as nucleotide and amino acid sequences supports biological data analysis, classification and prediction. Sequence alignments and comparison allow the transfer of knowledge to evolutionary-related entities, the mapping of functional domains, the identification of binding and modification sites. To support these types of studies, we developed ProSeqViewer, a tool to visualize annotation on single sequences and multiple sequence alignments. This state-of-the-art multifunctional library was developed as a modular component to be integrated into static or dynamic web resources and support intuitive visualization of sequence features. ProseSeqViewer is extremely lightweight, fast, interactive, dynamic, responsive and works at any screen size. It generates pure HTML which is compatible with any browser and operating system. ProSeqViewer can exchange events with other visualization components and is already used by multiple biological databases. AVAILABILITY AND IMPLEMENTATION: ProSeqViewer is an open-source TypeScript library compatible with state-of-the-art website environments. The source code and an extensive documentation including use cases are available from the URL: https://github.com/BioComputingUP/ProSeqViewer.


Asunto(s)
Programas Informáticos , Biblioteca de Genes , Alineación de Secuencia , Secuencia de Aminoácidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...