Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Struct Biol ; 215(4): 108033, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37797915

RESUMEN

Tandem repeats in proteins identification, classification and curation is a complex process that requires manual processing from experts, processing power and time. There are recent and relevant advances applying machine learning for protein structure prediction and repeat classification that are useful for this process. However, no service contemplates required databases and software to supplement researching on repeat proteins. In this publication we present Daisy, an integrated repeat protein curation web service. This service can process Protein Data Bank (PDB) and the AlphaFold Database entries for tandem repeats identification. In addition, it uses an algorithm to search a sequence against a library of Pfam hidden Markov model (HMM). Repeat classifications are associated with the identified families through RepeatsDB. This prediction is considered for enhancing the ReUPred algorithm execution and hastening the repeat units identification process. The service can also operate every associated PDB and AlphaFold structure with a UniProt proteome registry. Availability: The Daisy web service is freely accessible at daisy.bioinformatica.org.


Asunto(s)
Algoritmos , Programas Informáticos , Humanos , Proteoma , Bases de Datos de Proteínas
2.
J Struct Biol ; 215(4): 108023, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37652396

RESUMEN

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Asunto(s)
Proteínas , Secuencias Repetidas en Tándem , Proteínas/genética , Proteínas/química , Secuencias Repetidas en Tándem/genética , Secuencia de Aminoácidos
3.
Bioinformatics ; 38(21): 4959-4961, 2022 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-36111870

RESUMEN

SUMMARY: A collection of conformers that exist in a dynamical equilibrium defines the native state of a protein. The structural differences between them describe their conformational diversity, a defining characteristic of the protein with an essential role in multiple cellular processes. Since most proteins carry out their functions by assembling into complexes, we have developed CoDNaS-Q, the first online resource to explore conformational diversity in homooligomeric proteins. It features a curated collection of redundant protein structures with known quaternary structure. CoDNaS-Q integrates relevant annotations that allow researchers to identify and explore the extent and possible reasons of conformational diversity in homooligomeric protein complexes. AVAILABILITY AND IMPLEMENTATION: CoDNaS-Q is freely accessible at http://ufq.unq.edu.ar/codnasq/ or https://codnas-q.bioinformatica.org/home. The data can be retrieved from the website. The source code of the database can be downloaded from https://github.com/SfrRonaldo/codnas-q.


Asunto(s)
Proteínas , Programas Informáticos , Proteínas/química , Conformación Proteica , Bases de Datos Factuales
4.
Bioinformatics ; 38(6): 1745-1748, 2022 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-34954795

RESUMEN

SUMMARY: Conformational changes in RNA native ensembles are central to fulfill many of their biological roles. Systematic knowledge of the extent and possible modulators of this conformational diversity is desirable to better understand the relationship between RNA dynamics and function. We have developed CoDNaS-RNA as the first database of conformational diversity in RNA molecules. Known RNA structures are retrieved and clustered to identify alternative conformers of each molecule. Pairwise structural comparisons between all conformers within each cluster allows to measure the variability of the molecule. Additional annotations about structural features, molecular interactions and biological function are provided. All data in CoDNaS-RNA is free to download and available as a public website that can be of interest for researchers in computational biology and other life science disciplines. AVAILABILITY AND IMPLEMENTATION: The data underlying this article are available at http://ufq.unq.edu.ar/codnasrna or https://codnas-rna.bioinformatica.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional , ARN , Conformación Molecular , Programas Informáticos
5.
Nucleic Acids Res ; 49(D1): D452-D457, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33237313

RESUMEN

The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Secuencias Repetitivas de Aminoácido , Secuencias Repetidas en Tándem , Ontología de Genes , Células HEK293 , Células HeLa , Humanos , Reproducibilidad de los Resultados , Estadística como Asunto , Interfaz Usuario-Computador
6.
Database (Oxford) ; 20202020 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-32400867

RESUMEN

Revenant is a database of resurrected proteins coming from extinct organisms. Currently, it contains a manually curated collection of 84 resurrected proteins derived from bibliographic data. Each protein is extensively annotated, including structural, biochemical and biophysical information. Revenant contains a browse capability designed as a timeline from where the different proteins can be accessed. The oldest Revenant entries are between 4200 and 3500 million years ago, while the younger entries are between 8.8 and 6.3 million years ago. These proteins have been resurrected using computational tools called ancestral sequence reconstruction techniques combined with wet-laboratory synthesis and expression. Resurrected proteins are commonly used, with a noticeable increase during the past years, to explore and test different evolutionary hypotheses such as protein stability, to explore the origin of new functions, to get biochemical insights into past metabolisms and to explore specificity and promiscuous behaviour of ancient proteins.


Asunto(s)
Bases de Datos de Proteínas , Extinción Biológica , Proteínas , Evolución Molecular , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Proteínas/metabolismo
7.
Nucleic Acids Res ; 47(D1): D427-D432, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357350

RESUMEN

The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Anotación de Secuencia Molecular , Dominios Proteicos , Proteínas/química , Secuencias Repetitivas de Aminoácido
8.
Nucleic Acids Res ; 46(W1): W402-W407, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29746699

RESUMEN

RepeatsDB-lite (http://protein.bio.unipd.it/repeatsdb-lite) is a web server for the prediction of repetitive structural elements and units in tandem repeat (TR) proteins. TRs are a widespread but poorly annotated class of non-globular proteins carrying heterogeneous functions. RepeatsDB-lite extends the prediction to all TR types and strongly improves the performance both in terms of computational time and accuracy over previous methods, with precision above 95% for solenoid structures. The algorithm exploits an improved TR unit library derived from the RepeatsDB database to perform an iterative structural search and assignment. The web interface provides tools for analyzing the evolutionary relationships between units and manually refine the prediction by changing unit positions and protein classification. An all-against-all structure-based sequence similarity matrix is calculated and visualized in real-time for every user edit. Reviewed predictions can be submitted to RepeatsDB for review and inclusion.


Asunto(s)
Algoritmos , Proteínas/química , Secuencias Repetitivas de Aminoácido , Programas Informáticos , Homología Estructural de Proteína , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Internet , Ligandos , Modelos Moleculares , Anotación de Secuencia Molecular , Unión Proteica , Dominios Proteicos , Estructura Secundaria de Proteína , Factores de Tiempo
9.
J Struct Biol ; 201(2): 130-138, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29017817

RESUMEN

In recent years, a number of new protein structures that possess tandem repeats have emerged. Many of these proteins are comprised of tandem arrays of ß-hairpins. Today, the amount and variety of the data on these ß-hairpin repeat (BHR) structures have reached a level that requires detailed analysis and further classification. In this paper, we classified the BHR proteins, compared structures, sequences of repeat motifs, functions and distribution across the major taxonomic kingdoms of life and within organisms. As a result, we identified six different BHR folds in tandem repeat proteins of Class III (elongated structures) and one BHR fold (up-and-down ß-barrel) in Class IV ("closed" structures). Our survey reveals the high incidence of the BHR proteins among bacteria and viruses and their possible relationship to the structures of amyloid fibrils. It indicates that BHR folds will be an attractive target for future structural studies, especially in the context of age-related amyloidosis and emerging infectious diseases. This work allowed us to update the RepeatsDB database, which contains annotated tandem repeat protein structures and to construct sequence profiles based on BHR structural alignments.


Asunto(s)
Pliegue de Proteína , Proteínas/química , Proteínas/clasificación , Secuencias de Aminoácidos , Amiloide/química , Proteínas Bacterianas/química , Bases de Datos de Proteínas , Humanos , Internet , Modelos Moleculares , Priones/química , Conformación Proteica , Conformación Proteica en Lámina beta , Secuencias Repetitivas de Aminoácido , Proteínas Virales/química , Zinc/metabolismo
11.
Nucleic Acids Res ; 45(D1): D308-D312, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899671

RESUMEN

RepeatsDB 2.0 (URL: http://repeatsdb.bio.unipd.it/) is an update of the database of annotated tandem repeat protein structures. Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. Here we provide a new version of RepeatsDB with an improved classification schema including high quality annotations for ∼5400 protein structures. RepeatsDB 2.0 features information on start and end positions for the repeat regions and units for all entries. The extensive growth of repeat unit characterization was possible by applying the novel ReUPred annotation method over the entire Protein Data Bank, with data quality is guaranteed by an extensive manual validation for >60% of the entries. The updated web interface includes a new search engine for complex queries and a fully re-designed entry page for a better overview of structural data. It is now possible to compare unit positions, together with secondary structure, fold information and Pfam domains. Moreover, a new classification level has been introduced on top of the existing scheme as an independent layer for sequence similarity relationships at 40%, 60% and 90% identity.


Asunto(s)
Bases de Datos de Proteínas , Secuencias Repetitivas de Aminoácido , Animales , Bases de Datos de Proteínas/estadística & datos numéricos , Humanos , Proteínas/clasificación , Programas Informáticos , Relación Estructura-Actividad
12.
Amino Acids ; 48(6): 1391-400, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-26898549

RESUMEN

Over the last decade, numerous studies have demonstrated the fundamental importance of tandem repeat (TR) proteins in many biological processes. A plethora of new repeat structures have also been solved. The recently published RepeatsDB provides information on TR proteins. However, a detailed structural characterization of repetitive elements is largely missing, as repeat unit annotation is manually curated and currently covers only 3 % of the bona fide TR proteins. Repeat Protein Unit Predictor (ReUPred) is a novel method for the fast automatic prediction of repeat units and repeat classification using an extensive Structure Repeat Unit Library (SRUL) derived from RepeatsDB. ReUPred uses an iterative structural search against the SRUL to find repetitive units. On a test set of solenoid proteins, ReUPred is able to correctly detect 92 % of the proteins. Unlike previous methods, it is also able to correctly classify solenoid repeats in 89 % of cases. It also outperforms two recent state-of-the-art methods for the repeat unit identification problem. The accurate prediction of repeat units increases the number of annotated repeat units by an order of magnitude compared to the sequence-based Pfam classification. ReUPred is implemented in Python for Linux and freely available from the URL: http://protein.bio.unipd.it/reupred/ .


Asunto(s)
Biblioteca de Péptidos , Lenguajes de Programación , Secuencias Repetitivas de Aminoácido/genética , Análisis de Secuencia de Proteína/métodos
13.
Bioinformatics ; 31(7): 1138-40, 2015 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-25414364

RESUMEN

MOTIVATION: Protein sequence and structure representation and manipulation require dedicated software libraries to support methods of increasing complexity. Here, we describe the VIrtual Constrution TOol for pRoteins (Victor) C++ library, an open source platform dedicated to enabling inexperienced users to develop advanced tools and gathering contributions from the community. The provided application examples cover statistical energy potentials, profile-profile sequence alignments and ab initio loop modeling. Victor was used over the last 15 years in several publications and optimized for efficiency. It is provided as a GitHub repository with source files and unit tests, plus extensive online documentation, including a Wiki with help files and tutorials, examples and Doxygen documentation. AVAILABILITY AND IMPLEMENTATION: The C++ library and online documentation, distributed under a GPL license are available from URL: http://protein.bio.unipd.it/victor/.


Asunto(s)
Bases de Datos de Proteínas , Bibliotecas Digitales , Proteínas/química , Alineación de Secuencia/métodos , Programas Informáticos , Biología Computacional/métodos , Humanos , Homología Estructural de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...