Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nucleic Acids Res ; 43(Database issue): D1064-70, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25348399

RESUMEN

HAMAP (High-quality Automated and Manual Annotation of Proteins--available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Homología de Secuencia de Aminoácido , Humanos , Internet , Proteínas/clasificación
2.
Nucleic Acids Res ; 41(Database issue): D344-7, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23161676

RESUMEN

PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule a collection of rules, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Here, we describe recent developments that allow users to perform whole-proteome annotation as well as a number of filtering options that can be combined to perform powerful targeted searches for biological discovery. The latest version of PROSITE (release 20.85, of 30 August 2012) contains 1308 patterns, 1039 profiles and 1041 ProRules.


Asunto(s)
Secuencias de Aminoácidos , Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína , Secuencia de Aminoácidos , Secuencia Conservada , Internet , Anotación de Secuencia Molecular , Proteínas/química , Proteínas/clasificación , Proteoma/química
3.
Nucleic Acids Res ; 41(Database issue): D584-9, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193261

RESUMEN

HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Proteínas/clasificación , Eucariontes/genética , Internet
4.
Nucleic Acids Res ; 36(Database issue): D245-9, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18003654

RESUMEN

PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. In this article, we describe the implementation of a new method to assign a status to pattern matches, the new PROSITE web page and a new approach to improve the specificity and sensitivity of PROSITE methods. The latest version of PROSITE (release 20.19 of 11 September 2007) contains 1319 patterns, 745 profiles and 764 ProRules. Over the past 2 years, about 200 domains have been added, and now 53% of UniProtKB/Swiss-Prot entries (release 54.2 of 11 September 2007) have a PROSITE match. PROSITE is available on the web at: http://www.expasy.org/prosite/.


Asunto(s)
Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Proteínas/clasificación , Aminoácidos/química , Proteínas Bacterianas/química , Proteínas Bacterianas/clasificación , Bases de Datos de Proteínas/historia , Historia del Siglo XX , Historia del Siglo XXI , Internet , Proteínas/química , Alineación de Secuencia , Análisis de Secuencia de Proteína , Programas Informáticos , Interfaz Usuario-Computador
5.
Gigascience ; 9(2)2020 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-32034905

RESUMEN

BACKGROUND: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation. RESULTS: Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline. CONCLUSIONS: HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.


Asunto(s)
Genómica/métodos , Anotación de Secuencia Molecular/métodos , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos/normas , Animales , Genómica/normas , Humanos , Anotación de Secuencia Molecular/normas , Análisis de Secuencia de ADN/normas , Análisis de Secuencia de Proteína/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA