Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Bioinformatics ; 29(9): 1120-6, 2013 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-23505299

RESUMEN

MOTIVATION: Peptides play important roles in signalling, regulation and immunity within an organism. Many have successfully been used as therapeutic products often mimicking naturally occurring peptides. Here we present PeptideLocator for the automated prediction of functional peptides in a protein sequence. RESULTS: We have trained a machine learning algorithm to predict bioactive peptides within protein sequences. PeptideLocator performs well on training data achieving an area under the curve of 0.92 when tested in 5-fold cross-validation on a set of 2202 redundancy reduced peptide containing protein sequences. It has predictive power when applied to antimicrobial peptides, cytokines, growth factors, peptide hormones, toxins, venoms and other peptides. It can be applied to refine the choice of experimental investigations in functional studies of proteins. AVAILABILITY AND IMPLEMENTATION: PeptideLocator is freely available for academic users at http://bioware.ucd.ie/.


Asunto(s)
Algoritmos , Péptidos/química , Análisis de Secuencia de Proteína/métodos , Péptidos Catiónicos Antimicrobianos/química , Inteligencia Artificial , Péptidos/clasificación , Proteínas/química
2.
Nucleic Acids Res ; 40(Database issue): D242-51, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22110040

RESUMEN

Linear motifs are short, evolutionarily plastic components of regulatory proteins and provide low-affinity interaction interfaces. These compact modules play central roles in mediating every aspect of the regulatory functionality of the cell. They are particularly prominent in mediating cell signaling, controlling protein turnover and directing protein localization. Given their importance, our understanding of motifs is surprisingly limited, largely as a result of the difficulty of discovery, both experimentally and computationally. The Eukaryotic Linear Motif (ELM) resource at http://elm.eu.org provides the biological community with a comprehensive database of known experimentally validated motifs, and an exploratory tool to discover putative linear motifs in user-submitted protein sequences. The current update of the ELM database comprises 1800 annotated motif instances representing 170 distinct functional classes, including approximately 500 novel instances and 24 novel classes. Several older motif class entries have been also revisited, improving annotation and adding novel instances. Furthermore, addition of full-text search capabilities, an enhanced interface and simplified batch download has improved the overall accessibility of the ELM data. The motif discovery portion of the ELM resource has added conservation, and structural attributes have been incorporated to aid users to discriminate biologically relevant motifs from stochastically occurring non-functional instances.


Asunto(s)
Secuencias de Aminoácidos , Bases de Datos de Proteínas , Gráficos por Computador , Enfermedad/genética , Eucariontes , Análisis de Secuencia de Proteína , Interfaz Usuario-Computador , Proteínas Virales/química
3.
BMC Bioinformatics ; 14: 224, 2013 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-23855714

RESUMEN

BACKGROUND: Computational protein short linear motif discovery can use protein interaction information to search for motifs among proteins which share a common interactor. Cytoscape provides a visual interface for protein networks but there is no streamlined way to rapidly visualize motifs in a network of proteins, or to integrate computational discovery with such visualizations. RESULTS: We present SLiMScape, a Cytoscape plugin, which enables both de novo motif discovery and searches for instances of known motifs. Data is presented using Cytoscape's visualization features thus providing an intuitive interface for interpreting results. The distribution of discovered or user-defined motifs may be selectively displayed and the distribution of protein domains may be viewed simultaneously. To facilitate this SLiMScape automatically retrieves domains for each protein. CONCLUSION: SLiMScape provides a platform for performing short linear motif analyses of protein interaction networks by integrating motif discovery and search tools in a network visualization environment. This significantly aids in the discovery of novel short linear motifs and in visualizing the distribution of known motifs.


Asunto(s)
Secuencias de Aminoácidos , Programas Informáticos , Mapas de Interacción de Proteínas , Análisis de Secuencia de Proteína
4.
Nucleic Acids Res ; 39(Web Server issue): W56-60, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21622654

RESUMEN

Short, linear motifs (SLiMs) play a critical role in many biological processes. The SLiMSearch 2.0 (Short, Linear Motif Search) web server allows researchers to identify occurrences of a user-defined SLiM in a proteome, using conservation and protein disorder context statistics to rank occurrences. User-friendly output and visualizations of motif context allow the user to quickly gain insight into the validity of a putatively functional motif occurrence. For each motif occurrence, overlapping UniProt features and annotated SLiMs are displayed. Visualization also includes annotated multiple sequence alignments surrounding each occurrence, showing conservation and protein disorder statistics in addition to known and predicted SLiMs, protein domains and known post-translational modifications. In addition, enrichment of Gene Ontology terms and protein interaction partners are provided as indicators of possible motif function. All web server results are available for download. Users can search motifs against the human proteome or a subset thereof defined by Uniprot accession numbers or GO term. The SLiMSearch server is available at: http://bioware.ucd.ie/slimsearch2.html.


Asunto(s)
Secuencias de Aminoácidos , Programas Informáticos , Algoritmos , Humanos , Internet , Proteómica , Interfaz Usuario-Computador
5.
BMC Bioinformatics ; 13: 104, 2012 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-22607209

RESUMEN

BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3-10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. RESULTS: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. CONCLUSIONS: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.


Asunto(s)
Secuencias de Aminoácidos , Análisis de Secuencia de Proteína/métodos , Secuencia Conservada , Bases de Datos de Proteínas , Humanos , Estructura Terciaria de Proteína , Proteínas/química
6.
Mol Cell Proteomics ; 9(1): 1-10, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19674966

RESUMEN

Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site.


Asunto(s)
Bases de Datos de Proteínas/normas , Proteoma/análisis , Sistemas de Administración de Bases de Datos/normas , Humanos , Cooperación Internacional , Proteómica/métodos , Terminología como Asunto
7.
Nucleic Acids Res ; 38(Web Server issue): W534-9, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20497999

RESUMEN

Short, linear motifs (SLiMs) play a critical role in many biological processes, particularly in protein-protein interactions. The Short, Linear Motif Finder (SLiMFinder) web server is a de novo motif discovery tool that identifies statistically over-represented motifs in a set of protein sequences, accounting for the evolutionary relationships between them. Motifs are returned with an intuitive P-value that greatly reduces the problem of false positives and is accessible to biologists of all disciplines. Input can be uploaded by the user or extracted directly from UniProt. Numerous masking options give the user great control over the contextual information to be included in the analyses. The SLiMFinder server combines these with user-friendly output and visualizations of motif context to allow the user to quickly gain insight into the validity of a putatively functional motif. These visualizations include alignments of motif occurrences, alignments of motifs and their homologues and a visual schematic of the top-ranked motifs. Returned motifs can also be compared with known SLiMs from the literature using CompariMotif. All results are available for download. The SLiMFinder server is available at: http://bioware.ucd.ie/slimfinder.html.


Asunto(s)
Secuencias de Aminoácidos , Programas Informáticos , Algoritmos , Internet , Alineación de Secuencia , Análisis de Secuencia de Proteína , Interfaz Usuario-Computador
8.
Nucleic Acids Res ; 38(Database issue): D167-80, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19920119

RESUMEN

Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a 'Bar Code' format, which also displays known instances from homologous proteins through a novel 'Instance Mapper' protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation.


Asunto(s)
Secuencias de Aminoácidos/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Células Eucariotas/química , Secuencia de Aminoácidos , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido , Programas Informáticos
9.
J Proteome Res ; 9(7): 3759-63, 2010 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-20496950

RESUMEN

Antibodies are a primary research tool for a diverse range of experiments in biology, from development to pathology. Their utility is derived from their ability to specifically identify proteins at a high level of sensitivity. This diversity of experimental requirements stretches the capabilities of these key research reagents. However, antibodies seem well placed to answer the challenges of the forthcoming proteome-scale biology. Their use in such a wide variety of experimental requirements impacts on the choice of epitope used to raise the antibody. Understanding the constraints imposed by the experimental configuration is crucial to developing well-characterized affinity reagents. Their application to a wide range of biological fields and relatively low-cost of manufacture has ensured that the demand for a resource of well-characterized antibodies will remain high and that they will be an important biological resource for the foreseeable future. This demand will only increase as the number of therapeutic targets continues to grow. Current tools to aid in the production of affinity reagents are disparate and not freely available. We present a freely available Web resource ( http://epic.embl.de ) for the proteomics community; the Epitope Choice Resource (EpiC) for the selection of epitopes and characterization of the target protein. It provides the community with a single Web-based portal for the exploration of epitopes on a target protein and connects over the Internet to a wide range of bioinformatic tools ensuring that data being presented are up to date.


Asunto(s)
Anticuerpos , Bases de Datos de Proteínas , Epítopos , Técnicas Inmunológicas , Proteómica/métodos , Programas Informáticos , Afinidad de Anticuerpos , Secuencia de Consenso , Humanos , Internet
10.
J Phys Condens Matter ; 21(3): 034106, 2009 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-21817251

RESUMEN

We present the theory of thermal equivalence in the framework of the Peyrard-Bishop model and some of its anharmonic variants. The thermal equivalence gives rise to a melting index τ which maps closely the experimental DNA melting temperatures for short DNA sequences. We show that the efficient calculation of the melting index can be used to analyse the parameters of the Peyrard-Bishop model and propose an improved set of Morse potential parameters. With this new set we are able to calculate some of the experimental melting temperatures to ± 1.2 °C. We review some of the concepts of sequencing probe design and show how to use the melting index to explore the possibilities of gene coverage by tuning the model parameters.

11.
Front Biosci ; 13: 6580-603, 2008 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-18508681

RESUMEN

It is now clear that a detailed picture of cell regulation requires a comprehensive understanding of the abundant short protein motifs through which signaling is channeled. The current body of knowledge has slowly accumulated through piecemeal experimental investigation of individual motifs in signaling. Computational methods contributed little to this process. A new generation of bioinformatics tools will aid the future investigation of motifs in regulatory proteins, and the disordered polypeptide regions in which they frequently reside. Allied to high throughput methods such as phosphoproteomics, signaling networks are becoming amenable to experimental deconstruction. In this review, we summarise the current state of linear motif biology, which uses low affinity interactions to create cooperative, combinatorial and highly dynamic regulatory protein complexes. The discrete deterministic properties implicit to these assemblies suggest that models for cell regulatory networks in systems biology should neither be overly dependent on stochastic nor on smooth deterministic approximations.


Asunto(s)
Fenómenos Fisiológicos Celulares , Transducción de Señal , Animales , Retículo Endoplásmico/fisiología , Homeostasis , Mamíferos , Modelos Biológicos , Proteínas/fisiología , Reproducibilidad de los Resultados
12.
Nucleic Acids Res ; 33(19): e171, 2005 Nov 07.
Artículo en Inglés | MEDLINE | ID: mdl-16275781

RESUMEN

Several methods for ultra high-throughput DNA sequencing are currently under investigation. Many of these methods yield very short blocks of sequence information (reads). Here we report on an analysis showing the level of genome sequencing possible as a function of read length. It is shown that re-sequencing and de novo sequencing of the majority of a bacterial genome is possible with read lengths of 20-30 nt, and that reads of 50 nt can provide reconstructed contigs (a contiguous fragment of sequence data) of 1000 nt and greater that cover 80% of human chromosome 1.


Asunto(s)
Genómica/métodos , Análisis de Secuencia de ADN/métodos , Cromosomas Humanos Par 1 , Estudios de Factibilidad , Genoma Bacteriano , Genoma Humano , Genoma Viral , Humanos
13.
PeerJ ; 2: e315, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24711967

RESUMEN

Many protein domains bind to short peptide sequences, called linear motifs. Data on their sequence specificities is sparse, which is why biologists usually resort to basic pattern searches to identify new putative binding sites for experimental follow-up. Most motifs have poor specificity and prioritization of the matches is thus crucial when scanning a full proteome with a pattern. Here we present a generic method to prioritize motif occurrence predictions by using cellular contextual information. We take 2 parameters as input: the motif occurrences and one or more of the interacting domains. The potential hits are ranked based on how strongly the context network associates them with a protein containing one of the specified domains, which leads to an increased predictive performance. The method is available through a web interface at doremi.jensenlab.org, which allows for an easy application of the method. We show that this approach leads to improved predictions of binding partners for PDZ domains and the SUMO binding domain. This is consistent with the earlier observation that coupling sequence motifs with network information improves kinase-specific substrate predictions.

14.
Sci Signal ; 5(243): pe40, 2012 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-23012652

RESUMEN

Interactions between short peptides within proteins and peptide-binding domains can trigger many important cell signaling processes, and their interactions are typically of modest affinity. A study showed that this modest affinity appears to be favored by evolution. They used phage display selection to discover "superbinder" Src Homology 2 (SH2) domains, which bound peptides with much stronger affinity than naturally occurring SH2 domains. These superbinder domains had strong biological effects, such as blocking cell signaling. Although the superbinders had higher affinity, this did not appear to reduce their specificity. In contrast, SH2-binding peptides from bacterial pathogens have evolved to exhibit promiscuity of binding to multiple SH2 domains, carried within effector proteins that subvert signaling upon entry into the mammalian cell. Because there are many potential peptide binders of the SH2 domain found in numerous human proteins, modest affinity not only may optimize transient signaling mediated by reversible interactions but also may minimize off-target deleterious binding effects. The stage is set for a more thorough evaluation of the specificity and off-target impact of both naturally occurring and artificial domains and peptides. This may help define both targets and reagents for therapeutic intervention in key signaling processes mediated by short peptides.


Asunto(s)
Evolución Molecular , Modelos Biológicos , Péptidos/metabolismo , Dominios y Motivos de Interacción de Proteínas/fisiología , Transducción de Señal/fisiología , Animales , Humanos , Unión Proteica/fisiología , Dominios Homologos src
15.
J Mol Biol ; 415(1): 193-204, 2012 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-22079048

RESUMEN

Short linear motifs in proteins (typically 3-12 residues in length) play key roles in protein-protein interactions by frequently binding specifically to peptide binding domains within interacting proteins. Their tendency to be found in disordered segments of proteins has meant that they have often been overlooked. Here we present SLiMPred (short linear motif predictor), the first general de novo method designed to computationally predict such regions in protein primary sequences independent of experimentally defined homologs and interactors. The method applies machine learning techniques to predict new motifs based on annotated instances from the Eukaryotic Linear Motif database, as well as structural, biophysical, and biochemical features derived from the protein primary sequence. We have integrated these data sources and benchmarked the predictive accuracy of the method, and found that it performs equivalently to a predictor of protein binding regions in disordered regions, in addition to having predictive power for other classes of motif sites such as polyproline II helix motifs and short linear motifs lying in ordered regions. It will be useful in predicting peptides involved in potential protein associations and will aid in the functional characterization of proteins, especially of proteins lacking experimental information on structures and interactions. We conclude that, despite the diversity of motif sequences and structures, SLiMPred is a valuable tool for prioritizing potential interaction motifs in proteins.


Asunto(s)
Secuencias de Aminoácidos , Dominios y Motivos de Interacción de Proteínas , Proteínas/química , Proteínas/metabolismo , Secuencia de Aminoácidos , Inteligencia Artificial , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Unión Proteica , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteoma/química , Proteoma/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos
16.
PLoS One ; 7(10): e45012, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23056189

RESUMEN

The conventional wisdom is that certain classes of bioactive peptides have specific structural features that endow their particular functions. Accordingly, predictions of bioactivity have focused on particular subgroups, such as antimicrobial peptides. We hypothesized that bioactive peptides may share more general features, and assessed this by contrasting the predictive power of existing antimicrobial predictors as well as a novel general predictor, PeptideRanker, across different classes of peptides.We observed that existing antimicrobial predictors had reasonable predictive power to identify peptides of certain other classes i.e. toxin and venom peptides. We trained two general predictors of peptide bioactivity, one focused on short peptides (4-20 amino acids) and one focused on long peptides (> 20 amino acids). These general predictors had performance that was typically as good as, or better than, that of specific predictors. We noted some striking differences in the features of short peptide and long peptide predictions, in particular, high scoring short peptides favour phenylalanine. This is consistent with the hypothesis that short and long peptides have different functional constraints, perhaps reflecting the difficulty for typical short peptides in supporting independent tertiary structure.We conclude that there are general shared features of bioactive peptides across different functional classes, indicating that computational prediction may accelerate the discovery of novel bioactive peptides and aid in the improved design of existing peptides, across many functional classes. An implementation of the predictive method, PeptideRanker, may be used to identify among a set of peptides those that may be more likely to be bioactive.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Diseño de Fármacos , Péptidos/química , Secuencia de Aminoácidos , Animales , Antiinfecciosos/química , Bases de Datos Factuales , Humanos , Hormonas Peptídicas/química , Reproducibilidad de los Resultados , Toxinas Biológicas/química
17.
PLoS One ; 3(6): e2500, 2008 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-18563203

RESUMEN

BACKGROUND: Sequencing by hybridisation is an effective method for obtaining large amounts of DNA sequence information at low cost. The efficiency of SBH depends on the design of the probe library to provide the maximum information for minimum cost. Long probes provide a higher probability of non-repeated sequences but lead to an increase in the number of probes required whereas short probes may not provide unique sequence information due to repeated sequences. We have investigated the effect of probe length, use of reference sequences, and thermal filtering on the design of probe libraries for several highly variable target DNA sequences. RESULTS: We designed overlapping probe libraries for a range of highly variable drug target genes based on known sequence information and develop a formal terminology to describe probe library design. We find that for some targets these libraries can provide good coverage of a previously unseen target whereas for others the coverage is less than 30%. The optimal probe length varies from as short at 12 nt to as large as 19 nt and depends on the sequence, its variability, and the stringency of thermal filtering. It cannot be determined from inspection of an example gene sequence. CONCLUSIONS: Optimal probe length and the optimal number of reference sequences used to design a probe library are highly target specific for highly variable sequencing targets. The optimum design cannot be determined simply by inspection of input sequences or of alignments but only by detailed analysis of the each specific target. For highly variable sequences, shorter probes can in some cases provide better information than longer probes. Probe library design would benefit from a general purpose tool for analysing these issues. The formal terminology developed here and the analysis approaches it is used to describe will contribute to the development of such tools.


Asunto(s)
Sondas Moleculares , VIH/genética , Hepacivirus/genética , Orthomyxoviridae/genética , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA