Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36519835

RESUMEN

SUMMARY: Sequence homology is a basic concept in protein evolution, structure and function studies. However, there are not many different tools and services for homology searches being sensitive, accurate and fast at the same time. We present a new web server for protein analysis based on COMER2, a sequence alignment and homology search method that exhibits these characteristics. COMER2 has been upgraded since its last publication to improve its alignment quality and ease of use. We demonstrate how the user can benefit from using it by providing examples of extensive annotation of proteins of unknown function. Among the distinctive features of the web server is the user's ability to submit multiple queries with one click of a button. This and other features allow for transparently running homology searches-in a command-line, programmatic or graphical environment-across multiple databases with multiple queries. They also promote extensive simultaneous protein analysis at the sequence, structure and function levels. AVAILABILITY AND IMPLEMENTATION: The COMER web server is available at https://bioinformatics.lt/comer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de Secuencia de Proteína , Programas Informáticos , Análisis de Secuencia de Proteína/métodos , Computadores , Proteínas/química , Alineación de Secuencia , Internet
2.
Bioinformatics ; 36(11): 3570-3572, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32167522

RESUMEN

SUMMARY: Searching for homology in the vast amount of sequence data has a particular emphasis on its speed. We present a completely rewritten version of the sensitive homology search method COMER based on alignment of protein sequence profiles, which is capable of searching big databases even on a lightweight laptop. By harnessing the power of CUDA-enabled graphics processing units, it is up to 20 times faster than HHsearch, a state-of-the-art method using vectorized instructions on modern CPUs. AVAILABILITY AND IMPLEMENTATION: COMER2 is cross-platform open-source software available at https://sourceforge.net/projects/comer2 and https://github.com/minmarg/comer2. It can be easily installed from source code or using stand-alone installers. CONTACT: mindaugas.margelevicius@bti.vu.lt. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Secuencia de Aminoácidos , Alineación de Secuencia
3.
BMC Bioinformatics ; 20(1): 419, 2019 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-31409275

RESUMEN

BACKGROUND: Alignment of sequence families described by profiles provides a sensitive means for establishing homology between proteins and is important in protein evolutionary, structural, and functional studies. In the context of a steadily growing amount of sequence data, estimating the statistical significance of alignments, including profile-profile alignments, plays a key role in alignment-based homology search algorithms. Still, it is an open question as to what and whether one type of distribution governs profile-profile alignment score, especially when profile-profile substitution scores involve such terms as secondary structure predictions. RESULTS: This study presents a methodology for estimating the statistical significance of this type of alignments. The methodology rests on a new algorithm developed for generating random profiles such that their alignment scores are distributed similarly to those obtained for real unrelated profiles. We show that improvements in statistical accuracy and sensitivity and high-quality alignment rate result from statistically characterizing alignments by establishing the dependence of statistical parameters on various measures associated with both individual and pairwise profile characteristics. Implemented in the COMER software, the proposed methodology yielded an increase of up to 34.2% in the number of true positives and up to 61.8% in the number of high-quality alignments with respect to the previous version of the COMER method. CONCLUSIONS: The more accurate estimation of statistical significance is implemented in the COMER method, which is now more sensitive and provides an increased rate of high-quality profile-profile alignments. The results of the present study also suggest directions for future research.


Asunto(s)
Modelos Teóricos , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Conformación Proteica , Alineación de Secuencia
4.
Bioinformatics ; 34(12): 2037-2045, 2018 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-29390109

RESUMEN

Motivation: Protein sequence alignment forms the basis for comparative modeling, the most reliable approach to protein structure prediction, among many other applications. Alignment between sequence families, or profile-profile alignment, represents one of the most, if not the most, sensitive means for homology detection but still necessitates improvement. We aim at improving the quality of profile-profile alignments and the sensitivity induced by them by refining profile-profile substitution scores. Results: We have developed a new score that represents an additional component of profile-profile substitution scores. A comprehensive evaluation shows that the new add-on score statistically significantly improves both the sensitivity and the alignment quality of the COMER method. We discuss why the score leads to the improvement and its almost optimal computational complexity that makes it easily implementable in any profile-profile alignment method. Availability and implementation: An implementation of the add-on score in the open-source COMER software and data are available at https://sourceforge.net/projects/comer. The COMER software is also available on Github at https://github.com/minmarg/comer and as a Docker image (minmar/comer). Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Conformación Proteica , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencia de Aminoácidos , Biología Computacional/métodos , Exactitud de los Datos , Sensibilidad y Especificidad
5.
Bioinformatics ; 33(6): 935-937, 2017 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-28011769

RESUMEN

Summary: The PPI3D web server is focused on searching and analyzing the structural data on protein-protein interactions. Reducing the data redundancy by clustering and analyzing the properties of interaction interfaces using Voronoi tessellation makes this software a highly effective tool for addressing different questions related to protein interactions. Availability and Implementation: The server is freely accessible at http://bioinformatics.lt/software/ppi3d/ . Contact: ceslovas.venclovas@bti.vu.lt. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Modelos Moleculares , Proteínas/química , Programas Informáticos , Internet , Unión Proteica , Conformación Proteica , Proteínas/metabolismo
6.
Bioinformatics ; 32(18): 2744-52, 2016 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-27153649

RESUMEN

MOTIVATION: Wide application of modeling of three-dimensional protein structures in biomedical research motivates developing protein sequence alignment computer tools featuring high alignment accuracy and sensitivity to remotely homologous proteins. In this paper, we aim at improving the quality of alignments between sequence profiles, encoded multiple sequence alignments. Modeling profile contexts, fixed-length profile fragments, is engaged to achieve this goal. RESULTS: We develop a hierarchical Dirichlet process mixture model to describe the distribution of profile contexts, which is able to capture dependencies between amino acids in each context position. The model represents an attempt at modeling profile fragments at several hierarchical levels, within the profile and among profiles. Even modeling unit-length contexts leads to greater improvements than processing 13-length contexts previously. We develop a new profile comparison method, called COMER, integrating the model. A benchmark with three other profile-to-profile comparison methods shows an increase in both sensitivity and alignment quality. AVAILABILITY AND IMPLEMENTATION: COMER is open-source software licensed under the GNU GPLv3, available at https://sourceforge.net/projects/comer CONTACT: mindaugas.margelevicius@bti.vu.lt SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Teorema de Bayes , Proteínas , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Algoritmos , Secuencia de Aminoácidos , Modelos Moleculares , Análisis de Secuencia de Proteína , Programas Informáticos
7.
Nucleic Acids Res ; 39(4): 1187-96, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20961958

RESUMEN

PD-(D/E)XK nucleases, initially represented by only Type II restriction enzymes, now comprise a large and extremely diverse superfamily of proteins. They participate in many different nucleic acids transactions including DNA degradation, recombination, repair and RNA processing. Different PD-(D/E)XK families, although sharing a structurally conserved core, typically display little or no detectable sequence similarity except for the active site motifs. This makes the identification of new superfamily members using standard homology search techniques challenging. To tackle this problem, we developed a method for the detection of PD-(D/E)XK families based on the binary classification of profile-profile alignments using support vector machines (SVMs). Using a number of both superfamily-specific and general features, SVMs were trained to identify true positive alignments of PD-(D/E)XK representatives. With this method we identified several PFAM families of uncharacterized proteins as putative new members of the PD-(D/E)XK superfamily. In addition, we assigned several unclassified restriction enzymes to the PD-(D/E)XK type. Results show that the new method is able to make confident assignments even for alignments that have statistically insignificant scores. We also implemented the method as a freely accessible web server at http://www.ibt.lt/bioinformatics/software/pdexk/.


Asunto(s)
Inteligencia Artificial , Endonucleasas/clasificación , Alineación de Secuencia/métodos , Secuencia de Aminoácidos , Dominio Catalítico , Secuencia Conservada , Enzimas de Restricción del ADN/química , Enzimas de Restricción del ADN/clasificación , Endonucleasas/química , Exonucleasas/clasificación , Resolvasas de Unión Holliday/química , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido , Programas Informáticos
8.
Bioinformatics ; 27(5): 723-4, 2011 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-21186248

RESUMEN

UNLABELLED: We present Voroprot, an interactive cross-platform software tool that provides a unique set of capabilities for exploring geometric features of protein structure. Voroprot allows the construction and visualization of the Apollonius diagram (also known as the additively weighted Voronoi diagram), the Apollonius graph, protein alpha shapes, interatomic contact surfaces, solvent accessible surfaces, pockets and cavities inside protein structure. AVAILABILITY: Voroprot is available for Windows, Linux and Mac OS X operating systems and can be downloaded from http://www.ibt.lt/bioinformatics/voroprot/.


Asunto(s)
Imagenología Tridimensional/métodos , Modelos Moleculares , Proteínas/química , Programas Informáticos , Gráficos por Computador , Simulación por Computador , Conformación Proteica
9.
Bioinformatics ; 26(15): 1905-6, 2010 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-20529888

RESUMEN

SUMMARY: Detection of distant homology is a widely used computational approach for studying protein evolution, structure and function. Here, we report a homology search web server based on sequence profile-profile comparison. The user may perform searches in one of several regularly updated profile databases using either a single sequence or a multiple sequence alignment as an input. The same profile databases can also be downloaded for local use. The capabilities of the server are illustrated with the identification of new members of the highly diverse PD-(D/E)XK nuclease superfamily. AVAILABILITY: http://www.ibt.lt/bioinformatics/coma/


Asunto(s)
Biología Computacional/métodos , Internet , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Alineación de Secuencia
10.
BMC Bioinformatics ; 11: 89, 2010 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-20158924

RESUMEN

BACKGROUND: Detection of common evolutionary origin (homology) is a primary means of inferring protein structure and function. At present, comparison of protein families represented as sequence profiles is arguably the most effective homology detection strategy. However, finding the best way to represent evolutionary information of a protein sequence family in the profile, to compare profiles and to estimate the biological significance of such comparisons, remains an active area of research. RESULTS: Here, we present a new homology detection method based on sequence profile-profile comparison. The method has a number of new features including position-dependent gap penalties and a global score system. Position-dependent gap penalties provide a more biologically relevant way to represent and align protein families as sequence profiles. The global score system enables an analytical solution of the statistical parameters needed to estimate the statistical significance of profile-profile similarities. The new method, together with other state-of-the-art profile-based methods (HHsearch, COMPASS and PSI-BLAST), is benchmarked in all-against-all comparison of a challenging set of SCOP domains that share at most 20% sequence identity. For benchmarking, we use a reference ("gold standard") free model-based evaluation framework. Evaluation results show that at the level of protein domains our method compares favorably to all other tested methods. We also provide examples of the new method outperforming structure-based similarity detection and alignment. The implementation of the new method both as a standalone software package and as a web server is available at http://www.ibt.lt/bioinformatics/coma. CONCLUSION: Due to a number of developments, the new profile-profile comparison method shows an improved ability to match distantly related protein domains. Therefore, the method should be useful for annotation and homology modeling of uncharacterized proteins.


Asunto(s)
Evolución Molecular , Proteínas/química , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Alineación de Secuencia , Análisis de Secuencia de Proteína
11.
Proteins ; 77 Suppl 9: 81-8, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19639635

RESUMEN

Here, we describe our template-based protein modeling approach and its performance during the eighth community-wide experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP8, http://predictioncenter.org/casp8). In CASP8, our modeling approach was supplemented by the newly developed distant homology detection method based on sequence profile-profile comparison. Detection of structural homologs that could be used as modeling templates was largely achieved by automated profile-based searches. However, the other two major steps in template-based modeling (TBM) (selection of the best template(s) and construction of the optimal sequence-structure alignment) to a large degree relied on the combination of automatic tools and manual input. The analysis of 64 domains categorized by CASP8 assessors as TBM domains revealed that we missed correct structural templates for only four of them. The use of multiple templates or their fragments enabled us to improve over the structure of the single best PDB template in about 1/3 of our models for TBM domains. Our results for sequence-structure alignments are mixed. Although many models have optimal or near optimal sequence mapping, a large fraction contains one or more misaligned regions. Strikingly, in spite of this, our TBM models have the best overall alignment accuracy scores. This clearly suggests that the correct mapping of protein sequence onto three-dimensional structure remains one of the big challenges in protein structure prediction.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Humanos , Conformación Proteica , Alineación de Secuencia
12.
BMC Bioinformatics ; 9: 296, 2008 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-18588692

RESUMEN

BACKGROUND: Sequence searches are routinely employed to detect and annotate related proteins. However, a rapid growth of databases necessitates a frequent repetition of sequence searches and subsequent analysis of obtained results. Although there are several automatic systems available for executing periodical sequence searches and reporting results, they all suffer either from a lack of sensitivity, restrictive database choice or limited flexibility in setting up search strategies. Here, a new sequence search and reporting software package designed to address these shortcomings is described. RESULTS: Re-searcher is an open-source highly configurable system for recurrent detection and reporting of new homologs for the sequence of interest in specified protein sequence databases. Searches are performed using PSI-BLAST at desired time intervals either within NCBI or local databases. In addition to searches against individual databases, the system can perform "PDB-BLAST"-like combined searches, when PSI-BLAST profile generated during search against the first database is used to search the second database. The system supports multiple users enabling each to separately keep track of multiple queries and query-specific results. CONCLUSIONS: Re-searcher features a large number of options enabling automatic periodic detection of both close and distant homologs. At the same time it has a simple and intuitive interface, making the analysis of results even for a large number of queries a straightforward task.


Asunto(s)
Proteínas/química , Análisis de Secuencia de Proteína , Programas Informáticos , Animales , Bases de Datos de Proteínas , Humanos , Proteínas/genética , Homología de Secuencia
13.
Proteins ; 61 Suppl 7: 99-105, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16187350

RESUMEN

Along with over 150 other groups we have tested our template-based protein structure prediction approach by submitting models for 30 target proteins to the sixth round of the Critical Assessment of Protein Structure Prediction Methods (CASP6, http://predictioncenter.org). Most of our modeled proteins fall into the comparative or homology modeling (CM) category, and some are fold recognition (FR) targets. The key feature of our structure prediction strategy in CASP6 was an attempt to optimally select structural templates and to make accurate sequence-structure alignments. Template selection was based mainly on consensus results of multiple sequence searches. Likewise, the consensus of multiple alignment variants (or lack of it) was used to initially delineate reliable and unreliable alignment regions. Structure evaluation approaches were then used to identify the correct sequence-structure mapping. Our results suggest that in many cases use of multiple templates is advantageous. Selecting correct alignments even within the context of a three-dimensional structure remains a challenge. Together with more effective energy evaluation methods the simultaneous relaxation/refinement of a "frozen" backbone inherited from the template is likely needed to see a clear progress in tackling this problem. Our analysis also suggests that human input has little to contribute to automatic methods in modeling high homology targets. On the other hand, human expertise can be very valuable in modeling distantly related proteins and critical in cases of unexpected evolutionary changes in protein structure.


Asunto(s)
Biología Computacional/métodos , Proteómica/métodos , Algoritmos , Simulación por Computador , Computadores , Interpretación Estadística de Datos , Bases de Datos de Proteínas , Evolución Molecular , Modelos Moleculares , Método de Montecarlo , Conformación Proteica , Pliegue de Proteína , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Proteínas/química , Reproducibilidad de los Resultados , Alineación de Secuencia , Programas Informáticos
14.
BMC Bioinformatics ; 6: 185, 2005 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-16033659

RESUMEN

BACKGROUND: Protein sequence alignments have become indispensable for virtually any evolutionary, structural or functional study involving proteins. Modern sequence search and comparison methods combined with rapidly increasing sequence data often can reliably match even distantly related proteins that share little sequence similarity. However, even highly significant matches generally may have incorrectly aligned regions. Therefore when exact residue correspondence is used to transfer biological information from one aligned sequence to another, it is critical to know which alignment regions are reliable and which may contain alignment errors. RESULTS: PSI-BLAST-ISS is a standalone Unix-based tool designed to delineate reliable regions of sequence alignments as well as to suggest potential variants in unreliable regions. The region-specific reliability is assessed by producing multiple sequence alignments in different sequence contexts followed by the analysis of the consistency of alignment variants. The PSI-BLAST-ISS output enables the user to simultaneously analyze alignment reliability between query and multiple homologous sequences. In addition, PSI-BLAST-ISS can be used to detect distantly related homologous proteins. The software is freely available at: http://www.ibt.lt/bioinformatics/iss. CONCLUSION: PSI-BLAST-ISS is an effective reliability assessment tool that can be useful in applications such as comparative modelling or analysis of individual sequence regions. It favorably compares with the existing similar software both in the performance and functional features.


Asunto(s)
Almacenamiento y Recuperación de la Información/métodos , Sistemas de Información , Alineación de Secuencia/normas , Programas Informáticos , Internet , Reproducibilidad de los Resultados , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...