Pesquisa | Biblioteca Virtual em Saúde

Generalized Bootstrap Supports for Phylogenetic Analyses of Protein Sequences Incorporating Alignment Uncertainty.

Chatzou, Maria; Floden, Evan W; Di Tommaso, Paolo; Gascuel, Olivier; Notredame, Cedric.

Syst Biol ; 67(6): 997-1009, 2018 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-30295908

RESUMO

Phylogenetic reconstructions are essential in genomics data analyses and depend on accurate multiple sequence alignment (MSA) models. We show that all currently available large-scale progressive multiple alignment methods are numerically unstable when dealing with amino-acid sequences. They produce significantly different output when changing sequence input order. We used the HOMFAM protein sequences dataset to show that on datasets larger than 100 sequences, this instability affects on average 21.5% of the aligned residues. The resulting Maximum Likelihood (ML) trees estimated from these MSAs are equally unstable with over 38% of the branches being sensitive to the sequence input order. We established that about two-thirds of this uncertainty stems from the unordered nature of children nodes within the guide trees used to estimate MSAs. To quantify this uncertainty we developed unistrap, a novel approach that estimates the combined effect of alignment uncertainty and site sampling on phylogenetic tree branch supports. Compared with the regular bootstrap procedure, unistrap provides branch support estimates that take into account a larger fraction of the parameters impacting tree instability when processing datasets containing a large number of sequences.

Assuntos

Classificação/métodos , Modelos Genéticos , Filogenia , Proteínas/genética , Proteínas/química , Alinhamento de Sequência , Software , Incerteza

Nextflow enables reproducible computational workflows.

Di Tommaso, Paolo; Chatzou, Maria; Floden, Evan W; Barja, Pablo Prieto; Palumbo, Emilio; Notredame, Cedric.

Nat Biotechnol ; 35(4): 316-319, 2017 04 11.

Artigo em Inglês | MEDLINE | ID: mdl-28398311

Assuntos

Biologia Computacional/métodos , Processamento Eletrônico de Dados , Genômica , Software , Fluxo de Trabalho

PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming.

Nucleic Acids Res ; 44(W1): W339-43, 2016 07 08.

Artigo em Inglês | MEDLINE | ID: mdl-27106060

RESUMO

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee.

Assuntos

Algoritmos , Proteínas de Membrana/química , Análise de Sequência de Proteína/estatística & dados numéricos , Interface Usuário-Computador , Sequência de Aminoácidos , Gráficos por Computador , Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação , Internet , Proteínas de Membrana/genética , Domínios Proteicos , Estrutura Secundária de Proteína , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos

Multiple sequence alignment modeling: methods and applications.

Chatzou, Maria; Magis, Cedrik; Chang, Jia-Ming; Kemena, Carsten; Bussotti, Giovanni; Erb, Ionas; Notredame, Cedric.

Brief Bioinform ; 17(6): 1009-1023, 2016 11.

Artigo em Inglês | MEDLINE | ID: mdl-26615024

RESUMO

This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.

Assuntos

Alinhamento de Sequência , Algoritmos , DNA , Genômica , Proteínas , Reprodutibilidade dos Testes

The impact of Docker containers on the performance of genomic pipelines.

Di Tommaso, Paolo; Palumbo, Emilio; Chatzou, Maria; Prieto, Pablo; Heuer, Michael L; Notredame, Cedric.

PeerJ ; 3: e1273, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26421241

RESUMO

Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.

SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.

Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric.

Nucleic Acids Res ; 42(Web Server issue): W356-60, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24972831

RESUMO

This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee.

Assuntos

RNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Internet , Conformação de Ácido Nucleico

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA