Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Mol Evol ; 90(6): 418-428, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36181519

RESUMO

Vertebrate blood coagulation is controlled by a cascade containing more than 20 proteins. The cascade proteins are found in the blood in their zymogen forms and when the cascade is triggered by tissue damage, zymogens are activated and in turn activate their downstream proteins by serine protease activity. In this study, we examined proteomes of 21 chordates, of which 18 are vertebrates, to reveal the modular evolution of the blood coagulation cascade. Additionally, two Arthropoda species were used to compare domain arrangements of the proteins belonging to the hemolymph clotting and the blood coagulation cascades. Within the vertebrate coagulation protein set, almost half of the studied proteins are shared with jawless vertebrates. Domain similarity analyses revealed that there are multiple possible evolutionary trajectories for each coagulation protein. During the evolution of higher vertebrate clades, gene and genome duplications led to the formation of other coagulation cascade proteins.


Assuntos
Fatores de Coagulação Sanguínea , Cordados , Animais , Fatores de Coagulação Sanguínea/genética , Fatores de Coagulação Sanguínea/metabolismo , Vertebrados/genética , Coagulação Sanguínea/genética , Cordados/genética , Genoma
2.
Nucleic Acids Res ; 47(W1): W507-W510, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31076763

RESUMO

Even in the era of next generation sequencing, in which bioinformatics tools abound, annotating transcriptomes and proteomes remains a challenge. This can have major implications for the reliability of studies based on these datasets. Therefore, quality assessment represents a crucial step prior to downstream analyses on novel transcriptomes and proteomes. DOGMA allows such a quality assessment to be carried out. The data of interest are evaluated based on a comparison with a core set of conserved protein domains and domain arrangements. Depending on the studied species, DOGMA offers precomputed core sets for different phylogenetic clades. We now developed a web server for the DOGMA software, offering a user-friendly, simple to use interface. Additionally, the server provides a graphical representation of the analysis results and their placement in comparison to publicly available data. The server is freely available under https://domainworld-services.uni-muenster.de/dogma/. Additionally, for large scale analyses the software can be downloaded free of charge from https://domainworld.uni-muenster.de.


Assuntos
Domínios Proteicos , Proteoma , Software , Transcriptoma , Genoma , Internet , Anotação de Sequência Molecular
3.
BMC Evol Biol ; 20(1): 30, 2020 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-32059645

RESUMO

BACKGROUND: Modularity is important for evolutionary innovation. The recombination of existing units to form larger complexes with new functionalities spares the need to create novel elements from scratch. In proteins, this principle can be observed at the level of protein domains, functional subunits which are regularly rearranged to acquire new functions. RESULTS: In this study we analyse the mechanisms leading to new domain arrangements in five major eukaryotic clades (vertebrates, insects, fungi, monocots and eudicots) at unprecedented depth and breadth. This allows, for the first time, to directly compare rates of rearrangements between different clades and identify both lineage specific and general patterns of evolution in the context of domain rearrangements. We analyse arrangement changes along phylogenetic trees by reconstructing ancestral domain content in combination with feasible single step events, such as fusion or fission. Using this approach we explain up to 70% of all rearrangements by tracing them back to their precursors. We find that rates in general and the ratio between these rates for a given clade in particular, are highly consistent across all clades. In agreement with previous studies, fusions are the most frequent event leading to new domain arrangements. A lineage specific pattern in fungi reveals exceptionally high loss rates compared to other clades, supporting recent studies highlighting the importance of loss for evolutionary innovation. Furthermore, our methodology allows us to link domain emergences at specific nodes in the phylogenetic tree to important functional developments, such as the origin of hair in mammals. CONCLUSIONS: Our results demonstrate that domain rearrangements are based on a canonical set of mutational events with rates which lie within a relatively narrow and consistent range. In addition, gained knowledge about these rates provides a basis for advanced domain-based methodologies for phylogenetics and homology analysis which complement current sequence-based methods.


Assuntos
Eucariotos , Evolução Molecular , Estrutura Terciária de Proteína/genética , Proteínas/química , Proteínas/genética , Animais , Abelhas/fisiologia , Resistência à Doença/genética , Eucariotos/genética , Eucariotos/metabolismo , Células Eucarióticas/metabolismo , Fungos/classificação , Fungos/genética , Ontologia Genética , Mutação/fisiologia , Filogenia , Doenças das Plantas/microbiologia , Comportamento Social , Vertebrados/classificação , Vertebrados/genética , Vertebrados/metabolismo
4.
Brief Bioinform ; 17(6): 1009-1023, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26615024

RESUMO

This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods.


Assuntos
Alinhamento de Sequência , Algoritmos , DNA , Genômica , Proteínas , Reprodutibilidade dos Testes
5.
J Exp Zool B Mol Dev Evol ; 330(5): 296-304, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29845724

RESUMO

The evolution of division of labor between sterile and fertile individuals represents one of the major transitions in biological complexity. A fascinating gradient in eusociality evolved among the ancient hemimetabolous insects, ranging from noneusocial cockroaches through the primitively social lower termites-where workers retain the ability to reproduce-to the higher termites, characterized by lifetime commitment to worker sterility. Juvenile hormone (JH) is a prime candidate for the regulation of reproductive division of labor in termites, as it plays a key role in insect postembryonic development and reproduction. We compared the expression of JH pathway genes between workers and queens in two lower termites (Zootermopsis nevadensis and Cryptotermes secundus) and a higher termite (Macrotermes natalensis) to that of analogous nymphs and adult females of the noneusocial cockroach Blattella germanica. JH biosynthesis and metabolism genes ranged from reproductive female-biased expression in the cockroach to predominantly worker-biased expression in the lower termites. Remarkably, the expression profile of JH pathway genes sets the higher termite apart from the two lower termites, as well as the cockroach, indicating that JH signaling has undergone major changes in this eusocial termite. These changes go beyond mere shifts in gene expression between the different castes, as we find evidence for positive selection in several termite JH pathway genes. Thus, remodeling of the JH pathway may have played a major role in termite social evolution, representing a striking case of convergent molecular evolution between the termites and the distantly related social hymenoptera.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Isópteros/genética , Hormônios Juvenis/genética , Animais , Blattellidae/genética , Blattellidae/crescimento & desenvolvimento , Evolução Molecular , Feminino , Hormônios Juvenis/biossíntese , Hormônios Juvenis/metabolismo , Ninfa , Comportamento Social
6.
Nature ; 490(7421): 535-8, 2012 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-23064225

RESUMO

The main forces directing long-term molecular evolution remain obscure. A sizable fraction of amino-acid substitutions seem to be fixed by positive selection, but it is unclear to what degree long-term protein evolution is constrained by epistasis, that is, instances when substitutions that are accepted in one genotype are deleterious in another. Here we obtain a quantitative estimate of the prevalence of epistasis in long-term protein evolution by relating data on amino-acid usage in 14 organelle proteins and 2 nuclear-encoded proteins to their rates of short-term evolution. We studied multiple alignments of at least 1,000 orthologues for each of these 16 proteins from species from a diverse phylogenetic background and found that an average site contained approximately eight different amino acids. Thus, without epistasis an average site should accept two-fifths of all possible amino acids, and the average rate of amino-acid substitutions should therefore be about three-fifths lower than the rate of neutral evolution. However, we found that the measured rate of amino-acid substitution in recent evolution is 20 times lower than the rate of neutral evolution and an order of magnitude lower than that expected in the absence of epistasis. These data indicate that epistasis is pervasive throughout protein evolution: about 90 per cent of all amino-acid substitutions have a neutral or beneficial impact only in the genetic backgrounds in which they occur, and must therefore be deleterious in a different background of other species. Our findings show that most amino-acid substitutions have different fitness effects in different species and that epistasis provides the primary conceptual framework to describe the tempo and mode of long-term protein evolution.


Assuntos
Epistasia Genética/genética , Evolução Molecular , Substituição de Aminoácidos/genética , Animais , Núcleo Celular/genética , Biologia Computacional , Aptidão Genética , Genótipo , Modelos Genéticos , Mutação , Organelas/genética , Filogenia , Proteínas/química , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie
7.
Genome Res ; 24(12): 2077-89, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25273068

RESUMO

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.


Assuntos
Genoma , Genômica/métodos , Alinhamento de Sequência/métodos , Software , Animais , Biologia Computacional/métodos , Simulação por Computador , Conjuntos de Dados como Assunto , Estudo de Associação Genômica Ampla , Humanos , Mamíferos/genética , Filogenia , Reprodutibilidade dos Testes
8.
Bioinformatics ; 32(17): 2577-81, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153665

RESUMO

MOTIVATION: Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. RESULTS: We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds. AVAILABILITY AND IMPLEMENTATION: DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteoma , Software , Transcriptoma , Biologia Computacional , Genoma
9.
Mol Biol Evol ; 32(11): 2919-31, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26226984

RESUMO

A central goal of biology is to uncover the genetic basis for the origin of new phenotypes. A particularly effective approach is to examine the genomic architecture of species that have secondarily lost a phenotype with respect to their close relatives. In the eusocial Hymenoptera, queens and workers have divergent phenotypes that may be produced via either expression of alternative sets of caste-specific genes and pathways or differences in expression patterns of a shared set of multifunctional genes. To distinguish between these two hypotheses, we investigated how secondary loss of the worker phenotype in workerless ant social parasites impacted genome evolution across two independent origins of social parasitism in the ant genera Pogonomyrmex and Vollenhovia. We sequenced the genomes of three social parasites and their most-closely related eusocial host species and compared gene losses in social parasites with gene expression differences between host queens and workers. Virtually all annotated genes were expressed to some degree in both castes of the host, with most shifting in queen-worker bias across developmental stages. As a result, despite >1 My of divergence from the last common ancestor that had workers, the social parasites showed strikingly little evidence of gene loss, damaging mutations, or shifts in selection regime resulting from loss of the worker caste. This suggests that regulatory changes within a multifunctional genome, rather than sequence differences, have played a predominant role in the evolution of social parasitism, and perhaps also in the many gains and losses of phenotypes in the social insects.


Assuntos
Formigas/classificação , Formigas/genética , Comportamento Animal/fisiologia , Comportamento Social , Animais , Evolução Biológica , Feminino , Perfilação da Expressão Gênica , Genes de Insetos , Estudos de Associação Genética , Componentes Genômicos , Masculino , Reprodução/genética , Seleção Genética , Transcriptoma
10.
Nucleic Acids Res ; 42(Web Server issue): W356-60, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24972831

RESUMO

This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee.


Assuntos
RNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Internet , Conformação de Ácido Nucleico
11.
BMC Bioinformatics ; 16: 19, 2015 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-25626688

RESUMO

BACKGROUND: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments. RESULTS: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method. CONCLUSION: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat .


Assuntos
Algoritmos , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Humanos , Linguagens de Programação , Estrutura Terciária de Proteína
12.
BMC Bioinformatics ; 16: 154, 2015 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-25968113

RESUMO

BACKGROUND: Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. RESULTS: We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. CONCLUSION: We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .


Assuntos
Biologia Computacional/métodos , Domínios e Motivos de Interação entre Proteínas , Proteínas/química , Homologia de Sequência de Aminoácidos , Software , Humanos
13.
Bioinformatics ; 29(9): 1112-9, 2013 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-23449094

RESUMO

MOTIVATION: Aligning RNAs is useful to search for homologous genes, study evolutionary relationships, detect conserved regions and identify any patterns that may be of biological relevance. Poor levels of conservation among homologs, however, make it difficult to compare RNA sequences, even when considering closely evolutionary related sequences. RESULTS: We describe SARA-Coffee, a tertiary structure-based multiple RNA aligner, which has been validated using BRAliDARTS, a new benchmark framework designed for evaluating tertiary structure-based multiple RNA aligners. We provide two methods to measure the capacity of alignments to match corresponding secondary and tertiary structure features. On this benchmark, SARA-Coffee outperforms both regular aligners and those using secondary structure information. Furthermore, we show that on sequences in which <60% of the nucleotides form base pairs, primary sequence methods usually perform better than secondary-structure aware aligners. AVAILABILITY AND IMPLEMENTATION: The package and the datasets are available from http://www.tcoffee.org/Projects/saracoffee and http://structure.biofold.org/sara/.


Assuntos
RNA/química , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Conformação de Ácido Nucleico
14.
Bioinformatics ; 27(24): 3385-91, 2011 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-22039207

RESUMO

MOTIVATION: Evaluating alternative multiple protein sequence alignments is an important unsolved problem in Biology. The most accurate way of doing this is to use structural information. Unfortunately, most methods require at least two structures to be embedded in the alignment, a condition rarely met when dealing with standard datasets. RESULT: We developed STRIKE, a method that determines the relative accuracy of two alternative alignments of the same sequences using a single structure. We validated our methodology on three commonly used reference datasets (BAliBASE, Homestrad and Prefab). Given two alignments, STRIKE manages to identify the most accurate one in 70% of the cases on average. This figure increases to 79% when considering very challenging datasets like the RV11 category of BAliBASE. This discrimination capacity is significantly higher than that reported for other metrics such as Contact Accepted mutation or Blosum. We show that this increased performance results both from a refined definition of the contacts and from the use of an improved contact substitution score. CONTACT: cedric.notredame@crg.eu AVAILABILITY: STRIKE is an open source freeware available from www.tcoffee.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas/química , Alinhamento de Sequência/métodos , Biologia Computacional/métodos , Internet , Software
15.
Bioinformatics ; 25(19): 2455-65, 2009 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-19648142

RESUMO

This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and some suggestions for future validation strategies. The last part of the review addresses future challenges for multiple sequence alignment methods in the genomic era, most notably the need to cope with very large sequences, the need to integrate large amounts of experimental data, the need to accurately align non-coding and non-transcribed sequences and finally, the need to integrate many alternative methods and approaches.


Assuntos
Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Genoma , Dados de Sequência Molecular , Filogenia , Análise de Sequência de Proteína
16.
Methods Mol Biol ; 1851: 287-300, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30298404

RESUMO

Protein domains are reusable segments of proteins and play an important role in protein evolution. By combining the elements from a relatively small set of domains into unique arrangements, a large number of distinct proteins can be generated. Since domains often have specific functions, changes in their arrangement usually affect the overall protein function. Furthermore, domains are well amenable to computational representations, e.g., by Hidden Markov Models (HMMs), and these HMMs are widely represented in various databases. Therefore, domains can be efficiently used for proteomic analyses. Here, we describe how domains are annotated using different domain databases and then how to assess the annotation quality of proteomes. We next show how functional annotations of domains in large-scale data such as whole genomes or transcriptomes can be used to analyze molecular differences between species. Furthermore, we describe methods to analyze the changes in domain content of proteins which significantly helps to characterize and reconstruct the modular evolution of proteins. Altogether, domain-based methods offer a computationally highly effective approach to analyze large amounts of proteomic data in an evolutionary setting.


Assuntos
Bases de Dados de Proteínas , Proteômica/métodos , Biologia Computacional , Anotação de Sequência Molecular , Estrutura Terciária de Proteína
17.
Nat Ecol Evol ; 2(3): 557-566, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29403074

RESUMO

Around 150 million years ago, eusocial termites evolved from within the cockroaches, 50 million years before eusocial Hymenoptera, such as bees and ants, appeared. Here, we report the 2-Gb genome of the German cockroach, Blattella germanica, and the 1.3-Gb genome of the drywood termite Cryptotermes secundus. We show evolutionary signatures of termite eusociality by comparing the genomes and transcriptomes of three termites and the cockroach against the background of 16 other eusocial and non-eusocial insects. Dramatic adaptive changes in genes underlying the production and perception of pheromones confirm the importance of chemical communication in the termites. These are accompanied by major changes in gene regulation and the molecular evolution of caste determination. Many of these results parallel molecular mechanisms of eusocial evolution in Hymenoptera. However, the specific solutions are remarkably different, thus revealing a striking case of convergence in one of the major evolutionary transitions in biological complexity.


Assuntos
Blattellidae/genética , Evolução Molecular , Genoma , Isópteros/genética , Comportamento Social , Animais , Evolução Biológica , Blattellidae/fisiologia , Isópteros/fisiologia , Filogenia
18.
Nat Commun ; 5: 5495, 2014 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-25510865

RESUMO

Adaptation requires genetic variation, but founder populations are generally genetically depleted. Here we sequence two populations of an inbred ant that diverge in phenotype to determine how variability is generated. Cardiocondyla obscurior has the smallest of the sequenced ant genomes and its structure suggests a fundamental role of transposable elements (TEs) in adaptive evolution. Accumulations of TEs (TE islands) comprising 7.18% of the genome evolve faster than other regions with regard to single-nucleotide variants, gene/exon duplications and deletions and gene homology. A non-random distribution of gene families, larvae/adult specific gene expression and signs of differential methylation in TE islands indicate intragenomic differences in regulation, evolutionary rates and coalescent effective population size. Our study reveals a tripartite interplay between TEs, life history and adaptation in an invasive species.


Assuntos
Formigas/genética , Elementos de DNA Transponíveis , Genes de Insetos , Genoma de Inseto , Ilhas Genômicas , Espécies Introduzidas , Adaptação Fisiológica , Animais , Evolução Biológica , Brasil , Metilação de DNA , Éxons , Deleção de Genes , Duplicação Gênica , Japão , Filogeografia , Polimorfismo de Nucleotídeo Único
19.
Nat Protoc ; 6(11): 1669-82, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21979275

RESUMO

T-Coffee (Tree-based consistency objective function for alignment evaluation) is a versatile multiple sequence alignment (MSA) method suitable for aligning most types of biological sequences. The main strength of T-Coffee is its ability to combine third party aligners and to integrate structural (or homology) information when building MSAs. The series of protocols presented here show how the package can be used to multiply align proteins, RNA and DNA sequences. The protein section shows how users can select the most suitable T-Coffee mode for their data set. Detailed protocols include T-Coffee, the default mode, M-Coffee, a meta version able to combine several third party aligners into one, PSI (position-specific iterated)-Coffee, the homology extended mode suitable for remote homologs and Expresso, the structure-based multiple aligner. We then also show how the T-RMSD (tree based on root mean square deviation) option can be used to produce a functionally informative structure-based clustering. RNA alignment procedures are described for using R-Coffee, a mode able to use predicted RNA secondary structures when aligning RNA sequences. DNA alignments are illustrated with Pro-Coffee, a multiple aligner specific of promoter regions. We also present some of the many reformatting utilities bundled with T-Coffee. The package is an open-source freeware available from http://www.tcoffee.org/.


Assuntos
DNA/química , Conformação de Ácido Nucleico , Proteínas/química , RNA/química , Alinhamento de Sequência/métodos , Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Modelos Moleculares , Dados de Sequência Molecular , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA