Your browser doesn't support javascript.
loading
Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling.
Pelé, Julien; Bécu, Jean-Michel; Abdi, Hervé; Chabbert, Marie.
Afiliação
  • Pelé J; CNRS UMR 6214 - INSERM 1083, Faculté de Médecine, 3 rue Haute de Reculée, Angers, 49045, France.
BMC Bioinformatics ; 13: 133, 2012 Jun 15.
Article em En | MEDLINE | ID: mdl-22702410
ABSTRACT

BACKGROUND:

The distance matrix computed from multiple alignments of homologous sequences is widely used by distance-based phylogenetic methods to provide information on the evolution of protein families. This matrix can also be visualized in a low dimensional space by metric multidimensional scaling (MDS). Applied to protein families, MDS provides information complementary to the information derived from tree-based methods. Moreover, MDS gives a unique opportunity to compare orthologous sequence sets because it can add supplementary elements to a reference space.

RESULTS:

The R package bios2mds (from BIOlogical Sequences to MultiDimensional Scaling) has been designed to analyze multiple sequence alignments by MDS. Bios2mds starts with a sequence alignment, builds a matrix of distances between the aligned sequences, and represents this matrix by MDS to visualize a sequence space. This package also offers the possibility of performing K-means clustering in the MDS derived sequence space. Most importantly, bios2mds includes a function that projects supplementary elements (a.k.a. "out of sample" elements) onto the space defined by reference or "active" elements. Orthologous sequence sets can thus be compared in a straightforward way. The data analysis and visualization tools have been specifically designed for an easy monitoring of the evolutionary drift of protein sub-families.

CONCLUSIONS:

The bios2mds package provides the tools for a complete integrated pipeline aimed at the MDS analysis of multiple sets of orthologous sequences in the R statistical environment. In addition, as the analysis can be carried out from user provided matrices, the projection function can be widely used on any kind of data.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de Proteína / Receptores Acoplados a Proteínas G Idioma: En Ano de publicação: 2012 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de Proteína / Receptores Acoplados a Proteínas G Idioma: En Ano de publicação: 2012 Tipo de documento: Article