Pesquisa | Portal Regional da BVS

MS4--Multi-Scale Selector of Sequence Signatures: an alignment-free method for classification of biological sequences.

Corel, Eduardo; Pitschi, Florian; Laprevotte, Ivan; Grasseau, Gilles; Didier, Gilles; Devauchelle, Claudine.

BMC Bioinformatics ; 11: 406, 2010 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-20673356

RESUMO

BACKGROUND: While multiple alignment is the first step of usual classification schemes for biological sequences, alignment-free methods are being increasingly used as alternatives when multiple alignments fail. Subword-based combinatorial methods are popular for their low algorithmic complexity (suffix trees ...) or exhaustivity (motif search), in general with fixed length word and/or number of mismatches. We developed previously a method to detect local similarities (the N-local decoding) based on the occurrences of repeated subwords of fixed length, which does not impose a fixed number of mismatches. The resulting similarities are, for some "good" values of N, sufficiently relevant to form the basis of a reliable alignment-free classification. The aim of this paper is to develop a method that uses the similarities detected by N-local decoding while not imposing a fixed value of N. We present a procedure that selects for every position in the sequences an adaptive value of N, and we implement it as the MS4 classification tool. RESULTS: Among the equivalence classes produced by the N-local decodings for all N, we select a (relatively) small number of "relevant" classes corresponding to variable length subwords that carry enough information to perform the classification. The parameter N, for which correct values are data-dependent and thus hard to guess, is here replaced by the average repetitivity kappa of the sequences. We show that our approach yields classifications of several sets of HIV/SIV sequences that agree with the accepted taxonomy, even on usually discarded repetitive regions (like the non-coding part of LTR). CONCLUSIONS: The method MS4 satisfactorily classifies a set of sequences that are notoriously hard to align. This suggests that our approach forms the basis of a reliable alignment-free classification tool. The only parameter kappa of MS4 seems to give reasonable results even for its default value, which can be a great advantage for sequence sets for which little information is available.

Assuntos

Classificação/métodos , Biologia Computacional/métodos , Software , Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Genes nef , Genoma Viral , HIV/classificação , HIV/genética , Repetição Terminal Longa de HIV , Vírus da Imunodeficiência Símia/classificação , Vírus da Imunodeficiência Símia/genética

Complex molecular assemblies at hand via interactive simulations.

Delalande, Olivier; Férey, Nicolas; Grasseau, Gilles; Baaden, Marc.

J Comput Chem ; 30(15): 2375-87, 2009 Nov 30.

Artigo em Inglês | MEDLINE | ID: mdl-19353597

RESUMO

Studying complex molecular assemblies interactively is becoming an increasingly appealing approach to molecular modeling. Here we focus on interactive molecular dynamics (IMD) as a textbook example for interactive simulation methods. Such simulations can be useful in exploring and generating hypotheses about the structural and mechanical aspects of biomolecular interactions. For the first time, we carry out low-resolution coarse-grain IMD simulations. Such simplified modeling methods currently appear to be more suitable for interactive experiments and represent a well-balanced compromise between an important gain in computational speed versus a moderate loss in modeling accuracy compared to higher resolution all-atom simulations. This is particularly useful for initial exploration and hypothesis development for rare molecular interaction events. We evaluate which applications are currently feasible using molecular assemblies from 1900 to over 300,000 particles. Three biochemical systems are discussed: the guanylate kinase (GK) enzyme, the outer membrane protease T and the soluble N-ethylmaleimide-sensitive factor attachment protein receptors complex involved in membrane fusion. We induce large conformational changes, carry out interactive docking experiments, probe lipid-protein interactions and are able to sense the mechanical properties of a molecular model. Furthermore, such interactive simulations facilitate exploration of modeling parameters for method improvement. For the purpose of these simulations, we have developed a freely available software library called MDDriver. It uses the IMD protocol from NAMD and facilitates the implementation and application of interactive simulations. With MDDriver it becomes very easy to render any particle-based molecular simulation engine interactive. Here we use its implementation in the Gromacs software as an example.

Assuntos

Proteínas da Membrana Bacteriana Externa/química , Simulação por Computador , Proteínas de Escherichia coli/química , Guanilato Quinases/química , Modelos Químicos , Peptídeo Hidrolases/química , Proteínas SNARE/química , Guanilato Quinases/metabolismo , Modelos Moleculares , Software

SIMoNe: Statistical Inference for MOdular NEtworks.

Chiquet, Julien; Smith, Alexander; Grasseau, Gilles; Matias, Catherine; Ambroise, Christophe.

Bioinformatics ; 25(3): 417-8, 2009 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-19073589

RESUMO

SUMMARY: The R package SIMoNe (Statistical Inference for MOdular NEtworks) enables inference of gene-regulatory networks based on partial correlation coefficients from microarray experiments. Modelling gene expression data with a Gaussian graphical model (hereafter GGM), the algorithm estimates non-zero entries of the concentration matrix, in a sparse and possibly high-dimensional setting. Its originality lies in the fact that it searches for a latent modular structure to drive the inference procedure through adaptive penalization of the concentration matrix. AVAILABILITY: Under the GNU General Public Licence at http://cran.r-project.org/web/packages/simone/

Assuntos

Algoritmos , Redes Reguladoras de Genes , Software , Simulação por Computador , Bases de Dados Genéticas , Perfilação da Expressão Gênica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA