Pesquisa | Biblioteca Virtual em Saúde

Mostrar: 20 | 50 | 100

Resultados 1 - 11 de 11

Filtrar

Cophylogeny reconstruction via an approximate Bayesian computation.

Baudet, C; Donati, B; Sinaimeri, B; Crescenzi, P; Gautier, C; Matias, C; Sagot, M-F.

Syst Biol ; 64(3): 416-31, 2015 May.

Artigo em Inglês | MEDLINE | ID: mdl-25540454

RESUMO

Despite an increasingly vast literature on cophylogenetic reconstructions for studying host-parasite associations, understanding the common evolutionary history of such systems remains a problem that is far from being solved. Most algorithms for host-parasite reconciliation use an event-based model, where the events include in general (a subset of) cospeciation, duplication, loss, and host switch. All known parsimonious event-based methods then assign a cost to each type of event in order to find a reconstruction of minimum cost. The main problem with this approach is that the cost of the events strongly influences the reconciliation obtained. Some earlier approaches attempt to avoid this problem by finding a Pareto set of solutions and hence by considering event costs under some minimization constraints. To deal with this problem, we developed an algorithm, called Coala, for estimating the frequency of the events based on an approximate Bayesian computation approach. The benefits of this method are 2-fold: (i) it provides more confidence in the set of costs to be used in a reconciliation, and (ii) it allows estimation of the frequency of the events in cases where the data set consists of trees with a large number of taxa. We evaluate our method on simulated and on biological data sets. We show that in both cases, for the same pair of host and parasite trees, different sets of frequencies for the events lead to equally probable solutions. Moreover, often these solutions differ greatly in terms of the number of inferred events. It appears crucial to take this into account before attempting any further biological interpretation of such reconciliations. More generally, we also show that the set of frequencies can vary widely depending on the input host and parasite trees. Indiscriminately applying a standard vector of costs may thus not be a good strategy.

Assuntos

Algoritmos , Classificação/métodos , Filogenia , Animais , Artrópodes/classificação , Artrópodes/microbiologia , Teorema de Bayes , Interações Hospedeiro-Parasita , Wolbachia/classificação , Wolbachia/fisiologia

Mitochondrial respiration and genomic analysis provide insight into the influence of the symbiotic bacterium on host trypanosomatid oxygen consumption.

Azevedo-Martins, A C; Machado, A C L; Klein, C C; Ciapina, L; Gonzaga, L; Vasconcelos, A T R; Sagot, M F; DE Souza, W; Einicker-Lamas, M; Galina, A; Motta, M C M.

Parasitology ; 142(2): 352-62, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-25160925

RESUMO

Certain trypanosomatids co-evolve with an endosymbiotic bacterium in a mutualistic relationship that is characterized by intense metabolic exchanges. Symbionts were able to respire for up to 4 h after isolation from Angomonas deanei. FCCP (carbonyl cyanide-4-(trifluoromethoxy)phenylhydrazone) similarly increased respiration in wild-type and aposymbiotic protozoa, though a higher maximal O2 consumption capacity was observed in the symbiont-containing cells. Rotenone, a complex I inhibitor, did not affect A. deanei respiration, whereas TTFA (thenoyltrifluoroacetone), a complex II activity inhibitor, completely blocked respiration in both strains. Antimycin A and cyanide, inhibitors of complexes III and IV, respectively, abolished O2 consumption, but the aposymbiotic protozoa were more sensitive to both compounds. Oligomycin did not affect cell respiration, whereas carboxyatractyloside (CAT), an inhibitor of the ADP-ATP translocator, slightly reduced O2 consumption. In the A. deanei genome, sequences encoding most proteins of the respiratory chain are present. The symbiont genome lost part of the electron transport system (ETS), but complex I, a cytochrome d oxidase, and FoF1-ATP synthase remain. In conclusion, this work suggests that the symbiont influences the mitochondrial respiration of the host protozoan.

Assuntos

Bactérias/classificação , Mitocôndrias/metabolismo , Consumo de Oxigênio/fisiologia , Simbiose/fisiologia , Trypanosomatina/microbiologia , Trypanosomatina/fisiologia , Bactérias/metabolismo , Evolução Biológica , Transporte de Elétrons/genética , Transporte de Elétrons/fisiologia , Regulação da Expressão Gênica , Trypanosomatina/genética

Current tools for the identification of miRNA genes and their targets.

Mendes, N D; Freitas, A T; Sagot, M-F.

Nucleic Acids Res ; 37(8): 2419-33, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19295136

RESUMO

The discovery of microRNAs (miRNAs), almost 10 years ago, changed dramatically our perspective on eukaryotic gene expression regulation. However, the broad and important functions of these regulators are only now becoming apparent. The expansion of our catalogue of miRNA genes and the identification of the genes they regulate owe much to the development of sophisticated computational tools that have helped either to focus or interpret experimental assays. In this article, we review the methods for miRNA gene finding and target identification that have been proposed in the last few years. We identify some problems that current approaches have not yet been able to overcome and we offer some perspectives on the next generation of computational methods.

Assuntos

Biologia Computacional/métodos , MicroRNAs/genética , MicroRNAs/metabolismo , Interferência de RNA , RNA Mensageiro/química , Animais , Genes , MicroRNAs/biossíntese , RNA Mensageiro/metabolismo

Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.

Vanet, A; Marsan, L; Labigne, A; Sagot, M F.

J Mol Biol ; 297(2): 335-53, 2000 Mar 24.

Artigo em Inglês | MEDLINE | ID: mdl-10715205

RESUMO

Helicobacter pylori is adapted to life in a unique niche, the gastric epithelium of primates. Its promoters may therefore be different from those of other bacteria. Here, we determine motifs possibly involved in the recognition of such promoter sequences by the RNA polymerase using a new motif identification method. An important feature of this method is that the motifs are sought with the least possible assumptions about what they may look like. The method starts by considering the whole genome of H. pylori and attempts to infer directly from it a description for a family of promoters. Thus, this approach differs from searching for such promoters with a previously established description. The two algorithms are based on the idea of inferring motifs by flexibly comparing words in the sequences with an external object, instead of between themselves. The first algorithm infers single motifs, the second a combination of two motifs separated from one another by strictly defined, sterically constrained distances. Besides independently finding motifs known to be present in other bacteria, such as the Shine-Dalgarno sequence and the TATA-box, this approach suggests the existence in H. pylori of a new, combined motif, TTAAGC, followed optimally 21 bp downstream by TATAAT. Between these two motifs, there is in some cases another, TTTTAA or, less frequently, a repetition of TTAAGC separated optimally from the TATA-box by 12 bp. The combined motif TTAAGCx(21+/-2)TATAAT is present with no errors immediately upstream from the only two copies of the ribosomal 23 S-5 S RNA genes in H. pylori, and with one error upstream from the only two copies of the ribosomal 16 S RNA genes. The operons of both ribosomal RNA molecules are strongly expressed, representing an encouraging sign of the pertinence of the motifs found by the algorithms. In 25 cases out of a possible 30, the combined motif is found with no more than three substitutions immediately upstream from ribosomal proteins, or operons containing a ribosomal protein. This is roughly the same frequency of occurrence as for TTGACAx(15-19)TATAAT (with the same maximum number of substitutions allowed) described as being the sigma(70 )promoter sequence consensus in Bacillus subtilis and Escherichia coli. The frequency of occurrence of the new motif obtained, TTAAGCx(19-23)TATAAT, remains high when all protein genes in H. pylori are considered, as is the case for the TTGACAx(15-19)TATAAT motif in B. subtilis but not in E. coli.

Assuntos

Sequência Consenso/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Genoma Bacteriano , Helicobacter pylori/genética , Regiões Promotoras Genéticas/genética , Elementos de Resposta/genética , Fator sigma/metabolismo , Algoritmos , Bacillus subtilis/genética , Proteínas de Bactérias/genética , Sequência de Bases , Códon de Iniciação/genética , Biologia Computacional/métodos , Sequência Conservada/genética , DNA Bacteriano/genética , DNA Bacteriano/metabolismo , Escherichia coli/genética , Genes Bacterianos/genética , Genes de RNAr/genética , Óperon/genética , Reprodutibilidade dos Testes , Proteínas Ribossômicas/genética , Estatística como Assunto , TATA Box/genética

Identifying satellites and periodic repetitions in biological sequences.

Sagot, M F; Myers, E W.

J Comput Biol ; 5(3): 539-53, 1998.

Artigo em Inglês | MEDLINE | ID: mdl-9773349

RESUMO

We present in this paper an algorithm for identifying satellites in DNA sequences. Satellites (simple, micro, or mini) are repeats in number between 30 and as many as 1,000,000 whose lengths vary between 2 and hundreds of base pairs and that appear, with some mutations, in tandem along the sequence. We concentrate here on short to moderately long (up to 30-40 base pairs) approximate tandem repeats where copies may differ up to epsilon = 15-20% from a consensus model of the repeating unit (implying individual units may vary by 2 epsilon from each other). The algorithm is composed of two parts. The first one consists of a filter that basically eliminates all regions whose probability of containing a satellite is less than one in 10(4) when epsilon = 10%. The second part realizes an exhaustive exploration of the space of all possible models for the repeating units present in the sequence. It therefore has the advantage over previous work of being able to report a consensus model, say m, of the repeated unit as well as the span of the satellite. The first phase was designed for efficiency and takes only O (n) time where n is the length of the sequence. The second phase was designed for sensitivity and takes time O (n . N (e, k)) in the worst case where k is the length of the repeating unit m, e = [epsilon k] is the number of differences allowed between each repeat unit and the model m, and N (e, k) is the maximum number of words that are not more than e differences from another word of length k. That is, N (e, k) is the maximum size of an e-neighborhood of a string of length k. Experiments reveal the second phase to be considerably faster in practice than the worst-case complexity bound suggests. Finally, the present algorithm is easily adapted to finding tandem repeats in protein sequences, as well as extended to identifying mixed direct-inverse tandem repeats.

Assuntos

Algoritmos , DNA Satélite/análise , Sequências de Repetição em Tandem , Sequência de Bases , Cromossomos Fúngicos , DNA Fúngico , Dados de Sequência Molecular , Saccharomyces cerevisiae/genética

Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification.

Marsan, L; Sagot, M F.

J Comput Biol ; 7(3-4): 345-62, 2000.

Artigo em Inglês | MEDLINE | ID: mdl-11108467

RESUMO

This paper introduces two exact algorithms for extracting conserved structured motifs from a set of DNA sequences. Structured motifs may be described as an ordered collection of p > or = 1 "boxes" (each box corresponding to one part of the structured motif), p substitution rates (one for each box) and p - 1 intervals of distance (one for each pair of successive boxes in the collection). The contents of the boxes--that is, the motifs themselves--are unknown at the start of the algorithm. This is precisely what the algorithms are meant to find. A suffix tree is used for finding such motifs. The algorithms are efficient enough to be able to infer site consensi, such as, for instance, promoter sequences or regulatory sites, from a set of unaligned sequences corresponding to the noncoding regions upstream from all genes of a genome. In particular, both algorithms time complexity scales linearly with N2n where n is the average length of the sequences and N their number. An application to the identification of promoter and regulatory consensus sequences in bacterial genomes is shown.

Assuntos

Algoritmos , Análise de Sequência de DNA/estatística & dados numéricos , Sítios de Ligação/genética , Biologia Computacional , Sequência Consenso , DNA Bacteriano/genética , DNA Bacteriano/metabolismo , Genes Reguladores , Genoma Bacteriano , Modelos Genéticos , Regiões Promotoras Genéticas

Occurrence probability of structured motifs in random sequences.

Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S.

J Comput Biol ; 9(6): 761-73, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12614545

RESUMO

The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

Assuntos

Sequência de Bases , Biologia Computacional , Transcrição Gênica , Algoritmos , Bacillus subtilis/genética , Simulação por Computador , Escherichia coli/genética , Modelos Genéticos , Probabilidade , Regiões Promotoras Genéticas

Promoter sequences and algorithmical methods for identifying them.

Vanet, A; Marsan, L; Sagot, M F.

Res Microbiol ; 150(9-10): 779-99, 1999.

Artigo em Inglês | MEDLINE | ID: mdl-10673015

RESUMO

This paper presents a survey of currently available mathematical models and algorithmical methods for trying to identify promoter sequences. The methods concern both searching in a genome for a previously defined consensus and extracting a consensus from a set of sequences. Such methods were often tailored for either eukaryotes or prokaryotes although this does not preclude use of the same method for both types of organisms. The survey therefore covers all methods; however, emphasis is placed on prokaryotic promoter sequence identification. Illustrative applications of the main extracting algorithms are given for three bacteria.

Assuntos

Algoritmos , Células Procarióticas/química , Regiões Promotoras Genéticas/genética , Bactérias/genética , Composição de Bases , Sequência de Bases , Sequência Consenso , Modelos Estatísticos , Dados de Sequência Molecular , Análise de Sequência

Wolbachia detection: an assessment of standard PCR protocols.

Simões, P M; Mialdea, G; Reiss, D; Sagot, M-F; Charlat, S.

Mol Ecol Resour ; 11(3): 567-72, 2011 May.

Artigo em Inglês | MEDLINE | ID: mdl-21481216

RESUMO

Wolbachia is a large monophyletic genus of intracellular bacteria, traditionally detected using PCR assays. Its considerable phylogenetic diversity and impact on arthropods and nematodes make it urgent to assess the efficiency of these screening protocols. The sensitivity and range of commonly used PCR primers and of a new set of 16S primers were evaluated on a wide range of hosts and Wolbachia strains. We show that certain primer sets are significantly more efficient than others but that no single protocol can ensure the specific detection of all known Wolbachia infections.

Assuntos

Entomologia/métodos , Reação em Cadeia da Polimerase/métodos , Wolbachia/isolamento & purificação , Animais , Artrópodes/microbiologia , Primers do DNA/genética , Nematoides/microbiologia , RNA Ribossômico 16S/genética , Sensibilidade e Especificidade , Wolbachia/genética

10.

Finding flexible patterns in a text: an application to three-dimensional molecular matching.

Sagot, M F; Viari, A; Pothier, J; Soldano, H.

Comput Appl Biosci ; 11(1): 59-70, 1995 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-7796276

RESUMO

Finding certain regularities in a text is an important problem in many areas, e.g. in the analysis of biological molecules such as nucleic acids or proteins. In the latter case, the text may be sequences of amino acids or a linear coding of three-dimensional structures, and the regularities then correspond to lexical or structural motifs common to two, or more, proteins. We first recall an earlier algorithm that found these regularities in a flexible way. Then we introduce a generalized version of this algorithm designed for the particular case of protein three-dimensional structures, since these structures present a few peculiarities that make them computationally harder to process. Finally, we give some applications of our new algorithm on concrete examples.

Assuntos

Algoritmos , Reconhecimento Automatizado de Padrão , Proteínas/química , Sistema Enzimático do Citocromo P-450/química , Bases de Dados Factuais , Modelos Moleculares , Modelos Estatísticos , Estrutura Molecular , Conformação Proteica , Alinhamento de Sequência/métodos , Alinhamento de Sequência/estatística & dados numéricos

11.

A distance-based block searching algorithm.

Sagot, M F; Viari, A; Soldano, H.

Proc Int Conf Intell Syst Mol Biol ; 3: 322-31, 1995.

Artigo em Inglês | MEDLINE | ID: mdl-7584455

RESUMO

We present in this paper an algorithm for the multiple comparison of a set of protein sequences. Our approach is that of peptide matching and consists in looking for all the words that occur approximatively in at least q of the sequences in the set, where q is a parameter. Words are compared by using a reference object called a model, that is itself a word over the alphabet of the amino acids, and the comparison between a model and a word is based on w-length words instead of single symbols. This idea is similar to the one used in the Blast program in the case of pairwise comparisons. Two w-length words are considered to be related if an alignment without gaps of the two using a similarity matrix has a score greater than a certain threshold value t. In our case, we say that a k-length word u is an occurrence of a model m of the same length if every w-length subword of u is related to the corresponding subword of m in the sense given above. If a model m has occurrences in at least q of the sequences of the set, m is said to occur in the set. In percentage terms, the value of q may correspond to something as small as 5% of the sequences (search for recurrent words in a set of non homologous proteins) or as high as 70-100% (establishment of a list of all similar words as a first step in a multiple alignment program). The algorithm presented here is an efficient and exact way of looking for all the models, of a fixed length k or of the greatest possible length kmax, that occur in a set of sequences. It can work with any kind of scoring matrix and an extension of the algorithm allows for the introduction of gaps between a model and its occurrences.

Assuntos

Algoritmos , Proteínas/química , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Animais , Simulação por Computador , Humanos , Modelos Teóricos , Dados de Sequência Molecular , Software

Ver mais detalhes

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA