Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
BMC Bioinformatics ; 9: 456, 2008 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-18954438

RESUMO

BACKGROUND: The increasing availability of fungal genome sequences provides large numbers of proteins for evolutionary and phylogenetic analyses. However the heterogeneity of data, including the quality of genome annotation and the difficulty of retrieving true orthologs, makes such investigations challenging. The aim of this study was to provide a reliable and integrated resource of orthologous gene families to perform comparative and phylogenetic analyses in fungi. DESCRIPTION: FUNYBASE is a database dedicated to the analysis of fungal single-copy genes extracted from available fungal genomes sequences, their classification into reliable clusters of orthologs, and the assessment of their informative value for phylogenetic reconstruction based on amino acid sequences. The current release of FUNYBASE contains two types of protein data: (i) a complete set of protein sequences extracted from 30 public fungal genomes and classified into clusters of orthologs using a robust automated procedure, and (ii) a subset of 246 reliable ortholog clusters present as single copy genes in 21 fungal genomes. For each of these 246 ortholog clusters, phylogenetic trees were reconstructed based on their amino acid sequences. To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model. The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree. The full results of these analyses are available on-line with a user-friendly interface that allows for searches to be performed by species name, the ortholog cluster, various keywords, or using the BLAST algorithm. Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented. CONCLUSION: FUNYBASE constitutes a novel and useful resource for two types of analyses: (i) comparative studies can be greatly facilitated by reliable clusters of orthologs across sets of user-defined fungal genomes, and (ii) phylogenetic reconstruction can be improved by identifying genes with the highest informative value at the desired taxonomic level.


Assuntos
Bases de Dados Genéticas , Genoma Fúngico , Genômica/métodos , Armazenamento e Recuperação da Informação/métodos , Filogenia , Algoritmos , Bases de Dados de Proteínas , Evolução Molecular , Fungos/genética , Genes Fúngicos
2.
Theor Popul Biol ; 73(2): 289-99, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18190938

RESUMO

This paper provides a theoretical description of the chromosome architecture resulting from a given number of generations in a back-cross. It is worth considering chromosome architecture as being dependent on a marked point process, whose properties themselves depend on the crossing-over model used. The resulting architecture is presented here for two different models, one without interference, the other with complete interference. Exact distributions, with easy-to-compute formulae, are derived for quantities of interest, as the lengths of donor or receiver fragments, for any chromosome length and for both crossing-over models. Examples are presented to illustrate the use of these distributions in introgression programs.


Assuntos
Cromossomos/genética , Troca Genética/genética , Modelos Genéticos , Plantas/genética , França , Modelos Estatísticos , Distribuição de Poisson
3.
BMC Genomics ; 8: 272, 2007 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-17692127

RESUMO

BACKGROUND: The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. RESULTS: A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. CONCLUSION: This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.


Assuntos
Etiquetas de Sequências Expressas , Genes Fúngicos/genética , Genes Fúngicos Tipo Acasalamento/genética , Virulência/genética , Sequência de Bases , DNA Fúngico , Bases de Dados de Ácidos Nucleicos , Fungos , Biblioteca Gênica , Doenças das Plantas/microbiologia
4.
BMC Genomics ; 7: 194, 2006 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-16882342

RESUMO

BACKGROUND: Comparative mapping provides new insights into the evolutionary history of genomes. In particular, recent studies in mammals have suggested a role for segmental duplication in genome evolution. In some species such as Drosophila or maize, transposable elements (TEs) have been shown to be involved in chromosomal rearrangements. In this work, we have explored the presence of interspersed repeats in regions of chromosomal rearrangements, using an updated high-resolution integrated comparative map among cattle, man and mouse. RESULTS: The bovine, human and mouse comparative autosomal map has been constructed using data from bovine genetic and physical maps and from FISH-mapping studies. We confirm most previous results but also reveal some discrepancies. A total of 211 conserved segments have been identified between cattle and man, of which 33 are new segments and 72 correspond to extended, previously known segments. The resulting map covers 91% and 90% of the human and bovine genomes, respectively. Analysis of breakpoint regions revealed a high density of species-specific interspersed repeats in the human and mouse genomes. CONCLUSION: Analysis of the breakpoint regions has revealed specific repeat density patterns, suggesting that TEs may have played a significant role in chromosome evolution and genome plasticity. However, we cannot rule out that repeats and breakpoints accumulate independently in the few same regions where modifications are better tolerated. Likewise, we cannot ascertain whether increased TE density is the cause or the consequence of chromosome rearrangements. Nevertheless, the identification of high density repeat clusters combined with a well-documented repeat phylogeny should highlight probable breakpoints, and permit their precise dating. Combining new statistical models taking the present information into account should help reconstruct ancestral karyotypes.


Assuntos
Mapeamento Cromossômico/métodos , Evolução Molecular , Genoma/genética , Animais , Bovinos , Quebra Cromossômica , Humanos , Camundongos , Sequências Repetitivas de Ácido Nucleico , Translocação Genética
5.
BMC Struct Biol ; 6: 25, 2006 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-17166267

RESUMO

BACKGROUND: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models. RESULTS: Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model alpha-helices, 12 that model coil and 9 that model beta-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%. CONCLUSION: The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.


Assuntos
Biologia Computacional/métodos , Cadeias de Markov , Modelos Químicos , Estrutura Secundária de Proteína , Proteínas/química
6.
Nucleic Acids Res ; 30(6): 1418-26, 2002 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-11884641

RESUMO

We present here the use of a new statistical segmentation method on the Bacillus subtilis chromosome sequence. Maximum likelihood parameter estimation of a hidden Markov model, based on the expectation-maximization algorithm, enables one to segment the DNA sequence according to its local composition. This approach is not based on sliding windows; it enables different compositional classes to be separated without prior knowledge of their content, size and localization. We compared these compositional classes, obtained from the sequence, with the annotated DNA physical map, sequence homologies and repeat regions. The first heterogeneity revealed discriminates between the two coding strands and the non-coding regions. Other main heterogeneities arise; some are related to horizontal gene transfer, some to t-enriched composition of hydrophobic protein coding strands, and others to the codon usage fitness of highly expressed genes. Concerning potential and established gene transfers, we found 9 of the 10 known prophages, plus 14 new regions of atypical composition. Some of them are surrounded by repeats, most of their genes have unknown function or possess homology to genes involved in secondary catabolism, metal and antibiotic resistance. Surprisingly, we notice that all of these detected regions are a + t-richer than the host genome, raising the question of their remote sources.


Assuntos
Bacillus subtilis/genética , Cromossomos Bacterianos , Cadeias de Markov , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , DNA Bacteriano/classificação , Transferência Genética Horizontal , Variação Genética , Interações Hidrofóbicas e Hidrofílicas , Funções Verossimilhança , Lisogenia , RNA Bacteriano/genética , Sequências Repetitivas de Ácido Nucleico , Homologia de Sequência do Ácido Nucleico
7.
BMC Bioinformatics ; 6: 150, 2005 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-15960854

RESUMO

BACKGROUND: Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. RESULTS: AnovArray is a package implementing ANOVA for gene expression data using SAS statistical software. The originality of the package is 1) to quantify the different sources of variation on all genes together, 2) to provide a quality control of the model, 3) to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. CONCLUSION: AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS statistical software.


Assuntos
Biologia Computacional/métodos , Regulação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Análise de Variância , Animais , Bovinos , Simulação por Computador , Interpretação Estatística de Dados , Expressão Gênica , Perfilação da Expressão Gênica , Biblioteca Gênica , Internet , Modelos Estatísticos , Linguagens de Programação , Controle de Qualidade , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e Especificidade , Análise de Sequência de DNA , Software , Distribuição Tecidual
8.
Genome Biol Evol ; 7(10): 2896-912, 2015 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-26454013

RESUMO

Deciphering the genetic bases of pathogen adaptation to its host is a key question in ecology and evolution. To understand how the fungus Magnaporthe oryzae adapts to different plants, we sequenced eight M. oryzae isolates differing in host specificity (rice, foxtail millet, wheat, and goosegrass), and one Magnaporthe grisea isolate specific of crabgrass. Analysis of Magnaporthe genomes revealed small variation in genome sizes (39-43 Mb) and gene content (12,283-14,781 genes) between isolates. The whole set of Magnaporthe genes comprised 14,966 shared families, 63% of which included genes present in all the nine M. oryzae genomes. The evolutionary relationships among Magnaporthe isolates were inferred using 6,878 single-copy orthologs. The resulting genealogy was mostly bifurcating among the different host-specific lineages, but was reticulate inside the rice lineage. We detected traces of introgression from a nonrice genome in the rice reference 70-15 genome. Among M. oryzae isolates and host-specific lineages, the genome composition in terms of frequencies of genes putatively involved in pathogenicity (effectors, secondary metabolism, cazome) was conserved. However, 529 shared families were found only in nonrice lineages, whereas the rice lineage possessed 86 specific families absent from the nonrice genomes. Our results confirmed that the host specificity of M. oryzae isolates was associated with a divergence between lineages without major gene flow and that, despite the strong conservation of gene families between lineages, adaptation to different hosts, especially to rice, was associated with the presence of a small number of specific gene families. All information was gathered in a public database (http://genome.jouy.inra.fr/gemo).


Assuntos
Evolução Molecular , Genoma Fúngico , Magnaporthe/genética , Adaptação Biológica , Sequência de Bases , Evolução Biológica , Burkholderia/genética , Burkholderia/isolamento & purificação , Elementos de DNA Transponíveis , Digitaria/microbiologia , Proteínas Fúngicas/genética , Genes Fúngicos , Variação Genética , Magnaporthe/isolamento & purificação , Oryza/microbiologia , Doenças das Plantas/microbiologia , Análise de Sequência de DNA
9.
J Comput Biol ; 19(1): 13-29, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22149633

RESUMO

We present a general method for assessing threading score significance. The threading score of a protein sequence, thread onto a given structure, should be compared with the threading score distribution of a random amino-acid sequence, of the same length, thread on the same structure; small p-values point significantly high scores. We claim that, due to general protein contact map properties, this reference distribution is a Weibull extreme value distribution whose parameters depend on the threading method, the structure, the length of the query and the random sequence simulation model used. These parameters can be estimated off-line with simulated sequence samples, for different sequence lengths. They can further be interpolated at the exact length of a query, enabling the quick computation of the p-value.


Assuntos
Modelos Estatísticos , Alinhamento de Sequência/métodos , Análise de Sequência/métodos , Distribuições Estatísticas , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/métodos , Simulação por Computador , Cadeias de Markov , Conformação Proteica , Proteínas/química
10.
Infect Genet Evol ; 12(5): 987-96, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22406010

RESUMO

The rapid evolution of particular genes is essential for the adaptation of pathogens to new hosts and new environments. Powerful methods have been developed for detecting targets of selection in the genome. Here we used divergence data to compare genes among four closely related fungal pathogens adapted to different hosts to elucidate the functions putatively involved in adaptive processes. For this goal, ESTs were sequenced in the specialist fungal pathogens Botrytis tulipae and Botrytis ficariarum, and compared with genome sequences of Botrytis cinerea and Sclerotinia sclerotiorum, responsible for diseases on over 200 plant species. A maximum likelihood-based analysis of 642 predicted orthologs detected 21 genes showing footprints of positive selection. These results were validated by resequencing nine of these genes in additional Botrytis species, showing they have also been rapidly evolving in other related species. Twenty of the 21 genes had not previously been identified as pathogenicity factors in B. cinerea, but some had functions related to plant-fungus interactions. The putative functions were involved in respiratory and energy metabolism, protein and RNA metabolism, signal transduction or virulence, similarly to what was detected in previous studies using the same approach in other pathogens. Mutants of B. cinerea were generated for four of these genes as a first attempt to elucidate their functions.


Assuntos
Botrytis/genética , Evolução Molecular , Genes Fúngicos , Linhagem Celular , Análise por Conglomerados , Simulação por Computador , Genoma Fúngico , Solanum lycopersicum/microbiologia , Reprodutibilidade dos Testes , Seleção Genética , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa