Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 33(Database issue): D226-9, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608183

RESUMO

The SYSTERS project aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure. A refined two-step algorithm assigns each protein to a family and a superfamily. The sequence data underlying SYSTERS release 4 now comprise several protein sequence databases derived from completely sequenced genomes (ENSEMBL, TAIR, SGD and GeneDB), in addition to the comprehensive Swiss-Prot/TrEMBL databases. The SYSTERS web server (http://systers.molgen.mpg.de) provides access to 158 153 SYSTERS protein families. To augment the automatically derived results, information from external databases like Pfam and Gene Ontology are added to the web server. Furthermore, users can retrieve pre-processed analyses of families like multiple alignments and phylogenetic trees. New query options comprise a batch retrieval tool for functional inference about families based on automatic keyword extraction from sequence annotations. A new access point, PhyloMatrix, allows the retrieval of phylogenetic profiles of SYSTERS families across organisms with completely sequenced genomes.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Análise de Sequência de Proteína , Algoritmos , Filogenia , Software , Interface Usuário-Computador
2.
Nucleic Acids Res ; 30(1): 299-300, 2002 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-11752319

RESUMO

We have integrated the protein families from SYSTERS and the expressed sequence tag (EST) clusters from our database GeneNest with SpliceNest, a new database mapping EST contigs into genomic DNA. The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT, TrEMBL and PIR databases into disjoint protein family and superfamily clusters. GeneNest is a database and software package for producing and visualizing gene indices from ESTs and mRNAs. Currently, the database comprises gene indices of human, mouse, Arabidopsis thaliana and zebrafish. SpliceNest is a web-based graphical tool to explore gene structure, including alternative splicing, based on a mapping of the EST consensus sequences from GeneNest to the complete human genome. The integration of SYSTERS, GeneNest and SpliceNest into one framework now permits an overall exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA. The databases are available for querying and browsing at http://cmb.molgen.mpg.de.


Assuntos
Processamento Alternativo , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Genoma , Animais , Arabidopsis/genética , Mapeamento Cromossômico , Sequência Consenso , Etiquetas de Sequências Expressas , Humanos , Armazenamento e Recuperação da Informação , Internet , Camundongos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , RNA Mensageiro/genética , Alinhamento de Sequência , Integração de Sistemas , Peixe-Zebra/genética
3.
BMC Bioinformatics ; 6: 15, 2005 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-15663796

RESUMO

BACKGROUND: Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. RESULTS: We report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences resulting in a hierarchical clustering which is being made available for querying and browsing at http://systers.molgen.mpg.de/. CONCLUSIONS: Comparisons with other widely used clustering methods on various data sets show the abilities and strengths of our clustering methods in producing a biologically meaningful grouping of protein sequences.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Proteômica/métodos , Algoritmos , Análise por Conglomerados , Bases de Dados Factuais , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Proteínas Fúngicas/química , Ligação Genética , Genoma , Armazenamento e Recuperação da Informação , Modelos Biológicos , Família Multigênica , Filogenia , Estrutura Terciária de Proteína , Reprodutibilidade dos Testes , Alinhamento de Sequência , Análise de Sequência de Proteína , Software
4.
Evol Bioinform Online ; 8: 489-525, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22915837

RESUMO

In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, disruptive biological effects and the steadily increasing number of genomes have led to a huge diversity in published phylogenies. Comparison of those and, moreover, identification of the impact of inference properties (underlying data model, inference technique) on particular reconstructions is almost impossible. In this work, we introduce tree topology profiling as a method to compare already published whole-genome phylogenies. This method requires visual determination of the particular topology in a drawn whole-genome phylogeny for a set of particular bacterial clans. For each clan, neighborhoods to other bacteria are collected into a catalogue of generalized alternative topologies. Particular topology alternatives found for an ordered list of bacterial clans reveal a topology profile that represents the analyzed phylogeny. To simulate the inhomogeneity of published gene content phylogenies we generate a set of seven phylogenies using different inference techniques and the SYSTERS-PhyloMatrix data model. After tree topology profiling on in total 54 selected published and newly inferred phylogenies, we separate artefactual from biologically meaningful phylogenies and associate particular inference results (phylogenies) with inference background (inference techniques as well as data models). Topological relationships of particular bacterial species groups are presented. With this work we introduce tree topology profiling into the scientific field of comparative phylogenomics.

5.
Artigo em Inglês | MEDLINE | ID: mdl-21576757

RESUMO

Characterization of the kinetic and conformational properties of channel proteins is a crucial element in the integrative study of congenital cardiac diseases. The proteins of the ion channels of cardiomyocytes represent an important family of biological components determining the physiology of the heart. Some computational studies aiming to understand the mechanisms of the ion channels of cardiomyocytes have concentrated on Markovian stochastic approaches. Mathematically, these approaches employ Chapman-Kolmogorov equations coupled with partial differential equations. As the scale and complexity of such subcellular and cellular models increases, the balance between efficiency and accuracy of algorithms becomes critical. We have developed a novel two-stage splitting algorithm to address efficiency and accuracy issues arising in such modeling and simulation scenarios. Numerical experiments were performed based on the incorporation of our newly developed conformational kinetic model for the rapid delayed rectifier potassium channel into the dynamic models of human ventricular myocytes. Our results show that the new algorithm significantly outperforms commonly adopted adaptive Runge-Kutta methods. Furthermore, our parallel simulations with coupled algorithms for multicellular cardiac tissue demonstrate a high linearity in the speedup of large-scale cardiac simulations.


Assuntos
Biologia Computacional/métodos , Canais Iônicos/metabolismo , Modelos Biológicos , Miócitos Cardíacos/metabolismo , Algoritmos , Humanos , Canais Iônicos/química , Cinética , Cadeias de Markov
6.
Bioinformatics ; 18 Suppl 2: S84-90, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-12385988

RESUMO

Non-coding DNA segments that are conserved between the human and mouse genomic sequence are good indicators of possible regulatory sequences. Here we report on a systematic approach to delineate such conserved elements from upstream regions of orthologous gene pairs from man and mouse. We focus on orthologous genes in order to maximize our chances to find functionally similar regulatory elements. The identification of conserved elements is effected using the Waterman-Eggert local suboptimal alignment algorithm. We have modified an implementation of this algorithm such that it integrates the determination of statistical significance for the local suboptimal alignments. This has the effect of outputting a dynamically determined number of suboptimal alignments that are deemed statistically significant. Comparison with experimentally determined annotation shows a striking enrichement of regulatory sites among the conserved regions. Furthermore, the conserved regions tend to cover the promotor region described in the EPD database.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , DNA/genética , Genes Reguladores/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Animais , Sequência Conservada/genética , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Humanos , Camundongos , Homologia de Sequência do Ácido Nucleico , Especificidade da Espécie
7.
Genome Res ; 13(6A): 1056-66, 2003 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12799346

RESUMO

The 2R hypothesis predicting two genome duplications at the origin of vertebrates is highly controversial. Studies published so far include limited sequence data from organisms close to the hypothesized genome duplications. Through the comparison of a gene catalog from amphioxus, the closest living invertebrate relative of vertebrates, to 3453 single-copy genes orthologous between Caenorhabditis elegans (C), Drosophila melanogaster (D), and Saccharomyces cerevisiae (Y), and to Ciona intestinalis ESTs, mouse, and human genes, we show with a large number of genes that the gene duplication activity is significantly higher after the separation of amphioxus and the vertebrate lineages, which we estimate at 650 million years (Myr). The majority of human orthologs of 195 CDY groups that could be dated by the molecular clock appear to be duplicated between 300 and 680 Myr with a mean at 488 million years ago (Mya). We detected 485 duplicated chromosomal segments in the human genome containing CDY orthologs, 331 of which are found duplicated in the mouse genome and within regions syntenic between human and mouse, indicating that these were generated earlier than the human-mouse split. Model based calculations of the codon substitution rate of the human genes included in these segments agree with the molecular clock duplication time-scale prediction. Our results favor at least one large duplication event at the origin of vertebrates, followed by smaller scale duplication closer to the bird-mammalian split.


Assuntos
Cordados não Vertebrados/genética , Evolução Molecular , Duplicação Gênica , Genes/genética , Genoma , Vertebrados/genética , Animais , Ciona intestinalis/genética , Genoma Humano , Humanos , Camundongos , Dados de Sequência Molecular , Família Multigênica/genética , Filogenia
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa