Pesquisa | BVS CLAP/SMR-OPAS/OMS

1.

Gerstein, M; Sonnhammer, E L; Chothia, C.

J Mol Biol ; 236(4): 1067-78, 1994 Mar 04.

Artigo em Inglês | MEDLINE | ID: mdl-8120887

RESUMO

We have determined the variations in volume that occur during evolution in the buried core of three different families of proteins. The variation of the whole core is very small (approximately 2.5%) compared to the variation at individual sites (approximately 13%). However, by comparing our results to those expected from random sequences with no correlations between sites, we show that the small variation observed may simply be a manifestation of the statistical "law of large numbers" and not reflect any compensating changes in, or global constraints upon, protein sequences. We have also analysed in detail the volume variations at individual sites, both in the core and on the surface, and compared these variations with those expected from random sequences. Individual sites on the surface have nearly the same variation as random sequences (24% versus 28% variation). However, individual sites in the core have about half the variation of random sequences (13% versus 30%). Roughly, half of these core sites strongly conserve their volume (0 to 10% variation); one quarter have moderate variation (10 to 20%); and the remaining quarter vary randomly (20 to 40%). Our results have clear implications for the relationship between protein sequence and structure. For our analysis, we have developed a new and simple method for weighting protein sequences to correct for unequal representation, which we describe in an Appendix.

Assuntos

Evolução Biológica , Proteínas/química , Proteínas/genética , Algoritmos , Sequência de Aminoácidos , Animais , Azurina/química , Azurina/genética , Sítios de Ligação/genética , Variação Genética , Globinas/química , Globinas/genética , Humanos , Modelos Químicos , Dados de Sequência Molecular , Estrutura Molecular , Plastocianina/química , Plastocianina/genética , Conformação Proteica , Dobramento de Proteína , Tetra-Hidrofolato Desidrogenase/química , Tetra-Hidrofolato Desidrogenase/genética

2.

Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

Remm, M; Storm, C E; Sonnhammer, E L.

J Mol Biol ; 314(5): 1041-52, 2001 Dec 14.

Artigo em Inglês | MEDLINE | ID: mdl-11743721

RESUMO

Orthologs are genes in different species that originate from a single gene in the last common ancestor of these species. Such genes have often retained identical biological roles in the present-day organisms. It is hence important to identify orthologs for transferring functional information between genes in different organisms with a high degree of reliability. For example, orthologs of human proteins are often functionally characterized in model organisms. Unfortunately, orthology analysis between human and e.g. invertebrates is often complex because of large numbers of paralogs within protein families. Paralogs that predate the species split, which we call out-paralogs, can easily be confused with true orthologs. Paralogs that arose after the species split, which we call in-paralogs, however, are bona fide orthologs by definition. Orthologs and in-paralogs are typically detected with phylogenetic methods, but these are slow and difficult to automate. Automatic clustering methods based on two-way best genome-wide matches on the other hand, have so far not separated in-paralogs from out-paralogs effectively. We present a fully automatic method for finding orthologs and in-paralogs from two species. Ortholog clusters are seeded with a two-way best pairwise match, after which an algorithm for adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values for both orthologs and in-paralogs. The program, called INPARANOID, was tested on all completely sequenced eukaryotic genomes. To assess the quality of INPARANOID results, ortholog clusters were generated from a dataset of worm and mammalian transmembrane proteins, and were compared to clusters derived by manual tree-based ortholog detection methods. This study led to the identification with a high degree of confidence of over a dozen novel worm-mammalian ortholog assignments that were previously undetected because of shortcomings of phylogenetic methods.A WWW server that allows searching for orthologs between human and several fully sequenced genomes is installed at http://www.cgb.ki.se/inparanoid/. This is the first comprehensive resource with orthologs of all fully sequenced eukaryotic genomes. Programs and tables of orthology assignments are available from the same location.

Assuntos

Caenorhabditis elegans/genética , Biologia Computacional/métodos , Drosophila melanogaster/genética , Evolução Molecular , Genoma , Genômica/métodos , Homologia de Sequência , Algoritmos , Animais , Automação/métodos , Proteínas de Caenorhabditis elegans/genética , Análise por Conglomerados , Bases de Dados Genéticas , Proteínas de Drosophila/genética , Células Eucarióticas/metabolismo , Humanos , Filogenia , Software , Especificidade da Espécie

3.

Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Krogh, A; Larsson, B; von Heijne, G; Sonnhammer, E L.

J Mol Biol ; 305(3): 567-80, 2001 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-11152613

RESUMO

We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of TMHMM's performance, and show that it correctly predicts 97-98 % of the transmembrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, although the accuracy drops when signal peptides are present. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate that 20-30 % of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C(in) topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechanisms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/services/TMHMM/.

Assuntos

Biologia Computacional/métodos , Genoma , Cadeias de Markov , Proteínas de Membrana/química , Animais , Proteínas de Bactérias/química , Bases de Dados como Assunto , Proteínas Fúngicas/química , Internet , Proteínas de Plantas/química , Porinas/química , Sinais Direcionadores de Proteínas , Estrutura Secundária de Proteína , Reprodutibilidade dos Testes , Projetos de Pesquisa , Sensibilidade e Especificidade , Software , Solubilidade

4.

Sequence of the human immunoglobulin diversity (D) segment locus: a systematic analysis provides no evidence for the use of DIR segments, inverted D segments, "minor" D segments or D-D recombination.

Corbett, S J; Tomlinson, I M; Sonnhammer, E L; Buck, D; Winter, G.

J Mol Biol ; 270(4): 587-97, 1997 Jul 25.

Artigo em Inglês | MEDLINE | ID: mdl-9245589

RESUMO

We have determined the complete nucleotide sequence of the human immunoglobulin D segment locus on chromosome 14q32.3 and identified a total of 27 D segments, of which nine are new. Comparison with a database of rearranged heavy chain sequences indicates that the human antibody repertoire is created by VDJ recombination involving 25 of these 27 D segments, extensive processing at the V-D and D-J junctions and use of multiple reading frames. We could find no evidence for the proposed use of DIR segments, inverted D segments, "minor" D segments or D-D recombination. Conventional VDJ recombination, which obeys the 12/23 rule, is therefore sufficient to explain the wealth of lengths and sequences for the third hypervariable loop of human heavy chains.

Assuntos

Cromossomos Humanos Par 14 , Imunoglobulina D/genética , Recombinação Genética , Sequência de Bases , Mapeamento Cromossômico , Evolução Molecular , Células Germinativas , Humanos , Região de Junção de Imunoglobulinas/genética , Região Variável de Imunoglobulina/genética , Dados de Sequência Molecular , Fases de Leitura Aberta

5.

The imprint of somatic hypermutation on the repertoire of human germline V genes.

Tomlinson, I M; Walter, G; Jones, P T; Dear, P H; Sonnhammer, E L; Winter, G.

J Mol Biol ; 256(5): 813-17, 1996 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-8601832

RESUMO

In the human immune system, antibodies with high affinities for antigen are created in two stages. A diverse primary repertoire of antibody structures is produced by the combinatorial rearrangement of germline V gene segments and antibodies are selected from this repertoire by binding to the antigen. Their affinities are then improved by somatic hypermutation and further rounds of selection. We have dissected the sequence diversity created at each stage in response to a wide range of antigens. In the primary repertoire, diversity is focused at the centre of the binding site. With somatic hypermutation, diversity spreads to regions at the periphery of the binding site that are highly conserved in the primary repertoire. We propose that evolution has favoured this complementarity as an efficient strategy for searching sequence space and that the germline V gene families evolved to exploit the diversity created by somatic hypermutation.

Assuntos

Diversidade de Anticorpos , Genes de Imunoglobulinas , Região Variável de Imunoglobulina/genética , Mutação , Sítios de Ligação de Anticorpos/genética , Evolução Biológica , Humanos , Região Variável de Imunoglobulina/química , Região Variável de Imunoglobulina/ultraestrutura , Modelos Genéticos , Modelos Moleculares

6.

Modular arrangement of proteins as inferred from analysis of homology.

Sonnhammer, E L; Kahn, D.

Protein Sci ; 3(3): 482-92, 1994 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-8019419

RESUMO

The structure of many proteins consists of a combination of discrete modules that have been shuffled during evolution. Such modules can frequently be recognized from the analysis of homology. Here we present a systematic analysis of the modular organization of all sequenced proteins. To achieve this we have developed an automatic method to identify protein domains from sequence comparisons. Homologous domains can then be clustered into consistent families. The method was applied to all 21,098 nonfragment protein sequences in SWISS-PROT 21.0, which was automatically reorganized into a comprehensive protein domain database, ProDom. We have constructed multiple sequence alignments for each domain family in ProDom, from which consensus sequences were generated. These nonreduntant domain consensuses are useful for fast homology searches. Domain organization in ProDom is exemplified for proteins of the phosphoenolpyruvate:sugar phosphotransferase system (PEP:PTS) and for bacterial 2-component regulators. We provide 2 examples of previously unrecognized domain arrangements discovered with the help of ProDom.

Assuntos

Bases de Dados Factuais , Proteínas/química , Proteínas/genética , Software , Sequência de Aminoácidos , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Evolução Biológica , Dados de Sequência Molecular , Estrutura Molecular , Sistema Fosfotransferase de Açúcar do Fosfoenolpiruvato/genética , Proteínas/classificação , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Design de Software

7.

A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis.

Sonnhammer, E L; Durbin, R.

Gene ; 167(1-2): GC1-10, 1995 Dec 29.

Artigo em Inglês | MEDLINE | ID: mdl-8566757

RESUMO

Graphical dot-matrix plots can provide the most complete and detailed comparison of two sequences. Presented here is DOTTER2, a dot-plot program for X-windows which can compare DNA or protein sequences, and also DNA versus protein. The main novel feature of DOTTER is that the user can vary the stringency cutoffs interactively, so that the dot-matrix only needs to be calculated once. This is possible thanks to a 'Greyramp tool' that was developed to change the displayed stringency of the matrix by dynamically changing the greyscale rendering of the dots. The Greyramp tool allows the user to interactively change the lower and upper score limit for the greyscale rendering. This allows exploration of the separation between signal and noise, and fine-grained visualisation of different score levels in the dot-matrix. Other useful features are dot-matrix compression, mouse-controlled zooming, sequence alignment display and saving/loading of dot-matrices. Since the matrix only has to be calculated once and since the algorithm is fast and linear in space, DOTTER is practical to use even for sequences as long as cosmids. DOTTER was integrated in the gene-modelling module of the genomic database system ACEDB3. This was done via the homology viewer BLIXEM in a way that also allows segments from the BLAST suite of searching programs to be superimposed on top of the full dot-matrix. This feature can also be used for very quick finding of the strongest matches. As examples, we analyse a Caenorhabditis elegans cosmid with several tandem repeat families, and illustrate how DOTTER can improve gene modelling.

Assuntos

Análise de Sequência/métodos , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico , Software , Sequência de Aminoácidos , Apresentação de Dados , Dados de Sequência Molecular

8.

Dynamic contact maps of protein structures.

Sonnhammer, E L; Wootton, J C.

J Mol Graph Model ; 16(1): 1-5, 33, 1998 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-9783253

RESUMO

The two-dimensional contact map of interresidue distances is a visual analysis technique for protein structures. We present two standalone software tools designed to be used in combination to increase the versatility of this simple yet powerful technique. First, the program Structer calculates contact maps from three-dimensional molecular structural data. The contact map matrix can then be viewed in the graphical matrix-visualization program Dotter. Instead of using a predefined distance cutoff, we exploit Dotter's dynamic rendering control, allowing interactive exploration at varying distance cutoffs after calculating the matrix once. Structer can use a number of distance measures, can incorporate multiple chains in one contact map, and allows masking of user-defined residue sets. It works either directly with PDB files, or can use the MMDB network API for reading structures.

Assuntos

Simulação por Computador , Modelos Moleculares , Conformação Proteica , Software , Toxina Diftérica/química , Antígenos de Histocompatibilidade Classe II/química , Humanos

9.

FAT: a novel domain in PIK-related kinases.

Bosotti, R; Isacchi, A; Sonnhammer, E L.

Trends Biochem Sci ; 25(5): 225-7, 2000 May.

Artigo em Inglês | MEDLINE | ID: mdl-10782091

Assuntos

1-Fosfatidilinositol 4-Quinase/genética , Fosfatidilinositol 3-Quinases/genética , 1-Fosfatidilinositol 4-Quinase/química , 1-Fosfatidilinositol 4-Quinase/metabolismo , Sequência de Aminoácidos , Animais , Humanos , Dados de Sequência Molecular , Fosfatidilinositol 3-Quinases/química , Fosfatidilinositol 3-Quinases/metabolismo , Homologia de Sequência de Aminoácidos

10.

Widespread eukaryotic sequences, highly similar to bacterial DNA polymerase I, looking for functions.

Sonnhammer, E L; Wootton, J C.

Curr Biol ; 7(8): R463-5, 1997 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-9259570

Assuntos

DNA Polimerase I/genética , DNA Polimerase I/fisiologia , Sequência de Aminoácidos , Animais , Caenorhabditis elegans/enzimologia , Caenorhabditis elegans/genética , Células Eucarióticas , Genes de Helmintos , Humanos , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos

11.

Comparison of the human germline and rearranged VH repertoire reveals complementarity between germline variability and somatic mutation.

Walter, G; Tomlinson, I M; Dear, P H; Sonnhammer, E L; Cook, G P; Winter, G.

Ann N Y Acad Sci ; 764: 180-2, 1995 Sep 29.

Artigo em Inglês | MEDLINE | ID: mdl-7486518

Assuntos

Rearranjo Gênico de Cadeia Pesada de Linfócito B , Genes de Imunoglobulinas , Cadeias Pesadas de Imunoglobulinas/genética , Região Variável de Imunoglobulina/genética , Mutação , Homologia de Sequência de Aminoácidos , DNA Complementar/genética , Humanos , Switching de Imunoglobulina

12.

An expert system for processing sequence homology data.

Sonnhammer, E L; Durbin, R.

Proc Int Conf Intell Syst Mol Biol ; 2: 363-8, 1994.

Artigo em Inglês | MEDLINE | ID: mdl-7584413

RESUMO

When confronted with the task of finding homology to large numbers of sequences, database searching tools such as Blast and Fasta generate prohibitively large amounts of information. An automatic way of making most of the decisions a trained sequence analyst would make was developed by means of a rule-based expert system combined with an algorithm to avoid non-informative biased residue composition matches. The results found relevant by the system are presented in a very concise and clear way, so that the homology can be assessed with minimum effort. The expert system, HSPcrunch, was implemented to process the output to the programs in the BLAST suite. HSPcrunch embodies rules on detecting distant similarities when pairs of weak matches are consistent with a larger gapped alignment, i.e. when Blast has broken a longer gapped alignment up into smaller ungapped ones. This way, more distant similarities can be detected with no or little side-effects of more spurious matches. The rules for how small the gaps must be to be considered significant have been derived empirically. Currently a set of rules are used that operate on two different scoring levels, one for very weak matches that have very small gaps and one for medium weak matches that have slightly larger gaps. This set of rules proved to be robust for most cases and gives high fidelity separation between real homologies and spurious matches. One of the most important rules for reducing the amount of output is to limit the number of overlapping matches to the same region of the query sequence.(ABSTRACT TRUNCATED AT 250 WORDS)

Assuntos

Sistemas Inteligentes , Homologia de Sequência , Software , Sequência de Aminoácidos , Animais , Humanos , Dados de Sequência Molecular , Análise de Sequência

13.

A workbench for large-scale sequence homology analysis.

Sonnhammer, E L; Durbin, R.

Comput Appl Biosci ; 10(3): 301-7, 1994 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-7922687

RESUMO

When routinely analysing very long stretches of DNA sequences produced by genome sequencing projects, detailed analysis of database search results becomes exceedingly time consuming. To reduce the tedious browsing of large quantities of protein similarities, two programs, MSPcrunch and Blixem, were developed, which assist in processing the results from the database search programs in the BLAST suite. MSPcrunch removes biased composition and redundant matches while keeping weak matches that are consistent with a larger gapped alignment. This makes BLAST searching in practice more sensitive and reduces the risk of overlooking distant similarities. Blixem is a multiple sequence alignment viewer for X-windows which makes it significantly easier to scan and evaluate the matches ratified by MSPcrunch. In Blixem, matches to the translated DNA query sequence are simultaneously aligned in three frames. Also, the distribution of matches over the whole DNA query is displayed. Examples of usage are drawn from 36 C. elegans cosmid clones totalling 1.2 megabases, to which these tools were applied.

Assuntos

DNA/análise , Homologia de Sequência do Ácido Nucleico , Software , Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Éxons , Dados de Sequência Molecular , Alinhamento de Sequência/métodos , Design de Software

14.

A comparison of sequence and structure protein domain families as a basis for structural genomics.

Elofsson, A; Sonnhammer, E L.

Bioinformatics ; 15(6): 480-500, 1999 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-10383473

RESUMO

MOTIVATION: Protein families can be defined based on structure or sequence similarity. We wanted to compare two protein family databases, one based on structural and one on sequence similarity, to investigate to what extent they overlap, the similarity in definition of corresponding families, and to create a list of large protein families with unknown structure as a resource for structural genomics. We also wanted to increase the sensitivity of fold assignment by exploiting protein family HMMs. RESULTS: We compared Pfam, a protein family database based on sequence similarity, to Scop, which is based on structural similarity. We found that 70% of the Scop families exist in Pfam while 57% of the Pfam families exist in Scop. Most families that occur in both databases correspond well to each other, but in some cases they are different. Such cases highlight situations in which structure and sequence approaches differ significantly. The comparison enabled us to compile a list of the largest families that do not occur in Scop; these are suitable targets for structure prediction and determination, and may be useful to guide projects in structural genomics. It can be noted that 13 out of the 20 largest protein families without a known structure are likely transmembrane proteins. We also exploited Pfam to increase the sensitivity of detecting homologs of proteins with known structure, by comparing query sequences to Pfam HMMs that correspond to Scop families. For SWISSPROT+TREMBL, this yielded an increase in fold assignment from 31% to 42% compared to using FASTA only. This method assigned a structure to 22% of the proteins in Saccharomyces cerevisiae, 24% in Escherichia coli, and 16% in Methanococcus jannaschii.

Assuntos

Bases de Dados Factuais , Proteínas/química , Proteínas/genética , Biologia Computacional , Genoma , Dobramento de Proteína , Proteínas/classificação , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos

15.

Integrated graphical analysis of protein sequence features predicted from sequence composition.

Sonnhammer, E L; Wootton, J C.

Proteins ; 45(3): 262-73, 2001 Nov 15.

Artigo em Inglês | MEDLINE | ID: mdl-11599029

RESUMO

Several protein sequence analysis algorithms are based on properties of amino acid composition and repetitiveness. These include methods for prediction of secondary structure elements, coiled-coils, transmembrane segments or signal peptides, and for assignment of low-complexity, nonglobular, or intrinsically unstructured regions. The quality of such analyses can be greatly enhanced by graphical software tools that present predicted sequence features together in context and allow judgment to be focused simultaneously on several different types of supporting information. For these purposes, we describe the SFINX package, which allows many different sets of segmental or continuous-curve sequence feature data, generated by individual external programs, to be viewed in combination alongside a sequence dot-plot or a multiple alignment of database matches. The implementation is currently based on extensions to the graphical viewers Dotter and Blixem and scripts that convert data from external programs to a simple generic data definition format called SFS. We describe applications in which dot-plots and flanking database matches provide valuable contextual information for analyses based on compositional and repetitive sequence features. The system is also useful for comparing results from algorithms run with a range of parameters to determine appropriate values for defaults or cutoffs for large-scale genomic analyses.

Assuntos

Aminoácidos/química , Proteínas/química , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Apresentação de Dados , Bases de Dados de Proteínas , Internet , Proteínas de Membrana/química , Estrutura Terciária de Proteína , Sequências Repetitivas de Aminoácidos , Software

16.

Identification of motifs in protein sequences.

Sonnhammer, E L; Wolfsberg, T G.

Curr Protoc Cell Biol ; Appendix 1: Appendix 1C, 2001 May.

Artigo em Inglês | MEDLINE | ID: mdl-18228275

RESUMO

This brief appendix serves as a guide for the analysis of functional motifs in proteins. Several database search engines that can be accessed via the World Wide Web are described. Such computerized searches have become the preferred method to scan large sequence and motif databases, as the searches are efficient and the databases are updated frequently. A short list of sorting signals is also included, since these motifs often cannot be predicted reliably by a computer search.

Assuntos

Motivos de Aminoácidos/genética , Bases de Dados de Proteínas , Biologia Molecular/métodos , Proteínas/química , Proteínas/genética , Proteômica/métodos , Animais , Biologia Computacional/métodos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet/organização & administração , Análise de Sequência de Proteína/métodos

17.

Analysis of protein domain families in Caenorhabditis elegans.

Sonnhammer, E L; Durbin, R.

Genomics ; 46(2): 200-16, 1997 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-9417907

RESUMO

The Caenorhabditis elegans genome sequencing project has completed over half of this nematode's 100-Mb genome. Proteins predicted in the finished sequence have been compiled and released in the data-base Wormpep. Presented here is a comprehensive analysis of protein domain families in Wormpep 11, which comprises 7299 proteins. The relative abundance of common protein domain families was counted by comparing all Wormpep proteins to the Pfam collection of protein families, which is based on recognition by hidden Markov models. This analysis also identified a number of previously unannotated domains. To investigate new apparently nematode-specific protein families, Wormpep was clustered into domain families on the basis of sequence similarity using the Domainer program. The largest clusters that lacked clear homology to proteins outside Nematoda were analyzed in further detail, after which some could be assigned a putative function. We compared all proteins in Wormpep 11 to proteins in the human, Saccharomyces cerevisiae, and Haemophilus influenzae genomes. Among the results are the estimation that over two-thirds of the currently known human proteins are likely to have a homologue in the whole C. elegans genome and that a significant number of proteins are well conserved between C. elegans and H. influenzae, that are not found in S. cerevisiae.

Assuntos

Caenorhabditis elegans/genética , Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Proteínas/genética , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Animais , Proteínas de Bactérias/genética , Bases de Dados Factuais , Haemophilus influenzae/genética , Humanos , Dados de Sequência Molecular , Saccharomyces cerevisiae/genética

18.

MEDUSA: large scale automatic selection and visual assessment of PCR primer pairs.

Podowski, R M; Sonnhammer, E L.

Bioinformatics ; 17(7): 656-7, 2001 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-11448885

RESUMO

UNLABELLED: MEDUSA is a tool for automatic selection and visual assessment of PCR primer pairs, developed to assist large scale gene expression analysis projects. The system allows specification of constraints of the location and distances between the primers in a pair. For instance, primers in coding, non-coding, exon/intron-spanning regions might be selected. Medusa applies these constraints as a filter to primers predicted by three external programs, and displays the resulting primer pairs graphically in the Blixem (Sonnhammer and Durbin, COMPUT: Appl. Biosci. 10, 301-307, 1994; http://www.cgr.ki.se/cgr/groups/sonnhammer/Blixem.html) viewer. AVAILABILITY: The MEDUSA web server is available at http://www.cgr.ki.se/cgr/MEDUSA. The source code and user information are available at ftp://ftp.cgr.ki.se/pub/prog/medusa.

Assuntos

Primers do DNA , Reação em Cadeia da Polimerase/estatística & dados numéricos , Software , Sequência de Bases , Biologia Computacional , Primers do DNA/genética , Genômica , Dados de Sequência Molecular

19.

NIFAS: visual analysis of domain evolution in proteins.

Storm, C E; Sonnhammer, E L.

Bioinformatics ; 17(4): 343-8, 2001 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-11301303

RESUMO

MOTIVATION: Multi-domain proteins have evolved by insertions or deletions of distinct protein domains. Tracing the history of a certain domain combination can be important for functional annotation of multi-domain proteins, and for understanding the function of individual domains. In order to analyze the evolutionary history of the domains in modular proteins it is desirable to inspect a phylogenetic tree based on sequence divergence with the modular architecture of the sequences superimposed on the tree. RESULT: A Java applet, NIFAS, that integrates graphical domain schematics for each sequence in an evolutionary tree was developed. NIFAS retrieves domain information from the Pfam database and uses CLUSTAL W to calculate a tree for a given Pfam domain. The tree can be displayed with symbolic bootstrap values, and to allow the user to focus on a part of the tree, the layout can be altered by swapping nodes, changing the outgroup, and showing/collapsing subtrees. NIFAS is integrated with the Pfam database and is accessible over the internet (http://www.cgr.ki.se/Pfam). As an example, we use NIFAS to analyze the evolution of domains in Protein Kinases C.

Assuntos

Evolução Molecular , Processamento de Imagem Assistida por Computador , Proteína Quinase C/química , Estrutura Terciária de Proteína , Proteínas/química , Software , Gráficos por Computador , Humanos , Proteína Quinase C/classificação , Proteínas/classificação

20.

Pfam: a comprehensive database of protein domain families based on seed alignments.

Sonnhammer, E L; Eddy, S R; Durbin, R.

Proteins ; 28(3): 405-20, 1997 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-9223186

RESUMO

Databases of multiple sequence alignments are a valuable aid to protein sequence classification and analysis. One of the main challenges when constructing such a database is to simultaneously satisfy the conflicting demands of completeness on the one hand and quality of alignment and domain definitions on the other. The latter properties are best dealt with by manual approaches, whereas completeness in practice is only amenable to automatic methods. Herein we present a database based on hidden Markov model profiles (HMMs), which combines high quality and completeness. Our database, Pfam, consists of parts A and B. Pfam-A is curated and contains well-characterized protein domain families with high quality alignments, which are maintained by using manually checked seed alignments and HMMs to find and align all members. Pfam-B contains sequence families that were generated automatically by applying the Domainer algorithm to cluster and align the remaining protein sequences after removal of Pfam-A domains. By using Pfam, a large number of previously unannotated proteins from the Caenorhabditis elegans genome project were classified. We have also identified many novel family memberships in known proteins, including new kazal, Fibronectin type III, and response regulator receiver domains. Pfam-A families have permanent accession numbers and form a library of HMMs available for searching and automatic annotation of new protein sequences.

Assuntos

Sequência de Aminoácidos , Bases de Dados Factuais , Proteínas de Plantas/química , Estrutura Terciária de Proteína , Alinhamento de Sequência , Modelos Químicos , Dados de Sequência Molecular , Família Multigênica , Sementes/química , Homologia de Sequência de Aminoácidos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA