Pesquisa | Portal Regional da BVS

1.

eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations.

Muller, J; Szklarczyk, D; Julien, P; Letunic, I; Roth, A; Kuhn, M; Powell, S; von Mering, C; Doerks, T; Jensen, L J; Bork, P.

Nucleic Acids Res ; 38(Database issue): D190-5, 2010 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19900971

RESUMO

The identification of orthologous relationships forms the basis for most comparative genomics studies. Here, we present the second version of the eggNOG database, which contains orthologous groups (OGs) constructed through identification of reciprocal best BLAST matches and triangular linkage clustering. We applied this procedure to 630 complete genomes (529 bacteria, 46 archaea and 55 eukaryotes), which is a 2-fold increase relative to the previous version. The pipeline yielded 224,847 OGs, including 9724 extended versions of the original COG and KOG. We computed OGs for different levels of the tree of life; in addition to the species groups included in our first release (i.e. fungi, metazoa, insects, vertebrates and mammals), we have now constructed OGs for archaea, fishes, rodents and primates. We automatically annotate the non-supervised orthologous groups (NOGs) with functional descriptions, protein domains, and functional categories as defined initially for the COG/KOG database. In-depth analysis is facilitated by precomputed high-quality multiple sequence alignments and maximum-likelihood trees for each of the available OGs. Altogether, eggNOG covers 2,242 035 proteins (built from 2,590,259 proteins) and provides a broad functional description for at least 1,966,709 (88%) of them. Users can access the complete set of orthologous groups via a web interface at: http://eggnog.embl.de.

Assuntos

Motivos de Aminoácidos/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Animais , Archaea , Biologia Computacional/tendências , Bases de Dados de Proteínas , Peixes , Genoma Bacteriano , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Primatas , Estrutura Terciária de Proteína , Ratos , Software

2.

Quantitative assessment of protein function prediction from metagenomics shotgun sequences.

Harrington, E D; Singh, A H; Doerks, T; Letunic, I; von Mering, C; Jensen, L J; Raes, J; Bork, P.

Proc Natl Acad Sci U S A ; 104(35): 13913-8, 2007 Aug 28.

Artigo em Inglês | MEDLINE | ID: mdl-17717083

RESUMO

To assess the potential of protein function prediction in environmental genomics data, we analyzed shotgun sequences from four diverse and complex habitats. Using homology searches as well as customized gene neighborhood methods that incorporate intergenic and evolutionary distances, we inferred specific functions for 76% of the 1.4 million predicted ORFs in these samples (83% when nonspecific functions are considered). Surprisingly, these fractions are only slightly smaller than the corresponding ones in completely sequenced genomes (83% and 86%, respectively, by using the same methodology) and considerably higher than previously thought. For as many as 75,448 ORFs (5% of the total), only neighborhood methods can assign functions, illustrated here by a previously undescribed gene associated with the well characterized heme biosynthesis operon and a potential transcription factor that might regulate a coupling between fatty acid biosynthesis and degradation. Our results further suggest that, although functions can be inferred for most proteins on earth, many functions remain to be discovered in numerous small, rare protein families.

Assuntos

Genoma Bacteriano , Genoma , Biblioteca Genômica , Proteínas/genética , Animais , Biofilmes , Bases de Dados Factuais , Variação Genética , Modelos Genéticos , Fases de Leitura Aberta , Proteínas/metabolismo , Homologia de Sequência de Aminoácidos

3.

Quantitative phylogenetic assessment of microbial communities in diverse environments.

von Mering, C; Hugenholtz, P; Raes, J; Tringe, S G; Doerks, T; Jensen, L J; Ward, N; Bork, P.

Science ; 315(5815): 1126-30, 2007 Feb 23.

Artigo em Inglês | MEDLINE | ID: mdl-17272687

RESUMO

The taxonomic composition of environmental communities is an important indicator of their ecology and function. We used a set of protein-coding marker genes, extracted from large-scale environmental shotgun sequencing data, to provide a more direct, quantitative, and accurate picture of community composition than that provided by traditional ribosomal RNA-based approaches depending on the polymerase chain reaction. Mapping marker genes from four diverse environmental data sets onto a reference species phylogeny shows that certain communities evolve faster than others. The method also enables determination of preferred habitats for entire microbial clades and provides evidence that such habitat preferences are often remarkably stable over time.

Assuntos

Bactérias/classificação , Ecossistema , Microbiologia Ambiental , Genômica , Filogenia , Animais , Bactérias/genética , Evolução Biológica , Osso e Ossos/microbiologia , Genes Bacterianos , Genes de RNAr , Marcadores Genéticos , Funções Verossimilhança , Mineração , Água do Mar/microbiologia , Microbiologia do Solo , Microbiologia da Água , Baleias/microbiologia

4.

Proteome analysis based on motif statistics.

Nicodème, P; Doerks, T; Vingron, M.

Bioinformatics ; 18 Suppl 2: S161-71, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12385999

RESUMO

MOTIVATION: Even for the amino acid motifs collected in the Prosite database there may be chance occurences as opposed to those occurences where the motif is involved in fold or function of a protein. With recent mathematical advances in assessing the significance of observing such a motif a particular number of times, we can now study the over- or under-representation of particular motifs in a complete genome and attempt to make functional deductions. RESULTS: We demonstrate that statistical over- or under-representation of motifs in complete proteomes may be an indicator of whether, in that organism, we are looking at chance occurrences of the motif or whether the occurrences are sufficiently numerous to suggest a systematic, and thus functionally important occurrence. This has important implications on databank annotations. AVAILABILITY: The complete dataset comprising the plotted statistics of 266 Prosite motifs on 42 proteomes is available at http://algo.inria.fr/nicodeme/proteomes/proteocomp.html. The software used to compute this data has been described by Nicodème (2000, 2001). They are available either by web access as mentioned in these articles or by direct request from Pierre Nicodème.

Assuntos

Mapeamento Cromossômico/métodos , Bases de Dados de Proteínas , Modelos Químicos , Proteoma/análise , Proteoma/química , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Simulação por Computador , Interpretação Estatística de Dados , Modelos Genéticos , Modelos Estatísticos , Dados de Sequência Molecular , Proteoma/genética , Homologia de Sequência de Aminoácidos

5.

The Spir actin organizers are involved in vesicle transport processes.

Kerkhoff, E; Simpson, J C; Leberfinger, C B; Otto, I M; Doerks, T; Bork, P; Rapp, U R; Raabe, T; Pepperkok, R.

Curr Biol ; 11(24): 1963-8, 2001 Dec 11.

Artigo em Inglês | MEDLINE | ID: mdl-11747823

RESUMO

The p150-Spir protein, which was discovered as a phosphorylation target of the Jun N-terminal kinase, is an essential regulator of the polarization of the Drosophila oocyte. Spir proteins are highly conserved between species and belong to the family of Wiskott-Aldrich homology region 2 (WH2) proteins involved in actin organization. The C-terminal region of Spir encodes a zinc finger structure highly homologous to FYVE motifs. A region with high homology between the Spir family proteins is located adjacent (N-terminal) to the modified FYVE domain and is designated as "Spir-box." The Spir-box has sequence similarity to a region of rabphilin-3A, which mediates interaction with the small GTPase Rab3A. Coexpression of p150-Spir and green fluorescent protein-tagged Rab GTPases in NIH 3T3 cells revealed that the Spir protein colocalized specifically with the Rab11 GTPase, which is localized at the trans-Golgi network (TGN), post-Golgi vesicles, and the recycling endosome. The distinct Spir localization pattern was dependent on the integrity of the modified FYVE finger motif and the Spir-box. Overexpression of a mouse Spir-1 dominant interfering mutant strongly inhibited the transport of the vesicular stomatitis virus G (VSV G) protein to the plasma membrane. The viral protein was arrested in membrane structures, largely colocalizing with the TGN marker TGN46. Our findings that the Spir actin organizer is targeted to intracellular membrane structures by its modified FYVE zinc finger and is involved in vesicle transport processes provide a novel link between actin organization and intracellular transport.

Assuntos

Actinas/metabolismo , Proteínas de Drosophila , Proteínas dos Microfilamentos/metabolismo , Células 3T3 , Actinas/química , Sequência de Aminoácidos , Animais , Transporte Biológico , Drosophila , Camundongos , Proteínas dos Microfilamentos/química , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos

6.

DDT -- a novel domain in different transcription and chromosome remodeling factors.

Doerks, T; Copley, R; Bork, P.

Trends Biochem Sci ; 26(3): 145-6, 2001 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-11246006

RESUMO

Homology-based sequence analyses have revealed the presence of a novel domain (DDT) in bromodomain PHD finger transcription factors (BPTFs), chromatin remodeling factors of the BAZ-family and other putative nuclear proteins. This domain is characterized by a number of conserved aromatic and charged residues and is predicted to consist of three alpha helices. Recent studies indicate a likely DNA-binding function for the DDT domain.

Assuntos

Cromossomos , Fatores de Transcrição/química , Sequência de Aminoácidos , Proteínas de Homeodomínio/química , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos

7.

GRAM, a novel domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins.

Doerks, T; Strauss, M; Brendel, M; Bork, P.

Trends Biochem Sci ; 25(10): 483-5, 2000 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-11050430

Assuntos

Motivos de Aminoácidos , Glucosiltransferases/metabolismo , Proteínas de Membrana/metabolismo , Proteínas Tirosina Fosfatases/metabolismo , Sequência de Aminoácidos , Glucosiltransferases/química , Proteínas de Membrana/química , Dados de Sequência Molecular , Proteínas Tirosina Fosfatases/química , Proteínas Tirosina Fosfatases não Receptoras , Homologia de Sequência de Aminoácidos

8.

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames.

Dandekar, T; Huynen, M; Regula, J T; Ueberle, B; Zimmermann, C U; Andrade, M A; Doerks, T; Sánchez-Pulido, L; Snel, B; Suyama, M; Yuan, Y P; Herrmann, R; Bork, P.

Nucleic Acids Res ; 28(17): 3278-88, 2000 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-10954595

RESUMO

Four years after the original sequence submission, we have re-annotated the genome of Mycoplasma pneumoniae to incorporate novel data. The total number of ORFss has been increased from 677 to 688 (10 new proteins were predicted in intergenic regions, two further were newly identified by mass spectrometry and one protein ORF was dismissed) and the number of RNAs from 39 to 42 genes. For 19 of the now 35 tRNAs and for six other functional RNAs the exact genome positions were re-annotated and two new tRNA(Leu) and a small 200 nt RNA were identified. Sixteen protein reading frames were extended and eight shortened. For each ORF a consistent annotation vocabulary has been introduced. Annotation reasoning, annotation categories and comparisons to other published data on M.pneumoniae functional assignments are given. Experimental evidence includes 2-dimensional gel electrophoresis in combination with mass spectrometry as well as gene expression data from this study. Compared to the original annotation, we increased the number of proteins with predicted functional features from 349 to 458. The increase includes 36 new predictions and 73 protein assignments confirmed by the published literature. Furthermore, there are 23 reductions and 30 additions with respect to the previous annotation. mRNA expression data support transcription of 184 of the functionally unassigned reading frames.

Assuntos

Genes Bacterianos/genética , Genoma Bacteriano , Mycoplasma pneumoniae/genética , Fases de Leitura Aberta/genética , Sequência de Aminoácidos , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Biologia Computacional , Espectrometria de Massas , Dados de Sequência Molecular , Mycoplasma pneumoniae/química , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , RNA Bacteriano/análise , RNA Bacteriano/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , Alinhamento de Sequência

9.

More than 1,000 putative new human signalling proteins revealed by EST data mining.

Schultz, J; Doerks, T; Ponting, C P; Copley, R R; Bork, P.

Nat Genet ; 25(2): 201-4, 2000 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-10835637

RESUMO

Cloning procedures aided by homology searches of EST databases have accelerated the pace of discovery of new genes, but EST database searching remains an involved and onerous task. More than 1.6 million human EST sequences have been deposited in public databases, making it difficult to identify ESTs that represent new genes. Compounding the problems of scale are difficulties in detection associated with a high sequencing error rate and low sequence similarity between distant homologues. We have developed a new method, coupling BLAST-based searches with a domain identification protocol, that filters candidate homologues. Application of this method in a large-scale analysis of 100 signalling domain families has led to the identification of ESTs representing more than 1,000 novel human signalling genes. The 4,206 publicly available ESTs representing these genes are a valuable resource for rapid cloning of novel human signalling proteins. For example, we were able to identify ESTs of at least 106 new small GTPases, of which 6 are likely to belong to new subfamilies. In some cases, further analyses of genomic DNA led to the discovery of previously unidentified full-length protein sequences. This is exemplified by the in silico cloning (prediction of a gene product sequence using only genomic and EST sequence data) of a new type of GTPase with two catalytic domains.

Assuntos

Biologia Computacional/métodos , Etiquetas de Sequências Expressas , Proteínas/genética , Proteínas/metabolismo , Transdução de Sinais , Sequência de Aminoácidos , Automação , Domínio Catalítico , Clonagem Molecular/métodos , Bases de Dados Factuais , Genoma Humano , Humanos , Internet , Dados de Sequência Molecular , Proteínas Monoméricas de Ligação ao GTP/química , Proteínas Monoméricas de Ligação ao GTP/genética , Proteínas Monoméricas de Ligação ao GTP/metabolismo , Estrutura Terciária de Proteína , Proteínas/química , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Software

10.

L27, a novel heterodimerization domain in receptor targeting proteins Lin-2 and Lin-7.

Doerks, T; Bork, P; Kamberov, E; Makarova, O; Muecke, S; Margolis, B.

Trends Biochem Sci ; 25(7): 317-8, 2000 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-10871881

Assuntos

Proteínas de Caenorhabditis elegans , Proteínas de Helminto/química , Proteínas de Helminto/metabolismo , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Núcleosídeo-Fosfato Quinase/química , Núcleosídeo-Fosfato Quinase/metabolismo , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Caenorhabditis elegans/enzimologia , Sequência Conservada , Dimerização , Guanilato Quinases , Dados de Sequência Molecular , Ligação Proteica , Estrutura Terciária de Proteína , Alinhamento de Sequência

11.

REF, an evolutionary conserved family of hnRNP-like proteins, interacts with TAP/Mex67p and participates in mRNA nuclear export.

Stutz, F; Bachi, A; Doerks, T; Braun, I C; Séraphin, B; Wilm, M; Bork, P; Izaurralde, E.

RNA ; 6(4): 638-50, 2000 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-10786854

RESUMO

Vertebrate TAP and its yeast ortholog Mex67p are involved in the export of messenger RNAs from the nucleus. TAP has also been implicated in the export of simian type D viral RNAs bearing the constitutive transport element (CTE). Although TAP directly interacts with CTE-bearing RNAs, the mode of interaction of TAP/Mex67p with cellular mRNAs is different from that with the CTE RNA and is likely to be mediated by protein-protein interactions. Here we show that Mex67p directly interacts with Yra1p, an essential yeast hnRNP-like protein. This interaction is evolutionarily conserved as Yra1p also interacts with TAP. Conditional expression in yeast cells implicates Yra1 p in the export of cellular mRNAs. Database searches revealed that Yra1p belongs to an evolutionarily conserved family of hnRNP-like proteins having more than one member in Mus musculus, Xenopus laevis, Caenorhabditis elegans, and Schizosaccharomyces pombe and at least one member in several species including plants. The murine members of the family directly interact with TAP. Because members of this protein family are characterized by the presence of one RNP-motif RNA-binding domain and exhibit RNA-binding activity, we called these proteins REF-bps for RNA and export factor binding proteins. Thus, Yra1p and members of the REF family of hnRNP-like proteins may facilitate the interaction of TAP/Mex67p with cellular mRNAs.

Assuntos

Sequência Conservada/genética , Proteínas Fúngicas/metabolismo , Receptores de Hialuronatos , Glicoproteínas de Membrana , Proteínas Nucleares/metabolismo , Proteínas de Transporte Nucleocitoplasmático , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Receptores de Complemento/metabolismo , Ribonucleoproteínas/química , Proteínas de Saccharomyces cerevisiae , Fatores de Transcrição/metabolismo , Sequência de Aminoácidos , Animais , Transporte Biológico , Proteínas de Transporte , Núcleo Celular/química , Núcleo Celular/genética , Núcleo Celular/metabolismo , Clonagem Molecular , Citoplasma/química , Citoplasma/genética , Citoplasma/metabolismo , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Genes Fúngicos , Ribonucleoproteínas Nucleares Heterogêneas , Humanos , Camundongos , Proteínas Mitocondriais , Dados de Sequência Molecular , Família Multigênica , Proteínas Nucleares/química , Proteínas Nucleares/genética , Ligação Proteica , RNA Mensageiro/genética , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/genética , Receptores de Complemento/química , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo , Ribonucleoproteínas/genética , Ribonucleoproteínas/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crescimento & desenvolvimento , Saccharomyces cerevisiae/metabolismo , Alinhamento de Sequência , Fatores de Transcrição/química , Fatores de Transcrição/genética

12.

Prediction of structural domains of TAP reveals details of its interaction with p15 and nucleoporins.

Suyama, M; Doerks, T; Braun, I C; Sattler, M; Izaurralde, E; Bork, P.

EMBO Rep ; 1(1): 53-8, 2000 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-11256625

RESUMO

Vertebrate TAP is a nuclear mRNA export factor homologous to yeast Mex67p. The middle domain of TAP binds directly to p15, a protein related to the nuclear transport factor 2 (NTF2), whereas its C-terminal domain interacts with various nucleoporins, the components of the nuclear pore complex (NPC). Here, we report that the middle domain of TAP is also similar to NTF2, as well as to regions in Ras-GAP SH3 domain binding protein (G3BP) and some plant protein kinases. Based on the known three-dimensional structure of NTF2 homodimer, a heterodimerization model of TAP and p15 could be inferred. This model was confirmed by site-directed mutagenesis of residues located at the dimer interface. Furthermore, the C-terminus of TAP was found to contain a ubiquitin-associated (UBA) domain. By site-directed mutagenesis we show that a conserved loop in this domain plays an essential role in mediating TAP-nucleoporin interaction.

Assuntos

Proteínas de Transporte/metabolismo , Proteínas de Membrana/metabolismo , Proteínas Nucleares/química , Proteínas Nucleares/metabolismo , Proteínas de Transporte Nucleocitoplasmático , Proteínas de Ligação a RNA/química , Sequência de Aminoácidos , Animais , Proteínas de Transporte/genética , Dimerização , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Mutagênese Sítio-Dirigida , Proteínas Nucleares/genética , Estrutura Terciária de Proteína , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo , Alinhamento de Sequência

13.

SMART: a web-based tool for the study of genetically mobile domains.

Schultz, J; Copley, R R; Doerks, T; Ponting, C P; Bork, P.

Nucleic Acids Res ; 28(1): 231-4, 2000 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-10592234

RESUMO

SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (http://SMART.embl-heidelberg.de ). More than 400 domain families found in signalling, extra-cellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for proteins containing specific combinations of domains in defined taxa.

Assuntos

Sistemas de Gerenciamento de Base de Dados , Internet , Alinhamento de Sequência , Armazenamento e Recuperação da Informação , Proteínas/química

14.

Domains in plexins: links to integrins and transcription factors.

Bork, P; Doerks, T; Springer, T A; Snel, B.

Trends Biochem Sci ; 24(7): 261-3, 1999 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-10390613

Assuntos

Moléculas de Adesão Celular/química , Integrinas/química , Proteínas do Tecido Nervoso/química , Fatores de Transcrição/química , Sequência de Aminoácidos , Animais , Moléculas de Adesão Celular/genética , Humanos , Integrinas/genética , Camundongos , Dados de Sequência Molecular , Proteínas do Tecido Nervoso/genética , Homologia de Sequência de Aminoácidos , Fatores de Transcrição/genética

15.

Homology-based fold predictions for Mycoplasma genitalium proteins.

Huynen, M; Doerks, T; Eisenhaber, F; Orengo, C; Sunyaev, S; Yuan, Y; Bork, P.

J Mol Biol ; 280(3): 323-6, 1998 Jul 17.

Artigo em Inglês | MEDLINE | ID: mdl-9665839

RESUMO

Homology search techniques based on the iterative PSI-BLAST method in combination with various filters for low sequence complexity are applied to assign folds to all Mycoplasma genitalium proteins. The resulting procedure (implemented as a web server) is able to predict at least one domain in 37% of these proteins automatically, with an estimated accuracy higher than 98%. Taking structural features such as coiled coil or transmembrane regions aside, folds can be assigned to more than half of the globular proteins in a bacterium just by iterative sequence comparison.

Assuntos

Proteínas de Bactérias/química , Mycoplasma/química , Dobramento de Proteína , Conformação Proteica , Homologia de Sequência

16.

Protein annotation: detective work for function prediction.

Doerks, T; Bairoch, A; Bork, P.

Trends Genet ; 14(6): 248-50, 1998 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-9635409

Assuntos

Proteínas/fisiologia , Animais , Bases de Dados Factuais , Humanos , Filogenia , Software , Relação Estrutura-Atividade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA