Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Bioinformatics ; 33(21): 3454-3460, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-29036270

RESUMO

MOTIVATION: Biological knowledgebases, such as UniProtKB/Swiss-Prot, constitute an essential component of daily scientific research by offering distilled, summarized and computable knowledge extracted from the literature by expert curators. While knowledgebases play an increasingly important role in the scientific community, their ability to keep up with the growth of biomedical literature is under scrutiny. Using UniProtKB/Swiss-Prot as a case study, we address this concern via multiple literature triage approaches. RESULTS: With the assistance of the PubTator text-mining tool, we tagged more than 10 000 articles to assess the ratio of papers relevant for curation. We first show that curators read and evaluate many more papers than they curate, and that measuring the number of curated publications is insufficient to provide a complete picture as demonstrated by the fact that 8000-10 000 papers are curated in UniProt each year while curators evaluate 50 000-70 000 papers per year. We show that 90% of the papers in PubMed are out of the scope of UniProt, that a maximum of 2-3% of the papers indexed in PubMed each year are relevant for UniProt curation, and that, despite appearances, expert curation in UniProt is scalable. AVAILABILITY AND IMPLEMENTATION: UniProt is freely available at http://www.uniprot.org/. CONTACT: sylvain.poux@sib.swiss. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Curadoria de Dados , Bases de Dados de Proteínas , Curadoria de Dados/estatística & dados numéricos , Mineração de Dados , Bases de Dados de Proteínas/estatística & dados numéricos , Humanos , Bases de Conhecimento , PubMed/estatística & dados numéricos , Literatura de Revisão como Assunto , Estatística como Assunto
2.
Plant Physiol ; 165(4): 1709-1722, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24920445

RESUMO

CASPARIAN STRIP MEMBRANE DOMAIN PROTEINS (CASPs) are four-membrane-span proteins that mediate the deposition of Casparian strips in the endodermis by recruiting the lignin polymerization machinery. CASPs show high stability in their membrane domain, which presents all the hallmarks of a membrane scaffold. Here, we characterized the large family of CASP-like (CASPL) proteins. CASPLs were found in all major divisions of land plants as well as in green algae; homologs outside of the plant kingdom were identified as members of the MARVEL protein family. When ectopically expressed in the endodermis, most CASPLs were able to integrate the CASP membrane domain, which suggests that CASPLs share with CASPs the propensity to form transmembrane scaffolds. Extracellular loops are not necessary for generating the scaffold, since CASP1 was still able to localize correctly when either one of the extracellular loops was deleted. The CASP first extracellular loop was found conserved in euphyllophytes but absent in plants lacking Casparian strips, an observation that may contribute to the study of Casparian strip and root evolution. In Arabidopsis (Arabidopsis thaliana), CASPL showed specific expression in a variety of cell types, such as trichomes, abscission zone cells, peripheral root cap cells, and xylem pole pericycle cells.

3.
Nucleic Acids Res ; 40(Database issue): D565-70, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22123736

RESUMO

The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.


Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Vocabulário Controlado , Anotação de Sequência Molecular/normas
4.
Database (Oxford) ; 20222022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35411389

RESUMO

SwissBioPics (www.swissbiopics.org) is a freely available resource of interactive, high-resolution cell images designed for the visualization of subcellular location data. SwissBioPics provides images describing cell types from all kingdoms of life-from the specialized muscle, neuronal and epithelial cells of animals, to the rods, cocci, clubs and spirals of prokaryotes. All cell images in SwissBioPics are drawn in Scalable Vector Graphics (SVG), with each subcellular location tagged with a unique identifier from the controlled vocabulary of subcellular locations and organelles of UniProt (https://www.uniprot.org/locations/). Users can search and explore SwissBioPics cell images through our website, which provides a platform for users to learn more about how cells are organized. A web component allows developers to embed SwissBioPics images in their own websites, using the associated JavaScript and a styling template, and to highlight subcellular locations and organelles by simply providing the web component with the appropriate identifier(s) from the UniProt-controlled vocabulary or the 'Cellular Component' branch of the Gene Ontology (www.geneontology.org), as well as an organism identifier from the National Center for Biotechnology Information taxonomy (https://www.ncbi.nlm.nih.gov/taxonomy). The UniProt website now uses SwissBioPics to visualize the subcellular locations and organelles where proteins function. SwissBioPics is freely available for anyone to use under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. DATABASE URL: www.swissbiopics.org.


Assuntos
Proteínas , Vocabulário Controlado , Animais
5.
Metabolites ; 11(1)2021 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-33445429

RESUMO

The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.

6.
Methods Mol Biol ; 406: 89-112, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18287689

RESUMO

The Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI), and the Protein Information Resource (PIR) form the Universal Protein Resource (UniProt) consortium. Its main goal is to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB) and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc). (1) UniProtKB is a comprehensive protein sequence knowledgebase that consists of two sections: UniProtKB/Swiss-Prot, which contains manually annotated entries, and UniProtKB/TrEMBL, which contains computer-annotated entries. UniProtKB/Swiss-Prot entries contain information curated by biologists and provide users with cross-links to about 100 external databases and with access to additional information or tools. (2) The UniRef databases (UniRef100, UniRef90, and UniRef50) define clusters of protein sequences that share 100, 90, or 50% identity. (3) The UniParc database stores and maps all publicly available protein sequence data, including obsolete data excluded from UniProtKB. The UniProt databases can be accessed online (http://www.uniprot.org/) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every 2 weeks. The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry, paying particular attention to the specificities of plant protein annotation. We will also present some of the tools and databases that are linked to each entry.


Assuntos
Bases de Dados de Proteínas , Proteínas/genética , Sequência de Aminoácidos , Armazenamento e Recuperação da Informação , Dados de Sequência Molecular , Proteínas/classificação , Alinhamento de Sequência/métodos , Homologia de Sequência de Aminoácidos , Interface Usuário-Computador
7.
Database (Oxford) ; 20172017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29220476

RESUMO

UniProt Knowledgebase (UniProtKB) is a publicly available database with access to a vast amount of protein sequence and functional information. To widen the scope of the publications associated with a protein entry, UniProt has introduced the computationally mapped additional bibliography section, which includes literature collected from external sources. In this article, we describe a text mining system, eGenPub, which selects articles that are 'about' specific proteins and allows automatic identification of additional bibliography for given UniProt protein entries. Focusing on plant proteins initially, eGenPub utilizes a gene normalization tool called pGenN, and a trained support vector machine model, which achieves a precision of 95.3%, to predict whether an article, based on its abstract, should be linked to a given UniProt entry. We have conducted a full-scale PubMed processing using eGenPub for eight common plant species. Altogether, 9025 articles are identified as relevant bibliography for 4752 UniProt entries, among which 5252 are additional papers not in the existing publication section. These newly computationally mapped additional bibliography via eGenPub is being integrated in the UniProt production pipeline, and can be accessed via the UniProtKB protein entry publication view.


Assuntos
Mineração de Dados , Bases de Dados Bibliográficas , Bases de Dados de Proteínas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Plantas , Plantas/genética , Plantas/metabolismo
8.
Methods Mol Biol ; 1374: 23-54, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26519399

RESUMO

The Universal Protein Resource (UniProt, http://www.uniprot.org ) consortium is an initiative of the SIB Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI) and the Protein Information Resource (PIR) to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB), updated every 4 weeks, and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc).The Swiss-Prot section of the UniProt KnowledgeBase (UniProtKB/Swiss-Prot) contains publicly available expertly manually annotated protein sequences obtained from a broad spectrum of organisms. Plant protein entries are produced in the frame of the Plant Proteome Annotation Program (PPAP), with an emphasis on characterized proteins of Arabidopsis thaliana and Oryza sativa. High level annotations provided by UniProtKB/Swiss-Prot are widely used to predict annotation of newly available proteins through automatic pipelines.The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry. We will also present some of the tools and databases that are linked to each entry.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Animais , Humanos , Navegador
9.
Mol Plant Microbe Interact ; 16(10): 851-8, 2003 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-14558686

RESUMO

Root inoculation of Arabidopsis thaliana ecotype Columbia with Pseudomonas fluorescens CHA0r partially protected leaves from the oomycete Peronospora parasitica. The molecular determinants of Pseudomonas fluorescens CHA0r for this induced systemic resistance (ISR) were investigated, using mutants derived from strain CHA0: CHA400 (pyoverdine deficient), CHA805 (exoprotease deficient), CHA77 (HCN deficient), CHA660 (pyoluteorin deficient), CHA631 (2,4-diacetylphloroglucinol [DAPG] deficient), and CHA89 (HCN, DAPG- and pyoluteorin deficient). Only mutations interfering with DAPG production led to a significant decrease in ISR to Peronospora parasitica. Thus, DAPG production in Pseudomonas fluorescens is required for the induction of ISR to Peronospora parasitica. DAPG is known for its antibiotic activity; however, our data indicate that one action of DAPG could be due to an effect on the physiology of the plant. DAPG at 10 to 100 microM applied to roots of Arabidopsis mimicked the ISR effect. CHA0r-mediated ISR was also tested in various Arabidopsis mutants and transgenic plants: NahG (transgenic line degrading salicylic acid [SA]), sid2-1 (nonproducing SA), npr1-1 (non-expressing NPR1 protein), jar1-1 (insensitive to jasmonic acid and methyl jasmonic acid), ein2-1 (insensitive to ethylene), etr1-1 (insensitive to ethylene), eir1-1 (insensitive to ethylene in roots), and pad2-1 (phytoalexin deficient). Only jar1-1, eir1-1, and npr1-1 mutants were unable to undergo ISR. Sensitivity to jasmonic acid and functional NPR1 and EIR1 proteins were required for full expression of CHA0r-mediated ISR. The requirements for ISR observed in this study in Peronospora parasitica induced by Pseudomonas fluorescens CHA0r only partially overlap with those published so far for Peronospora parasitica, indicating a great degree of flexibility in the molecular processes leading to ISR.


Assuntos
Arabidopsis/microbiologia , Pseudomonas fluorescens/fisiologia , Antibacterianos/metabolismo , Antibacterianos/farmacologia , Arabidopsis/genética , Arabidopsis/fisiologia , Ciclopentanos/metabolismo , Etilenos/metabolismo , Genes de Plantas , Mutação , Oxilipinas , Peronospora/patogenicidade , Floroglucinol/análogos & derivados , Floroglucinol/metabolismo , Floroglucinol/farmacologia , Doenças das Plantas/microbiologia , Reguladores de Crescimento de Plantas/metabolismo , Raízes de Plantas/microbiologia , Plantas Geneticamente Modificadas , Ácido Salicílico/metabolismo
10.
J Proteomics ; 72(3): 567-73, 2009 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-19084081

RESUMO

The UniProt knowledgebase, UniProtKB, is the main product of the UniProt consortium. It consists of two sections, UniProtKB/Swiss-Prot, the manually curated section, and UniProtKB/TrEMBL, the computer translation of the EMBL/GenBank/DDBJ nucleotide sequence database. Taken together, these two sections cover all the proteins characterized or inferred from all publicly available nucleotide sequences. The Plant Proteome Annotation Program (PPAP) of UniProtKB/Swiss-Prot focuses on the manual annotation of plant-specific proteins and protein families. Our major effort is currently directed towards the two model plants Arabidopsis thaliana and Oryza sativa. In UniProtKB/Swiss-Prot, redundancy is minimized by merging all data from different sources in a single entry. The proposed protein sequence is frequently modified after comparison with ESTs, full length transcripts or homologous proteins from other species. The information present in manually curated entries allows the reconstruction of all described isoforms. The annotation also includes proteomics data such as PTM and protein identification MS experimental results. UniProtKB and the other products of the UniProt consortium are accessible online at www.uniprot.org.


Assuntos
Indexação e Redação de Resumos , Bases de Dados de Proteínas , Bases de Conhecimento , Proteínas de Plantas/análise , Proteínas de Plantas/classificação , Proteoma/análise , Proteoma/classificação , Arabidopsis/química , Internet , Espectrometria de Massas , Oryza/química , Proteínas de Plantas/química , Proteoma/química
11.
Plant Physiol ; 133(4): 1893-910, 2003 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-14630957

RESUMO

To study the role of LecRK (lectin-like receptor kinase) genes in the legumerhizobia symbiosis, we have characterized the four Medicago truncatula Gaernt. LecRK genes that are most highly expressed in roots. Three of these genes, MtLecRK7;1, MtLecRK7;2, and MtLecRK7;3, encode proteins most closely related to the Class A LecRKs of Arabidopsis, whereas the protein encoded by the fourth gene, MtLecRK1;1, is most similar to a Class B Arabidopsis LecRK. All four genes show a strongly enhanced root expression, and detailed studies on MtLecRK1;1 and MtLecRK7;2 revealed that the levels of their mRNAs are increased by nitrogen starvation and transiently repressed after either rhizobial inoculation or addition of lipochitooligosaccharidic Nod factors. Studies of the MtLecRK1;1 and MtLecRK7;2 proteins, using green fluorescent protein fusions in transgenic M. truncatula roots, revealed that they are located in the plasma membrane and that their central transmembrane-spanning helix is required for correct sorting. Moreover, their lectin-like domains appear to be highly glycosylated. Of the four proteins, only MtLecRK1;1 shows a high conservation of key residues implicated in monosaccharide binding, and molecular modeling revealed that this protein may be capable of interacting with Nod factors. However, no increase in Nod factor binding was found in roots overexpressing a fusion in which the kinase domain of this protein had been replaced with green fluorescent protein. Roots expressing this fusion protein however showed an increase in nodule number, suggesting that expression of MtLecRK1;1 influences nodulation. The potential role of LecRKs in the legume-rhizobia symbiosis is discussed.


Assuntos
Regulação da Expressão Gênica de Plantas/genética , Medicago/enzimologia , Lectinas de Plantas/genética , Raízes de Plantas/enzimologia , Proteínas Quinases/genética , Sinorhizobium meliloti/fisiologia , Sequência de Aminoácidos , Sítios de Ligação , Sequência Conservada , Regulação Enzimológica da Expressão Gênica/genética , Medicago/classificação , Medicago/genética , Medicago/fisiologia , Modelos Moleculares , Dados de Sequência Molecular , Filogenia , Lectinas de Plantas/química , Conformação Proteica , Proteínas Quinases/química , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Simbiose
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA