Pesquisa | BVS IEC

1.

DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation.

Quaglia, Federica; Mészáros, Bálint; Salladini, Edoardo; Hatos, András; Pancsa, Rita; Chemes, Lucía B; Pajkos, Mátyás; Lazar, Tamas; Peña-Díaz, Samuel; Santos, Jaime; Ács, Veronika; Farahi, Nazanin; Fichó, Erzsébet; Aspromonte, Maria Cristina; Bassot, Claudio; Chasapi, Anastasia; Davey, Norman E; Davidovic, Radoslav; Dobson, Laszlo; Elofsson, Arne; Erdos, Gábor; Gaudet, Pascale; Giglio, Michelle; Glavina, Juliana; Iserte, Javier; Iglesias, Valentín; Kálmán, Zsófia; Lambrughi, Matteo; Leonardi, Emanuela; Longhi, Sonia; Macedo-Ribeiro, Sandra; Maiani, Emiliano; Marchetti, Julia; Marino-Buslje, Cristina; Mészáros, Attila; Monzon, Alexander Miguel; Minervini, Giovanni; Nadendla, Suvarna; Nilsson, Juliet F; Novotný, Marian; Ouzounis, Christos A; Palopoli, Nicolás; Papaleo, Elena; Pereira, Pedro José Barbosa; Pozzati, Gabriele; Promponas, Vasilis J; Pujols, Jordi; Rocha, Alma Carolina Sanchez; Salas, Martin; Sawicki, Luciana Rodriguez.

Nucleic Acids Res ; 50(D1): D480-D487, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34850135

RESUMO

The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.

Assuntos

Bases de Dados de Proteínas , Proteínas Intrinsicamente Desordenadas/metabolismo , Anotação de Sequência Molecular , Software , Sequência de Aminoácidos , DNA/genética , DNA/metabolismo , Conjuntos de Dados como Assunto , Ontologia Genética , Humanos , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Ligação Proteica , RNA/genética , RNA/metabolismo

2.

The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST).

Touré, Vasundra; Vercruysse, Steven; Acencio, Marcio Luis; Lovering, Ruth C; Orchard, Sandra; Bradley, Glyn; Casals-Casas, Cristina; Chaouiya, Claudine; Del-Toro, Noemi; Flobak, Åsmund; Gaudet, Pascale; Hermjakob, Henning; Hoyt, Charles Tapley; Licata, Luana; Lægreid, Astrid; Mungall, Christopher J; Niknejad, Anne; Panni, Simona; Perfetto, Livia; Porras, Pablo; Pratt, Dexter; Saez-Rodriguez, Julio; Thieffry, Denis; Thomas, Paul D; Türei, Dénes; Kuiper, Martin.

Bioinformatics ; 36(24): 5712-5718, 2021 04 05.

Artigo em Inglês | MEDLINE | ID: mdl-32637990

RESUMO

MOTIVATION: A large variety of molecular interactions occurs between biomolecular components in cells. When a molecular interaction results in a regulatory effect, exerted by one component onto a downstream component, a so-called 'causal interaction' takes place. Causal interactions constitute the building blocks in our understanding of larger regulatory networks in cells. These causal interactions and the biological processes they enable (e.g. gene regulation) need to be described with a careful appreciation of the underlying molecular reactions. A proper description of this information enables archiving, sharing and reuse by humans and for automated computational processing. Various representations of causal relationships between biological components are currently used in a variety of resources. RESULTS: Here, we propose a checklist that accommodates current representations, called the Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST). This checklist defines both the required core information, as well as a comprehensive set of other contextual details valuable to the end user and relevant for reusing and reproducing causal molecular interaction information. The MI2CAST checklist can be used as reporting guidelines when annotating and curating causal statements, while fostering uniformity and interoperability of the data across resources. AVAILABILITY AND IMPLEMENTATION: The checklist together with examples is accessible at https://github.com/MI2CAST/MI2CAST. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Software , Causalidade , Humanos

3.

The neXtProt knowledgebase in 2020: data, tools and usability improvements.

Zahn-Zabal, Monique; Michel, Pierre-André; Gateau, Alain; Nikitin, Frédéric; Schaeffer, Mathieu; Audot, Estelle; Gaudet, Pascale; Duek, Paula D; Teixeira, Daniel; Rech de Laval, Valentine; Samarasinghe, Kasun; Bairoch, Amos; Lane, Lydie.

Nucleic Acids Res ; 48(D1): D328-D334, 2020 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-31724716

RESUMO

The neXtProt knowledgebase (https://www.nextprot.org) is an integrative resource providing both data on human protein and the tools to explore these. In order to provide comprehensive and up-to-date data, we evaluate and add new data sets. We describe the incorporation of three new data sets that provide expression, function, protein-protein binary interaction, post-translational modifications (PTM) and variant information. New SPARQL query examples illustrating uses of the new data were added. neXtProt has continued to develop tools for proteomics. We have improved the peptide uniqueness checker and have implemented a new protein digestion tool. Together, these tools make it possible to determine which proteases can be used to identify trypsin-resistant proteins by mass spectrometry. In terms of usability, we have finished revamping our web interface and completely rewritten our API. Our SPARQL endpoint now supports federated queries. All the neXtProt data are available via our user interface, API, SPARQL endpoint and FTP site, including the new PEFF 1.0 format files. Finally, the data on our FTP site is now CC BY 4.0 to promote its reuse.

Assuntos

Bases de Dados de Proteínas , Bases de Conhecimento , Humanos , Internet , Espectrometria de Massas , Peptídeos/química , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Processamento de Proteína Pós-Traducional , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de RNA , Software , Tripsina , Interface Usuário-Computador

4.

The Feature-Viewer: a visualization tool for positional annotations on a sequence.

Paladin, Lisanna; Schaeffer, Mathieu; Gaudet, Pascale; Zahn-Zabal, Monique; Michel, Pierre-André; Piovesan, Damiano; Tosatto, Silvio C E; Bairoch, Amos.

Bioinformatics ; 36(10): 3244-3245, 2020 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-31985787

RESUMO

SUMMARY: The Feature-Viewer is a lightweight library for the visualization of biological data mapped to a protein or nucleotide sequence. It is designed for ease of use while allowing for a full customization. The library is already used by several biological data resources and allows intuitive visual mapping of a full spectra of sequence features for different usages. AVAILABILITY AND IMPLEMENTATION: The Feature-Viewer is open source, compatible with state-of-the-art development technologies and responsive, also for mobile viewing. Documentation and usage examples are available online.

Assuntos

Computadores , Software

5.

ECO, the Evidence & Conclusion Ontology: community standard for evidence information.

Giglio, Michelle; Tauber, Rebecca; Nadendla, Suvarna; Munro, James; Olley, Dustin; Ball, Shoshannah; Mitraka, Elvira; Schriml, Lynn M; Gaudet, Pascale; Hobbs, Elizabeth T; Erill, Ivan; Siegele, Deborah A; Hu, James C; Mungall, Chris; Chibucos, Marcus C.

Nucleic Acids Res ; 47(D1): D1186-D1194, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30407590

RESUMO

The Evidence and Conclusion Ontology (ECO) contains terms (classes) that describe types of evidence and assertion methods. ECO terms are used in the process of biocuration to capture the evidence that supports biological assertions (e.g. gene product X has function Y as supported by evidence Z). Capture of this information allows tracking of annotation provenance, establishment of quality control measures and query of evidence. ECO contains over 1500 terms and is in use by many leading biological resources including the Gene Ontology, UniProt and several model organism databases. ECO is continually being expanded and revised based on the needs of the biocuration community. The ontology is freely available for download from GitHub (https://github.com/evidenceontology/) or the project's website (http://evidenceontology.org/). Users can request new terms or changes to existing terms through the project's GitHub site. ECO is released into the public domain under CC0 1.0 Universal.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Ontologia Genética , Proteínas/genética , Animais , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Proteínas/metabolismo , Análise de Sequência de Proteína , Interface Usuário-Computador

6.

A new bioinformatics tool to help assess the significance of BRCA1 variants.

Cusin, Isabelle; Teixeira, Daniel; Zahn-Zabal, Monique; Rech de Laval, Valentine; Gleizes, Anne; Viassolo, Valeria; Chappuis, Pierre O; Hutter, Pierre; Bairoch, Amos; Gaudet, Pascale.

Hum Genomics ; 12(1): 36, 2018 07 11.

Artigo em Inglês | MEDLINE | ID: mdl-29996917

RESUMO

BACKGROUND: Germline pathogenic variants in the breast cancer type 1 susceptibility gene BRCA1 are associated with a 60% lifetime risk for breast and ovarian cancer. This overall risk estimate is for all BRCA1 variants; obviously, not all variants confer the same risk of developing a disease. In cancer patients, loss of BRCA1 function in tumor tissue has been associated with an increased sensitivity to platinum agents and to poly-(ADP-ribose) polymerase (PARP) inhibitors. For clinical management of both at-risk individuals and cancer patients, it would be important that each identified genetic variant be associated with clinical significance. Unfortunately for the vast majority of variants, the clinical impact is unknown. The availability of results from studies assessing the impact of variants on protein function may provide insight of crucial importance. RESULTS AND CONCLUSION: We have collected, curated, and structured the molecular and cellular phenotypic impact of 3654 distinct BRCA1 variants. The data was modeled in triple format, using the variant as a subject, the studied function as the object, and a predicate describing the relation between the two. Each annotation is supported by a fully traceable evidence. The data was captured using standard ontologies to ensure consistency, and enhance searchability and interoperability. We have assessed the extent to which functional defects at the molecular and cellular levels correlate with the clinical interpretation of variants by ClinVar submitters. Approximately 30% of the ClinVar BRCA1 missense variants have some molecular or cellular assay available in the literature. Pathogenic variants (as assigned by ClinVar) have at least some significant functional defect in 94% of testable cases. For benign variants, 77% of ClinVar benign variants, for which neXtProt Cancer variant portal has data, shows either no or mild experimental functional defects. While this does not provide evidence for clinical interpretation of variants, it may provide some guidance for variants of unknown significance, in the absence of more reliable data. The neXtProt Cancer variant portal ( https://www.nextprot.org/portals/breast-cancer ) contains over 6300 observations at the molecular and/or cellular level for BRCA1 variants.

Assuntos

Proteína BRCA1/genética , Neoplasias da Mama/genética , Predisposição Genética para Doença , Neoplasias Ovarianas/genética , Adulto , Idoso , Proteína BRCA1/química , Neoplasias da Mama/patologia , Biologia Computacional , Feminino , Variação Genética , Mutação em Linhagem Germinativa/genética , Humanos , Pessoa de Meia-Idade , Neoplasias Ovarianas/patologia , Conformação Proteica

7.

The neXtProt knowledgebase on human proteins: 2017 update.

Gaudet, Pascale; Michel, Pierre-André; Zahn-Zabal, Monique; Britan, Aurore; Cusin, Isabelle; Domagalski, Marcin; Duek, Paula D; Gateau, Alain; Gleizes, Anne; Hinard, Valérie; Rech de Laval, Valentine; Lin, JinJin; Nikitin, Frederic; Schaeffer, Mathieu; Teixeira, Daniel; Lane, Lydie; Bairoch, Amos.

Nucleic Acids Res ; 45(D1): D177-D182, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27899619

RESUMO

The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.

Assuntos

Bases de Dados de Proteínas , Proteômica , Estudos de Associação Genética , Variação Genética , Humanos , Internet , Fenótipo , Proteômica/métodos , Software , Navegador

8.

Annotation of functional impact of voltage-gated sodium channel mutations.

Hinard, Valérie; Britan, Aurore; Schaeffer, Mathieu; Zahn-Zabal, Monique; Thomet, Urs; Rougier, Jean-Sébastien; Bairoch, Amos; Abriel, Hugues; Gaudet, Pascale.

Hum Mutat ; 38(5): 485-493, 2017 05.

Artigo em Inglês | MEDLINE | ID: mdl-28168870

RESUMO

Voltage-gated sodium channels are pore-forming transmembrane proteins that selectively allow sodium ions to flow across the plasma membrane according to the electro-chemical gradient thus mediating the rising phase of action potentials in excitable cells and playing key roles in physiological processes such as neurotransmission, skeletal muscle contraction, heart rhythm, and pain sensation. Genetic variations in the nine human genes encoding these channels are known to cause a large range of diseases affecting the nervous and cardiac systems. Understanding the molecular effect of genetic variations is critical for elucidating the pathologic mechanisms of known variations and in predicting the effect of newly discovered ones. To this end, we have created a Web-based tool, the Ion Channels Variants Portal, which compiles all variants characterized functionally in the human sodium channel genes. This portal describes 672 variants each associated with at least one molecular or clinical phenotypic impact, for a total of 4,658 observations extracted from 264 different research articles. These data were captured as structured annotations using standardized vocabularies and ontologies, such as the Gene Ontology and the Ion Channel ElectroPhysiology Ontology. All these data are available to the scientific community via neXtProt at https://www.nextprot.org/portals/navmut.

Assuntos

Biologia Computacional , Bases de Dados Genéticas , Mutação , Canais de Sódio Disparados por Voltagem/genética , Canais de Sódio Disparados por Voltagem/metabolismo , Animais , Biologia Computacional/métodos , Fenômenos Eletrofisiológicos/genética , Estudos de Associação Genética , Predisposição Genética para Doença , Genótipo , Humanos , Anotação de Sequência Molecular , Fenótipo , Domínios Proteicos , Índice de Gravidade de Doença , Software , Canais de Sódio Disparados por Voltagem/química , Navegador

9.

Diabetogenic milieus induce specific changes in mitochondrial transcriptome and differentiation of human pancreatic islets.

Brun, Thierry; Li, Ning; Jourdain, Alexis A; Gaudet, Pascale; Duhamel, Dominique; Meyer, Jérémy; Bosco, Domenico; Maechler, Pierre.

Hum Mol Genet ; 24(18): 5270-84, 2015 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-26123492

RESUMO

In pancreatic ß-cells, mitochondria play a central role in coupling glucose metabolism to insulin secretion. Chronic exposure of ß-cells to metabolic stresses impairs their function and potentially induces apoptosis. Little is known on mitochondrial adaptation to metabolic stresses, i.e. high glucose, fatty acids or oxidative stress; being all highlighted in the pathogenesis of type 2 diabetes. Here, human islets were exposed for 3 days to 25 mm glucose, 0.4 mm palmitate, 0.4 mm oleate and transiently to H2O2. Culture at physiological 5.6 mm glucose served as no-stress control. Expression of mitochondrion-associated genes was quantified, including the transcriptome of mitochondrial inner membrane carriers. Targets of interest were further evaluated at the protein level. Three days after acute oxidative stress, no significant alteration in ß-cell function or apoptosis was detected in human islets. Palmitate specifically increased expression of the pyruvate carriers MPC1 and MPC2, whereas the glutamate carrier GC1 and the aspartate/glutamate carrier AGC1 were down-regulated by palmitate and oleate, respectively. High glucose decreased mRNA levels of key transcription factors (HNF4A, IPF1, PPARA and TFAM) and energy-sensor SIRT1. High glucose also reduced expression of 11 mtDNA-encoded respiratory chain subunits. Interestingly, transcript levels of the carriers for aspartate/glutamate AGC2, malate DIC and malate/oxaloacetate/aspartate UCP2 were increased by high glucose, a profile suggesting important mitochondrial anaplerotic/cataplerotic activities and NADPH-generating shuttles. Chronic exposure to high glucose impaired glucose-stimulated insulin secretion, decreased insulin content, promoted caspase-3 cleavage and cell death, revealing glucotoxicity. Overall, expression profile of mitochondrion-associated genes was selectively modified by glucose, delineating a glucotoxic-specific signature.

Assuntos

Diferenciação Celular/genética , Diabetes Mellitus/genética , Ilhotas Pancreáticas/citologia , Ilhotas Pancreáticas/metabolismo , Mitocôndrias/genética , Transcriptoma , Apoptose/genética , Linhagem Celular , Sobrevivência Celular/genética , DNA Mitocondrial/genética , Diabetes Mellitus/metabolismo , Transporte de Elétrons , Expressão Gênica , Glucose/metabolismo , Humanos , Insulina/metabolismo , Mitocôndrias/metabolismo , Bombas de Próton/metabolismo , Superóxidos/metabolismo

10.

The neXtProt knowledgebase on human proteins: current status.

Gaudet, Pascale; Michel, Pierre-André; Zahn-Zabal, Monique; Cusin, Isabelle; Duek, Paula D; Evalet, Olivier; Gateau, Alain; Gleizes, Anne; Pereira, Mario; Teixeira, Daniel; Zhang, Ying; Lane, Lydie; Bairoch, Amos.

Nucleic Acids Res ; 43(Database issue): D764-70, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25593349

RESUMO

neXtProt (http://www.nextprot.org) is a human protein-centric knowledgebase developed at the SIB Swiss Institute of Bioinformatics. Focused solely on human proteins, neXtProt aims to provide a state of the art resource for the representation of human biology by capturing a wide range of data, precise annotations, fully traceable data provenance and a web interface which enables researchers to find and view information in a comprehensive manner. Since the introductory neXtProt publication, significant advances have been made on three main aspects: the representation of proteomics data, an extended representation of human variants and the development of an advanced search capability built around semantic technologies. These changes are presented in the current neXtProt update.

Assuntos

Bases de Dados de Proteínas , Variação Genética , Proteínas/genética , Proteômica , Linhagem Celular , Doença/genética , Humanos , Internet , Proteoma

11.

GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G.

PLoS Comput Biol ; 11(4): e1004143, 2015 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-25856076

RESUMO

In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

Assuntos

Biologia Computacional/educação , Biologia Computacional/organização & administração , Currículo , Relações Interinstitucionais , Internacionalidade , Ensino/organização & administração

12.

Target discovery from protein databases: challenges for curation.

Chichester, Christine; Gaudet, Pascale.

Drug Discov Today Technol ; 14: 11-6, 2015 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-26194582

RESUMO

Protein databases are a gold mine of potential new drug targets. The ready access to a complete overview of all aspects of protein biology provides the most benefit at the outset of drug discovery pipelines. Ideally, curation strategies used to move from the raw data to the validated knowledge should contain the checks and balances necessary for accuracy. The neXtProt human protein knowledgebase is used here as an example to give insight into these methods.

Assuntos

Bases de Dados de Proteínas , Descoberta de Drogas , Mineração de Dados , Humanos , Conformação Proteica

13.

Metrics for the Human Proteome Project 2013-2014 and strategies for finding missing proteins.

Lane, Lydie; Bairoch, Amos; Beavis, Ronald C; Deutsch, Eric W; Gaudet, Pascale; Lundberg, Emma; Omenn, Gilbert S.

J Proteome Res ; 13(1): 15-20, 2014 Jan 03.

Artigo em Inglês | MEDLINE | ID: mdl-24364385

RESUMO

One year ago the Human Proteome Project (HPP) leadership designated the baseline metrics for the Human Proteome Project to be based on neXtProt with a total of 13,664 proteins validated at protein evidence level 1 (PE1) by mass spectrometry, antibody-capture, Edman sequencing, or 3D structures. Corresponding chromosome-specific data were provided from PeptideAtlas, GPMdb, and Human Protein Atlas. This year, the neXtProt total is 15,646 and the other resources, which are inputs to neXtProt, have high-quality identifications and additional annotations for 14,012 in PeptideAtlas, 14,869 in GPMdb, and 10,976 in HPA. We propose to remove 638 genes from the denominator that are "uncertain" or "dubious" in Ensembl, UniProt/SwissProt, and neXtProt. That leaves 3844 "missing proteins", currently having no or inadequate documentation, to be found from a new denominator of 19,490 protein-coding genes. We present those tabulations and web links and discuss current strategies to find the missing proteins.

Assuntos

Proteoma , Cromossomos Humanos , Humanos , Espectrometria de Massas

14.

neXtProt: a knowledge platform for human proteins.

Lane, Lydie; Argoud-Puy, Ghislaine; Britan, Aurore; Cusin, Isabelle; Duek, Paula D; Evalet, Olivier; Gateau, Alain; Gaudet, Pascale; Gleizes, Anne; Masselot, Alexandre; Zwahlen, Catherine; Bairoch, Amos.

Nucleic Acids Res ; 40(Database issue): D76-83, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22139911

RESUMO

neXtProt (http://www.nextprot.org/) is a new human protein-centric knowledge platform. Developed at the Swiss Institute of Bioinformatics (SIB), it aims to help researchers answer questions relevant to human proteins. To achieve this goal, neXtProt is built on a corpus containing both curated knowledge originating from the UniProtKB/Swiss-Prot knowledgebase and carefully selected and filtered high-throughput data pertinent to human proteins. This article presents an overview of the database and the data integration process. We also lay out the key future directions of neXtProt that we consider the necessary steps to make neXtProt the one-stop-shop for all research projects focusing on human proteins.

Assuntos

Bases de Dados de Proteínas , Humanos , Bases de Conhecimento , Proteínas/genética , Proteínas/metabolismo , Interface Usuário-Computador

15.

Interpreting Gene Ontology Annotations Derived from Sequence Homology Methods.

Feuermann, Marc; Gaudet, Pascale.

Methods Mol Biol ; 2836: 285-298, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38995546

RESUMO

The Gene Ontology (GO) project describes the functions of the gene products of organisms from all kingdoms of life in a standardized way, enabling powerful analyses of experiments involving genome-wide analysis. The scientific literature is used to convert experimental results into GO annotations that systematically classify gene products' functions. However, to address the fact that only a minor fraction of all genes has been characterized experimentally, multiple predictive methods to assign GO annotations have been developed since the inception of GO. Sequence homologies between novel genes and genes with known functions help to approximate the roles of these non-characterized genes. Here we describe the main sequence homology methods to produce annotations: pairwise comparison (BLAST), protein profile models (InterPro), and phylogenetic-based annotation (PAINT). Some of these methods can be implemented with genome analysis pipelines (BLAST and InterPro2GO), while PAINT is curated by the GO consortium.

Assuntos

Biologia Computacional , Ontologia Genética , Anotação de Sequência Molecular , Anotação de Sequência Molecular/métodos , Biologia Computacional/métodos , Filogenia , Software , Homologia de Sequência , Bases de Dados Genéticas , Humanos

16.

neXtProt: organizing protein knowledge in the context of human proteome projects.

Gaudet, Pascale; Argoud-Puy, Ghislaine; Cusin, Isabelle; Duek, Paula; Evalet, Olivier; Gateau, Alain; Gleizes, Anne; Pereira, Mario; Zahn-Zabal, Monique; Zwahlen, Catherine; Bairoch, Amos; Lane, Lydie.

J Proteome Res ; 12(1): 293-8, 2013 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-23205526

RESUMO

About 5000 (25%) of the ~20400 human protein-coding genes currently lack any experimental evidence at the protein level. For many others, there is only little information relative to their abundance, distribution, subcellular localization, interactions, or cellular functions. The aim of the HUPO Human Proteome Project (HPP, www.thehpp.org ) is to collect this information for every human protein. HPP is based on three major pillars: mass spectrometry (MS), antibody/affinity capture reagents (Ab), and bioinformatics-driven knowledge base (KB). To meet this objective, the Chromosome-Centric Human Proteome Project (C-HPP) proposes to build this catalog chromosome-by-chromosome ( www.c-hpp.org ) by focusing primarily on proteins that currently lack MS evidence or Ab detection. These are termed "missing proteins" by the HPP consortium. The lack of observation of a protein can be due to various factors including incorrect and incomplete gene annotation, low or restricted expression, or instability. neXtProt ( www.nextprot.org ) is a new web-based knowledge platform specific for human proteins that aims to complement UniProtKB/Swiss-Prot ( www.uniprot.org ) with detailed information obtained from carefully selected high-throughput experiments on genomic variation, post-translational modifications, as well as protein expression in tissues and cells. This article describes how neXtProt contributes to prioritize C-HPP efforts and integrates C-HPP results with other research efforts to create a complete human proteome catalog.

Assuntos

Bases de Dados de Proteínas , Proteínas , Proteoma , Cromossomos Humanos , Biologia Computacional , Genoma Humano , Humanos , Internet , Bases de Conhecimento , Espectrometria de Massas , Processamento de Proteína Pós-Traducional , Proteínas/genética , Proteínas/metabolismo

17.

A chromosome-centric human proteome project (C-HPP) to characterize the sets of proteins encoded in chromosome 17.

Liu, Suli; Im, Hogune; Bairoch, Amos; Cristofanilli, Massimo; Chen, Rui; Deutsch, Eric W; Dalton, Stephen; Fenyo, David; Fanayan, Susan; Gates, Chris; Gaudet, Pascale; Hincapie, Marina; Hanash, Samir; Kim, Hoguen; Jeong, Seul-Ki; Lundberg, Emma; Mias, George; Menon, Rajasree; Mu, Zhaomei; Nice, Edouard; Paik, Young-Ki; Uhlen, Mathias; Wells, Lance; Wu, Shiaw-Lin; Yan, Fangfei; Zhang, Fan; Zhang, Yue; Snyder, Michael; Omenn, Gilbert S; Beavis, Ronald C; Hancock, William S.

J Proteome Res ; 12(1): 45-57, 2013 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-23259914

RESUMO

We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.

Assuntos

Cromossomos Humanos Par 17 , Genoma Humano , Proteínas , Proteômica , Sequência de Aminoácidos , Cromossomos Humanos Par 17/genética , Cromossomos Humanos Par 17/metabolismo , Bases de Dados de Proteínas , Expressão Gênica , Projeto Genoma Humano , Humanos , Proteínas/classificação , Proteínas/genética , Proteínas/metabolismo

18.

Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium.

Gaudet, Pascale; Livstone, Michael S; Lewis, Suzanna E; Thomas, Paul D.

Brief Bioinform ; 12(5): 449-62, 2011 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-21873635

RESUMO

The goal of the Gene Ontology (GO) project is to provide a uniform way to describe the functions of gene products from organisms across all kingdoms of life and thereby enable analysis of genomic data. Protein annotations are either based on experiments or predicted from protein sequences. Since most sequences have not been experimentally characterized, most available annotations need to be based on predictions. To make as accurate inferences as possible, the GO Consortium's Reference Genome Project is using an explicit evolutionary framework to infer annotations of proteins from a broad set of genomes from experimental annotations in a semi-automated manner. Most components in the pipeline, such as selection of sequences, building multiple sequence alignments and phylogenetic trees, retrieving experimental annotations and depositing inferred annotations, are fully automated. However, the most crucial step in our pipeline relies on software-assisted curation by an expert biologist. This curation tool, Phylogenetic Annotation and INference Tool (PAINT) helps curators to infer annotations among members of a protein family. PAINT allows curators to make precise assertions as to when functions were gained and lost during evolution and record the evidence (e.g. experimentally supported GO annotations and phylogenetic information including orthology) for those assertions. In this article, we describe how we use PAINT to infer protein function in a phylogenetic context with emphasis on its strengths, limitations and guidelines. We also discuss specific examples showing how PAINT annotations compare with those generated by other highly used homology-based methods.

Assuntos

Genômica/métodos , Anotação de Sequência Molecular/métodos , Filogenia , Proteínas/química , Bases de Dados Genéticas , Genoma , Proteínas/genética

19.

dictyBase update 2011: web 2.0 functionality and the initial steps towards a genome portal for the Amoebozoa.

Gaudet, Pascale; Fey, Petra; Basu, Siddhartha; Bushmanova, Yulia A; Dodson, Robert; Sheppard, Kerry A; Just, Eric M; Kibbe, Warren A; Chisholm, Rex L.

Nucleic Acids Res ; 39(Database issue): D620-4, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21087999

RESUMO

dictyBase (http://www.dictybase.org), the model organism database for Dictyostelium, aims to provide the broad biomedical research community with well integrated, high quality data and tools for Dictyostelium discoideum and related species. dictyBase houses the complete genome sequence, ESTs, and the entire body of literature relevant to Dictyostelium. This information is curated to provide accurate gene models and functional annotations, with the goal of fully annotating the genome to provide a 'reference genome' in the Amoebozoa clade. We highlight several new features in the present update: (i) new annotations; (ii) improved interface with web 2.0 functionality; (iii) the initial steps towards a genome portal for the Amoebozoa; (iv) ortholog display; and (v) the complete integration of the Dicty Stock Center with dictyBase.

Assuntos

Bases de Dados Genéticas , Dictyostelium/genética , Amebozoários/genética , Genoma de Protozoário , Internet , Anotação de Sequência Molecular , Proteínas de Protozoários/química , Proteínas de Protozoários/genética , Integração de Sistemas

20.

Towards BioDBcore: a community-defined information specification for biological databases.

Gaudet, Pascale; Bairoch, Amos; Field, Dawn; Sansone, Susanna-Assunta; Taylor, Chris; Attwood, Teresa K; Bateman, Alex; Blake, Judith A; Bult, Carol J; Cherry, J Michael; Chisholm, Rex L; Cochrane, Guy; Cook, Charles E; Eppig, Janan T; Galperin, Michael Y; Gentleman, Robert; Goble, Carole A; Gojobori, Takashi; Hancock, John M; Howe, Douglas G; Imanishi, Tadashi; Kelso, Janet; Landsman, David; Lewis, Suzanna E; Mizrachi, Ilene Karsch; Orchard, Sandra; Ouellette, B F Francis; Ranganathan, Shoba; Richardson, Lorna; Rocca-Serra, Philippe; Schofield, Paul N; Smedley, Damian; Southan, Christopher; Tan, Tin Wee; Tatusova, Tatiana; Whetzel, Patricia L; White, Owen; Yamasaki, Chisato.

Nucleic Acids Res ; 39(Database issue): D7-10, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21097465

RESUMO

The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.

Assuntos

Bases de Dados Factuais/normas , Disseminação de Informação

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA