Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Hum Genomics ; 12(1): 36, 2018 07 11.
Artigo em Inglês | MEDLINE | ID: mdl-29996917

RESUMO

BACKGROUND: Germline pathogenic variants in the breast cancer type 1 susceptibility gene BRCA1 are associated with a 60% lifetime risk for breast and ovarian cancer. This overall risk estimate is for all BRCA1 variants; obviously, not all variants confer the same risk of developing a disease. In cancer patients, loss of BRCA1 function in tumor tissue has been associated with an increased sensitivity to platinum agents and to poly-(ADP-ribose) polymerase (PARP) inhibitors. For clinical management of both at-risk individuals and cancer patients, it would be important that each identified genetic variant be associated with clinical significance. Unfortunately for the vast majority of variants, the clinical impact is unknown. The availability of results from studies assessing the impact of variants on protein function may provide insight of crucial importance. RESULTS AND CONCLUSION: We have collected, curated, and structured the molecular and cellular phenotypic impact of 3654 distinct BRCA1 variants. The data was modeled in triple format, using the variant as a subject, the studied function as the object, and a predicate describing the relation between the two. Each annotation is supported by a fully traceable evidence. The data was captured using standard ontologies to ensure consistency, and enhance searchability and interoperability. We have assessed the extent to which functional defects at the molecular and cellular levels correlate with the clinical interpretation of variants by ClinVar submitters. Approximately 30% of the ClinVar BRCA1 missense variants have some molecular or cellular assay available in the literature. Pathogenic variants (as assigned by ClinVar) have at least some significant functional defect in 94% of testable cases. For benign variants, 77% of ClinVar benign variants, for which neXtProt Cancer variant portal has data, shows either no or mild experimental functional defects. While this does not provide evidence for clinical interpretation of variants, it may provide some guidance for variants of unknown significance, in the absence of more reliable data. The neXtProt Cancer variant portal ( https://www.nextprot.org/portals/breast-cancer ) contains over 6300 observations at the molecular and/or cellular level for BRCA1 variants.


Assuntos
Proteína BRCA1/genética , Neoplasias da Mama/genética , Predisposição Genética para Doença , Neoplasias Ovarianas/genética , Adulto , Idoso , Proteína BRCA1/química , Neoplasias da Mama/patologia , Biologia Computacional , Feminino , Variação Genética , Mutação em Linhagem Germinativa/genética , Humanos , Pessoa de Meia-Idade , Neoplasias Ovarianas/patologia , Conformação Proteica
2.
Nucleic Acids Res ; 45(D1): D177-D182, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899619

RESUMO

The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.


Assuntos
Bases de Dados de Proteínas , Proteômica , Estudos de Associação Genética , Variação Genética , Humanos , Internet , Fenótipo , Proteômica/métodos , Software , Navegador
3.
Nucleic Acids Res ; 43(Database issue): D764-70, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25593349

RESUMO

neXtProt (http://www.nextprot.org) is a human protein-centric knowledgebase developed at the SIB Swiss Institute of Bioinformatics. Focused solely on human proteins, neXtProt aims to provide a state of the art resource for the representation of human biology by capturing a wide range of data, precise annotations, fully traceable data provenance and a web interface which enables researchers to find and view information in a comprehensive manner. Since the introductory neXtProt publication, significant advances have been made on three main aspects: the representation of proteomics data, an extended representation of human variants and the development of an advanced search capability built around semantic technologies. These changes are presented in the current neXtProt update.


Assuntos
Bases de Dados de Proteínas , Variação Genética , Proteínas/genética , Proteômica , Linhagem Celular , Doença/genética , Humanos , Internet , Proteoma
4.
Nucleic Acids Res ; 40(Database issue): D76-83, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22139911

RESUMO

neXtProt (http://www.nextprot.org/) is a new human protein-centric knowledge platform. Developed at the Swiss Institute of Bioinformatics (SIB), it aims to help researchers answer questions relevant to human proteins. To achieve this goal, neXtProt is built on a corpus containing both curated knowledge originating from the UniProtKB/Swiss-Prot knowledgebase and carefully selected and filtered high-throughput data pertinent to human proteins. This article presents an overview of the database and the data integration process. We also lay out the key future directions of neXtProt that we consider the necessary steps to make neXtProt the one-stop-shop for all research projects focusing on human proteins.


Assuntos
Bases de Dados de Proteínas , Humanos , Bases de Conhecimento , Proteínas/genética , Proteínas/metabolismo , Interface Usuário-Computador
5.
J Proteome Res ; 12(1): 293-8, 2013 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-23205526

RESUMO

About 5000 (25%) of the ~20400 human protein-coding genes currently lack any experimental evidence at the protein level. For many others, there is only little information relative to their abundance, distribution, subcellular localization, interactions, or cellular functions. The aim of the HUPO Human Proteome Project (HPP, www.thehpp.org ) is to collect this information for every human protein. HPP is based on three major pillars: mass spectrometry (MS), antibody/affinity capture reagents (Ab), and bioinformatics-driven knowledge base (KB). To meet this objective, the Chromosome-Centric Human Proteome Project (C-HPP) proposes to build this catalog chromosome-by-chromosome ( www.c-hpp.org ) by focusing primarily on proteins that currently lack MS evidence or Ab detection. These are termed "missing proteins" by the HPP consortium. The lack of observation of a protein can be due to various factors including incorrect and incomplete gene annotation, low or restricted expression, or instability. neXtProt ( www.nextprot.org ) is a new web-based knowledge platform specific for human proteins that aims to complement UniProtKB/Swiss-Prot ( www.uniprot.org ) with detailed information obtained from carefully selected high-throughput experiments on genomic variation, post-translational modifications, as well as protein expression in tissues and cells. This article describes how neXtProt contributes to prioritize C-HPP efforts and integrates C-HPP results with other research efforts to create a complete human proteome catalog.


Assuntos
Bases de Dados de Proteínas , Proteínas , Proteoma , Cromossomos Humanos , Biologia Computacional , Genoma Humano , Humanos , Internet , Bases de Conhecimento , Espectrometria de Massas , Processamento de Proteína Pós-Traducional , Proteínas/genética , Proteínas/metabolismo
6.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30576492

RESUMO

The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA5, that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA5 assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction.Here, we describe the evaluation of neXtA5 by expert curators. We first assessed the annotations of two independent curators to provide a baseline for comparison. To evaluate the performance of neXtA5, we submitted requests and compared the neXtA5 results with the manual curation. The analysis focuses on the usability of neXtA5 to support the curation of two types of data: biological processes (BPs) and diseases (Ds). We evaluated the relevance of the papers proposed as well as the recall and precision of the suggested annotations.The evaluation of document triage by neXtA5 precision showed that both curators agree with neXtA5 for 67 (BP) and 63% (D) of abstracts, while curators agree on accepting or rejecting an abstract ~80% of the time. Hence, the precision of the triage system is satisfactory.For concept extraction, curators approved 35 (BP) and 25% (D) of the neXtA5 annotations. Conversely, neXtA5 successfully annotated up to 36 (BP) and 68% (D) of the terms identified by curators. The user feedback obtained in these tests highlighted the need for improvement in the ranking function of neXtA5 annotations. Therefore, we transformed the information extraction component into an annotation ranking system. This improvement results in a top precision (precision at first rank) of 59 (D) and 63% (BP). These results suggest that when considering only the first extracted entity, the current system achieves a precision comparable with expert biocurators.


Assuntos
Curadoria de Dados/métodos , Mineração de Dados/métodos , Bases de Dados Factuais , Software , Humanos
7.
Artigo em Inglês | MEDLINE | ID: mdl-27374119

RESUMO

The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements. The first component is a named-entity recognition module, which annotates MEDLINE over some predefined axes. This report focuses on three axes: Diseases, the Molecular Function and Biological Process sub-ontologies of the Gene Ontology (GO). The automatic annotations are then stored in a local database, BioMed, for each annotation axis. Additional entities such as species and chemical compounds are also identified. The second component is an existing search engine, which retrieves the most relevant MEDLINE records for any given query. The third component uses the content of BioMed to generate an axis-specific ranking, which takes into account the density of named-entities as stored in the Biomed database. The two ranked lists are ultimately merged using a linear combination, which has been specifically tuned to support the annotation of each axis. The fine-tuning of the coefficients is formally reported for each axis-driven search. Compared with PubMed, which is the system used by most curators, the improvement is the following: +231% for Diseases, +236% for Molecular Functions and +3153% for Biological Process when measuring the precision of the top-returned PMID (P0 or mean reciprocal rank). The current search methods significantly improve the search effectiveness of curators for three important curation axes. Further experiments are being performed to extend the curation types, in particular protein-protein interactions, which require specific relationship extraction capabilities. In parallel, user-friendly interfaces powered with a set of JSON web services are currently being implemented into the neXtProt annotation pipeline.Available on: http://babar.unige.ch:8082/neXtA5Database URL: http://babar.unige.ch:8082/neXtA5/fetcher.jsp.


Assuntos
Curadoria de Dados/métodos , Mineração de Dados/métodos , Processamento Eletrônico de Dados/métodos , MEDLINE , Ferramenta de Busca/métodos
8.
J Proteome Res ; 4(1): 167-74, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15707372

RESUMO

There is growing interest to use mass spectrometry data to search genome sequences directly. Previous work by other authors demonstrated that this approach is able to correct and complement available genome annotations. We discuss the practical difficulty of searching large eukaryotic genomes with peptide ion trap tandem mass spectra of small proteins (<40 kDa). The challenging problem of automatically identifying peptides that span across exon/intron boundaries is explored for the first time by using experimental data. In a human genome search, we find that roughly 30% of the peptides are missed, due to various reasons, compared to a Swiss-Prot search. We show that this percentage is significantly reduced with improved parent mass accuracy. We finally provide several examples of predicted gene structures that could be improved by proteomics data, in particular by peptides spanning across exon/intron boundaries.


Assuntos
Genômica/métodos , Proteômica/métodos , Adulto , Sequência de Aminoácidos , Bases de Dados de Proteínas , Éxons , Feminino , Humanos , Íntrons , Espectrometria de Massas/métodos , Peptídeos , Gravidez
9.
Proteomics ; 4(7): 1977-84, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15221758

RESUMO

In a previous paper we introduced a novel model-based approach (OLAV) to the problem of identifying peptides via tandem mass spectrometry, for which early implementations showed promising performance. We recently further improved this performance to a remarkable level (1-2% false positive rate at 95% true positive rate) and characterized key properties of OLAV like robustness and training set size. We present these results in a synthetic and coherent way along with detailed performance comparisons, a new scoring component making use of peptide amino acidic composition, and new developments like automatic parameter learning. Finally, we discuss the impact of OLAV on the automation of proteomics projects.


Assuntos
Peptídeos/química , Proteômica/métodos , Algoritmos , Automação , Reações Falso-Positivas , Humanos , Espectrometria de Massas/métodos , Curva ROC , Espectrometria de Massas por Ionização por Electrospray/métodos , Fatores de Tempo
10.
Proteomics ; 4(7): 2125-50, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15221774

RESUMO

Human blood plasma is a useful source of proteins associated with both health and disease. Analysis of human blood plasma is a challenge due to the large number of peptides and proteins present and the very wide range of concentrations. In order to identify as many proteins as possible for subsequent comparative studies, we developed an industrial-scale (2.5 liter) approach involving sample pooling for the analysis of smaller proteins (M(r) generally < ca. 40 000 and some fragments of very large proteins). Plasma from healthy males was depleted of abundant proteins (albumin and IgG), then smaller proteins and polypeptides were separated into 12 960 fractions by chromatographic techniques. Analysis of proteins and polypeptides was performed by mass spectrometry prior to and after enzymatic digestion. Thousands of peptide identifications were made, permitting the identification of 502 different proteins and polypeptides from a single pool, 405 of which are listed here. The numbers refer to chromatographically separable polypeptide entities present prior to digestion. Combining results from studies with other plasma pools we have identified over 700 different proteins and polypeptides in plasma. Relatively low abundance proteins such as leptin and ghrelin and peptides such as bradykinin, all invisible to two-dimensional gel technology, were clearly identified. Proteins of interest were synthesized by chemical methods for bioassays. We believe that this is the first time that the small proteins in human blood plasma have been separated and analyzed so extensively.


Assuntos
Análise Química do Sangue/métodos , Proteínas Sanguíneas/metabolismo , Plasma/metabolismo , Proteômica/métodos , Sequência de Aminoácidos , Cromatografia , Cromatografia em Gel , Cromatografia Líquida de Alta Pressão , Cromatografia por Troca Iônica , Biologia Computacional , Bases de Dados como Assunto , Eletroforese em Gel Bidimensional/métodos , Humanos , Imunoglobulina G/química , Espectrometria de Massas , Dados de Sequência Molecular , Peptídeos/química , Proteoma , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Frações Subcelulares , Fatores de Tempo , Tripsina/farmacologia
11.
Proteomics ; 4(8): 2333-51, 2004 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15274127

RESUMO

We present an integrated proteomics platform designed for performing differential analyses. Since reproducible results are essential for comparative studies, we explain how we improved reproducibility at every step of our laboratory processes, e.g. by taking advantage of the powerful laboratory information management system we developed. The differential capacity of our platform is validated by detecting known markers in a real sample and by a spiking experiment. We introduce an innovative two-dimensional (2-D) plot for displaying identification results combined with chromatographic data. This 2-D plot is very convenient for detecting differential proteins. We also adapt standard multivariate statistical techniques to show that peptide identification scores can be used for reliable and sensitive differential studies. The interest of the protein separation approach we generally apply is justified by numerous statistics, complemented by a comparison with a simple shotgun analysis performed on a small volume sample. By introducing an automatic integration step after mass spectrometry data identification, we are able to search numerous databases systematically, including the human genome and expressed sequence tags. Finally, we explain how rigorous data processing can be combined with the work of human experts to set high quality standards, and hence obtain reliable (false positive < 0.35%) and nonredundant protein identifications.


Assuntos
Líquidos Corporais/química , Perfilação da Expressão Gênica , Gestão da Informação/métodos , Proteínas/análise , Proteínas/química , Proteômica/métodos , Cromatografia/instrumentação , Cromatografia/métodos , Biologia Computacional , Bases de Dados Factuais , Humanos , Gestão da Informação/instrumentação , Espectrometria de Massas/instrumentação , Espectrometria de Massas/métodos , Peptídeos/análise , Proteínas/genética , Proteínas/metabolismo , Reprodutibilidade dos Testes , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA