Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 61
Filter
1.
Proc Natl Acad Sci U S A ; 121(5): e2308776121, 2024 Jan 30.
Article in English | MEDLINE | ID: mdl-38252831

ABSTRACT

We present a drug design strategy based on structural knowledge of protein-protein interfaces selected through virus-host coevolution and translated into highly potential small molecules. This approach is grounded on Vinland, the most comprehensive atlas of virus-human protein-protein interactions with annotation of interacting domains. From this inspiration, we identified small viral protein domains responsible for interaction with human proteins. These peptides form a library of new chemical entities used to screen for replication modulators of several pathogens. As a proof of concept, a peptide from a KSHV protein, identified as an inhibitor of influenza virus replication, was translated into a small molecule series with low nanomolar antiviral activity. By targeting the NEET proteins, these molecules turn out to be of therapeutic interest in a nonalcoholic steatohepatitis mouse model with kidney lesions. This study provides a biomimetic framework to design original chemistries targeting cellular proteins, with indications going far beyond infectious diseases.


Subject(s)
Influenza, Human , Viruses , Animals , Mice , Humans , Proteome , Peptides/pharmacology , Drug Discovery
2.
J Proteome Res ; 23(2): 532-549, 2024 02 02.
Article in English | MEDLINE | ID: mdl-38232391

ABSTRACT

Since 2010, the Human Proteome Project (HPP), the flagship initiative of the Human Proteome Organization (HUPO), has pursued two goals: (1) to credibly identify the protein parts list and (2) to make proteomics an integral part of multiomics studies of human health and disease. The HPP relies on international collaboration, data sharing, standardized reanalysis of MS data sets by PeptideAtlas and MassIVE-KB using HPP Guidelines for quality assurance, integration and curation of MS and non-MS protein data by neXtProt, plus extensive use of antibody profiling carried out by the Human Protein Atlas. According to the neXtProt release 2023-04-18, protein expression has now been credibly detected (PE1) for 18,397 of the 19,778 neXtProt predicted proteins coded in the human genome (93%). Of these PE1 proteins, 17,453 were detected with mass spectrometry (MS) in accordance with HPP Guidelines and 944 by a variety of non-MS methods. The number of neXtProt PE2, PE3, and PE4 missing proteins now stands at 1381. Achieving the unambiguous identification of 93% of predicted proteins encoded from across all chromosomes represents remarkable experimental progress on the Human Proteome parts list. Meanwhile, there are several categories of predicted proteins that have proved resistant to detection regardless of protein-based methods used. Additionally there are some PE1-4 proteins that probably should be reclassified to PE5, specifically 21 LINC entries and ∼30 HERV entries; these are being addressed in the present year. Applying proteomics in a wide array of biological and clinical studies ensures integration with other omics platforms as reported by the Biology and Disease-driven HPP teams and the antibody and pathology resource pillars. Current progress has positioned the HPP to transition to its Grand Challenge Project focused on determining the primary function(s) of every protein itself and in networks and pathways within the context of human health and disease.


Subject(s)
Antibodies , Proteome , Humans , Proteome/genetics , Proteome/analysis , Databases, Protein , Mass Spectrometry/methods , Proteomics/methods
3.
J Proteome Res ; 22(4): 1024-1042, 2023 04 07.
Article in English | MEDLINE | ID: mdl-36318223

ABSTRACT

The 2022 Metrics of the Human Proteome from the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18 407 (93.2%) of the 19 750 predicted proteins coded in the human genome, a net gain of 50 since 2021 from data sets generated around the world and reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 78 from 1421 to 1343. This represents continuing experimental progress on the human proteome parts list across all the chromosomes, as well as significant reclassifications. Meanwhile, applying proteomics in a vast array of biological and clinical studies continues to yield significant findings and growing integration with other omics platforms. We present highlights from the Chromosome-Centric HPP, Biology and Disease-driven HPP, and HPP Resource Pillars, compare features of mass spectrometry and Olink and Somalogic platforms, note the emergence of translation products from ribosome profiling of small open reading frames, and discuss the launch of the initial HPP Grand Challenge Project, "A Function for Each Protein".


Subject(s)
Proteome , Proteomics , Humans , Proteome/genetics , Proteome/analysis , Databases, Protein , Mass Spectrometry/methods , Open Reading Frames , Proteomics/methods
4.
J Proteome Res ; 22(4): 1148-1158, 2023 04 07.
Article in English | MEDLINE | ID: mdl-36445260

ABSTRACT

The Chromosome-centric Human Proteome Project (C-HPP) aims at identifying the proteins as gene products encoded by the human genome, characterizing their isoforms and functions. The existence of products has now been confirmed for 93.2% of the genes at the protein level. The remaining mostly correspond to proteins of low abundance or difficult to access. Over the past years, we have significantly contributed to the identification of missing proteins in the human spermatozoa. We pursue our search in the reproductive sphere with a focus on early human embryonic development. Pluripotent cells, developing into the fetus, and trophoblast cells, giving rise to the placenta, emerge during the first weeks. This emergence is a focus of scientists working in the field of reproduction, placentation and regenerative medicine. Most knowledge has been harnessed by transcriptomic analysis. Interestingly, some genes are uniquely expressed in those cells, giving the opportunity to uncover new proteins that might play a crucial role in setting up the molecular events underlying early embryonic development. Here, we analyzed naive pluripotent and trophoblastic stem cells and discovered 4 new missing proteins, thus contributing to the C-HPP. The mass spectrometry proteomics data was deposited on ProteomeXchange under the data set identifier PXD035768.


Subject(s)
Proteome , Trophoblasts , Male , Humans , Proteome/genetics , Proteome/analysis , Mass Spectrometry , Chromosomes/chemistry , Cell Line
5.
J Proteome Res ; 20(12): 5227-5240, 2021 12 03.
Article in English | MEDLINE | ID: mdl-34670092

ABSTRACT

The 2021 Metrics of the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18 357 (92.8%) of the 19 778 predicted proteins coded in the human genome, a gain of 483 since 2020 from reports throughout the world reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 478 to 1421. This represents remarkable progress on the proteome parts list. The utilization of proteomics in a broad array of biological and clinical studies likewise continues to expand with many important findings and effective integration with other omics platforms. We present highlights from the Immunopeptidomics, Glycoproteomics, Infectious Disease, Cardiovascular, Musculo-Skeletal, Liver, and Cancers B/D-HPP teams and from the Knowledgebase, Mass Spectrometry, Antibody Profiling, and Pathology resource pillars, as well as ethical considerations important to the clinical utilization of proteomics and protein biomarkers.


Subject(s)
Benchmarking , Proteome , Databases, Protein , Humans , Mass Spectrometry/methods , Proteome/analysis , Proteome/genetics , Proteomics/methods
6.
Database (Oxford) ; 20212021 07 28.
Article in English | MEDLINE | ID: mdl-34318869

ABSTRACT

About 10% of human proteins have no annotated function in protein knowledge bases. A workflow to generate hypotheses for the function of these uncharacterized proteins has been developed, based on predicted and experimental information on protein properties, interactions, tissular expression, subcellular localization, conservation in other organisms, as well as phenotypic data in mutant model organisms. This workflow has been applied to seven uncharacterized human proteins (C6orf118, C7orf25, CXorf58, RSRP1, SMLR1, TMEM53 and TMEM232) in the frame of a course-based undergraduate research experience named Functionathon organized at the University of Geneva to teach undergraduate students how to use biological databases and bioinformatics tools and interpret the results. C6orf118, CXorf58 and TMEM232 were proposed to be involved in cilia-related functions; TMEM53 and SMLR1 were proposed to be involved in lipid metabolism and C7orf25 and RSRP1 were proposed to be involved in RNA metabolism and gene expression. Experimental strategies to test these hypotheses were also discussed. The results of this manual data mining study may contribute to the project recently launched by the Human Proteome Organization (HUPO) Human Proteome Project aiming to fill gaps in the functional annotation of human proteins. Database URL: http://www.nextprot.org.


Subject(s)
Data Mining , Proteome , Databases, Protein , Humans , Students , Workflow
7.
Front Cell Neurosci ; 15: 653075, 2021.
Article in English | MEDLINE | ID: mdl-33796011

ABSTRACT

Neuropathological diseases of the central nervous system (CNS) are frequently associated with impaired differentiation of the oligodendroglial cell lineage and subsequent alterations in white matter structure and dynamics. Down syndrome (DS), or trisomy 21, is the most common genetic cause for cognitive impairments and intellectual disability (ID) and is associated with a reduction in the number of neurons and oligodendrocytes, as well as with hypomyelination and astrogliosis. Recent studies mainly focused on neuronal development in DS and underestimated the role of glial cells as pathogenic players. This also relates to C21ORF91, a protein considered a key modulator of aberrant CNS development in DS. We investigated the role of C21orf91 ortholog in terms of oligodendrogenesis and myelination using database information as well as through cultured primary oligodendroglial precursor cells (OPCs). Upon modulation of C21orf91 gene expression, we found this factor to be important for accurate oligodendroglial differentiation, influencing their capacity to mature and to myelinate axons. Interestingly, C21orf91 overexpression initiates a cell population coexpressing astroglial- and oligodendroglial markers indicating that elevated C21orf91 expression levels induce a gliogenic shift towards the astrocytic lineage reflecting non-equilibrated glial cell populations in DS brains.

8.
J Proteome Res ; 19(12): 4782-4794, 2020 12 04.
Article in English | MEDLINE | ID: mdl-33064489

ABSTRACT

In the context of the Human Proteome Project, we built an inventory of 412 functionally unannotated human proteins for which experimental evidence at the protein level exists (uPE1) and which are highly expressed in tissues involved in human male reproduction. We implemented a strategy combining literature mining, bioinformatics tools to collate annotation and experimental information from specific molecular public resources, and efficient visualization tools to put these unknown proteins into their biological context (protein complexes, tissue and subcellular location, expression pattern). The gathered knowledge allowed pinpointing five uPE1 for which a function has recently been proposed and which should be updated in protein knowledge bases. Furthermore, this bioinformatics strategy allowed to build new functional hypotheses for five other uPE1s in link with phenotypic traits that are specific to male reproductive function such as ciliogenesis/flagellum formation in germ cells (CCDC112 and TEX9), chromatin remodeling (C3orf62) and spermatozoon maturation (CCDC183). We also discussed the enigmatic case of MAGEB proteins, a poorly documented cancer/testis antigen subtype. Tools used and computational outputs produced during this study are freely accessible via ProteoRE (http://www.proteore.org), a Galaxy-based instance, for reuse purposes. We propose these five uPE1s should be investigated in priority by expert laboratories and hope that this inventory and shared resources will stimulate the interest of the community of reproductive biology.


Subject(s)
Proteome , Proteomics , Computational Biology , Humans , Knowledge Bases , Male , Proteome/genetics , Reproduction
9.
Nat Commun ; 11(1): 5301, 2020 10 16.
Article in English | MEDLINE | ID: mdl-33067450

ABSTRACT

The Human Proteome Organization (HUPO) launched the Human Proteome Project (HPP) in 2010, creating an international framework for global collaboration, data sharing, quality assurance and enhancing accurate annotation of the genome-encoded proteome. During the subsequent decade, the HPP established collaborations, developed guidelines and metrics, and undertook reanalysis of previously deposited community data, continuously increasing the coverage of the human proteome. On the occasion of the HPP's tenth anniversary, we here report a 90.4% complete high-stringency human proteome blueprint. This knowledge is essential for discerning molecular processes in health and disease, as we demonstrate by highlighting potential roles the human proteome plays in our understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.


Subject(s)
Disease/genetics , Proteome/genetics , Human Genome Project , Humans , Proteome/chemistry , Proteome/metabolism , Proteomics
10.
J Proteome Res ; 19(12): 4735-4746, 2020 12 04.
Article in English | MEDLINE | ID: mdl-32931287

ABSTRACT

According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19 773 predicted proteins coded in the human genome. The HPP annually reports on progress made throughout the world toward credibly identifying and characterizing the complete human protein parts list and promoting proteomics as an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2020-01 classified 17 874 proteins as PE1, having strong protein-level evidence, up 180 from 17 694 one year earlier. These represent 90.4% of the 19 773 predicted coding genes (all PE1,2,3,4 proteins in neXtProt). Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), was reduced by 230 from 2129 to 1899 since the neXtProt 2019-01 release. PeptideAtlas is the primary source of uniform reanalysis of raw mass spectrometry data for neXtProt, supplemented this year with extensive data from MassIVE. PeptideAtlas 2020-01 added 362 canonical proteins between 2019 and 2020 and MassIVE contributed 84 more, many of which converted PE1 entries based on non-MS evidence to the MS-based subgroup. The 19 Biology and Disease-driven B/D-HPP teams continue to pursue the identification of driver proteins that underlie disease states, the characterization of regulatory mechanisms controlling the functions of these proteins, their proteoforms, and their interactions, and the progression of transitions from correlation to coexpression to causal networks after system perturbations. And the Human Protein Atlas published Blood, Brain, and Metabolic Atlases.


Subject(s)
Proteome , Proteomics , Databases, Protein , Genome, Human , Humans , Mass Spectrometry , Proteome/genetics
11.
Nat Commun ; 11(1): 1312, 2020 03 11.
Article in English | MEDLINE | ID: mdl-32161263

ABSTRACT

The emergence of small open reading frame (sORF)-encoded peptides (SEPs) is rapidly expanding the known proteome at the lower end of the size distribution. Here, we show that the mitochondrial proteome, particularly the respiratory chain, is enriched for small proteins. Using a prediction and validation pipeline for SEPs, we report the discovery of 16 endogenous nuclear encoded, mitochondrial-localized SEPs (mito-SEPs). Through functional prediction, proteomics, metabolomics and metabolic flux modeling, we demonstrate that BRAWNIN, a 71 a.a. peptide encoded by C12orf73, is essential for respiratory chain complex III (CIII) assembly. In human cells, BRAWNIN is induced by the energy-sensing AMPK pathway, and its depletion impairs mitochondrial ATP production. In zebrafish, Brawnin deletion causes complete CIII loss, resulting in severe growth retardation, lactic acidosis and early death. Our findings demonstrate that BRAWNIN is essential for vertebrate oxidative phosphorylation. We propose that mito-SEPs are an untapped resource for essential regulators of oxidative metabolism.


Subject(s)
Electron Transport Complex III/metabolism , Mitochondria/metabolism , Mitochondrial Proteins/metabolism , Oxidative Phosphorylation , Peptides/metabolism , Zebrafish Proteins/metabolism , Acidosis, Lactic/genetics , Animals , Animals, Genetically Modified , Disease Models, Animal , Female , Gene Knockdown Techniques , Growth Disorders/genetics , Humans , Male , Metabolomics , Mitochondrial Proteins/genetics , Models, Animal , Models, Biological , Open Reading Frames/genetics , Peptides/genetics , Proteomics , Zebrafish/genetics , Zebrafish/growth & development , Zebrafish Proteins/genetics
13.
Nucleic Acids Res ; 48(D1): D328-D334, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31724716

ABSTRACT

The neXtProt knowledgebase (https://www.nextprot.org) is an integrative resource providing both data on human protein and the tools to explore these. In order to provide comprehensive and up-to-date data, we evaluate and add new data sets. We describe the incorporation of three new data sets that provide expression, function, protein-protein binary interaction, post-translational modifications (PTM) and variant information. New SPARQL query examples illustrating uses of the new data were added. neXtProt has continued to develop tools for proteomics. We have improved the peptide uniqueness checker and have implemented a new protein digestion tool. Together, these tools make it possible to determine which proteases can be used to identify trypsin-resistant proteins by mass spectrometry. In terms of usability, we have finished revamping our web interface and completely rewritten our API. Our SPARQL endpoint now supports federated queries. All the neXtProt data are available via our user interface, API, SPARQL endpoint and FTP site, including the new PEFF 1.0 format files. Finally, the data on our FTP site is now CC BY 4.0 to promote its reuse.


Subject(s)
Databases, Protein , Knowledge Bases , Humans , Internet , Mass Spectrometry , Peptides/chemistry , Protein Kinases/chemistry , Protein Kinases/metabolism , Protein Processing, Post-Translational , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Sequence Analysis, RNA , Software , Trypsin , User-Computer Interface
15.
J Proteome Res ; 18(12): 4154-4166, 2019 12 06.
Article in English | MEDLINE | ID: mdl-31581775

ABSTRACT

In 2018, we reported a hybrid pipeline that predicts protein structures with I-TASSER and function with COFACTOR. I-TASSER/COFACTOR achieved Gene Ontology (GO) high prediction accuracies of Fmax = 0.69 and 0.57 for molecular function (MF) and biological process (BP), respectively, on 100 comprehensively annotated proteins. Now we report blinded analyses of newly annotated proteins in the critical assessment of function annotation (CAFA) three function prediction challenge and in neXtProt. For CAFA3 results released in May 2019, our predictions on 267 and 912 human proteins with newly annotated MF and BP terms achieved Fmax = 0.50 and 0.42, respectively, on "No Knowledge" proteins, and 0.51 and 0.74, respectively, on "Limited Knowledge" proteins. While COFACTOR consistently outperforms simple homology-based analysis, its accuracy still depends on template availability. Meanwhile, in neXtProt 2019-01, 25 proteins acquired new function annotation through literature curation at UniProt/Swiss-Prot. Before the release of these curated results, we submitted to neXtProt blinded predictions of free-text function annotation based on predicted GO terms. For 10 of the 25, a good match of free-text or GO term annotation was obtained. These blind tests represent rigorous assessments of I-TASSER/COFACTOR. neXtProt now provides links to precomputed I-TASSER/COFACTOR predictions for proteins without function annotation to facilitate experimental planning on "dark proteins".


Subject(s)
Databases, Protein , Molecular Sequence Annotation/methods , Proteins/chemistry , Proteins/metabolism , Computational Biology/methods , Humans
16.
J Proteome Res ; 18(12): 4108-4116, 2019 12 06.
Article in English | MEDLINE | ID: mdl-31599596

ABSTRACT

The Human Proteome Organization's (HUPO) Human Proteome Project (HPP) developed Mass Spectrometry (MS) Data Interpretation Guidelines that have been applied since 2016. These guidelines have helped ensure that the emerging draft of the complete human proteome is highly accurate and with low numbers of false-positive protein identifications. Here, we describe an update to these guidelines based on consensus-reaching discussions with the wider HPP community over the past year. The revised 3.0 guidelines address several major and minor identified gaps. We have added guidelines for emerging data independent acquisition (DIA) MS workflows and for use of the new Universal Spectrum Identifier (USI) system being developed by the HUPO Proteomics Standards Initiative (PSI). In addition, we discuss updates to the standard HPP pipeline for collecting MS evidence for all proteins in the HPP, including refinements to minimum evidence. We present a new plan for incorporating MassIVE-KB into the HPP pipeline for the next (HPP 2020) cycle in order to obtain more comprehensive coverage of public MS data sets. The main checklist has been reorganized under headings and subitems, and related guidelines have been grouped. In sum, Version 2.1 of the HPP MS Data Interpretation Guidelines has served well, and this timely update to version 3.0 will aid the HPP as it approaches its goal of collecting and curating MS evidence of translation and expression for all predicted ∼20 000 human proteins encoded by the human genome.


Subject(s)
Guidelines as Topic , Mass Spectrometry/methods , Proteome , Signal Processing, Computer-Assisted , Humans , Proteomics , Societies, Scientific
17.
J Proteome Res ; 18(12): 4143-4153, 2019 12 06.
Article in English | MEDLINE | ID: mdl-31517492

ABSTRACT

Using neXtProt release 2019-01-11, we manually curated a list of 1837 functionally uncharacterized human proteins. Using OrthoList 2, we found that 270 of them have homologues in Caenorhabditis elegans, including 60 with a one-to-one orthology relationship. According to annotations extracted from WormBase, the vast majority of these 60 worm genes have RNAi experimental data or mutant alleles, but manual inspection shows that only 15% have phenotypes that could be interpreted in terms of a specific function. One third of the worm orthologs have protein-protein interaction data, and two of these interactions are conserved in humans. The combination of phenotypic, protein-protein interaction, and gene expression data provides functional hypotheses for 8 uncharacterized human proteins. Experimental validation in human or orthologs is necessary before they can be considered for annotation.


Subject(s)
Caenorhabditis elegans Proteins , Databases, Protein , Proteins/metabolism , Animals , Gene Expression , Humans , Membrane Proteins/genetics , Membrane Proteins/metabolism , Mice , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Phenotype , Protein Interaction Maps , Proteins/genetics , RNA Interference , Sequence Homology, Amino Acid
18.
J Proteome Res ; 18(12): 4098-4107, 2019 12 06.
Article in English | MEDLINE | ID: mdl-31430157

ABSTRACT

The Human Proteome Project (HPP) annually reports on progress made throughout the field in credibly identifying and characterizing the complete human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1,2,3,4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), has been reduced from 2949 to 2129 since 2016 through efforts throughout the community, including the chromosome-centric HPP. PeptideAtlas is the source of uniformly reanalyzed raw mass spectrometry data for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, especially from studies designed to detect hard-to-identify proteins. Meanwhile, the Human Protein Atlas has released version 18.1 with immunohistochemical evidence of expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many investigators apply multiplexed SRM-targeted proteomics for quantitation of organ-specific popular proteins in studies of various human diseases. The 19 teams of the Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, bringing proteomics to a broad array of biomedical research.


Subject(s)
Databases, Protein , Proteins/metabolism , Proteome , Chromosomes, Human , Guidelines as Topic , Humans , Mass Spectrometry , Proteins/chemistry , Proteins/genetics , Proteome/genetics
19.
J Proteome Res ; 18(6): 2686-2692, 2019 06 07.
Article in English | MEDLINE | ID: mdl-31081335

ABSTRACT

Mass-spectrometry-based proteomics enables the high-throughput identification and quantification of proteins, including sequence variants and post-translational modifications (PTMs) in biological samples. However, most workflows require that such variations be included in the search space used to analyze the data, and doing so remains challenging with most analysis tools. In order to facilitate the search for known sequence variants and PTMs, the Proteomics Standards Initiative (PSI) has designed and implemented the PSI extended FASTA format (PEFF). PEFF is based on the very popular FASTA format but adds a uniform mechanism for encoding substantially more metadata about the sequence collection as well as individual entries, including support for encoding known sequence variants, PTMs, and proteoforms. The format is very nearly backward compatible, and as such, existing FASTA parsers will require little or no changes to be able to read PEFF files as FASTA files, although without supporting any of the extra capabilities of PEFF. PEFF is defined by a full specification document, controlled vocabulary terms, a set of example files, software libraries, and a file validator. Popular software and resources are starting to support PEFF, including the sequence search engine Comet and the knowledge bases neXtProt and UniProtKB. Widespread implementation of PEFF is expected to further enable proteogenomics and top-down proteomics applications by providing a standardized mechanism for encoding protein sequences and their known variations. All the related documentation, including the detailed file format specification and example files, are available at http://www.psidev.info/peff .


Subject(s)
Proteomics/standards , Humans , Information Storage and Retrieval , Mass Spectrometry , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...