Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 182(1): 226-244.e17, 2020 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-32649875

RESUMO

Lung cancer in East Asia is characterized by a high percentage of never-smokers, early onset and predominant EGFR mutations. To illuminate the molecular phenotype of this demographically distinct disease, we performed a deep comprehensive proteogenomic study on a prospectively collected cohort in Taiwan, representing early stage, predominantly female, non-smoking lung adenocarcinoma. Integrated genomic, proteomic, and phosphoproteomic analysis delineated the demographically distinct molecular attributes and hallmarks of tumor progression. Mutational signature analysis revealed age- and gender-related mutagenesis mechanisms, characterized by high prevalence of APOBEC mutational signature in younger females and over-representation of environmental carcinogen-like mutational signatures in older females. A proteomics-informed classification distinguished the clinical characteristics of early stage patients with EGFR mutations. Furthermore, integrated protein network analysis revealed the cellular remodeling underpinning clinical trajectories and nominated candidate biomarkers for patient stratification and therapeutic intervention. This multi-omic molecular architecture may help develop strategies for management of early stage never-smoker lung adenocarcinoma.


Assuntos
Progressão da Doença , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Proteogenômica , Fumar/genética , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/patologia , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Carcinógenos/toxicidade , Estudos de Coortes , Citosina Desaminase/metabolismo , Ásia Oriental , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Genoma Humano , Humanos , Metaloproteinases da Matriz/metabolismo , Mutação/genética , Análise de Componente Principal
2.
J Proteome Res ; 18(12): 4124-4132, 2019 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-31429573

RESUMO

When conducting proteomics experiments to detect missing proteins and protein isoforms in the human proteome, it is desirable to use a protease that can yield more unique peptides with properties amenable for mass spectrometry analysis. Though trypsin is currently the most widely used protease, some proteins can yield only a limited number of unique peptides by trypsin digestion. Other proteases and multiple proteases have been applied in reported studies to increase the number of identified proteins and protein sequence coverage. To facilitate the selection of proteases, we developed a web-based resource, called in silico Human Proteome Digestion Map (iHPDM), which contains a comprehensive proteolytic peptide database constructed from human proteins, including isoforms, in neXtProt digested by 15 protease combinations of one or two proteases. iHPDM provides convenient functions and graphical visualizations for users to examine and compare the digestion results of different proteases. Notably, it also supports users to input filtering criteria on digested peptides, e.g., peptide length and uniqueness, to select suitable proteases. iHPDM can facilitate protease selection for shotgun proteomics experiments to identify missing proteins, protein isoforms, and single amino acid variant peptides.


Assuntos
Peptídeo Hidrolases/metabolismo , Mapeamento de Peptídeos/métodos , Proteoma/metabolismo , Gráficos por Computador , Simulação por Computador , Visualização de Dados , Bases de Dados Factuais , Receptores ErbB/metabolismo , Humanos , Internet , MAP Quinase Quinase 1/metabolismo , N-Acetilexosaminiltransferases/metabolismo , Isoformas de Proteínas/metabolismo , Proteômica/métodos , Receptores Odorantes/metabolismo , Interface Usuário-Computador , gama-Glutamiltransferase/metabolismo
3.
Anal Chem ; 91(15): 9403-9406, 2019 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-31305071

RESUMO

Protein and peptide identification and quantitation are essential tasks in proteomics research and involve a series of steps in analyzing mass spectrometry data. Trans-Proteomic Pipeline (TPP) provides a wide range of useful tools through its web interfaces for analyses such as sequence database search, statistical validation, and quantitation. To utilize the powerful functionality of TPP without the need for manual intervention to launch each step, we developed a software tool, called WinProphet, to create and automatically execute a pipeline for proteomic analyses. It seamlessly integrates with TPP and other external command-line programs, supporting various functionalities, including database search for protein and peptide identification, spectral library construction and search, data-independent acquisition (DIA) data analysis, and isobaric labeling and label-free quantitation. WinProphet is a standalone, installation-free tool with graphical interfaces for users to configure, manage, and automatically execute pipelines. The constructed pipelines can be exported as XML files with all of the parameter settings for reusability and portability. The executable files, user manual, and sample data sets of WinProphet are freely available at  http://ms.iis.sinica.edu.tw/COmics/Software_WinProphet.html .


Assuntos
Análise de Dados , Proteômica/métodos , Software , Interface Usuário-Computador , Fluxo de Trabalho
4.
J Proteome Res ; 17(9): 2937-2952, 2018 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-30088773

RESUMO

In proteogenomic studies, many genome-annotated events, for example, single amino acid variation (SAAV) and short INDEL, are often unobserved in shotgun proteomics. Therefore, we propose an analysis pipeline called LeTE-fusion (Le, peptide length; T, theoretical values; E, experimental data) to first investigate whether peptides with certain lengths are observed more often in mass spectrometry (MS)-based proteomics, which may hinder peptide identification causing difficulty in detecting genome-annotated events. By applying LeTE-fusion on different MS-based proteome data sets, we found peptides within 7-20 amino acids are more frequently identified, possibly attributed to MS-related factors instead of proteases. We then further extended the usage of LeTE-fusion on four variant-containing-sequence data sets (SAAV-only) with various sample complexity up to the whole human proteome scale, which yields theoretically ∼70% variants observable in an ideal shotgun proteomics. However, only ∼40% of variants might be detectable in real shotgun proteomic experiments when LeTE-fusion utilizes the experimentally observed variant-site-containing wild-type peptides in PeptideAtlas to estimate the expected observable coverage of variants. Finally, we conducted a case study on HEK293 cell line with variants reported at genomic level that were also identified in shotgun proteomics to demonstrate the efficacy of LeTE-fusion on estimating expected observable coverage of variants. To the best of our knowledge, this is the first study to systematically investigate the detection limits of genome-annotated events via shotgun proteomics using such analysis pipeline.


Assuntos
Genoma Humano , Peptídeos/análise , Proteogenômica/métodos , Proteoma/análise , Sequência de Aminoácidos , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Células HEK293 , Humanos , Rim/química , Rim/metabolismo , Peptídeos/química , Proteólise , Proteoma/genética , Proteoma/metabolismo
5.
J Proteome Res ; 17(12): 4138-4151, 2018 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-30203655

RESUMO

Human embryonic stem cells (hESCs) have the capacity for self-renewal and multilineage differentiation, which are of clinical importance for regeneration medicine. Despite the significant progress of hESC study, the complete hESC proteome atlas, especially the surface protein composition, awaits delineation. According to the latest release of neXtProt database (January 17, 2018; 19 658 PE1, 2, 3, and 4 human proteins), membrane proteins present the major category (1047; 48%) among all 2186 missing proteins (MPs). We conducted a deep subcellular proteomics analysis of hESCs to identify the nuclear, cytoplasmic, and membrane proteins in hESCs and to mine missing membrane proteins in the very early cell status. To our knowledge, our study achieved the largest data set with confident identification of 11 970 unique proteins (1% false discovery rate at peptide, protein, and PSM levels), including the most-comprehensive description of 6 138 annotated membrane proteins in hESCs. Following the HPP guideline, we identified 26 gold (neXtProt PE2, 3, and 4 MPs) and 87 silver (potential MP candidates with a single unique peptide detected) MPs, of which 69 were membrane proteins, and the expression of 21 gold MPs was further verified either by multiple reaction monitoring mass spectrometry or by matching synthetic peptides in the Peptide Atlas database. Functional analysis of the MPs revealed their potential roles in the pluripotency-related pathways and the lineage- and tissue-specific differentiation processes. Our proteome map of hESCs may provide a rich resource not only for the identification of MPs in the human proteome but also for the investigation on self-renewal and differentiation of hESC. All mass spectrometry data were deposited in ProteomeXchange via jPOST with identifier PXD009840.


Assuntos
Células-Tronco Embrionárias Humanas/química , Proteínas de Membrana/análise , Proteoma/análise , Diferenciação Celular , Linhagem da Célula , Humanos , Membranas Intracelulares/química , Proteômica/métodos
6.
Nucleic Acids Res ; 44(W1): W575-80, 2016 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-27084943

RESUMO

MAGIC-web is the first web server, to the best of our knowledge, that performs both untargeted and targeted analyses of mass spectrometry-based glycoproteomics data for site-specific N-linked glycoprotein identification. The first two modules, MAGIC and MAGIC+, are designed for untargeted and targeted analysis, respectively. MAGIC is implemented with our previously proposed novel Y1-ion pattern matching method, which adequately detects Y1- and Y0-ion without prior information of proteins and glycans, and then generates in silico MS(2) spectra that serve as input to a database search engine (e.g. Mascot) to search against a large-scale protein sequence database. On top of that, the newly implemented MAGIC+ allows users to determine glycopeptide sequences using their own protein sequence file. The third module, Reports Integrator, provides the service of combining protein identification results from Mascot and glycan-related information from MAGIC-web to generate a complete site-specific protein-glycan summary report. The last module, Glycan Search, is designed for the users who are interested in finding possible glycan structures with specific numbers and types of monosaccharides. The results from MAGIC, MAGIC+ and Reports Integrator can be downloaded via provided links whereas the annotated spectra and glycan structures can be visualized in the browser. MAGIC-web is accessible from http://ms.iis.sinica.edu.tw/MAGIC-web/index.html.


Assuntos
Glicoproteínas/análise , Glicoproteínas/química , Internet , Polissacarídeos/análise , Polissacarídeos/química , Software , Simulação por Computador , Bases de Dados de Proteínas , Glicopeptídeos/análise , Glicopeptídeos/química , Humanos , Espectrometria de Massas , Proteômica , Ferramenta de Busca , Interface Usuário-Computador , Navegador
7.
J Proteome Res ; 16(12): 4415-4424, 2017 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-28929764

RESUMO

To confirm the existence of missing proteins, we need to identify at least two unique peptides with length of 9-40 amino acids of a missing protein in bottom-up mass-spectrometry-based proteomic experiments. However, an identified unique peptide of the missing protein, even identified with high level of confidence, could possibly coincide with a peptide of a commonly observed protein due to isobaric substitutions, mass modifications, alternative splice isoforms, or single amino acid variants (SAAVs). Besides unique peptides of missing proteins, identified variant peptides (SAAV-containing peptides) could also alternatively map to peptides of other proteins due to the aforementioned issues. Therefore, we conducted a thorough comparative analysis on data sets in PeptideAtlas Tiered Human Integrated Search Proteome (THISP, 2017-03 release), including neXtProt (2017-01 release), to systematically investigate the possibility of unique peptides in missing proteins (PE2-4), unique peptides in dubious proteins, and variant peptides affected by isobaric substitutions, causing doubtful identification results. In this study, we considered 11 isobaric substitutions. From our analysis, we found <5% of the unique peptides of missing proteins and >6% of variant peptides became shared with peptides of PE1 proteins after isobaric substitutions.


Assuntos
Peptídeos/análise , Proteoma/análise , Sequência de Aminoácidos , Bases de Dados de Proteínas , Humanos , Isoformas de Proteínas , Espectrometria de Massas em Tandem
8.
Anal Chem ; 89(24): 13128-13136, 2017 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-29165996

RESUMO

Top-down proteomics using liquid chromatogram coupled with mass spectrometry has been increasingly applied for analyzing intact proteins to study genetic variation, alternative splicing, and post-translational modifications (PTMs) of the proteins (proteoforms). However, only a few tools have been developed for charge state deconvolution, monoisotopic/average molecular weight determination and quantitation of proteoforms from LC-MS1 spectra. Though Decon2LS and MASH Suite Pro have been available to provide intraspectrum charge state deconvolution and quantitation, manual processing is still required to quantify proteoforms across multiple MS1 spectra. An automated tool for interspectrum quantitation is a pressing need. Thus, in this paper, we present a user-friendly tool, called iTop-Q (intelligent Top-down Proteomics Quantitation), that automatically performs large-scale proteoform quantitation based on interspectrum abundance in top-down proteomics. Instead of utilizing single spectrum for proteoform quantitation, iTop-Q constructs extracted ion chromatograms (XICs) of possible proteoform peaks across adjacent MS1 spectra to calculate abundances for accurate quantitation. Notably, iTop-Q is implemented with a newly proposed algorithm, called DYAMOND, using dynamic programming for charge state deconvolution. In addition, iTop-Q performs proteoform alignment to support quantitation analysis across replicates/samples. The performance evaluations on an in-house standard data set and a public large-scale yeast lysate data set show that iTop-Q achieves highly accurate quantitation, more consistent quantitation than using intraspectrum quantitation. Furthermore, the DYAMOND algorithm is suitable for high charge state deconvolution and can distinguish shared peaks in coeluting proteoforms. iTop-Q is publicly available for download at http://ms.iis.sinica.edu.tw/COmics/Software_iTop-Q .


Assuntos
Algoritmos , Proteínas/análise , Proteômica , Cromatografia Líquida , Espectrometria de Massas
9.
J Proteome Res ; 14(12): 5396-407, 2015 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-26549055

RESUMO

Protein experiment evidence at protein level from mass spectrometry and antibody experiments are essential to characterize the human proteome. neXtProt (2014-09 release) reported 20 055 human proteins, including 16 491 proteins identified at protein level and 3564 proteins unidentified. Excluding 616 proteins at uncertain level, 2948 proteins were regarded as missing proteins. Missing proteins were unidentified partially due to MS limitations and intrinsic properties of proteins, for example, only appearing in specific diseases or tissues. Despite such reasons, it is desirable to explore issues affecting validation of missing proteins from an "ideal" shotgun analysis of human proteome. We thus performed in silico digestions on the human proteins to generate all in silico fully digested peptides. With these presumed peptides, we investigated the identification of proteins without any unique peptide, the effect of sequence variants on protein identification, difficulties in identifying olfactory receptors, and highly similar proteins. Among all proteins with evidence at transcript level, G protein-coupled receptors and olfactory receptors, based on InterPro classification, were the largest families of proteins and exhibited more frequent variants. To identify missing proteins, the above analyses suggested including sequence variants in protein FASTA for database searching. Furthermore, evidence of unique peptides identified from MS experiments would be crucial for experimentally validating missing proteins.


Assuntos
Proteômica/métodos , Sequência de Aminoácidos , Anexinas/química , Anexinas/genética , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Variação Genética , Humanos , Interações Hidrofóbicas e Hidrofílicas , Espectrometria de Massas , Anotação de Sequência Molecular , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Fragmentos de Peptídeos/genética , Fragmentos de Peptídeos/isolamento & purificação , Proteólise , Proteoma/química , Proteoma/genética , Proteoma/isolamento & purificação , Proteômica/estatística & dados numéricos , Receptores Odorantes/química , Receptores Odorantes/genética , Receptores Odorantes/isolamento & purificação
10.
J Proteome Res ; 14(9): 3658-69, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26202522

RESUMO

Despite significant efforts in the past decade toward complete mapping of the human proteome, 3564 proteins (neXtProt, 09-2014) are still "missing proteins". Over one-third of these missing proteins are annotated as membrane proteins, owing to their relatively challenging accessibility with standard shotgun proteomics. Using nonsmall cell lung cancer (NSCLC) as a model study, we aim to mine missing proteins from disease-associated membrane proteome, which may be still largely under-represented. To increase identification coverage, we employed Hp-RP StageTip prefractionation of membrane-enriched samples from 11 NSCLC cell lines. Analysis of membrane samples from 20 pairs of tumor and adjacent normal lung tissue was incorporated to include physiologically expressed membrane proteins. Using multiple search engines (X!Tandem, Comet, and Mascot) and stringent evaluation of FDR (MAYU and PeptideShaker), we identified 7702 proteins (66% membrane proteins) and 178 missing proteins (74 membrane proteins) with PSM-, peptide-, and protein-level FDR of 1%. Through multiple reaction monitoring using synthetic peptides, we provided additional evidence of eight missing proteins including seven with transmembrane helix domains. This study demonstrates that mining missing proteins focused on cancer membrane subproteome can greatly contribute to map the whole human proteome. All data were deposited into ProteomeXchange with the identifier PXD002224.


Assuntos
Proteínas de Membrana/química , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Linhagem Celular Tumoral , Cromatografia Líquida/métodos , Humanos , Concentração de Íons de Hidrogênio , Dados de Sequência Molecular , Proteoma
11.
J Proteome Res ; 14(9): 3415-31, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26076068

RESUMO

This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-HPP) consortium, which develops new technologies to identify yet-to-be annotated proteins (termed "missing proteins") in biological samples that lack sufficient experimental evidence at the protein level for confident protein identification. The C-HPP also aims to identify new protein forms that may be caused by genetic variability, post-translational modifications, and alternative splicing. Proteogenomic data integration forms the basis of the C-HPP's activities; therefore, we have summarized some of the key approaches and their roles in the project. We present new analytical technologies that improve the chemical space and lower detection limits coupled to bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays. Most of this paper's content has been compiled from posters, slides, and discussions presented in the series of C-HPP workshops held during 2014. All data (posters, presentations) used are available at the C-HPP Wiki (http://c-hpp.webhosting.rug.nl/) and in the Supporting Information.


Assuntos
Mapeamento Cromossômico , Proteínas/genética , Proteoma , Cromatografia Líquida , Genômica , Humanos , Proteínas/química , Espectrometria de Massas em Tandem
12.
Anal Chem ; 87(24): 12016-23, 2015 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-26554430

RESUMO

Membrane proteins are crucial targets for cancer biomarker discovery and drug development. However, in addition to the inherent challenges of hydrophobicity and low abundance, complete membrane proteome coverage of clinical specimen is usually hindered by the requirement of large amount of starting materials. Toward comprehensive membrane proteomic profiling for small amounts of samples (10 µg), we developed high-pH reverse phase (Hp-RP) combined with stop-and-go extraction tip (StageTip) technique, as a fast (∼15 min.), sensitive, reproducible, high-resolution and multiplexed fractionation method suitable for accurate quantification of the membrane proteome. This approach provided almost 2-fold enhanced detection of peptides encompassing transmembrane helix (TMH) domain, as compared with strong anion exchange (SAX) and strong cation exchange (SCX) StageTip techniques. Almost 5000 proteins (∼60% membrane proteins) can be identified in only 10 µg of membrane protein digests, showing the superior sensitivity of the Hp-RP StageTip approach. The method allowed up to 9- and 6-fold increase in the identification of unique hydrophobic and hydrophilic peptides, respectively. The Hp-RP StageTip method enabled in-depth membrane proteome profiling of 11 lung cancer cell lines harboring different EGFR mutation status, which resulted in the identification of 3983 annotated membrane proteins. This provides the largest collection of reference peptide spectral data for lung cancer membrane subproteome. Finally, relative quantification of membrane proteins between Gefitinib-resistant and -sensitive lung cancer cell lines revealed several up-regulated membrane proteins with key roles in lung cancer progression.


Assuntos
Proteínas de Membrana/análise , Proteínas de Membrana/isolamento & purificação , Proteômica/métodos , Linhagem Celular Tumoral , Humanos , Limite de Detecção , Neoplasias Pulmonares/fisiopatologia , Proteínas de Membrana/química , Proteínas de Membrana/genética , Modelos Biológicos , Mutação , Fatores de Tempo
13.
Anal Chem ; 87(4): 2143-51, 2015 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-25543920

RESUMO

Metabolite identification remains a bottleneck in mass spectrometry (MS)-based metabolomics. Currently, this process relies heavily on tandem mass spectrometry (MS/MS) spectra generated separately for peaks of interest identified from previous MS runs. Such a delayed and labor-intensive procedure creates a barrier to automation. Further, information embedded in MS data has not been used to its full extent for metabolite identification. Multimers, adducts, multiply charged ions, and fragments of given metabolites occupy a substantial proportion (40-80%) of the peaks of a quantitation result. However, extensive information on these derivatives, especially fragments, may facilitate metabolite identification. We propose a procedure with automation capability to group and annotate peaks associated with the same metabolite in the quantitation results of opposite modes and to integrate this information for metabolite identification. In addition to the conventional mass and isotope ratio matches, we would match annotated fragments with low-energy MS/MS spectra in public databases. For identification of metabolites without accessible MS/MS spectra, we have developed characteristic fragment and common substructure matches. The accuracy and effectiveness of the procedure were evaluated using one public and two in-house liquid chromatography-mass spectrometry (LC-MS) data sets. The procedure accurately identified 89% of 28 standard metabolites with derivative ions in the data sets. With respect to effectiveness, the procedure confidently identified the correct chemical formula of at least 42% of metabolites with derivative ions via MS/MS spectrum, characteristic fragment, and common substructure matches. The confidence level was determined according to the fulfilled identification criteria of various matches and relative retention time.


Assuntos
Metabolômica/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Cromatografia Líquida/métodos , Diabetes Mellitus Experimental/metabolismo , Dieta , Íons/análise , Íons/metabolismo , Metaboloma , Camundongos , Ratos
14.
Anal Chem ; 87(4): 2466-73, 2015 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-25629585

RESUMO

Glycosylation is a highly complex modification influencing the functions and activities of proteins. Interpretation of intact glycopeptide spectra is crucial but challenging. In this paper, we present a mass spectrometry-based automated glycopeptide identification platform (MAGIC) to identify peptide sequences and glycan compositions directly from intact N-linked glycopeptide collision-induced-dissociation spectra. The identification of the Y1 (peptideY0 + GlcNAc) ion is critical for the correct analysis of unknown glycoproteins, especially without prior knowledge of the proteins and glycans present in the sample. To ensure accurate Y1-ion assignment, we propose a novel algorithm called Trident that detects a triplet pattern corresponding to [Y0, Y1, Y2] or [Y0-NH3, Y0, Y1] from the fragmentation of the common trimannosyl core of N-linked glycopeptides. To facilitate the subsequent peptide sequence identification by common database search engines, MAGIC generates in silico spectra by overwriting the original precursor with the naked peptide m/z and removing all of the glycan-related ions. Finally, MAGIC computes the glycan compositions and ranks them. For the model glycoprotein horseradish peroxidase (HRP) and a 5-glycoprotein mixture, a 2- to 31-fold increase in the relative intensities of the peptide fragments was achieved, which led to the identification of 7 tryptic glycopeptides from HRP and 16 glycopeptides from the mixture via Mascot. In the HeLa cell proteome data set, MAGIC processed over a thousand MS(2) spectra in 3 min on a PC and reported 36 glycopeptides from 26 glycoproteins. Finally, a remarkable false discovery rate of 0 was achieved on the N-glycosylation-free Escherichia coli data set. MAGIC is available at http://ms.iis.sinica.edu.tw/COmics/Software_MAGIC.html .


Assuntos
Algoritmos , Biologia Computacional , Glicopeptídeos/análise , Software , Automação , Bases de Dados Factuais , Escherichia coli/química , Glicopeptídeos/química , Células HeLa , Humanos
15.
J Proteome Res ; 13(7): 3160-5, 2014 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-24831074

RESUMO

Following an official announcement of the Chromosome-centric Human Proteome Project (C-HPP), the Chromosome 12 (Ch12) Consortium has been established by five representative teams from five Asian countries including Thailand (Siriraj Hospital, Mahidol University), Singapore (National University of Singapore), Taiwan (Academia Sinica), Hong Kong (The Chinese University of Hong Kong), and India (Institute of Bioinformatics). We have worked closely together to extensively and systematically analyze all missing and known proteins encoded by Ch12 for their tissue/cellular/subcellular localizations. The target organs/tissues/cells include kidney, brain, gastrointestinal tissues, blood/immune cells, and stem cells. In the later phase, post-translational modifications and functional significance of Ch12-encoded proteins as well as their associations with human diseases (i.e., immune diseases, metabolic disorders, and cancers) will be defined. We have collaborated with other chromosome teams, Human Kidney and Urine Proteome Project (HKUPP), AOHUPO Membrane Proteomics Initiative, and other existing HUPO initiatives in the Biology/Disease-Based Human Proteome Project (B/D-HPP) to delineate functional roles and medical implications of Ch12-encoded proteins. The data set to be obtained from this multicountry consortium will be an important piece of the jigsaw puzzle to fulfill the missions and goals of the C-HPP and the global Human Proteome Project (HPP).


Assuntos
Cromossomos Humanos Par 12/genética , Proteoma/genética , Cromossomos Humanos Par 12/metabolismo , Humanos , Doenças Metabólicas/genética , Doenças Metabólicas/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Especificidade de Órgãos , Proteoma/metabolismo , Projetos de Pesquisa
16.
Anal Chem ; 86(1): 685-93, 2014 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-24313913

RESUMO

Methodologies to enrich heterogeneous types of phosphopeptides are critical for comprehensive mapping of the under-explored phosphoproteome. Taking advantage of the distinct binding affinities of Ga(3+) and Fe(3+) for phosphopeptides, we designed a metal-directed immobilized metal ion affinity chromatography for the sequential enrichment of phosphopeptides. In Raji B cells, the sequential Ga(3+)-Fe(3+)-immobilized metal affinity chromatography (IMAC) strategy displayed a 1.5-3.5-fold superior phosphoproteomic coverage compared to single IMAC (Fe(3+), Ti(4+), Ga(3+), and Al(3+)). In addition, up to 92% of the 6283 phosphopeptides were uniquely enriched in either the first Ga(3+)-IMAC (41%) or second Fe(3+)-IMAC (51%). The complementary properties of Ga(3+) and Fe(3+) were further demonstrated through the exclusive enrichment of almost all of 1214 multiply phosphorylated peptides (99.4%) in the Ga(3+)-IMAC, whereas only 10% of 5069 monophosphorylated phosphopeptides were commonly enriched in both fractions. The application of sequential Ga(3+)-Fe(3+)-IMAC to human lung cancer tissue allowed the identification of 2560 unique phosphopeptides with only 8% overlap. In addition to the above-mentioned mono- and multiply phosphorylated peptides, this fractionation ability was also demonstrated on the basic and acidic phosphopeptides: acidophilic phosphorylation sites were predominately enriched in the first Ga(3+)-IMAC (72%), while Pro-directed (85%) and basophilic (79%) phosphorylation sites were enriched in the second Fe(3+)-IMAC. This strategy provided complementary mapping of different kinase substrates in multiple cellular pathways related to cancer invasion and metastasis of lung cancer. Given the fractionation ability and ease of tip preparation of this Ga(3+)-Fe(3+)-IMAC, we propose that this strategy allows more comprehensive characterization of the phosphoproteome both in vitro and in vivo.


Assuntos
Cromatografia de Afinidade/métodos , Metais/química , Proteômica/métodos , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Linhagem Celular Tumoral , Células Imobilizadas , Humanos
17.
Mol Cell Proteomics ; 11(10): 901-15, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22761399

RESUMO

Mutational activation of KRAS promotes various malignancies, including lung adenocarcinoma. Knowledge of the molecular targets mediating the downstream effects of activated KRAS is limited. Here, we provide the KRAS target proteins and N-glycoproteins using human bronchial epithelial cells with and without the expression of activated KRAS (KRAS(V12)). Using an OFFGEL peptide fractionation and hydrazide method combined with subsequent LTQ-Orbitrap analysis, we identified 5713 proteins and 608 N-glycosites on 317 proteins in human bronchial epithelial cells. Label-free quantitation of 3058 proteins (≥2 peptides; coefficient of variation (CV) ≤ 20%) and 297 N-glycoproteins (CV ≤ 20%) revealed the differential regulation of 23 proteins and 14 N-glycoproteins caused by activated KRAS, including 84% novel ones. An informatics-assisted IPA-Biomarker® filter analysis prioritized some of the differentially regulated proteins (ALDH3A1, CA2, CTSD, DST, EPHA2, and VIM) and N-glycoproteins (ALCAM, ITGA3, and TIMP-1) as cancer biomarkers. Further, integrated in silico analysis of microarray repository data of lung adenocarcinoma clinical samples and cell lines containing KRAS mutations showed positive mRNA fold changes (p < 0.05) for 61% of the KRAS-regulated proteins, including biomarker proteins, CA2 and CTSD. The most significant discovery of the integrated validation is the down-regulation of FABP5 and PDCD4. A few validated proteins, including tumor suppressor PDCD4, were further confirmed as KRAS targets by shRNA-based knockdown experiments. Finally, the studies on KRAS-regulated N-glycoproteins revealed structural alterations in the core N-glycans of SEMA4B in KRAS-activated human bronchial epithelial cells and functional role of N-glycosylation of TIMP-1 in the regulation of lung adenocarcinoma A549 cell invasion. Together, our study represents the largest proteome and N-glycoproteome data sets for HBECs, which we used to identify several novel potential targets of activated KRAS that may provide insights into KRAS-induced adenocarcinoma and have implications for both lung cancer therapy and diagnosis.


Assuntos
Adenocarcinoma/genética , Proteínas Reguladoras de Apoptose/genética , Brônquios/metabolismo , Células Epiteliais/metabolismo , Proteínas de Ligação a Ácido Graxo/genética , Regulação Neoplásica da Expressão Gênica , Neoplasias Pulmonares/genética , Proteínas Proto-Oncogênicas/genética , Proteínas de Ligação a RNA/genética , Proteínas ras/genética , Adenocarcinoma/metabolismo , Adenocarcinoma/patologia , Adenocarcinoma de Pulmão , Proteínas Reguladoras de Apoptose/metabolismo , Biomarcadores Tumorais , Brônquios/patologia , Linhagem Celular Tumoral , Células Epiteliais/patologia , Proteínas de Ligação a Ácido Graxo/metabolismo , Glicoproteínas/genética , Glicoproteínas/metabolismo , Glicosilação , Humanos , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patologia , Proteoma/genética , Proteoma/metabolismo , Proteômica , Proteínas Proto-Oncogênicas/metabolismo , Proteínas Proto-Oncogênicas p21(ras) , RNA Interferente Pequeno , Proteínas de Ligação a RNA/metabolismo , Semaforinas/genética , Semaforinas/metabolismo , Inibidor Tecidual de Metaloproteinase-1/genética , Inibidor Tecidual de Metaloproteinase-1/metabolismo , Proteínas ras/metabolismo
18.
Int J Biol Macromol ; 259(Pt 1): 129074, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38163507

RESUMO

The overexpression of dual-specificity tyrosine phosphorylation-regulated kinase 1A (DYRK1A), commonly observed in neurodegenerative diseases like Alzheimer's disease (AD) and Down syndrome (DS), can induce the formation of neurofibrillary tangles (NFTs) and amyloid plaques. Hence, designing a selective DYRK1A inhibitor would result in a promising small molecule for treating neurodegenerative diseases. Developing selective inhibitors for DYRK1A has been a difficult challenge due to the highly preserved ATP-binding site of protein kinases. In this study, we employed a structure-based virtual screening (SBVS) campaign targeting DYRK1A from a database containing 1.6 million compounds. Enzymatic assays were utilized to verify inhibitory properties, confirming that Y020-3945 and Y020-3957 showed inhibitory activity towards DYRK1A. In particular, the compounds exhibited high selectivity for DYRK1A over a panel of 120 kinases, reduced the phosphorylation of tau, and reversed the tubulin polymerization for microtubule stability. Additionally, treatment with the compounds significantly reduced the secretion of inflammatory cytokines IL-6 and TNF-α activated by DYRK1A-assisted NFTs and Aß oligomers. These identified inhibitors possess promising therapeutic potential for conditions associated with DYRK1A in neurodegenerative diseases. The results showed that Y020-3945 and Y020-3957 demonstrated structural novelty compared to known DYRK1A inhibitors, making them a valuable addition to developing potential treatments for neurodegenerative diseases.


Assuntos
Doença de Alzheimer , Doenças Neurodegenerativas , Humanos , Fosforilação , Proteínas Tirosina Quinases/metabolismo , Proteínas Serina-Treonina Quinases/metabolismo , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/metabolismo , Doenças Neurodegenerativas/metabolismo , Microtúbulos/metabolismo , Tirosina/metabolismo , Proteínas tau/metabolismo , Inibidores de Proteínas Quinases/metabolismo
19.
BMC Bioinformatics ; 14: 304, 2013 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-24112406

RESUMO

BACKGROUND: Since membrane protein structures are challenging to crystallize, computational approaches are essential for elucidating the sequence-to-structure relationships. Structural modeling of membrane proteins requires a multidimensional approach, and one critical geometric parameter is the rotational angle of transmembrane helices. Rotational angles of transmembrane helices are characterized by their folded structures and could be inferred by the hydrophobic moment; however, the folding mechanism of membrane proteins is not yet fully understood. The rotational angle of a transmembrane helix is related to the exposed surface of a transmembrane helix, since lipid exposure gives the degree of accessibility of each residue in lipid environment. To the best of our knowledge, there have been few advances in investigating whether an environment descriptor of lipid exposure could infer a geometric parameter of rotational angle. RESULTS: Here, we present an analysis of the relationship between rotational angles and lipid exposure and a support-vector-machine method, called TMexpo, for predicting both structural features from sequences. First, we observed from the development set of 89 protein chains that the lipid exposure, i.e., the relative accessible surface area (rASA) of residues in the lipid environment, generated from high-resolution protein structures could infer the rotational angles with a mean absolute angular error (MAAE) of 46.32˚. More importantly, the predicted rASA from TMexpo achieved an MAAE of 51.05˚, which is better than 71.47˚ obtained by the best of the compared hydrophobicity scales. Lastly, TMexpo outperformed the compared methods in rASA prediction on the independent test set of 21 protein chains and achieved an overall Matthew's correlation coefficient, accuracy, sensitivity, specificity, and precision of 0.51, 75.26%, 81.30%, 69.15%, and 72.73%, respectively. TMexpo is publicly available at http://bio-cluster.iis.sinica.edu.tw/TMexpo. CONCLUSIONS: TMexpo can better predict rASA and rotational angles than the compared methods. When rotational angles can be accurately predicted, free modeling of transmembrane protein structures in turn may benefit from a reduced complexity in ensembles with a significantly less number of packing arrangements. Furthermore, sequence-based prediction of both rotational angle and lipid exposure can provide essential information when high-resolution structures are unavailable and contribute to experimental design to elucidate transmembrane protein functions.


Assuntos
Biologia Computacional/métodos , Lipídeos de Membrana/química , Proteínas de Membrana/química , Sequência de Aminoácidos , Interações Hidrofóbicas e Hidrofílicas , Lipídeos de Membrana/metabolismo , Proteínas de Membrana/metabolismo , Dados de Sequência Molecular , Estrutura Secundária de Proteína , Máquina de Vetores de Suporte
20.
J Proteome Res ; 12(5): 2305-10, 2013 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-23560440

RESUMO

As spectral library searching has received increasing attention for peptide identification, constructing good decoy spectra from the target spectra is the key to correctly estimating the false discovery rate in searching against the concatenated target-decoy spectral library. Several methods have been proposed to construct decoy spectral libraries. Most of them construct decoy peptide sequences and then generate theoretical spectra accordingly. In this paper, we propose a method, called precursor-swap, which directly constructs decoy spectral libraries directly at the "spectrum level" without generating decoy peptide sequences by swapping the precursors of two spectra selected according to a very simple rule. Our spectrum-based method does not require additional efforts to deal with ion types (e.g., a, b or c ions), fragment mechanism (e.g., CID, or ETD), or unannotated peaks, but preserves many spectral properties. The precursor-swap method is evaluated on different spectral libraries and the results of obtained decoy ratios show that it is comparable to other methods. Notably, it is efficient in time and memory usage for constructing decoy libraries. A software tool called Precursor-Swap-Decoy-Generation (PSDG) is publicly available for download at http://ms.iis.sinica.edu.tw/PSDG/.


Assuntos
Fragmentos de Peptídeos/química , Software , Animais , Bases de Dados de Proteínas , Humanos , Anotação de Sequência Molecular/métodos , Biblioteca de Peptídeos , Mapeamento de Peptídeos , Análise de Sequência de Proteína , Espectrometria de Massas em Tandem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA