Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
1.
Cell ; 181(2): 460-474.e14, 2020 04 16.
Article in English | MEDLINE | ID: mdl-32191846

ABSTRACT

Plants are foundational for global ecological and economic systems, but most plant proteins remain uncharacterized. Protein interaction networks often suggest protein functions and open new avenues to characterize genes and proteins. We therefore systematically determined protein complexes from 13 plant species of scientific and agricultural importance, greatly expanding the known repertoire of stable protein complexes in plants. By using co-fractionation mass spectrometry, we recovered known complexes, confirmed complexes predicted to occur in plants, and identified previously unknown interactions conserved over 1.1 billion years of green plant evolution. Several novel complexes are involved in vernalization and pathogen defense, traits critical for agriculture. We also observed plant analogs of animal complexes with distinct molecular assemblies, including a megadalton-scale tRNA multi-synthetase complex. The resulting map offers a cross-species view of conserved, stable protein assemblies shared across plant cells and provides a mechanistic, biochemical framework for interpreting plant genetics and mutant phenotypes.


Subject(s)
Plant Proteins/genetics , Plant Proteins/metabolism , Protein Interaction Maps/physiology , Mass Spectrometry/methods , Plants/genetics , Plants/metabolism , Protein Interaction Mapping/methods , Proteomics/methods
2.
Cell ; 150(5): 1068-81, 2012 Aug 31.
Article in English | MEDLINE | ID: mdl-22939629

ABSTRACT

Cellular processes often depend on stable physical associations between proteins. Despite recent progress, knowledge of the composition of human protein complexes remains limited. To close this gap, we applied an integrative global proteomic profiling approach, based on chromatographic separation of cultured human cell extracts into more than one thousand biochemical fractions that were subsequently analyzed by quantitative tandem mass spectrometry, to systematically identify a network of 13,993 high-confidence physical interactions among 3,006 stably associated soluble human proteins. Most of the 622 putative protein complexes we report are linked to core biological processes and encompass both candidate disease genes and unannotated proteins to inform on mechanism. Strikingly, whereas larger multiprotein assemblies tend to be more extensively annotated and evolutionarily conserved, human protein complexes with five or fewer subunits are far more likely to be functionally unannotated or restricted to vertebrates, suggesting more recent functional innovations.


Subject(s)
Multiprotein Complexes/analysis , Protein Interaction Maps , Proteins/chemistry , Proteomics/methods , Humans , Tandem Mass Spectrometry
3.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38600664

ABSTRACT

Small open reading frames (smORFs) have been acknowledged to play various roles on essential biological pathways and affect human beings from diabetes to tumorigenesis. Predicting smORFs in silico is quite a prerequisite for processing the omics data. Here, we proposed the smORF-coding-potential-predicting framework, sOCP, which provides functions to construct a model for predicting novel smORFs in some species. The sOCP model constructed in human was based on in-frame features and the nucleotide bias around the start codon, and the small feature subset was proved to be competent enough and avoid overfitting problems for complicated models. It showed more advanced prediction metrics than previous methods and could correlate closely with experimental evidence in a heterogeneous dataset. The model was applied to Rattus norvegicus and exhibited satisfactory performance. We then scanned smORFs with ATG and non-ATG start codons from the human genome and generated a database containing about a million novel smORFs with coding potential. Around 72 000 smORFs are located on the lncRNA regions of the genome. The smORF-encoded peptides may be involved in biological pathways rare for canonical proteins, including glucocorticoid catabolic process and the prokaryotic defense system. Our work provides a model and database for human smORF investigation and a convenient tool for further smORF prediction in other species.


Subject(s)
Genome, Human , Peptides , Animals , Humans , Rats , Open Reading Frames , Peptides/genetics , Proteins/genetics
4.
PLoS Comput Biol ; 20(6): e1012208, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38900844

ABSTRACT

The apicomplexan intracellular parasite Toxoplasma gondii is a major food borne pathogen that is highly prevalent in the global population. The majority of the T. gondii proteome remains uncharacterized and the organization of proteins into complexes is unclear. To overcome this knowledge gap, we used a biochemical fractionation strategy to predict interactions by correlation profiling. To overcome the deficit of high-quality training data in non-model organisms, we complemented a supervised machine learning strategy, with an unsupervised approach, based on similarity network fusion. The resulting combined high confidence network, ToxoNet, comprises 2,063 interactions connecting 652 proteins. Clustering identifies 93 protein complexes. We identified clusters enriched in mitochondrial machinery that include previously uncharacterized proteins that likely represent novel adaptations to oxidative phosphorylation. Furthermore, complexes enriched in proteins localized to secretory organelles and the inner membrane complex, predict additional novel components representing novel targets for detailed functional characterization. We present ToxoNet as a publicly available resource with the expectation that it will help drive future hypotheses within the research community.


Subject(s)
Protein Interaction Maps , Protozoan Proteins , Toxoplasma , Toxoplasma/metabolism , Protozoan Proteins/metabolism , Protozoan Proteins/chemistry , Protein Interaction Maps/physiology , Computational Biology , Protein Interaction Mapping/methods , Proteome/metabolism , Databases, Protein , Machine Learning , Cluster Analysis
5.
J Proteome Res ; 23(1): 368-376, 2024 01 05.
Article in English | MEDLINE | ID: mdl-38006349

ABSTRACT

The low-molecular-weight proteins (LMWP) in serum and plasma are related to various human diseases and can be valuable biomarkers. A small open reading frame-encoded peptide (SEP) is one kind of LMWP, which has been found to function in many bioprocesses and has also been found in human blood, making it a potential biomarker. The detection of LMWP by a mass spectrometry (MS)-based proteomic assay is often inhibited by the wide dynamic range of serum/plasma protein abundance. Nanoparticle protein coronas are a newly emerging protein enrichment method. To analyze SEPs in human serum, we have developed a protocol integrated with nanoparticle protein coronas and liquid chromatography (LC)/MS/MS. With three nanoparticles, TiO2, Fe3O4@SiO2, and Fe3O4@SiO2@TiO2, we identified 164 new SEPs in the human serum sample. Fe3O4@SiO2 and a nanoparticle mixture obtained the maximum number and the largest proportion of identified SEPs, respectively. Compared with acetonitrile-based extraction, nanoparticle protein coronas can cover more small proteins and SEPs. The magnetic nanoparticle is also fit for high-throughput parallel protein separation before LC/MS. This method is fast, efficient, reproducible, and easy to operate in 96-well plates and centrifuge tubes, which will benefit the research on SEPs and biomarkers.


Subject(s)
Nanoparticles , Protein Corona , Humans , Proteomics/methods , Tandem Mass Spectrometry , Open Reading Frames , Silicon Dioxide , Peptides/analysis , Blood Proteins/chemistry , Biomarkers
6.
Plant Physiol ; 191(3): 1535-1545, 2023 03 17.
Article in English | MEDLINE | ID: mdl-36548962

ABSTRACT

As one of the essential life forms in the biosphere, research on cyanobacteria has been growing remarkably for decades. Biological functions in organisms are often accomplished through protein-protein interactions (PPIs), which help to regulate interacting proteins or organize them into an integral machine. However, the study of PPIs in cyanobacteria falls far behind that in mammals and has not been integrated for ease of use. Thus, we built CyanoMapDB (http://www.cyanomapdb.msbio.pro/), a database providing cyanobacterial PPIs with experimental evidence, consisting of 52,304 PPIs among 6,789 proteins from 23 cyanobacterial species. We collected available data in UniProt, STRING, and IntAct, and mined numerous PPIs from co-fractionation MS data in cyanobacteria. The integrated data are accessible in CyanoMapDB (http://www.cyanomapdb.msbio.pro/), enabling users to easily query proteins of interest, investigate interacting proteins with evidence from different sources, and acquire a visual network of the target protein. We believe that CyanoMapDB will promote research involved with cyanobacteria and plants.


Subject(s)
Cyanobacteria , Protein Interaction Mapping , Animals , Databases, Protein , Proteins/metabolism , Cyanobacteria/genetics , Cyanobacteria/metabolism , Mammals/metabolism
7.
Mol Cell Proteomics ; 21(4): 100224, 2022 04.
Article in English | MEDLINE | ID: mdl-35288331

ABSTRACT

The filamentous cyanobacterium Anabaena sp. PCC 7120 can differentiate into heterocysts to fix atmospheric nitrogen. During cell differentiation, cellular morphology and gene expression undergo a series of significant changes. To uncover the mechanisms responsible for these alterations, we built protein-protein interaction (PPI) networks for these two cell types by cofractionation coupled with mass spectrometry. We predicted 280 and 215 protein complexes, with 6322 and 2791 high-confidence PPIs in vegetative cells and heterocysts, respectively. Most of the proteins in both types of cells presented similar elution profiles, whereas the elution peaks of 438 proteins showed significant changes. We observed that some well-known complexes recruited new members in heterocysts, such as ribosomes, diflavin flavoprotein, and cytochrome c oxidase. Photosynthetic complexes, including photosystem I, photosystem II, and phycobilisome, remained in both vegetative cells and heterocysts for electron transfer and energy generation. Besides that, PPI data also reveal new functions of proteins. For example, the hypothetical protein Alr4359 was found to interact with FraH and Alr4119 in heterocysts and was located on heterocyst poles, thereby influencing the diazotrophic growth of filaments. The overexpression of Alr4359 suspended heterocyst formation and altered the pigment composition and filament length. This work demonstrates the differences in protein assemblies and provides insight into physiological regulation during cell differentiation.


Subject(s)
Anabaena , Gene Expression Regulation, Bacterial , Anabaena/genetics , Anabaena/metabolism , Bacterial Proteins/metabolism , Biology , Cell Differentiation
8.
Proc Natl Acad Sci U S A ; 118(17)2021 04 27.
Article in English | MEDLINE | ID: mdl-33875586

ABSTRACT

Coordinated beating is crucial for the function of multiple cilia. However, the molecular mechanism is poorly understood. Here, we characterize a conserved ciliary protein CYB5D1 with a heme-binding domain and a cordon-bleu ubiquitin-like domain. Mutation or knockdown of Cyb5d1 in zebrafish impaired coordinated ciliary beating in the otic vesicle and olfactory epithelium. Similarly, the two flagella of an insertional mutant of the CYB5D1 ortholog in Chlamydomonas (Crcyb5d1) showed an uncoordinated pattern due to a defect in the cis-flagellum. Biochemical analyses revealed that CrCYB5D1 is a radial spoke stalk protein that binds heme only under oxidizing conditions. Lack of CrCYB5D1 resulted in a reductive shift in flagellar redox state and slowing down of the phototactic response. Treatment of Crcyb5d1 with oxidants restored coordinated flagellar beating. Taken together, these data suggest that CrCYB5D1 may integrate environmental and intraciliary signals and regulate the redox state of cilia, which is crucial for the coordinated beating of multiple cilia.


Subject(s)
Cilia/metabolism , Cilia/physiology , Cytochromes b5/metabolism , Animals , Axoneme/metabolism , Chlamydomonas/metabolism , Chlamydomonas/physiology , Cytochromes b5/physiology , Dyneins/metabolism , Flagella/metabolism , Flagella/physiology , Heme-Binding Proteins/metabolism , Heme-Binding Proteins/physiology , Microtubules/metabolism , Mutation , Zebrafish/metabolism
9.
Proteomics ; 23(12): e2200473, 2023 06.
Article in English | MEDLINE | ID: mdl-36947710

ABSTRACT

Nostoc flagelliforme, a terrestrial cyanobacterium spread throughout arid and semi-arid areas, has been long known for its outstanding adaptability to extremely dry conditions. This microorganism is able to recover biological activities within hours after months of anhydrobiosis state, attracting investigation through proteomic analysis. Except for canonical proteome, microproteins encoded by small ORFs (smORFs) have recently been regarded as indispensable participants in metabolic processes. However, the involvement of smORFs in N. flagelliforme remains unknown. Here we first constructed a smORF database in N. flagelliforme using bioinformatic prediction, resulting in 6072 novel smORFs. Then LS-MS/MS analysis was applied to identify expression patterns of microproteins and seek smORFs and their encoded microprotein playing a role during rehydration. In total, 18 novel microproteins were mined based on a smORF searching strategy combined with three proteomic assays, of which five were annotated as ribosomal proteins, one as RNA polymerase subunit, and one as acetohydroxy acid isomeroreductase. We also suggested the possible functions of smORFs according to their expression pattern and discovered two neighboring and homologous smORFs. All these results will expand our knowledge of smORFs-encoded microproteins and their relation to the stress response of extremophilic microorganisms.


Subject(s)
Nostoc , Proteomics , Humans , Open Reading Frames , Tandem Mass Spectrometry , Nostoc/genetics , Nostoc/metabolism , Fluid Therapy , Micropeptides
10.
J Proteome Res ; 22(9): 2814-2826, 2023 Sep 01.
Article in English | MEDLINE | ID: mdl-37500539

ABSTRACT

The early development of zebrafish (Danio rerio) is a complex and dynamic physiological process involving cell division, differentiation, and movement. Currently, the genome and transcriptome techniques have been widely used to study the embryonic development of zebrafish. However, the research of proteomics based on proteins that directly execute functions is relatively vacant. In this work, we apply label-free quantitative proteomics to explore protein profiling during zebrafish's embryogenesis, and a total of 5961 proteins were identified at 10 stages of zebrafish's early development. The identified proteins were divided into 11 modules according to weighted gene coexpression network analysis (WGCNA), and the characteristics between modules were significantly different. For example, mitochondria-related functions enriched the early development of zebrafish. Primordial germ cell-related proteins were identified at the 4-cell stage, while the eye development event is dominated at 5 days post fertilization (dpf). By combining with published transcriptomics data, we discovered some proteins that may be involved in activating zygotic genes. Meanwhile, 137 novel proteins were identified. This study comprehensively analyzed the dynamic processes in the embryonic development of zebrafish from the perspective of proteomics. It provided solid data support for further understanding of the molecular mechanism of its development.

11.
J Proteome Res ; 22(4): 1172-1180, 2023 04 07.
Article in English | MEDLINE | ID: mdl-36924315

ABSTRACT

The incidence rate of atrial fibrillation (AF) has stayed at a high level in recent years. Despite the intensive efforts to study the pathologic changes of AF, the molecular mechanism of disease development remains unclarified. Microproteins are ribosomally translated gene products from small open reading frames (sORFs) and are found to play crucial biological functions, while remain rare attention and indistinct in AF study. In this work, we recruited 65 AF patients and 65 healthy subjects for microproteomic profiling. By differential analysis and cross-validation between independent datasets, a total of 4 microproteins were identified as significantly different, including 3 annotated ones and 1 novel one. Additionally, we established a diagnostic model with either microproteins or global proteins by machine learning methods and found the model with microproteins achieved comparable and excellent performance as that with global proteins. Our results confirmed the abnormal expression of microproteins in AF and may provide new perspectives on the mechanism study of AF.


Subject(s)
Atrial Fibrillation , Humans , Proteins/genetics , RNA , Micropeptides
12.
Genomics ; 114(5): 110444, 2022 09.
Article in English | MEDLINE | ID: mdl-35933072

ABSTRACT

Small open reading frames (smORFs) have been acknowledged as an important partner in organism functions ranging from bacteria to higher eukaryotes. However, there is a lack of investigation of smORFs in green algae, despite their importance in ecology and evolution. We applied bioinformatic analysis, ribosome profiling, and small peptide proteomics to provide a genome-wide and high-confident smORF database in the model green alga Chlamydomonas reinhardtii. The whole genome was screened first to mine potential coding smORFs. Then conservative analysis, ribosome profiling, and proteomics data were processed to identify conserved smORFs and generate translation evidence. The combination of procedures resulted in 2014 smORFs that might exist in the C. reinhardtii genome. The expression of smORFs in Cd treatment suggested that two smORFs might participate in redox reaction, three in inorganic phosphate transport, and one in DNA repair under stress. Our study built a genome-widely database in C. reinhardtii, providing target smORFs for further research.


Subject(s)
Chlamydomonas reinhardtii , Cadmium , Chlamydomonas reinhardtii/genetics , Open Reading Frames , Peptides/genetics , Phosphates
13.
Proteomics ; 22(15-16): e2100312, 2022 08.
Article in English | MEDLINE | ID: mdl-35384297

ABSTRACT

Accumulating evidence has shown that a large number of short open reading frames (sORFs) also have the ability to encode proteins. The discovery of sORFs opens up a new research area, leading to the identification and functional study of sORF encoded peptides (SEPs) at the omics level. Besides bioinformatics prediction and ribosomal profiling, mass spectrometry (MS) has become a significant tool as it directly detects the sequence of SEPs. Though MS-based proteomics methods have proved to be effective for qualitative and quantitative analysis of SEPs, the detection of SEPs is still a great challenge due to their low abundance and short sequence. To illustrate the progress in method development, we described and discussed the main steps of large-scale proteomics identification of SEPs, including SEP extraction and enrichment, MS detection, data processing and quality control, quantification, and function prediction and validation methods.


Subject(s)
Peptides , Proteomics , Computational Biology , Open Reading Frames , Peptides/analysis , Proteins , Proteomics/methods
14.
J Proteome Res ; 21(4): 1052-1060, 2022 04 01.
Article in English | MEDLINE | ID: mdl-35199523

ABSTRACT

Microproteins are generated from small open reading frames and turn out to play various vital biological functions. As an essential biological event of eukaryotic cells, the cell cycle is involved in cell replication and division. For such a highly regulated event, microproteins associated with cell cycle regulation remained unclarified. Utilizing a combination of bottom-up and top-down proteomics, we analyzed microproteins at specific cell cycle stages of Hep3B cells. A total of 657 microproteins were identified under three cell cycle stages, including 151 in the G0/G1 stage, 163 in the S stage, and 132 in the G2/M stage. The annotation of these microproteins showed their cell cycle-specific functions, such as translation, nuclear assembly, chromatin organization, and the G2/M transition of the mitotic cell cycle. Meanwhile, more than 50% of identified microproteins were ncRNA-encoded. These nonannotated novel microproteins contain several function domains, such as the nucleoside diphosphate kinase domain, the high mobility group domain, and the DNA-binding domain. This suggested the potential functions of these novel microproteins in specific cell cycle stages. This study presented a large-scale profile of microproteins at different cell cycle stages from Hep3B and may provide new perspectives on the regulation mechanism of the cell cycle. Liquid chromatography-mass spectrometry data were deposited to ProteomeXchange using the identifier PXD030286.


Subject(s)
Proteomics , Cell Cycle , Chromatography, Liquid , Humans , Mass Spectrometry , Open Reading Frames , Proteomics/methods
15.
J Proteome Res ; 21(4): 1114-1123, 2022 04 01.
Article in English | MEDLINE | ID: mdl-35227063

ABSTRACT

Short open reading frame-encoded peptides (SEPs) are microproteins with less than 100 amino acids that play an essential role in the growth and development of organisms. There are plenty of short open reading frames in Drosophila melanogaster that potentially code polypeptides. We chose 11 time points during the life cycle of Drosophila to investigate microproteins, particularly those related to development. Finally, we identified a total of 410 microproteins, of which 27 were noncoding RNA-encoded proteins. Of the 410 microproteins, 74 were expressed in all stages from embryo to adults, whereas 300 microproteins were only found in one or two time points. Approximately, one-third of the microproteins were not reported previously and 44 were obtained from de novo sequencing, validated by synthetic peptides. These microproteins are related to the main bioprocesses of growth and development, such as multicellular organism reproduction, postmating behavior, and oviposition. Over half of the microproteins have predicted functional domains and are conserved across species, suggesting that these microproteins have critical functions in fly development. This work enriches the D. melanogaster proteome and provides a significant data resource for growth and development research.


Subject(s)
Drosophila melanogaster , Peptides , Amino Acids , Animals , Drosophila melanogaster/genetics , Open Reading Frames , Peptides/genetics , Proteome/genetics
16.
J Proteome Res ; 21(8): 1939-1947, 2022 08 05.
Article in English | MEDLINE | ID: mdl-35838590

ABSTRACT

Small open reading frame-encoded peptides (SEPs) are microproteins with a length of 100 amino acids or less, which may play a critical role in maintaining cell homeostasis under stress. Therefore, we used mass spectrometry-based proteomics to explore microproteins potentially involved in cellular stress responses in Saccharomyces cerevisiae. A total of 225 microproteins with 1920 unique peptides were identified under six culture conditions: normal, oxidation, starvation, ultraviolet radiation, heat shock, and heat shock with starvation. Among these microproteins, we found 70 SEPs with 75 unique peptides. The annotated microproteins are involved in stress-related processes, such as cell redox reactions, cell wall modification, protein folding and degradation, and DNA damage repair. It suggests that SEPs may also play similar functions under stress conditions. For example, SEP IP_008057, translated from a short coding sequence of YJL159W, may play a role in heat shock. This study identified stress-responsive SEPs in S. cerevisiae and provided valuable information to determine the functions of these proteins, which enrich the genome and proteome of S. cerevisiae and show clues to improving the stress tolerance of S. cerevisiae.


Subject(s)
Saccharomyces cerevisiae Proteins , Saccharomyces cerevisiae , Open Reading Frames , Peptides/chemistry , Proteome/genetics , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics , Ultraviolet Rays
17.
Nat Methods ; 16(8): 737-742, 2019 08.
Article in English | MEDLINE | ID: mdl-31308550

ABSTRACT

Protein complexes are key macromolecular machines of the cell, but their description remains incomplete. We and others previously reported an experimental strategy for global characterization of native protein assemblies based on chromatographic fractionation of biological extracts coupled to precision mass spectrometry analysis (chromatographic fractionation-mass spectrometry, CF-MS), but the resulting data are challenging to process and interpret. Here, we describe EPIC (elution profile-based inference of complexes), a software toolkit for automated scoring of large-scale CF-MS data to define high-confidence multi-component macromolecules from diverse biological specimens. As a case study, we used EPIC to map the global interactome of Caenorhabditis elegans, defining 612 putative worm protein complexes linked to diverse biological processes. These included novel subunits and assemblies unique to nematodes that we validated using orthogonal methods. The open source EPIC software is freely available as a Jupyter notebook packaged in a Docker container (https://hub.docker.com/r/baderlab/bio-epic/).


Subject(s)
Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans/metabolism , Multiprotein Complexes/isolation & purification , Multiprotein Complexes/metabolism , Protein Interaction Mapping , Proteome/analysis , Software , Animals , Caenorhabditis elegans Proteins/isolation & purification
18.
Nature ; 525(7569): 339-44, 2015 Sep 17.
Article in English | MEDLINE | ID: mdl-26344197

ABSTRACT

Macromolecular complexes are essential to conserved biological processes, but their prevalence across animals is unclear. By combining extensive biochemical fractionation with quantitative mass spectrometry, here we directly examined the composition of soluble multiprotein complexes among diverse metazoan models. Using an integrative approach, we generated a draft conservation map consisting of more than one million putative high-confidence co-complex interactions for species with fully sequenced genomes that encompasses functional modules present broadly across all extant animals. Clustering reveals a spectrum of conservation, ranging from ancient eukaryotic assemblies that have probably served cellular housekeeping roles for at least one billion years, ancestral complexes that have accrued contemporary components, and rarer metazoan innovations linked to multicellularity. We validated these projections by independent co-fractionation experiments in evolutionarily distant species, affinity purification and functional analyses. The comprehensiveness, centrality and modularity of these reconstructed interactomes reflect their fundamental mechanistic importance and adaptive value to animal cell systems.


Subject(s)
Evolution, Molecular , Multiprotein Complexes/chemistry , Multiprotein Complexes/metabolism , Protein Interaction Maps , Animals , Datasets as Topic , Humans , Protein Interaction Mapping , Reproducibility of Results , Systems Biology , Tandem Mass Spectrometry
19.
Biochem J ; 477(19): 3833-3838, 2020 10 16.
Article in English | MEDLINE | ID: mdl-32969463

ABSTRACT

Post-translational modifications play important roles in mediating protein functions in a wide variety of cellular events in vivo. HEMK2-TRMT112 heterodimer has been reported to be responsible for both histone lysine methylation and eukaryotic release factor 1 (eRF1) glutamine methylation. However, how HEMK2-TRMT112 complex recognizes and catalyzes eRF1 glutamine methylation is largely unknown. Here, we present two structures of HEMK2-TRMT112, with one bound to SAM and the other bound with SAH and methylglutamine (Qme). Structural analyses of the post-catalytic complex, complemented by mass spectrometry experiments, indicate that the HEMK2 utilizes a specific pocket to accommodate the substrate glutamine and catalyzes the subsequent methylation. Therefore, our work not only throws light on the protein glutamine methylation mechanism, but also reveals the dual activity of HEMK2 by catalyzing the methylation of both Lys and Gln residues.


Subject(s)
Glutamine/chemistry , Methyltransferases/chemistry , Site-Specific DNA-Methyltransferase (Adenine-Specific)/chemistry , Glutamine/metabolism , Humans , Methylation , Methyltransferases/metabolism , Protein Structure, Quaternary , Site-Specific DNA-Methyltransferase (Adenine-Specific)/metabolism
20.
Int J Mol Sci ; 22(11)2021 May 22.
Article in English | MEDLINE | ID: mdl-34067398

ABSTRACT

Small open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de novo sequencing to improve SEP identification and sequence coverage. With de novo sequencing, we identified 1682 peptides mapping to 2544 human sORFs, which were all first characterized in this work. Two-thirds of these new sORFs have reading frame shifts and use a non-ATG start codon. The top-down approach identified 241 human SEPs, with high sequence coverage. The average length of the peptides from the bottom-up database search was 19 amino acids (AA); from de novo sequencing, it was 9 AA; and from the top-down approach, it was 25 AA. The longer peptide positively boosts the sequence coverage, more efficiently distinguishing SEPs from the known gene coding sequence. Top-down has the advantage of identifying peptides with sequential K/R or high K/R content, which is unfavorable in the bottom-up approach. Our method can explore new coding sORFs and obtain highly accurate sequences of their SEPs, which can also benefit future function research.


Subject(s)
Open Reading Frames/genetics , Peptides/genetics , Amino Acid Sequence , Amino Acids/genetics , Cell Line, Tumor , Codon, Initiator/genetics , Humans , Proteomics/methods
SELECTION OF CITATIONS
SEARCH DETAIL