Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
1.
Lab Invest ; 104(6): 102069, 2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38670317

ABSTRACT

Tissue gene expression studies are impacted by biological and technical sources of variation, which can be broadly classified into wanted and unwanted variation. The latter, if not addressed, results in misleading biological conclusions. Methods have been proposed to reduce unwanted variation, such as normalization and batch correction. A more accurate understanding of all causes of variation could significantly improve the ability of these methods to remove unwanted variation while retaining variation corresponding to the biological question of interest. We used 17,282 samples from 49 human tissues in the Genotype-Tissue Expression data set (v8) to investigate patterns and causes of expression variation. Transcript expression was transformed to z-scores, and only the most variable 2% of transcripts were evaluated and clustered based on coexpression patterns. Clustered gene sets were assigned to different biological or technical causes based on histologic appearances and metadata elements. We identified 522 variable transcript clusters (median: 11 per tissue) among the samples. Of these, 63% were confidently explained, 16% were likely explained, 7% were low confidence explanations, and 14% had no clear cause. Histologic analysis annotated 46 clusters. Other common causes of variability included sex, sequencing contamination, immunoglobulin diversity, and compositional tissue differences. Less common biological causes included death interval (Hardy score), disease status, and age. Technical causes included blood draw timing and harvesting differences. Many of the causes of variation in bulk tissue expression were identifiable in the Tabula Sapiens data set of single-cell expression. This is among the largest explorations of the underlying sources of tissue expression variation. It uncovered expected and unexpected causes of variable gene expression and demonstrated the utility of matched histologic specimens. It further demonstrated the value of acquiring meaningful tissue harvesting metadata elements to use for improved normalization, batch correction, and analysis of both bulk and single-cell RNA-seq data.

2.
NAR Genom Bioinform ; 6(1): lqad112, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38213836

ABSTRACT

Altered open chromatin regions, impacting gene expression, is a feature of some human disorders. We discovered it is possible to detect global changes in genomically-related adjacent gene co-expression within single cell RNA sequencing (scRNA-seq) data. We built a software package to generate and test non-randomness using 'Brooklyn plots' to identify the percent of genes significantly co-expressed from the same chromosome in ∼10 MB intervals across the genome. These plots establish an expected low baseline of co-expression in scRNA-seq from most cell types, but, as seen in dilated cardiomyopathy cardiomyocytes, altered patterns of open chromatin appear. These may relate to larger regions of transcriptional bursting, observable in single cell, but not bulk datasets.

4.
Gigascience ; 112022 08 25.
Article in English | MEDLINE | ID: mdl-36007182

ABSTRACT

BACKGROUND: An incomplete picture of the expression distribution of microRNAs (miRNAs) across human cell types has long hindered our understanding of this important regulatory class of RNA. With the continued increase in available public small RNA sequencing datasets, there is an opportunity to more fully understand the general distribution of miRNAs at the cell level. RESULTS: From the NCBI Sequence Read Archive, we obtained 6,054 human primary cell datasets and processed 4,184 of them through the miRge3.0 small RNA sequencing alignment software. This dataset was curated down, through shared miRNA expression patterns, to 2,077 samples from 196 unique cell types derived from 175 separate studies. Of 2,731 putative miRNAs listed in miRBase (v22.1), 2,452 (89.8%) were detected. Among reasonably expressed miRNAs, 108 were designated as cell specific/near specific, 59 as infrequent, 52 as frequent, 54 as near ubiquitous, and 50 as ubiquitous. The complexity of cellular microRNA expression estimates recapitulates tissue expression patterns and informs on the miRNA composition of plasma. CONCLUSIONS: This study represents the most complete reference, to date, of miRNA expression patterns by primary cell type. The data are available through the human cellular microRNAome track at the UCSC Genome Browser (https://genome.ucsc.edu/cgi-bin/hgHubConnect) and an R/Bioconductor package (https://bioconductor.org/packages/microRNAome/).


Subject(s)
MicroRNAs , Software , Genome , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , Sequence Alignment , Sequence Analysis, RNA
5.
ACS Omega ; 7(10): 8246-8257, 2022 Mar 15.
Article in English | MEDLINE | ID: mdl-35309442

ABSTRACT

Malaria is a vector-borne disease. It is caused by Plasmodium parasites. Plasmodium yoelii is a rodent model parasite, primarily used for studying parasite development in liver cells and vectors. To better understand parasite biology, we carried out a high-throughput-based proteomic analysis of P. yoelii. From the same mass spectrometry (MS)/MS data set, we also captured several post-translational modified peptides by following a bioinformatics analysis without any prior enrichment. Further, we carried out a proteogenomic analysis, which resulted in improvements to some of the existing gene models along with the identification of several novel genes. Analysis of proteome and post-translational modifications (PTMs) together resulted in the identification of 3124 proteins. The identified PTMs were found to be enriched in mitochondrial metabolic pathways. Subsequent bioinformatics analysis provided an insight into proteins associated with metabolic regulatory mechanisms. Among these, the tricarboxylic acid (TCA) cycle and the isoprenoid synthesis pathway are found to be essential for parasite survival and drug resistance. The proteogenomic analysis discovered 43 novel protein-coding genes. The availability of an in-depth proteomic landscape of a malaria pathogen model will likely facilitate further molecular-level investigations on pre-erythrocytic stages of malaria.

7.
OMICS ; 25(8): 525-536, 2021 08.
Article in English | MEDLINE | ID: mdl-34255573

ABSTRACT

Alzheimer's disease (AD) is a leading cause of dementia and a neurodegenerative disease. Proteomics and post-translational modification (PTM) analyses offer new opportunities for a comprehensive understanding of pathophysiology of brain in AD. We report here multiple PTMs in patients with AD, harnessing publicly available proteomics data from nine brain regions and at three different Braak stages of disease progression. Specifically, we identified 7190 peptides with PTMs, corresponding to 2545 proteins from brain regions with intermediate tangles, and 6864 peptides with PTMs corresponding to 2465 proteins from brain regions with severe tangles. A total of 103 proteins with PTMs were expressed uniquely to intermediate tangles and severe tangles compared to no tangles. Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis suggested the association of these proteins in AD progression through platelet activation. These modified proteins were also found to be enriched for the tricarboxylic acid (TCA) cycle, respiratory electron cycle, and detoxification of reactive oxygen species. The multi-PTM data reported here contribute to our understanding of the neurobiology of AD and highlight the prospects of omics systems science research in neurodegenerative diseases. The present study provides a region-wise classification for the proteins with PTMs along with their differential expression patterns, providing insights into the localization of these proteins upon modification. The catalog of multi-PTMs identified in the context of AD from different brain regions provides a unique platform for generating newer hypotheses in understanding the putative role of specific PTMs in AD pathogenesis.


Subject(s)
Alzheimer Disease , Neurodegenerative Diseases , Alzheimer Disease/genetics , Brain , Data Mining , Humans , Protein Processing, Post-Translational , Proteomics
8.
NAR Genom Bioinform ; 3(3): lqab068, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34308351

ABSTRACT

MicroRNAs and tRFs are classes of small non-coding RNAs, known for their roles in translational regulation of genes. Advances in next-generation sequencing (NGS) have enabled high-throughput small RNA-seq studies, which require robust alignment pipelines. Our laboratory previously developed miRge and miRge2.0, as flexible tools to process sequencing data for annotation of miRNAs and other small-RNA species and further predict novel miRNAs using a support vector machine approach. Although miRge2.0 is a leading analysis tool in terms of speed with unique quantifying and annotation features, it has a few limitations. We present miRge3.0 that provides additional features along with compatibility to newer versions of Cutadapt and Python. The revisions of the tool include the ability to process Unique Molecular Identifiers (UMIs) to account for PCR duplicates while quantifying miRNAs in the datasets, correct erroneous single base substitutions in miRNAs with miREC and an accurate mirGFF3 formatted isomiR tool. miRge3.0 also has speed improvements benchmarked to miRge2.0, Chimira and sRNAbench. Finally, miRge3.0 output integrates into other packages for a streamlined analysis process and provides a cross-platform Graphical User Interface (GUI). In conclusion miRge3.0 is our third generation small RNA-seq aligner with improvements in speed, versatility and functionality over earlier iterations.

9.
Skelet Muscle ; 11(1): 13, 2021 05 17.
Article in English | MEDLINE | ID: mdl-34001262

ABSTRACT

BACKGROUND: Skeletal muscle myofibers can be separated into functionally distinct cell types that differ in gene and protein expression. Current single cell expression data is generally based upon single nucleus RNA, rather than whole myofiber material. We examined if a whole-cell flow sorting approach could be applied to perform single cell RNA-seq (scRNA-seq) in a single muscle type. METHODS: We performed deep, whole cell, scRNA-seq on intact and fragmented skeletal myofibers from the mouse fast-twitch flexor digitorum brevis muscle utilizing a flow-gated method of large cell isolation. We performed deep sequencing of 763 intact and fragmented myofibers. RESULTS: Quality control metrics across the different gates indicated only 171 of these cells were optimal, with a median read count of 239,252 and an average of 12,098 transcripts per cell. scRNA-seq identified three clusters of myofibers (a slow/fast 2A cluster and two fast 2X clusters). Comparison to a public skeletal nuclear RNA-seq dataset demonstrated a diversity in transcript abundance by method. RISH validated multiple genes across fast and slow twitch skeletal muscle types. CONCLUSION: This study introduces and validates a method to isolate intact skeletal muscle myofibers to generate deep expression patterns and expands the known repertoire of fiber-type-specific genes.


Subject(s)
Muscle, Skeletal , Muscular Diseases , Animals , Cell Separation , Foot , Mice , Sequence Analysis, RNA
10.
J Proteome Res ; 20(1): 888-894, 2021 01 01.
Article in English | MEDLINE | ID: mdl-33251806

ABSTRACT

Skeletal muscle myofibers have differential protein expression resulting in functionally distinct slow- and fast-twitch types. While certain protein classes are well-characterized, the depth of all proteins involved in this process is unknown. We utilized the Human Protein Atlas (HPA) and the HPASubC tool to classify mosaic expression patterns of staining across 49,600 unique tissue microarray (TMA) images using a visual proteomic approach. We identified 2164 proteins with potential mosaic expression, of which 1605 were categorized as "likely" or "real." This list included both well-known fiber-type-specific and novel proteins. A comparison of the 1605 mosaic proteins with a mass spectrometry (MS)-derived proteomic dataset of single human muscle fibers led to the assignment of 111 proteins to fiber types. We additionally used a multiplexed immunohistochemistry approach, a multiplexed RNA-ISH approach, and STRING v11 to further assign or suggest fiber types of newly characterized mosaic proteins. This visual proteomic analysis of mature skeletal muscle myofibers greatly expands the known repertoire of twitch-type-specific proteins.


Subject(s)
Muscle Fibers, Slow-Twitch , Muscular Diseases , Humans , Muscle Fibers, Fast-Twitch , Muscle, Skeletal , Proteomics
11.
OMICS ; 24(12): 743-755, 2020 12.
Article in English | MEDLINE | ID: mdl-33275529

ABSTRACT

Plant omics is an emerging field of systems science and offers the prospects of evidence-based evaluation of traditional herbal medicines in human diseases. To this end, the powdered root of Yashtimadhu (Glycyrrhiza glabra L.), commonly known as liquorice, is frequently used in Indian Ayurvedic medicine with an eye to neuroprotection but its target proteins, mechanisms of action, and metabolites remain to be determined. Using a metabolomics and network pharmacology approach, we identified 98,097 spectra from positive and negative polarities that matched to ∼1600 known metabolites. These metabolites belong to terpenoids, alkaloids, and flavonoids, including both novel and previously reported active metabolites such as glycyrrhizin, glabridin, liquiritin, and other terpenoid saponins. Novel metabolites were also identified such as quercetin glucosides, coumarin derivatives, beta-carotene, and asiatic acid, which were previously not reported in relation to liquorice. Metabolite-protein interaction-based network pharmacology analyses enriched 107 human proteins, which included dopamine, serotonin, and acetylcholine neurotransmitter receptors among other regulatory proteins. Pathway analysis highlighted the regulation of signaling kinases, growth factor receptors, cell cycle, and inflammatory pathways. In vitro validation confirmed the regulation of cell cycle, MAPK1/3, PI3K/AKT pathways by liquorice. The present data-driven, metabolomics and network pharmacology study paves the way for further translational clinical research on neuropharmacology of liquorice and other traditional medicines.


Subject(s)
Glycyrrhiza/metabolism , Metabolomics , Plants, Medicinal/metabolism , Plants/metabolism , Computational Biology/methods , Metabolome , Metabolomics/methods , Plant Extracts/chemistry , Plant Extracts/metabolism , Plant Extracts/pharmacology
12.
F1000Res ; 9: 344, 2020.
Article in English | MEDLINE | ID: mdl-33274046

ABSTRACT

Cancer genome sequencing studies have revealed a number of variants in coding regions of several genes. Some of these coding variants play an important role in activating specific pathways that drive proliferation. Coding variants present on cancer cell surfaces by the major histocompatibility complex serve as neo-antigens and result in immune activation. The success of immune therapy in patients is attributed to neo-antigen load on cancer cell surfaces. However, which coding variants are expressed at the protein level can't be predicted based on genomic data. Complementing genomic data with proteomic data can potentially reveal coding variants that are expressed at the protein level. However, identification of variant peptides using mass spectrometry data is still a challenging task due to the lack of an appropriate tool that integrates genomic and proteomic data analysis pipelines. To overcome this problem, and for the ease of the biologists, we have developed a graphical user interface (GUI)-based tool called CusVarDB. We integrated variant calling pipeline to generate sample-specific variant protein database from next-generation sequencing datasets. We validated the tool with triple negative breast cancer cell line datasets and identified 423, 408, 386 and 361 variant peptides from BT474, MDMAB157, MFM223 and HCC38 datasets, respectively.


Subject(s)
Computational Biology , Databases, Protein , High-Throughput Nucleotide Sequencing , Software , Humans , Proteomics
13.
Sci Rep ; 9(1): 18793, 2019 12 11.
Article in English | MEDLINE | ID: mdl-31827134

ABSTRACT

Epidermal growth factor receptor (EGFR) targeted therapies have shown limited efficacy in head and neck squamous cell carcinoma (HNSCC) patients despite its overexpression. Identifying molecular mechanisms associated with acquired resistance to EGFR-TKIs such as erlotinib remains an unmet need and a therapeutic challenge. In this study, we employed an integrated multi-omics approach to delineate mechanisms associated with acquired resistance to erlotinib by carrying out whole exome sequencing, quantitative proteomic and phosphoproteomic profiling. We observed amplification of several genes including AXL kinase and transcription factor YAP1 resulting in protein overexpression. We also observed expression of constitutively active mutant MAP2K1 (p.K57E) in erlotinib resistant SCC-R cells. An integrated analysis of genomic, proteomic and phosphoproteomic data revealed alterations in MAPK pathway and its downstream targets in SCC-R cells. We demonstrate that erlotinib-resistant cells are sensitive to MAPK pathway inhibition. This study revealed multiple genetic, proteomic and phosphoproteomic alterations associated with erlotinib resistant SCC-R cells. Our data indicates that therapeutic targeting of MAPK pathway is an effective strategy for treating erlotinib-resistant HNSCC tumors.


Subject(s)
Antineoplastic Agents/therapeutic use , Erlotinib Hydrochloride/therapeutic use , MAP Kinase Kinase 1/antagonists & inhibitors , Protein Kinase Inhibitors/therapeutic use , Squamous Cell Carcinoma of Head and Neck/drug therapy , Cell Line, Tumor , Datasets as Topic , Drug Delivery Systems , Drug Resistance, Neoplasm/genetics , Epithelial-Mesenchymal Transition , Genomics , Humans , Metabolic Networks and Pathways , Phenotype , Proteomics , Squamous Cell Carcinoma of Head and Neck/enzymology , Whole Genome Sequencing
14.
Indian J Pathol Microbiol ; 62(4): 529-536, 2019.
Article in English | MEDLINE | ID: mdl-31611435

ABSTRACT

BACKGROUND: In recent years, high-throughput omics technologies have been widely used globally to identify potential biomarkers and therapeutic targets in various cancers. However, apart from large consortiums such as The Cancer Genome Atlas, limited attempts have been made to mine existing datasets pertaining to cancers. METHODS AND RESULTS: In the current study, we used an omics data analysis approach wherein publicly available protein expression data were integrated to identify functionally important proteins that revealed consistent dysregulated expression in head and neck squamous cell carcinomas. Our analysis revealed members of the integrin family of proteins to be consistently altered in expression across disparate datasets. Additionally, through association evidence and network analysis, we also identified members of the laminin family to be significantly altered in head and neck cancers. Members of both integrin and laminin families are known to be involved in cell-extracellular matrix adhesion and have been implicated in tumor metastatic processes in several cancers. To this end, we carried out immunohistochemical analyses to validate the findings in a cohort (n = 50) of oral cancer cases. Laminin-111 expression (composed of LAMA1, LAMB1, and LAMC1) was found to correlate with cell differentiation in oral cancer, showing a gradual decrease from well differentiated to poorly differentiated cases. CONCLUSION: This study serves as a proof-of-principle for the mining of multiple omics datasets coupled with selection of functionally important group of molecules to provide novel insights into tumorigenesis and cancer progression.


Subject(s)
Carcinoma, Squamous Cell/genetics , Cell Differentiation , Data Mining , Integrins/genetics , Laminin/genetics , Signal Transduction , Adult , Biomarkers, Tumor/genetics , Carcinoma, Squamous Cell/pathology , Cohort Studies , Computational Biology , Databases, Protein , Head and Neck Neoplasms/genetics , Head and Neck Neoplasms/pathology , Humans , Immunohistochemistry , Integrins/metabolism , Laminin/metabolism , Middle Aged , Proof of Concept Study
15.
OMICS ; 23(7): 350-361, 2019 07.
Article in English | MEDLINE | ID: mdl-31225774

ABSTRACT

Alzheimer's disease (AD) is a common complex disease and a major public health burden in both developed and developing countries. Postgenomic technologies such as proteomics and intelligent mining of multi-omics Big Data offer new prospects for diagnostics and therapeutics innovation for AD. In this context, it is noteworthy that mass spectrometry (MS) data are often searched against proteomics databases to unravel the identity of protein biomarkers. In contrast, only a fraction of the MS data can be matched to known proteins, while a large portion of such raw data remains underutilized. Furthermore, the spectral data can be mined for multiple high-confidence post-translational modifications (PTMs) without a priori enrichment. Thus, AD research stands to gain by greater attention to the biological mechanisms regulated by PTMs. Protein modifications may serve as diagnostic biomarkers or as novel molecular targets for drug discovery. We report here novel PTMs discovered in relation to the AD from MS/MS-based proteomic datasets. Publicly available label-free proteomics data were searched for select PTMs using SEQUEST-HT. Only high-confidence PTMs were analyzed using bioinformatics analysis. We identified 4961 unique modified peptides corresponding to 1856 proteins from AD datasets. Of these, 52 proteins were known to be involved in Alzheimer's pathway. Importantly, 3164 PTMs reported in this study are novel in the context of AD. Furthermore, protein quantification revealed expression of 13 high-abundant secretary proteins across multiple studies, which can be potentially harnessed in the future to develop biomarkers. In summary, this study identifies novel PTMs which might help develop new insights on the molecular substrates of AD and thus inform future development of novel diagnostics and treatments for this highly prevalent disease.


Subject(s)
Alzheimer Disease/metabolism , Proteome , Proteomics , Alzheimer Disease/etiology , Biomarkers , Computational Biology/methods , Databases, Protein , Gene Ontology , Humans , Neurodegenerative Diseases/etiology , Neurodegenerative Diseases/metabolism , Peptides , Protein Processing, Post-Translational , Proteomics/methods , Signal Transduction , Tandem Mass Spectrometry
16.
OMICS ; 23(6): 318-326, 2019 06.
Article in English | MEDLINE | ID: mdl-31120389

ABSTRACT

Elizabethkingia meningoseptica is Gram-negative, rod-shaped opportunistic bacterial pathogen increasingly reported in hospital-acquired outbreaks. This bacterium is well known to thrive in the hospital environment. One of the leading causes of meningitis in pediatric and immune-compromised patients, E. meningoseptica has been noted as a "pathogen of interest" in the context of nosocomial diseases associated with device-related infections in particular. This pathogen's multidrug-resistant phenotype and attendant lack of adequate molecular mechanistic data limit the current approaches for its effective management in hospitals and public health settings. This study provides the global proteome of E. meningoseptica. The reference strain E. meningoseptica ATCC 13253 was used for proteomic analysis using high-resolution Fourier transform mass spectrometry. The study provided translational evidence for 2506 proteins of E. meningoseptica. We identified multiple metallo-ß-lactamases, transcriptional regulators, and efflux transporter proteins associated with multidrug resistance. A protein Car D, which is an enzyme of the carbapenem synthesis pathway, was also discovered in E. meningoseptica. Further, the proteomics data were harnessed for refining the genome annotation. We discovered 39 novel protein-coding genes and corrected four existing translations using proteogenomic workflow. Novel translations reported in this study enhance the molecular data on this organism, thus improving current databases. We believe that the in-depth proteomic data presented in this study offer a platform for accelerated research on this pathogen. The identification of multiple proteins, particularly those involved in drug resistance, offers new future opportunities to design novel and specific antibiotics against infections caused by E. meningoseptica.


Subject(s)
Chryseobacterium/drug effects , Chryseobacterium/metabolism , Communicable Diseases/metabolism , Proteomics/methods , Anti-Bacterial Agents/pharmacology , Humans , Microbial Sensitivity Tests
17.
J Cell Commun Signal ; 13(2): 163-177, 2019 Jun.
Article in English | MEDLINE | ID: mdl-30666556

ABSTRACT

Gallbladder cancer (GBC) is a rare malignancy, associated with poor disease prognosis with a 5-year survival of only 20%. This has been attributed to late presentation of the disease, lack of early diagnostic markers and limited efficacy of therapeutic interventions. Elucidation of molecular events in GBC can contribute to better management of the disease by aiding in the identification of therapeutic targets. To identify aberrantly activated signaling events in GBC, tandem mass tag-based quantitative phosphoproteomic analysis of five GBC cell lines was carried out. Proline-rich Akt substrate 40 kDa (PRAS40) was one of the proteins found to be hyperphosphorylated in all the invasive GBC cell lines. Tissue microarray-based immunohistochemical labeling of phospho-PRAS40 (T246) revealed moderate to strong staining in 77% of the primary gallbladder adenocarcinoma cases. Regulation of PRAS40 activity by inhibiting its upstream kinase PIM1 resulted in a significant decrease in cell proliferation, colony forming and invasive ability of GBC cells. Our results support the role of PRAS40 phosphorylation in GBC cell survival and aggressiveness. This study also elucidates phospho-PRAS40 as a clinical marker in GBC and the role of PIM1 as a therapeutic target in GBC.

18.
Data Brief ; 20: 723-731, 2018 Oct.
Article in English | MEDLINE | ID: mdl-30211266

ABSTRACT

This article contains data on the proteins expressed in the ovaries of Anopheles stephensi, a major vector of malaria in India. Data acquisition was performed using a high-resolution Orbitrap-Velos mass spectrometer. The acquired MS/MS data was searched against An. stephensi protein database comprising of 11,789 sequences. Overall, 4407 proteins were identified, functional analysis was performed for the identified proteins and a protein-protein interaction map predicted. The data provided here is also related to a published article - "Integrating transcriptomics and proteomics data for accurate assembly and annotation of genomes" (Prasad et al., 2017) [1].

19.
OMICS ; 22(8): 544-552, 2018 08.
Article in English | MEDLINE | ID: mdl-30106353

ABSTRACT

Candida tropicalis belongs to the non-albicans group of Candida, and causes epidermal, mucosal, or systemic candidiasis in immunocompromised individuals. Although the prevalence of candidiasis has increased worldwide and non-albicans Candida (NAC) are becoming more significant, there are very few studies that focus on the NAC biology. Proteins and their post-translational modifications (PTMs) are an integral aspect in the pathobiology of such medically important fungi. Previously, we had reported the largest proteomic catalog of C. tropicalis. Notably, PTMs can be identified from proteomics data without a priori enrichment for a particular PTM, thus allowing broad-scale omics analyses. In this study, we developed the "PTM-Pro," a graphical user interface-based tool for identification and summary of high-confidence PTM sites based on statistical threshold of users' choice. We mined available proteomic data of C. tropicalis, and using PTM-Pro identified nearly 600 high-confidence PTM sites. The PTMs identified include phosphorylation of serine, threonine, and tyrosine; acetylation, crotonylation, methylation, and succinylation of lysine. These PTMs reside on biologically significant molecules, including histones, enzymes, and transcription factors. To our knowledge, this is the first report of PTMs in C. tropicalis and lays a foundation for future investigations of C. tropicalis PTMs. In addition, the PTM-Pro offers a graphical user interface tool for research on PTM sites in the field of proteomics.


Subject(s)
Candida/metabolism , Proteome/metabolism , Candida/genetics , Candida tropicalis/genetics , Candida tropicalis/metabolism , Phosphorylation , Protein Processing, Post-Translational
20.
Front Microbiol ; 9: 1314, 2018.
Article in English | MEDLINE | ID: mdl-29971057

ABSTRACT

H37Ra is a virulence attenuated strain of Mycobacterium tuberculosis widely employed as a model to investigate virulence mechanisms. Comparative high-throughput studies have earlier correlated its avirulence to the presence of specific mutations or absence of certain proteins. However, a recent sequencing study of H37Ra, has disproved several genomic differences earlier reported to be associated with virulence. This warrants further investigations on the H37Ra proteome as well. In this study, we carried out an integrated analysis of the genome, transcriptome, and proteome of H37Ra. In addition to confirming single nucleotide variations (SNVs) and insertion-deletions that were reported earlier, our study provides novel insights into the mutation spectrum in the promoter regions of 7 genes. We also provide transcriptional and proteomic evidence for 3,900 genes representing ~80% of the total predicted gene count including 408 proteins that have not been identified previously. We identified 9 genes whose coding potential was hitherto reported to be absent in H37Ra. These include 2 putative virulence factors belonging to ESAT-6 like family of proteins. Furthermore, proteogenomic analysis enabled us to identify 63 novel proteins coding genes and correct 25 existing gene models in H37Ra genome. A majority of these were found to be conserved in the virulent strain H37Rv as well as in other mycobacterial species suggesting that the differences in the virulent and avirulent strains of M. tuberculosis are not entirely dependent on the expression of certain proteins or their absence but may possibly be ascertained to functional changes.

SELECTION OF CITATIONS
SEARCH DETAIL
...