RESUMEN
INTRODUCTION: Ultradense peptide binding arrays that can probe millions of linear peptides comprising the entire proteomes of human or mouse, or hundreds of thousands of microbes, are powerful tools for studying the antibody repertoire in serum samples to understand adaptive immune responses. MOTIVATION: There are few tools for exploring high-dimensional, significant and reproducible antibody targets for ultradense peptide binding arrays at the linear peptide, epitope (grouping of adjacent peptides), and protein level across multiple samples/subjects (i.e. epitope spread or immunogenic regions of proteins) for understanding the heterogeneity of immune responses. RESULTS: We developed HERON (Hierarchical antibody binding Epitopes and pROteins from liNear peptides), an R package, which identifies immunogenic epitopes, using meta-analyses and spatial clustering techniques to explore antibody targets at various resolution and confidence levels, that can be found consistently across a specified number of samples through the entire proteome to study antibody responses for diagnostics or treatment. Our approach estimates significance values at the linear peptide (probe), epitope, and protein level to identify top candidates for validation. We test the performance of predictions on all three levels using correlation between technical replicates and comparison of epitope calls on two datasets, which shows HERON's competitiveness in estimating false discovery rates and finding general and sample-level regions of interest for antibody binding. AVAILABILITY: The HERON R package is available at Bioconductor https://bioconductor.org/packages/release/bioc/html/HERON.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
Estrogen receptor (ER)-positive breast cancer is characterized by late recurrences following initial treatment. The epithelial cell fate transcription factor Grainyhead-like protein 2 (GRHL2) is overexpressed in ER-positive breast cancers and is linked to poorer prognosis as compared to ER-negative breast cancers. To understand how GRHL2 contributes to progression, GRHL2 was overexpressed in ER-positive cells. We demonstrated that elevated GRHL2 imparts plasticity with stem cell- and dormancy-associated traits. RNA sequencing and immunocytochemistry revealed that high GRHL2 not only strengthens the epithelial identity but supports a hybrid epithelial to mesenchymal transition (EMT). Proliferation and tumor studies exhibited a decrease in growth and an upregulation of dormancy markers, such as NR2F1 and CDKN1B. Mammosphere assays and flow cytometry revealed enrichment of stem cell markers CD44 and ALDH1, and increased self-renewal capacity. Cistrome analyses revealed a change in transcription factor motifs near GRHL2 sites from developmental factors to those associated with disease progression. Together, these data support the idea that the plasticity and properties induced by elevated GRHL2 may provide a selective advantage to explain the association between GRHL2 and breast cancer progression.
RESUMEN
Introduction: Before they can produce their own antibodies, newborns are protected from infections by transplacental transfer of maternal IgG antibodies and after birth through breast milk IgA antibodies. Rhinovirus (RV) infections are extremely common in early childhood, and while RV infections often result in only mild upper respiratory illnesses, they can also cause severe lower respiratory illnesses such as bronchiolitis and pneumonia. Methods: We used high-density peptide arrays to profile infant and maternal antibody reactivity to capsid and full proteome sequences of three human RVs - A16, B52, and C11. Results: Numerous plasma IgG and breast milk IgA RV epitopes were identified that localized to regions of the RV capsid surface and interior, and also to several non-structural proteins. While most epitopes were bound by both IgG and IgA, there were several instances where isotype-specific and RV-specific binding were observed. We also profiled 62 unique RV-C protein loop sequences characteristic of this species' capsid VP1 protein. Discussion: Many of the RV-C loop sequences were highly bound by IgG from one-year-old infants, indicating recent or ongoing active infections, or alternatively, a level of cross-reactivity among homologous RV-C sites.
Asunto(s)
Anticuerpos Antivirales , Inmunoglobulina G , Leche Humana , Rhinovirus , Humanos , Leche Humana/inmunología , Anticuerpos Antivirales/inmunología , Anticuerpos Antivirales/sangre , Femenino , Inmunoglobulina G/inmunología , Inmunoglobulina G/sangre , Lactante , Rhinovirus/inmunología , Inmunoglobulina A/inmunología , Inmunoglobulina A/sangre , Infecciones por Picornaviridae/inmunología , Recién Nacido , Epítopos/inmunología , Proteínas de la Cápside/inmunología , AdultoRESUMEN
ABSTRACT: Yin Yang 1 (YY1) and structural maintenance of chromosomes 3 (SMC3) are 2 critical chromatin structural factors that mediate long-distance enhancer-promoter interactions and promote developmentally regulated changes in chromatin architecture in hematopoietic stem/progenitor cells (HSPCs). Although YY1 has critical functions in promoting hematopoietic stem cell (HSC) self-renewal and maintaining HSC quiescence, SMC3 is required for proper myeloid lineage differentiation. However, many questions remain unanswered regarding how YY1 and SMC3 interact with each other and affect hematopoiesis. We found that YY1 physically interacts with SMC3 and cooccupies with SMC3 at a large cohort of promoters genome wide, and YY1 deficiency deregulates the genetic network governing cell metabolism. YY1 occupies the Smc3 promoter and represses SMC3 expression in HSPCs. Although deletion of 1 Smc3 allele partially restores HSC numbers and quiescence in YY1 knockout mice, Yy1-/-Smc3+/- HSCs fail to reconstitute blood after bone marrow transplant. YY1 regulates HSC metabolic pathways and maintains proper intracellular reactive oxygen species levels in HSCs, and this regulation is independent of the YY1-SMC3 axis. Our results establish a distinct YY1-SMC3 axis and its impact on HSC quiescence and metabolism.
Asunto(s)
Proteínas de Ciclo Celular , Proteínas Cromosómicas no Histona , Células Madre Hematopoyéticas , Factor de Transcripción YY1 , Animales , Ratones , Proteínas de Ciclo Celular/metabolismo , Proteínas de Ciclo Celular/genética , Proteínas Cromosómicas no Histona/metabolismo , Proteínas Cromosómicas no Histona/genética , Cohesinas , Regulación de la Expresión Génica , Hematopoyesis , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/citología , Ratones Noqueados , Regiones Promotoras Genéticas , Factor de Transcripción YY1/metabolismo , Factor de Transcripción YY1/genéticaRESUMEN
Sera of immune mice that were previously cured of their melanoma through a combined radiation and immunocytokine immunotherapy regimen consisting of 12 Gy of external beam radiation and the intratumoral administration of an immunocytokine (anti-GD2 mAb coupled to IL-2) with long-term immunological memory showed strong antibody-binding against melanoma tumor cell lines via flow cytometric analysis. Using a high-density whole-proteome peptide array (of 6.090.593 unique peptides), we assessed potential protein-targets for antibodies found in immune sera. Sera from 6 of these cured mice were analyzed with this high-density, whole-proteome peptide array to determine specific antibody-binding sites and their linear peptide sequence. We identified thousands of peptides that were targeted by these 6 mice and exhibited strong antibody binding only by immune (after successful cure and rechallenge), not naïve (before tumor implantation) sera and developed a robust method to detect these differentially targeted peptides. Confirmatory studies were done to validate these results using 2 separate systems, a peptide ELISA and a smaller scale peptide array utilizing a slightly different technology. To the best of our knowledge, this is the first study of the full set of germline encoded linear peptide-based proteome epitopes that are recognized by immune sera from mice cured of cancer via radio-immunotherapy. We furthermore found that although the generation of B-cell repertoire in immune development is vastly variable, and numerous epitopes are identified uniquely by immune serum from each of these 6 immune mice evaluated, there are still several epitopes and proteins that are commonly recognized by at least half of the mice studied. This suggests that every mouse has a unique set of antibodies produced in response to the curative therapy, creating an individual "fingerprint." Additionally, certain epitopes and proteins stand out as more immunogenic, as they are recognized by multiple mice in the immune group.
Asunto(s)
Melanoma , Animales , Ratones , Proteoma , Ratones Endogámicos C57BL , Inmunoterapia , Péptidos , Epítopos , Sueros InmunesRESUMEN
MOTIVATION: Native top-down proteomics (nTDP) integrates native mass spectrometry (nMS) with top-down proteomics (TDP) to provide comprehensive analysis of protein complexes together with proteoform identification and characterization. Despite significant advances in nMS and TDP software developments, a unified and user-friendly software package for analysis of nTDP data remains lacking. RESULTS: We have developed MASH Native to provide a unified solution for nTDP to process complex datasets with database searching capabilities in a user-friendly interface. MASH Native supports various data formats and incorporates multiple options for deconvolution, database searching, and spectral summing to provide a "one-stop shop" for characterizing both native protein complexes and proteoforms. AVAILABILITY AND IMPLEMENTATION: The MASH Native app, video tutorials, written tutorials, and additional documentation are freely available for download at https://labs.wisc.edu/gelab/MASH_Explorer/MASHSoftware.php. All data files shown in user tutorials are included with the MASH Native software in the download .zip file.
Asunto(s)
Proteómica , Programas Informáticos , Bases de Datos Factuales , Proteínas de Unión al ADN , Espectrometría de Masas , Proteómica/métodosRESUMEN
Single-cell proteomics has emerged as a powerful method to characterize cellular phenotypic heterogeneity and the cell-specific functional networks underlying biological processes. However, significant challenges remain in single-cell proteomics for the analysis of proteoforms arising from genetic mutations, alternative splicing, and post-translational modifications. Herein, we have developed a highly sensitive functionally integrated top-down proteomics method for the comprehensive analysis of proteoforms from single cells. We applied this method to single muscle fibers (SMFs) to resolve their heterogeneous functional and proteomic properties at the single-cell level. Notably, we have detected single-cell heterogeneity in large proteoforms (>200 kDa) from the SMFs. Using SMFs obtained from three functionally distinct muscles, we found fiber-to-fiber heterogeneity among the sarcomeric proteoforms which can be related to the functional heterogeneity. Importantly, we detected multiple isoforms of myosin heavy chain (~223 kDa), a motor protein that drives muscle contraction, with high reproducibility to enable the classification of individual fiber types. This study reveals single muscle cell heterogeneity in large proteoforms and establishes a direct relationship between sarcomeric proteoforms and muscle fiber types, highlighting the potential of top-down proteomics for uncovering the molecular underpinnings of cell-to-cell variation in complex systems.
Asunto(s)
Procesamiento Proteico-Postraduccional , Proteómica , Proteómica/métodos , Reproducibilidad de los Resultados , Isoformas de Proteínas/metabolismo , Fibras Musculares Esqueléticas/metabolismo , Proteoma/metabolismoRESUMEN
Ultradense peptide binding arrays that can probe millions of linear peptides comprising the entire proteomes or immunomes of human or mouse, or numerous microbes, are powerful tools for studying the abundance of different antibody repertoire in serum samples to understand adaptive immune responses. There are few statistical analysis tools for exploring high-dimensional, significant and reproducible antibody targets for ultradense peptide binding arrays at the linear peptide, epitope (grouping of adjacent peptides), and protein level across multiple samples/subjects (I.e. epitope spread or immunogenic regions within each protein) for understanding the heterogeneity of immune responses. We developed HERON (Hierarchical antibody binding Epitopes and pROteins from liNear peptides), an R package, which allows users to identify immunogenic epitopes using meta-analyses and spatial clustering techniques to explore antibody targets at various resolution and confidence levels, that can be found consistently across a specified number of samples through the entire proteome to study antibody responses for diagnostics or treatment. Our approach estimates significance values at the linear peptide (probe), epitope, and protein level to identify top candidates for validation. We test the performance of predictions on all three levels using correlation between technical replicates and comparison of epitope calls on 2 datasets, which shows HERON's competitiveness in estimating false discovery rates and finding general and sample-level regions of interest for antibody binding. The code is available as an R package downloadable from http://github.com/Ong-Research/HERON.
RESUMEN
An important paradigm in allogeneic hematopoietic cell transplantations (allo-HCTs) is the prevention of graft-versus-host disease (GVHD) while preserving the graft-versus-leukemia (GVL) activity of donor T cells. From an observational clinical study of adult allo-HCT recipients, we identified a CD4+/CD8+ double-positive T cell (DPT) population, not present in starting grafts, whose presence was predictive of ≥ grade 2 GVHD. Using an established xenogeneic transplant model, we reveal that the DPT population develops from antigen-stimulated CD8 T cells, which become transcriptionally, metabolically, and phenotypically distinct from single-positive CD4 and CD8 T cells. Isolated DPTs were sufficient to mediate xeno-GVHD pathology when retransplanted into naïve mice but provided no survival benefit when mice were challenged with a human B-ALL cell line. Overall, this study reveals human DPTs as a T cell population directly involved with GVHD pathology.
Asunto(s)
Enfermedad Injerto contra Huésped , Trasplante de Células Madre Hematopoyéticas , Humanos , Ratones , Animales , Linfocitos T CD4-Positivos , Linfocitos T CD8-positivos/patologíaRESUMEN
Native top-down proteomics (nTDP) integrates native mass spectrometry (nMS) with top-down proteomics (TDP) to provide comprehensive analysis of protein complexes together with proteoform identification and characterization. Despite significant advances in nMS and TDP software developments, a unified and user-friendly software package for analysis of nTDP data remains lacking. Herein, we have developed MASH Native to provide a unified solution for nTDP to process complex datasets with database searching capabilities in a user-friendly interface. MASH Native supports various data formats and incorporates multiple options for deconvolution, database searching, and spectral summing to provide a one-stop shop for characterizing both native protein complexes and proteoforms. The MASH Native app, video tutorials, written tutorials and additional documentation are freely available for download at https://labs.wisc.edu/gelab/MASH_Explorer/MASHNativeSoftware.php . All data files shown in user tutorials are included with the MASH Native software in the download .zip file.
RESUMEN
BACKGROUND: Humans with inactivating mutations in growth hormone receptor (GHR) have lower rates of cancer, including prostate cancer. Similarly, mice with inactivating Ghr mutations are protected from prostatic intraepithelial neoplasia in the C3(1)/TAg prostate cancer model. However, gaps in clinical relevance in those models persist. The current study addresses these gaps and the ongoing role of Ghr in prostate cancer using loss-of-function and gain-of-function models. METHODS: Conditional Ghr inactivation was achieved in the C3(1)/TAg model by employing a tamoxifen-inducible Cre and a prostate-specific Cre. In parallel, a transgenic GH antagonist was also used. Pathology, proliferation, and gene expression of 6-month old mouse prostates were assessed. Analysis of The Cancer Genome Atlas data was conducted to identify GHR overexpression in a subset of human prostate cancers. Ghr overexpression was modeled in PTEN-P2 and TRAMP-C2 mouse prostate cancer cells using stable transfectants. The growth, proliferation, and gene expression effects of Ghr overexpression was assessed in vitro and in vivo. RESULTS: Loss-of-function for Ghr globally or in prostatic epithelial cells reduced proliferation and stratification of the prostatic epithelium in the C3(1)/TAg model. Genes and gene sets involved in the immune system and tumorigenesis, for example, were dysregulated upon global Ghr disruption. Analysis of The Cancer Genome Atlas revealed higher GHR expression in human prostate cancers with ERG-fusion genes or ETV1-fusion genes. Modeling the GHR overexpression observed in these human prostate cancers by overexpressing Ghr in mouse prostate cancer cells with mutant Pten or T-antigen driver genes increased proliferation of prostate cancer cells in vitro and in vivo. Ghr overexpression regulated the expression of multiple genes oppositely to Ghr loss-of-function models. CONCLUSIONS: Loss-of-function and gain-of-function Ghr models, including prostatic epithelial cell specific alterations in Ghr, altered proliferation, and gene expression. These data suggest that changes in GHR activity in human prostatic epithelial cells play a role in proliferation and gene regulation in prostate cancer, suggesting the potential for disrupting GH signaling, for example by the FDA approved GH antagonist pegvisomant, may be beneficial in treating prostate cancer.
Asunto(s)
Neoplasias de la Próstata , Receptores de Somatotropina , Animales , Humanos , Lactante , Masculino , Ratones , Regulación de la Expresión Génica , Hormona del Crecimiento/genética , Hormona del Crecimiento/metabolismo , Próstata/patología , Neoplasias de la Próstata/patología , Receptores de Somatotropina/genética , Receptores de Somatotropina/metabolismoRESUMEN
Before they can produce their own antibodies, newborns are protected from infections by transplacental transfer of maternal IgG antibodies and after birth through breast milk IgA antibodies. Rhinovirus (RV) infections are extremely common in early childhood, and while RV infections often result in only mild upper respiratory illnesses, they can also cause severe lower respiratory illnesses such as bronchiolitis and pneumonia. We used high-density peptide arrays to profile infant and maternal antibody reactivity to capsid and full proteome sequences of three human RVs - A16, B52, and C11. Numerous plasma IgG and breast milk IgA RV epitopes were identified that localized to regions of the RV capsid surface and interior, and also to several non-structural proteins. While most epitopes were bound by both IgG and IgA, there were several instances where isotype-specific and RV-specific binding were observed. We also profiled 62 unique RV-C dominant protein loop sequences characteristic of this species' capsid VP1 protein. Many of these RV-C sites were highly bound by IgG from one-year-old infants, indicating recent or ongoing active infections, or alternatively, a level of cross-reactivity among homologous RV-C sites.
RESUMEN
Previous studies investigating the effects of blocking the growth hormone (GH)/insulin-like growth factor-1 (IGF-1) axis in prostate cancer found no effects of the growth hormone receptor (GHR) antagonist, pegvisomant, on the growth of grafted human prostate cancer cells in vivo. However, human GHR is not activated by mouse GH, so direct actions of GH on prostate cancer cells were not evaluated in this context. The present study addresses the species specificity of GH-GHR activity by investigating GH actions in prostate cancer cell lines derived from a mouse Pten-deletion model. In vitro cell growth was stimulated by GH and reduced by pegvisomant. These in vitro GH effects were mediated at least in part by the activation of JAK2 and STAT5. When Pten-mutant cells were grown as xenografts in mice, pegvisomant treatment dramatically reduced xenograft size, and this was accompanied by decreased proliferation and increased apoptosis. RNA sequencing of xenografts identified 1765 genes upregulated and 953 genes downregulated in response to pegvisomant, including many genes previously implicated as cancer drivers. Further evaluation of a selected subset of these genes via quantitative reverse transcription-polymerase chain reaction determined that some genes exhibited similar regulation by pegvisomant in prostate cancer cells whether treatment was in vivo or in vitro, indicating direct regulation by GH via GHR activation in prostate cancer cells, whereas other genes responded to pegvisomant only in vivo, suggesting indirect regulation by pegvisomant effects on the host endocrine environment. Similar results were observed for a prostate cancer cell line derived from the mouse transgenic adenocarcinoma of the mouse prostate (TRAMP) model.
Asunto(s)
Hormona de Crecimiento Humana , Neoplasias de la Próstata , Animales , Apoptosis/genética , Proliferación Celular/genética , Expresión Génica , Hormona del Crecimiento/genética , Hormona de Crecimiento Humana/genética , Hormona de Crecimiento Humana/farmacología , Humanos , Factor I del Crecimiento Similar a la Insulina/metabolismo , Masculino , Ratones , Próstata/metabolismo , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/metabolismo , Receptores de Somatotropina/genética , Receptores de Somatotropina/metabolismoRESUMEN
The search for potential antibody-based diagnostics, vaccines, and therapeutics for pandemic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has focused almost exclusively on the spike (S) and nucleocapsid (N) proteins. Coronavirus membrane (M), ORF3a, and ORF8 proteins are humoral immunogens in other coronaviruses (CoVs) but remain largely uninvestigated for SARS-CoV-2. Here, we use ultradense peptide microarray mapping to show that SARS-CoV-2 infection induces robust antibody responses to epitopes throughout the SARS-CoV-2 proteome, particularly in M, in which 1 epitope achieved excellent diagnostic accuracy. We map 79 B cell epitopes throughout the SARS-CoV-2 proteome and demonstrate that antibodies that develop in response to SARS-CoV-2 infection bind homologous peptide sequences in the 6 other known human CoVs. We also confirm reactivity against 4 of our top-ranking epitopes by enzyme-linked immunosorbent assay (ELISA). Illness severity correlated with increased reactivity to 9 SARS-CoV-2 epitopes in S, M, N, and ORF3a in our population. Our results demonstrate previously unknown, highly reactive B cell epitopes throughout the full proteome of SARS-CoV-2 and other CoV proteins.
Asunto(s)
Anticuerpos Antivirales/inmunología , COVID-19/inmunología , SARS-CoV-2/inmunología , Proteínas Virales/inmunología , Anticuerpos Antivirales/sangre , COVID-19/patología , Coronavirus/inmunología , Reacciones Cruzadas , Epítopos de Linfocito B , Humanos , Epítopos Inmunodominantes , Inmunoglobulina G/sangre , Inmunoglobulina G/inmunología , Proteoma/inmunología , Índice de Severidad de la EnfermedadRESUMEN
Three-dimensional (3D) human induced pluripotent stem cell-derived engineered cardiac tissues (hiPSC-ECTs) have emerged as a promising alternative to two-dimensional hiPSC-cardiomyocyte monolayer systems because hiPSC-ECTs are a closer representation of endogenous cardiac tissues and more faithfully reflect the relevant cardiac pathophysiology. The ability to perform functional and molecular assessments using the same hiPSC-ECT construct would allow for more reliable correlation between observed functional performance and underlying molecular events, and thus is critically needed. Herein, for the first time, we have established an integrated method that permits sequential assessment of functional properties and top-down proteomics from the same single hiPSC-ECT construct. We quantitatively determined the differences in isometric twitch force and the sarcomeric proteoforms between two groups of hiPSC-ECTs that differed in the duration of time of 3D-ECT culture. Importantly, by using this integrated method we discovered a new and strong correlation between the measured contractile parameters and the phosphorylation levels of alpha-tropomyosin between the two groups of hiPSC-ECTs. The integration of functional assessments together with molecular characterization by top-down proteomics in the same hiPSC-ECT construct enables a holistic analysis of hiPSC-ECTs to accelerate their applications in disease modeling, cardiotoxicity, and drug discovery. Data are available via ProteomeXchange with identifier PXD022814.
Asunto(s)
Células Madre Pluripotentes Inducidas , Cardiotoxicidad , Diferenciación Celular , Humanos , Miocitos Cardíacos , Proteómica , Ingeniería de TejidosRESUMEN
The search for potential antibody-based diagnostics, vaccines, and therapeutics for pandemic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has focused almost exclusively on the spike (S) and nucleocapsid (N) proteins. Coronavirus membrane (M), ORF3a, and ORF8 proteins are humoral immunogens in other coronaviruses (CoVs) but remain largely uninvestigated for SARS-CoV-2. Here we use ultradense peptide microarray mapping to show that SARS-CoV-2 infection induces robust antibody responses to epitopes throughout the SARS-CoV-2 proteome, particularly in M, in which one epitope achieved excellent diagnostic accuracy. We map 79 B cell epitopes throughout the SARS-CoV-2 proteome and demonstrate that antibodies that develop in response to SARS-CoV-2 infection bind homologous peptide sequences in the six other known human CoVs. We also confirm reactivity against four of our top-ranking epitopes by enzyme-linked immunosorbent assay (ELISA). Illness severity correlated with increased reactivity to nine SARS-CoV-2 epitopes in S, M, N, and ORF3a in our population. Our results demonstrate previously unknown, highly reactive B cell epitopes throughout the full proteome of SARS-CoV-2 and other CoV proteins.
RESUMEN
Hypertrophic cardiomyopathy (HCM) is the most common heritable heart disease. Although the genetic cause of HCM has been linked to mutations in genes encoding sarcomeric proteins, the ability to predict clinical outcomes based on specific mutations in HCM patients is limited. Moreover, how mutations in different sarcomeric proteins can result in highly similar clinical phenotypes remains unknown. Posttranslational modifications (PTMs) and alternative splicing regulate the function of sarcomeric proteins; hence, it is critical to study HCM at the level of proteoforms to gain insights into the mechanisms underlying HCM. Herein, we employed high-resolution mass spectrometry-based top-down proteomics to comprehensively characterize sarcomeric proteoforms in septal myectomy tissues from HCM patients exhibiting severe outflow track obstruction (n = 16) compared to nonfailing donor hearts (n = 16). We observed a complex landscape of sarcomeric proteoforms arising from combinatorial PTMs, alternative splicing, and genetic variation in HCM. A coordinated decrease of phosphorylation in important myofilament and Z-disk proteins with a linear correlation suggests PTM cross-talk in the sarcomere and dysregulation of protein kinase A pathways in HCM. Strikingly, we discovered that the sarcomeric proteoform alterations in the myocardium of HCM patients undergoing septal myectomy were remarkably consistent, regardless of the underlying HCM-causing mutations. This study suggests that the manifestation of severe HCM coalesces at the proteoform level despite distinct genotype, which underscores the importance of molecular characterization of HCM phenotype and presents an opportunity to identify broad-spectrum treatments to mitigate the most severe manifestations of this genetically heterogenous disease.
Asunto(s)
Cardiomiopatía Hipertrófica/genética , Proteínas/genética , Sarcómeros/metabolismo , Cardiomiopatía Hipertrófica/metabolismo , Genotipo , Humanos , Espectrometría de Masas , Miocardio/metabolismo , Proteínas/química , Proteínas/metabolismo , Proteómica , Sarcómeros/genética , Transducción de SeñalRESUMEN
Top-down mass spectrometry (MS)-based proteomics enable a comprehensive analysis of proteoforms with molecular specificity to achieve a proteome-wide understanding of protein functions. However, the lack of a universal software for top-down proteomics is becoming increasingly recognized as a major barrier, especially for newcomers. Here, we have developed MASH Explorer, a universal, comprehensive, and user-friendly software environment for top-down proteomics. MASH Explorer integrates multiple spectral deconvolution and database search algorithms into a single, universal platform which can process top-down proteomics data from various vendor formats, for the first time. It addresses the urgent need in the rapidly growing top-down proteomics community and is freely available to all users worldwide. With the critical need and tremendous support from the community, we envision that this MASH Explorer software package will play an integral role in advancing top-down proteomics to realize its full potential for biomedical research.
Asunto(s)
Proteómica , Programas Informáticos , Algoritmos , Espectrometría de Masas , ProteomaRESUMEN
Top-down mass spectrometry (MS) is a powerful tool for the identification and comprehensive characterization of proteoforms arising from alternative splicing, sequence variation, and post-translational modifications. However, the complex data set generated from top-down MS experiments requires multiple sequential data processing steps to successfully interpret the data for identifying and characterizing proteoforms. One critical step is the deconvolution of the complex isotopic distribution that arises from naturally occurring isotopes. Multiple algorithms are currently available to deconvolute top-down mass spectra, resulting in different deconvoluted peak lists with varied accuracy compared to true positive annotations. In this study, we have designed a machine learning strategy that can process and combine the peak lists from different deconvolution results. By optimizing clustering results, deconvolution results from THRASH, TopFD, MS-Deconv, and SNAP algorithms were combined into consensus peak lists at various thresholds using either a simple voting ensemble method or a random forest machine learning algorithm. For the random forest algorithm, which had better predictive performance, the consensus peak lists on average could achieve a recall value (true positive rate) of 0.60 and a precision value (positive predictive value) of 0.78. It outperforms the single best algorithm, which achieved a recall value of only 0.47 and a precision value of 0.58. This machine learning strategy enhanced the accuracy and confidence in protein identification during database searches by accelerating the detection of true positive peaks while filtering out false positive peaks. Thus, this method shows promise in enhancing proteoform identification and characterization for high-throughput data analysis in top-down proteomics.