Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
2.
J Biol Chem ; 295(28): 9335-9348, 2020 07 10.
Article in English | MEDLINE | ID: mdl-32393580

ABSTRACT

The oncogene RAS is one of the most widely studied proteins in cancer biology, and mutant active RAS is a driver in many types of solid tumors and hematological malignancies. Yet the biological effects of different RAS mutations and the tissue-specific clinical implications are complex and nuanced. Here, we identified an internal tandem duplication (ITD) in the switch II domain of NRAS from a patient with extremely aggressive colorectal carcinoma. Results of whole-exome DNA sequencing of primary and metastatic tumors indicated that this mutation was present in all analyzed metastases and excluded the presence of any other clear oncogenic driver mutations. Biochemical analysis revealed increased interaction of the RAS ITD with Raf proto-oncogene Ser/Thr kinase (RAF), leading to increased phosphorylation of downstream MAPK/ERK kinase (MEK)/extracellular signal-regulated kinase (ERK). The ITD prevented interaction with neurofibromin 1 (NF1)-GTPase-activating protein (GAP), providing a mechanism for sustained activity of the RAS ITD protein. We present the first crystal structures of NRAS and KRAS ITD at 1.65-1.75 Å resolution, respectively, providing insight into the physical interactions of this class of RAS variants with its regulatory and effector proteins. Our in-depth bedside-to-bench analysis uncovers the molecular mechanism underlying a case of highly aggressive colorectal cancer and illustrates the importance of robust biochemical and biophysical approaches in the implementation of individualized medicine.


Subject(s)
Colorectal Neoplasms , GTP Phosphohydrolases , MAP Kinase Signaling System , Membrane Proteins , Mutation , Proto-Oncogene Proteins p21(ras) , Colorectal Neoplasms/enzymology , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Crystallography, X-Ray , GTP Phosphohydrolases/chemistry , GTP Phosphohydrolases/genetics , GTP Phosphohydrolases/metabolism , HEK293 Cells , Humans , Membrane Proteins/chemistry , Membrane Proteins/genetics , Membrane Proteins/metabolism , Protein Domains , Proto-Oncogene Mas , Proto-Oncogene Proteins p21(ras)/chemistry , Proto-Oncogene Proteins p21(ras)/genetics , Proto-Oncogene Proteins p21(ras)/metabolism , Exome Sequencing , raf Kinases/genetics , raf Kinases/metabolism
3.
Mol Genet Metab Rep ; 19: 100464, 2019 Jun.
Article in English | MEDLINE | ID: mdl-30891420

ABSTRACT

Clinical laboratories have adopted next generation sequencing (NGS) as a gold standard for the diagnosis of hereditary disorders because of its analytic accuracy, high throughput, and potential for cost-effectiveness. We describe the implementation of a single broad-based NGS sequencing assay to meet the genetic testing needs at the University of Minnesota. A single hybrid capture library preparation was used for each test ordered, data was informatically blinded to clinically-ordered genes, and identified variants were reviewed and classified by genetic counselors and molecular pathologists. We performed 2509 sequencing tests from August 2012 till December 2017. The diagnostic yield has remained steady at 25%, but the number of variants of uncertain significance (VUS) included in a patient report decreased over time with 50% of the patient reports including at least one VUS in 2012 and only 22% of the patient reports reporting a VUS in 2017 (p = .002). Among the various clinical specialties, the diagnostic yield was highest in dermatology (60% diagnostic yield) and ophthalmology (42% diagnostic yield) while the diagnostic yield was lowest in gastrointestinal diseases and pulmonary diseases (10% detection yield in both specialties). Deletion/duplication analysis was also implemented in a subset of panels ordered, with 9% of samples having a diagnostic finding using the deletion/duplication analysis. We have demonstrated the feasibility of this broad-based NGS platform to meet the needs of our academic institution by aggregating a sufficient sample volume from many individually rare tests and providing a flexible ordering for custom, patient-specific panels.

4.
Cancer Res ; 77(21): e43-e46, 2017 11 01.
Article in English | MEDLINE | ID: mdl-29092937

ABSTRACT

Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub. Cancer Res; 77(21); e43-46. ©2017 AACR.


Subject(s)
Computational Biology/methods , Genomics/methods , Neoplasms/genetics , Software , Genome, Human , Humans , Proteomics/methods , Tandem Mass Spectrometry , Transcriptome/genetics
5.
J Mol Diagn ; 18(6): 872-881, 2016 11.
Article in English | MEDLINE | ID: mdl-27597741

ABSTRACT

Simultaneous detection of small copy number variations (CNVs) (<0.5 kb) and single-nucleotide variants in clinically significant genes is of great interest for clinical laboratories. The analytical variability in next-generation sequencing (NGS) and artifacts in coverage data because of issues with mappability along with lack of robust bioinformatics tools for CNV detection have limited the utility of targeted NGS data to identify CNVs. We describe the development and implementation of a bioinformatics algorithm, copy number variation-random forest (CNV-RF), that incorporates a machine learning component to identify CNVs from targeted NGS data. Using CNV-RF, we identified 12 of 13 deletions in samples with known CNVs, two cases with duplications, and identified novel deletions in 22 additional cases. Furthermore, no CNVs were identified among 60 genes in 14 cases with normal copy number and no CNVs were identified in another 104 patients with clinical suspicion of CNVs. All positive deletions and duplications were confirmed using a quantitative PCR method. CNV-RF also detected heterozygous deletions and duplications with a specificity of 50% across 4813 genes. The ability of CNV-RF to detect clinically relevant CNVs with a high degree of sensitivity along with confirmation using a low-cost quantitative PCR method provides a framework for providing comprehensive NGS-based CNV/single-nucleotide variant detection in a clinical molecular diagnostics laboratory.


Subject(s)
DNA Copy Number Variations , Genetic Testing , High-Throughput Nucleotide Sequencing , Algorithms , Computational Biology/methods , Female , Gene Deletion , Gene Duplication , Genetic Markers , Genetic Testing/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Male , Real-Time Polymerase Chain Reaction , Reproducibility of Results , Sensitivity and Specificity
6.
Biotechnol J ; 11(9): 1151-7, 2016 Sep.
Article in English | MEDLINE | ID: mdl-27374913

ABSTRACT

Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. We assembled two independent drafts of Chinese hamster genome by de novo assembly from shotgun sequencing reads and by re-scaffolding and gap-filling the draft genome from NCBI for improved scaffold lengths and gap fractions. We then used the two independent assemblies to identify high confidence regions using two different approaches. First, the two independent assemblies were compared at the sequence level to identify their consensus regions as "high confidence regions" which accounts for at least 78 % of the assembled genome. Further, a genome wide comparison of the Chinese hamster scaffolds with mouse chromosomes revealed scaffolds with large blocks of collinearity, which were also compiled as high-quality scaffolds. Genome scale collinearity was complemented with EST based synteny which also revealed conserved gene order compared to mouse. As cell line sequencing becomes more commonly practiced, the approaches reported here are useful for assessing the quality of assembly and potentially facilitate the engineering of cell lines.


Subject(s)
Chromosome Mapping/methods , Genome , Sequence Analysis, DNA/methods , Animals , CHO Cells , Cricetinae , Cricetulus , Expressed Sequence Tags , Mice
7.
Arch Pathol Lab Med ; 139(2): 204-10, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25611102

ABSTRACT

CONTEXT: Although next-generation sequencing (NGS) can revolutionize molecular diagnostics, several hurdles remain in the implementation of this technology in clinical laboratories. OBJECTIVES: To validate and implement an NGS panel for genetic diagnosis of more than 100 inherited diseases, such as neurologic conditions, congenital hearing loss and eye disorders, developmental disorders, nonmalignant diseases treated by hematopoietic cell transplantation, familial cancers, connective tissue disorders, metabolic disorders, disorders of sexual development, and cardiac disorders. The diagnostic gene panels ranged from 1 to 54 genes with most of panels containing 10 genes or fewer. DESIGN: We used a liquid hybridization-based, target-enrichment strategy to enrich 10 067 exons in 568 genes, followed by NGS with a HiSeq 2000 sequencing system (Illumina, San Diego, California). RESULTS: We successfully sequenced 97.6% (9825 of 10 067) of the targeted exons to obtain a minimum coverage of 20× at all bases. We demonstrated 100% concordance in detecting 19 pathogenic single-nucleotide variations and 11 pathogenic insertion-deletion mutations ranging in size from 1 to 18 base pairs across 18 samples that were previously characterized by Sanger sequencing. Using 4 pairs of blinded, duplicate samples, we demonstrated a high degree of concordance (>99%) among the blinded, duplicate pairs. CONCLUSIONS: We have successfully demonstrated the feasibility of using the NGS platform to multiplex genetic tests for several rare diseases and the use of cloud computing for bioinformatics analysis as a relatively low-cost solution for implementing NGS in clinical laboratories.


Subject(s)
Genetic Diseases, Inborn/diagnosis , Genetic Testing/methods , High-Throughput Nucleotide Sequencing/methods , Rare Diseases/diagnosis , Computational Biology , DNA Copy Number Variations , DNA Mutational Analysis , Exons/genetics , Feasibility Studies , Genetic Diseases, Inborn/genetics , Genetic Predisposition to Disease , Genetic Testing/standards , Genetic Variation , Genotype , Humans , Mutation , Rare Diseases/genetics , Sequence Analysis, DNA
8.
J Proteome Res ; 13(12): 5898-908, 2014 Dec 05.
Article in English | MEDLINE | ID: mdl-25301683

ABSTRACT

Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy's flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics.


Subject(s)
Proteome/metabolism , Saliva/metabolism , DNA, Complementary/genetics , Gene Library , Humans , Proteome/genetics , Proteomics , Reading Frames , Software
9.
BMC Genomics ; 15: 703, 2014 Aug 22.
Article in English | MEDLINE | ID: mdl-25149441

ABSTRACT

BACKGROUND: Current practice in mass spectrometry (MS)-based proteomics is to identify peptides by comparison of experimental mass spectra with theoretical mass spectra derived from a reference protein database; however, this strategy necessarily fails to detect peptide and protein sequences that are absent from the database. We and others have recently shown that customized proteomic databases derived from RNA-Seq data can be employed for MS-searching to both improve MS analysis and identify novel peptides. While this general strategy constitutes a significant advance for the discovery of novel protein variations, it has not been readily transferable to other laboratories due to the need for many specialized software tools. To address this problem, we have implemented readily accessible, modifiable, and extensible workflows within Galaxy-P, short for Galaxy for Proteomics, a web-based bioinformatic extension of the Galaxy framework for the analysis of multi-omics (e.g. genomics, transcriptomics, proteomics) data. RESULTS: We present three bioinformatic workflows that allow the user to upload raw RNA sequencing reads and convert the data into high-quality customized proteomic databases suitable for MS searching. We show the utility of these workflows on human and mouse samples, identifying 544 peptides containing single amino acid polymorphisms (SAPs) and 187 peptides corresponding to unannotated splice junction peptides, correlating protein and transcript expression levels, and providing the option to incorporate transcript abundance measures within the MS database search process (reduced databases, incorporation of transcript abundance for protein identification score calculations, etc.). CONCLUSIONS: Using RNA-Seq data to enhance MS analysis is a promising strategy to discover novel peptides specific to a sample and, more generally, to improve proteomics results. The main bottleneck for widespread adoption of this strategy has been the lack of easily used and modifiable computational tools. We provide a solution to this problem by introducing a set of workflows within the Galaxy-P framework that converts raw RNA-Seq data into customized proteomic databases.


Subject(s)
Sequence Analysis, RNA , Software , Alternative Splicing , Amino Acid Substitution , Animals , Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing , Humans , Jurkat Cells , Mice , Polymorphism, Single Nucleotide , Protein Isoforms/chemistry , Protein Isoforms/genetics , Proteome/chemistry , Proteome/genetics
10.
BMC Res Notes ; 7: 314, 2014 May 23.
Article in English | MEDLINE | ID: mdl-24885806

ABSTRACT

BACKGROUND: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. FINDINGS: To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. CONCLUSIONS: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.


Subject(s)
Clinical Laboratory Techniques , High-Throughput Nucleotide Sequencing/methods , Internet , Sequence Analysis, DNA/methods , Statistics as Topic , Clinical Laboratory Techniques/economics , High-Throughput Nucleotide Sequencing/economics , Humans , Internet/economics , Reproducibility of Results , Sequence Analysis, DNA/economics
11.
Clin Proteomics ; 6(3): 75-82, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20930922

ABSTRACT

INTRODUCTION: Tumors lack normal drainage of secreted fluids and consequently build up tumor interstitial fluid (TIF). Unlike other bodily fluids, TIF likely contains a high proportion of tumor-specific proteins with potential as biomarkers. METHODS: Here, we evaluated a novel technique using a unique ultrafiltration catheter for in situ collection of TIF and used it to generate the first catalog of TIF proteins from a head and neck squamous cell carcinoma (HNSCC). To maximize proteomic coverage, TIF was immunodepleted for high abundance proteins and digested with trypsin, and peptides were fractionated in three dimensions prior to mass spectrometry. RESULTS: We identified 525 proteins with high confidence. The HNSCC TIF proteome was distinct compared to proteomes of other bodily fluids. It contained a relatively high proportion of proteins annotated by Gene Ontology as "extracellular" compared to other secreted fluid and cellular proteomes, indicating minimal cell lysis from our in situ collection technique. Several proteins identified are putative biomarkers of HNSCC, supporting our catalog's value as a source of potential biomarkers. CONCLUSIONS: In all, we demonstrate a reliable new technique for in situ TIF collection and provide the first HNSCC TIF protein catalog with value as a guide for others seeking to develop tumor biomarkers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12014-010-9050-3) contains supplementary material, which is available to authorized users.

12.
Proteomics ; 10(19): 3533-8, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20821806

ABSTRACT

Pulsed Q dissociation enables combining LTQ ion trap instruments with isobaric peptide tagging. Unfortunately, this combination lacks a technique which accurately reports protein abundance ratios and is implemented in a freely available, flexible software pipeline. We developed and implemented a technique assigning collective reporter ion intensity-based weights to each peptide abundance ratio and calculating a protein's weighted average abundance ratio and p-value. Using an iTRAQ-labeled standard mixture, we compared our technique's performance to the commercial software MASCOT, finding that it performed better than MASCOT's nonweighted averaging and median peptide ratio techniques, and equal to its weighted averaging technique. We also compared performance of the LTQ-Orbitrap plus our technique to 4800 MALDI TOF/TOF plus Protein Pilot, by analyzing an iTRAQ-labeled stem cell lysate. We found highly correlated protein abundance ratios, indicating that the LTQ-Orbitrap plus our technique yields results comparable to the current standard. We implemented our technique in a freely available, automated software pipeline, called LTQ-iQuant, which is mzXML-compatible; supports iTRAQ 4-plex and 8-plex LTQ data; and can be modified for and have weights trained to a user's LTQ and other isobaric peptide tagging methods. LTQ-iQuant should make LTQ instruments and isobaric peptide tagging accessible to more proteomic researchers.


Subject(s)
Proteins/analysis , Proteomics/methods , Software , Peptides/analysis , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods
13.
PLoS One ; 5(6): e11148, 2010 Jun 17.
Article in English | MEDLINE | ID: mdl-20567502

ABSTRACT

BACKGROUND: Oral cancer survival rates increase significantly when it is detected and treated early. Unfortunately, clinicians now lack tests which easily and reliably distinguish pre-malignant oral lesions from those already transitioned to malignancy. A test for proteins, ones found in non-invasively-collected whole saliva and whose abundances distinguish these lesion types, would meet this critical need. METHODOLOGY/PRINCIPAL FINDINGS: To discover such proteins, in a first-of-its-kind study we used advanced mass spectrometry-based quantitative proteomics analysis of the pooled soluble fraction of whole saliva from four subjects with pre-malignant lesions and four with malignant lesions. We prioritized candidate biomarkers via bioinformatics and validated selected proteins by western blotting. Bioinformatic analysis of differentially abundant proteins and initial western blotting revealed increased abundance of myosin and actin in patients with malignant lesions. We validated those results by additional western blotting of individual whole saliva samples from twelve other subjects with pre-malignant oral lesions and twelve with malignant oral lesions. Sensitivity/specificity values for distinguishing between different lesion types were 100%/75% (p = 0.002) for actin, and 67%/83% (p<0.00001) for myosin in soluble saliva. Exfoliated epithelial cells from subjects' saliva also showed increased myosin and actin abundance in those with malignant lesions, linking our observations in soluble saliva to abundance differences between pre-malignant and malignant cells. CONCLUSIONS/SIGNIFICANCE: Salivary actin and myosin abundances distinguish oral lesion types with sensitivity and specificity rivaling other non-invasive oral cancer tests. Our findings provide a promising starting point for the development of non-invasive and inexpensive salivary tests to reliably detect oral cancer early.


Subject(s)
Actins/metabolism , Biomarkers/metabolism , Mouth Neoplasms/metabolism , Myosins/metabolism , Precancerous Conditions/metabolism , Proteomics , Chromatography, High Pressure Liquid , Humans , Mouth Neoplasms/diagnosis , Precancerous Conditions/diagnosis , Sensitivity and Specificity , Tandem Mass Spectrometry
14.
Mol Cell Proteomics ; 9(2): 403-14, 2010 Feb.
Article in English | MEDLINE | ID: mdl-19955088

ABSTRACT

Cellular nutritional and energy status regulates a wide range of nuclear processes important for cell growth, survival, and metabolic homeostasis. Mammalian target of rapamycin (mTOR) plays a key role in the cellular responses to nutrients. However, the nuclear processes governed by mTOR have not been clearly defined. Using isobaric peptide tagging coupled with linear ion trap mass spectrometry, we performed quantitative proteomics analysis to identify nuclear processes in human cells under control of mTOR. Within 3 h of inhibiting mTOR with rapamycin in HeLa cells, we observed down-regulation of nuclear abundance of many proteins involved in translation and RNA modification. Unexpectedly, mTOR inhibition also down-regulated several proteins functioning in chromosomal integrity and up-regulated those involved in DNA damage responses (DDRs) such as 53BP1. Consistent with these proteomic changes and DDR activation, mTOR inhibition enhanced interaction between 53BP1 and p53 and increased phosphorylation of ataxia telangiectasia mutated (ATM) kinase substrates. ATM substrate phosphorylation was also induced by inhibiting protein synthesis and suppressed by inhibiting proteasomal activity, suggesting that mTOR inhibition reduces steady-state (abundance) levels of proteins that function in cellular pathways of DDR activation. Finally, rapamycin-induced changes led to increased survival after radiation exposure in HeLa cells. These findings reveal a novel functional link between mTOR and DDR pathways in the nucleus potentially operating as a survival mechanism against unfavorable growth conditions.


Subject(s)
Cell Nucleus/metabolism , DNA Damage , Intracellular Signaling Peptides and Proteins/metabolism , Protein Serine-Threonine Kinases/metabolism , Proteomics/methods , Cell Nucleus/drug effects , Cell Nucleus/radiation effects , Cycloheximide/pharmacology , HeLa Cells , Humans , Intracellular Signaling Peptides and Proteins/antagonists & inhibitors , Isotope Labeling , Proteasome Endopeptidase Complex/metabolism , Proteasome Inhibitors , Protein Biosynthesis/drug effects , Protein Biosynthesis/radiation effects , Protein Serine-Threonine Kinases/antagonists & inhibitors , Proteome/metabolism , Radiation Tolerance/drug effects , Radiation Tolerance/radiation effects , Radiation, Ionizing , Reproducibility of Results , Signal Transduction/drug effects , Signal Transduction/radiation effects , Sirolimus/pharmacology , TOR Serine-Threonine Kinases
15.
J Proteome Res ; 8(12): 5590-600, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19813771

ABSTRACT

Comprehensive identification of proteins in whole human saliva is critical for appreciating its full diagnostic potential. However, this is challenged by the large dynamic range of protein abundance within the fluid. To address this problem, we used an analysis platform that coupled hexapeptide libraries for dynamic range compression (DRC) with three-dimensional (3D) peptide fractionation. Our approach identified 2340 proteins in whole saliva and represents the largest saliva proteomic dataset generated using a single analysis platform. Three-dimensional peptide fractionation involving sequential steps of preparative isoelectric focusing (IEF), strong cation exchange, and capillary reversed-phase liquid chromatography was essential for maximizing gains from DRC. Compared to saliva not treated with hexapeptide libraries, DRC substantially increased identified proteins across physicochemical and functional categories. Approximately 20% of total salivary proteins are also seen in plasma, and proteins in both fluids show comparable functional diversity and disease-linkage. However, for a subset of diseases, saliva has higher apparent diagnostic potential. These results expand the potential for whole saliva in health monitoring/diagnostics and provide a general platform for improving proteomic coverage of complex biological samples.


Subject(s)
Molecular Diagnostic Techniques , Proteome/analysis , Saliva/chemistry , Blood Proteins/analysis , Databases, Protein , Humans , Peptide Library , Proteins/analysis , Proteomics/methods
16.
Mol Cell Proteomics ; 7(3): 486-98, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18045803

ABSTRACT

Whole human saliva possesses tremendous potential in clinical diagnostics, particularly for conditions within the oral cavity such as oral cancer. Although many have studied the soluble fraction of whole saliva, few have taken advantage of the diagnostic potential of the cells present in saliva, and none have taken advantage of proteomics capabilities for their study. We report on a novel proteomics method with which we characterized for the first time cells contained in whole saliva from patients diagnosed with oral squamous cell carcinoma. Our method uses three dimensions of peptide fractionation, combining the following steps: preparative IEF using free flow electrophoresis, strong cation exchange step gradient chromatography, and microcapillary reverse-phase liquid chromatography. We determined that the whole saliva samples contained enough cells, mostly exfoliated epithelial cells, providing adequate amounts of total protein for proteomics analysis. From a mixture of four oral cancer patient samples, the analysis resulted in a catalogue of over 1000 human proteins, each identified from at least two peptides, including numerous proteins with a role in oral squamous cell carcinoma signaling and tumorigenesis pathways. Additionally proteins from over 30 different bacteria were identified, some of which putatively contribute to cancer development. The combination of preparative IEF followed by strong cation exchange chromatography effectively fractionated the complex peptide mixtures despite the closely related physiochemical peptide properties of these separations (pI and solution phase charge, respectively). Furthermore compared with our two-step method combining preparative IEF and reverse-phase liquid chromatography, our three-step method identified significantly more cellular proteins while retaining higher confidence protein identification enabled by peptide pI information gained through IEF. Thus, for detecting salivary markers of oral cancer and possibly other conditions of the oral cavity, the results confirm both the potential of analyzing the cells in whole saliva and doing so with our proteomics method.


Subject(s)
Chemical Fractionation/methods , Mouth Neoplasms/pathology , Peptides/chemistry , Proteomics/methods , Saliva/cytology , Tandem Mass Spectrometry , Bacterial Proteins , Disease Progression , Glycosylphosphatidylinositols , Humans , Neoplasm Proteins/analysis
17.
Bioinformatics ; 20(18): 3442-54, 2004 Dec 12.
Article in English | MEDLINE | ID: mdl-15271779

ABSTRACT

MOTIVATION: To improve the ability of biologists (both researchers and students) to ask biologically interesting questions of the Gene Ontology (GO) database and to explore the ontologies by seeing large portions of the ontology graphs in context, along with details of individual terms in the ontologies. RESULTS: GoGet and GoView are two new tools built as part of an extensible web application system based on Java 2 Enterprise Edition technology. GoGet has a user interface that enables users to ask biologically interesting questions, such as (1) What are the DNA binding proteins involved in DNA repair, but not in DNA replication? and (2) Of the terms containing the word triphosphatase, which have associated gene products from mouse, but not fruit fly? The results of such queries can be viewed in a collapsed tabular format that eases the burden of getting through large tables of data. GoView enables users to explore the large directed acyclic graph structure of the ontologies in the GO database. The two tools are coordinated, so that results from queries in GoGet can be visualized in GoView in the ontology in which they appear, and explorations started from GoView can request details of gene product associations to appear in a result table in GoGet. AVAILABILITY: Free access to the GoGet query tool and free download of the GoView ontology viewer are provided to all users at http://db.math.macalester.edu/goproject. In addition, source code for the GoView tool is also available from this site, along with a user manual for both tools.


Subject(s)
Algorithms , Database Management Systems , Databases, Genetic , Information Storage and Retrieval/methods , Natural Language Processing , Software , User-Computer Interface , Documentation/methods , Internet
SELECTION OF CITATIONS
SEARCH DETAIL
...