Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters











Publication year range
1.
PLoS Genet ; 11(11): e1005425, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26587833

ABSTRACT

Changes in the locations and boundaries of heterochromatin are critical during development, and de novo assembly of silent chromatin in budding yeast is a well-studied model for how new sites of heterochromatin assemble. De novo assembly cannot occur in the G1 phase of the cell cycle and one to two divisions are needed for complete silent chromatin assembly and transcriptional repression. Mutation of DOT1, the histone H3 lysine 79 (K79) methyltransferase, and SET1, the histone H3 lysine 4 (K4) methyltransferase, speed de novo assembly. These observations have led to the model that regulated demethylation of histones may be a mechanism for how cells control the establishment of heterochromatin. We find that the abundance of Sir4, a protein required for the assembly of silent chromatin, decreases dramatically during a G1 arrest and therefore tested if changing the levels of Sir4 would also alter the speed of de novo establishment. Halving the level of Sir4 slows heterochromatin establishment, while increasing Sir4 speeds establishment. yku70Δ and ubp10Δ cells also speed de novo assembly, and like dot1Δ cells have defects in subtelomeric silencing, suggesting that these mutants may indirectly speed de novo establishment by liberating Sir4 from telomeres. Deleting RIF1 and RIF2, which suppresses the subtelomeric silencing defects in these mutants, rescues the advanced de novo establishment in yku70Δ and ubp10Δ cells, but not in dot1Δ cells, suggesting that YKU70 and UBP10 regulate Sir4 availability by modulating subtelomeric silencing, while DOT1 functions directly to regulate establishment. Our data support a model whereby the demethylation of histone H3 K79 and changes in Sir4 abundance and availability define two rate-limiting steps that regulate de novo assembly of heterochromatin.


Subject(s)
Gene Silencing , Heterochromatin/genetics , Saccharomyces cerevisiae/genetics , Silent Information Regulator Proteins, Saccharomyces cerevisiae/physiology , DNA-Binding Proteins/genetics , Epistasis, Genetic , G1 Phase , Gene Deletion , Mutation , Nuclear Proteins/genetics , Repressor Proteins/genetics , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae Proteins/genetics , Telomere , Telomere-Binding Proteins/genetics , Ubiquitin Thiolesterase/genetics
2.
EBioMedicine ; 2(9): 1160-8, 2015 Sep.
Article in English | MEDLINE | ID: mdl-26501113

ABSTRACT

Biomarkers for active tuberculosis (TB) are urgently needed to improve rapid TB diagnosis. The objective of this study was to identify serum protein expression changes associated with TB but not latent Mycobacterium tuberculosis infection (LTBI), uninfected states, or respiratory diseases other than TB (ORD). Serum samples from 209 HIV uninfected (HIV(-)) and co-infected (HIV(+)) individuals were studied. In the discovery phase samples were analyzed via liquid chromatography and mass spectrometry, and in the verification phase biologically independent samples were analyzed via a multiplex multiple reaction monitoring mass spectrometry (MRM-MS) assay. Compared to LTBI and ORD, host proteins were significantly differentially expressed in TB, and involved in the immune response, tissue repair, and lipid metabolism. Biomarker panels whose composition differed according to HIV status, and consisted of 8 host proteins in HIV(-) individuals (CD14, SEPP1, SELL, TNXB, LUM, PEPD, QSOX1, COMP, APOC1), or 10 host proteins in HIV(+) individuals (CD14, SEPP1, PGLYRP2, PFN1, VASN, CPN2, TAGLN2, IGFBP6), respectively, distinguished TB from ORD with excellent accuracy (AUC = 0.96 for HIV(-) TB, 0.95 for HIV(+) TB). These results warrant validation in larger studies but provide promise that host protein biomarkers could be the basis for a rapid, blood-based test for TB.


Subject(s)
Biomarkers/blood , Coinfection/complications , HIV Infections/complications , Tuberculosis/complications , Adult , Area Under Curve , Blood Proteins/metabolism , Female , Humans , Male , Middle Aged , Tuberculosis/blood
3.
Stat Appl Genet Mol Biol ; 9: Article23, 2010.
Article in English | MEDLINE | ID: mdl-20597849

ABSTRACT

Research on analyzing microarray data has focused on the problem of identifying differentially expressed genes to the neglect of the problem of how to integrate evidence that a gene is differentially expressed with information on the extent of its differential expression. Consequently, researchers currently prioritize genes for further study either on the basis of volcano plots or, more commonly, according to simple estimates of the fold change after filtering the genes with an arbitrary statistical significance threshold. While the subjective and informal nature of the former practice precludes quantification of its reliability, the latter practice is equivalent to using a hard-threshold estimator of the expression ratio that is not known to perform well in terms of mean-squared error, the sum of estimator variance and squared estimator bias. On the basis of two distinct simulation studies and data from different microarray studies, we systematically compared the performance of several estimators representing both current practice and shrinkage. We find that the threshold-based estimators usually perform worse than the maximum-likelihood estimator (MLE) and they often perform far worse as quantified by estimated mean-squared risk. By contrast, the shrinkage estimators tend to perform as well as or better than the MLE and never much worse than the MLE, as expected from what is known about shrinkage. However, a Bayesian measure of performance based on the prior information that few genes are differentially expressed indicates that hard-threshold estimators perform about as well as the local false discovery rate (FDR), the best of the shrinkage estimators studied. Based on the ability of the latter to leverage information across genes, we conclude that the use of the local-FDR estimator of the fold change instead of informal or threshold-based combinations of statistical tests and non-shrinkage estimators can be expected to substantially improve the reliability of gene prioritization at very little risk of doing so less reliably. Since the proposed replacement of post-selection estimates with shrunken estimates applies as well to other types of high-dimensional data, it could also improve the analysis of SNP data from genome-wide association studies.


Subject(s)
Gene Expression Profiling/statistics & numerical data , Likelihood Functions , Models, Statistical , False Positive Reactions , Gene Expression Regulation , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Polymorphism, Single Nucleotide
4.
BMC Bioinformatics ; 11: 63, 2010 Jan 28.
Article in English | MEDLINE | ID: mdl-20109217

ABSTRACT

BACKGROUND: Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from equally justified algorithms. Recently, a concordance method that measures agreement among gene lists have been introduced to assess various aspects of differential gene expression detection. This method has the advantage of basing its assessment solely on the results of real data analyses, but as it requires examining gene lists of given sizes, it may be unstable. RESULTS: Two methodologies for assessing predictive error are described: a cross-validation method and a posterior predictive method. As a nonparametric method of estimating prediction error from observed expression levels, cross validation provides an empirical approach to assessing algorithms for detecting differential gene expression that is fully justified for large numbers of biological replicates. Because it leverages the knowledge that only a small portion of genes are differentially expressed, the posterior predictive method is expected to provide more reliable estimates of algorithm performance, allaying concerns about limited biological replication. In practice, the posterior predictive method can assess when its approximations are valid and when they are inaccurate. Under conditions in which its approximations are valid, it corroborates the results of cross validation. Both comparison methodologies are applicable to both single-channel and dual-channel microarrays. For the data sets considered, estimating prediction error by cross validation demonstrates that empirical Bayes methods based on hierarchical models tend to outperform algorithms based on selecting genes by their fold changes or by non-hierarchical model-selection criteria. (The latter two approaches have comparable performance.) The posterior predictive assessment corroborates these findings. CONCLUSIONS: Algorithms for detecting differential gene expression may be compared by estimating each algorithm's error in predicting expression ratios, whether such ratios are defined across microarray channels or between two independent groups.According to two distinct estimators of prediction error, algorithms using hierarchical models outperform the other algorithms of the study. The fact that fold-change shrinkage performed as well as conventional model selection criteria calls for investigating algorithms that combine the strengths of significance testing and fold-change estimation.


Subject(s)
Algorithms , Data Interpretation, Statistical , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods
5.
Bioinformatics ; 26(1): 98-103, 2010 Jan 01.
Article in English | MEDLINE | ID: mdl-19892804

ABSTRACT

MOTIVATION: Labeling techniques are being used increasingly to estimate relative protein abundances in quantitative proteomic studies. These techniques require the accurate measurement of correspondingly labeled peptide peak intensities to produce high-quality estimates of differential expression ratios. In mass spectrometers with counting detectors, the measurement noise varies with intensity and consequently accuracy increases with the number of ions detected. Consequently, the relative variability of peptide intensity measurements varies with intensity. This effect must be accounted for when combining information from multiple peptides to estimate relative protein abundance. RESULTS: We examined a variety of algorithms that estimate protein differential expression ratios from multiple peptide intensity measurements. Algorithms that account for the variation of measurement error with intensity were found to provide the most accurate estimates of differential abundance. A simple Sum-of-Intensities algorithm provided the best estimates of true protein ratios of all algorithms tested.


Subject(s)
Algorithms , Isotope Labeling/methods , Peptide Mapping/methods , Proteins/analysis , Proteins/chemistry , Amino Acid Sequence , Molecular Sequence Data , Sensitivity and Specificity
6.
Article in English | MEDLINE | ID: mdl-19163533

ABSTRACT

In high-throughput proteomics, one promising approach presently being explored is the Accurate Mass and Time (AMT) tag approach, in which reversed-phase liquid chromatography coupled to high accuracy mass spectrometry provide measurements of both the masses and chromatographic retention times of tryptic peptides in complex mixtures. These measurements are matched to the mass and predicted retention times of peptides in library. There are two varieties of peptides in the library: peptides whose retention time predictions are derived from previous peptide identifications and therefore are of high precision, and peptides whose retention time predictions are derived from a sequence-based model and therefore have lower precision. We present a Bayesian statistical model that provides probability estimates for the correctness of each match by separately modeling the data distributions of correct matches and incorrect matches. For matches to peptides with high-precision retention time predictions, the model distinguishes correct matches from incorrect matches with high confidence. For matches to peptides having low-precision retention time predictions, match probabilities do not approach certainty; however, even moderate probability matches may provide biologically interesting findings, motivating further investigations.


Subject(s)
Chromatography, Liquid/methods , Mass Spectrometry/methods , Proteomics/methods , Algorithms , Bayes Theorem , Electronic Data Processing , Humans , Monte Carlo Method , Peptides/chemistry , Probability , Reproducibility of Results , Time Factors , Trypsin/chemistry
7.
Article in English | MEDLINE | ID: mdl-18002183

ABSTRACT

In high-throughput mass spectrometry-based proteomics, it is necessary to employ separations to reduce sample complexity prior to mass spectrometric peptide identification. Interest has begun to focus on using information from separations to aid in peptide identification. One of the most common separations is reversed-phase liquid chromatography, in which peptides are separated on the basis of their chromatographic retention time. We apply a sequence-based model of peptide hydrophobicity to the problem of predicting peptide retention times, first fitting the model parameters using a large set of peptide identifications and then testing its predictions using a set of completely different peptide identifications. We demonstrate that not only does the model provide reasonably accurate predictions, it also provides a quantification of the uncertainty of its predictions. The model may therefore be used to provide checks on future tentative peptide identifications, even when the peptide species in question has never been observed before.


Subject(s)
Algorithms , Chromatography, High Pressure Liquid/methods , Mass Spectrometry/methods , Peptide Mapping/methods , Peptides/chemistry , Proteome/chemistry , Proteomics/methods , Amino Acid Sequence , Molecular Sequence Data , Reproducibility of Results , Sensitivity and Specificity , Sequence Analysis, Protein/methods
8.
Proteome Sci ; 5: 3, 2007 Jan 16.
Article in English | MEDLINE | ID: mdl-17227583

ABSTRACT

BACKGROUND: Tandem mass spectrometry followed by database search is currently the predominant technology for peptide sequencing in shotgun proteomics experiments. Most methods compare experimentally observed spectra to the theoretical spectra predicted from the sequences in protein databases. There is a growing interest, however, in comparing unknown experimental spectra to a library of previously identified spectra. This approach has the advantage of taking into account instrument-dependent factors and peptide-specific differences in fragmentation probabilities. It is also computationally more efficient for high-throughput proteomics studies. RESULTS: This paper investigates computational issues related to this spectral comparison approach. Different methods have been empirically evaluated over several large sets of spectra. First, we illustrate that the peak intensities follow a Poisson distribution. This implies that applying a square root transform will optimally stabilize the peak intensity variance. Our results show that the square root did indeed outperform other transforms, resulting in improved accuracy of spectral matching. Second, different measures of spectral similarity were compared, and the results illustrated that the correlation coefficient was most robust. Finally, we examine how to assemble multiple spectra associated with the same peptide to generate a synthetic reference spectrum. Ensemble averaging is shown to provide the best combination of accuracy and efficiency. CONCLUSION: Our results demonstrate that when combined, these methods can boost the sensitivity and specificity of spectral comparison. Therefore they are capable of enhancing and complementing existing tools for consistent and accurate peptide identification.

9.
Anal Chem ; 77(22): 7246-54, 2005 Nov 15.
Article in English | MEDLINE | ID: mdl-16285672

ABSTRACT

In high-throughput proteomics, a promising current approach is the use of liquid chromatography coupled to Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR-MS) of tryptic peptides from complex mixtures of proteins. To apply this method, it is necessary to account for any systematic measurement error, and it is useful to have an estimate of the random error expected in the measured masses. Here, we analyze by LC-FTICR-MS a complex mixture of peptides derived from a sample previously characterized by LC-QTOF-MS. Application of a Bayesian probability model of the data and partial knowledge of the composition of the sample suffice to estimate both the systematic and random errors in measured masses.


Subject(s)
Chromatography, Liquid/methods , Mass Spectrometry/methods , Peptides/analysis , Peptides/chemistry , Spectroscopy, Fourier Transform Infrared/methods , Amino Acid Sequence , Animals , Calibration , Molecular Sequence Data , Peptide Library , Rats
10.
J Am Soc Mass Spectrom ; 16(11): 1818-26, 2005 Nov.
Article in English | MEDLINE | ID: mdl-16198121

ABSTRACT

Comprehensive proteomic studies that employ MS directed peptide sequencing are limited by optimal peptide separation and MS and tandem MS data acquisition routines. To identify the optimal parameters for data acquisition, we developed a system that models the automatic function switching behavior of a mass spectrometer using an MS-only dataset. Simulations were conducted to characterize the number and the quality of simulated fragmentation as a function of the data acquisition routines and used to construct operating curves defining tandem mass spectra quality and the number of peptides fragmented. Results demonstrated that one could optimize for quality or quantity, with the number of peptides fragmented decreasing as quality increased. The predicted optimal operating curve indicated that significant improvements can be realized by selecting the appropriate data acquisition parameters. The simulation results were confirmed experimentally by testing 10 LC MS/MS data acquisition parameter sets on an LC-Q-TOF-MS. Database matching of the experimental fragmentation returned peptide scores consistent with the predictions of the model. The results of the simulations of mass spectrometer data acquisition routines reveal an inverse relationship between the quality and the quantity of peptide identifications and predict an optimal operating curve that can be used to select an optimal data acquisition parameter for a given (or any) sample.


Subject(s)
Algorithms , Gene Expression Profiling/methods , Mass Spectrometry/methods , Microsomes, Liver/metabolism , Peptide Mapping/methods , Peptides/chemistry , Sequence Analysis, Protein/methods , Animals , Artificial Intelligence , Cells, Cultured , Peptides/analysis , Rats
SELECTION OF CITATIONS
SEARCH DETAIL