RESUMO
UNLABELLED: Alignment of peaks across samples is a difficult but unavoidable step in the data analysis for all analytical techniques containing a separation step like chromatography. Important application examples are the fields of metabolomics and proteomics. Parametric time warping (PTW) has already shown to be very useful in these fields because of the highly restricted form of the warping functions, avoiding overfitting. Here, we describe a new formulation of PTW, working on peak-picked features rather than on complete profiles. Not only does this allow for a much more smooth integration in existing pipelines, it also speeds up the (already among the fastest) algorithm by orders of magnitude. Using two publicly available datasets we show the potential of the new approach. The first set is a LC-DAD dataset of grape samples, and the second an LC-MS dataset of apple extracts. AVAILABILITY AND IMPLEMENTATION: Parametric time warping of peak lists is implemented in the ptw package, version 1.9.1 and onwards, available from Github (https://github.com/rwehrens/ptw) and CRAN (http://cran.r-project.org). The package also contains a vignette, providing more theoretical details and scripts to reproduce the results below. CONTACT: ron.wehrens@wur.nl.
Assuntos
Algoritmos , Carotenoides/análise , Cromatografia Líquida/métodos , Espectrometria de Massas/métodos , Vitis/química , Metabolômica/métodos , Proteômica/métodosRESUMO
An important prerequisite for the development and benchmarking of novel analysis methods is a well-designed comprehensive LC-MS/MS data set. Here, we present our data set consisting of 59 LC-MS/MS analyses of 50 protein samples extracted individually from Escherichia coli K12 and spiked with different concentrations of bovine carbonic anhydrase II and/or chicken ovalbumin, according to a 2 × 3 full factorial design. Using the well-annotated and commonly used E. coli proteome as the sample background ensures that the complexity of the data is on a par with most current proteomic analyses. Data were acquired over a 2-month period using multiple reversed-phase columns and instrument calibrations to include real-life challenges faced when analyzing large proteomics data sets. Moreover, so-called "ground truth" data, comprised by LC-MS/MS measurements of the pure spikes are included in the data set. The current manuscript elaborates this comprehensive benchmark data set for future development and evaluation of analysis methods and software.
Assuntos
Cromatografia Líquida/métodos , Bases de Dados de Proteínas , Proteoma/química , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Anidrase Carbônica II/química , Bovinos , Galinhas , Proteínas de Escherichia coli/química , Ovalbumina/química , Fragmentos de Peptídeos/químicaRESUMO
The identification of differential patterns in data originating from combined measurement techniques such as LC/MS is pivotal to proteomics. Although "shotgun proteomics" has been employed successfully to this end, this method also has severe drawbacks, because of its dependence on largely untargeted MS/MS sequencing and databases for statistical analyses. Alternatively, several MS-signal-based (MS/MS-independent) methods have been published that are mainly based on (univariate) Student's t-tests. Here, we present a more robust multivariate alternative employing linear discriminant analysis. Like the t-test-based methods, it is applied directly to LC/MS data, instead of using MS/MS measurements. We demonstrate the method on a number of simulated data sets, as well as on a spike-in LC/MS data set, and show its superior performance over t-tests.
Assuntos
Biomarcadores/metabolismo , Análise Discriminante , Proteômica , Cromatografia Líquida , Humanos , Espectrometria de MassasRESUMO
The crystal structure of the environmentally friendly flame retardant melaminium polyphosphate (MPoly) (2,4,6-triamino-1,3,5-triazinium x PO(3))(n)was determined by a direct-space global optimization technique from X-ray powder diffraction data. Solid-state NMR was used to corroborate the proposed hydrogen-bonding model and to determine the average degree of polymerization (n > 100). An analysis of the crystal structure of MPoly reveals aspects of molecular geometry and packing that are characteristic for melamine-containing compounds and polyphosphate salts. A comparison of MPoly with the crystal structures of its precursors melaminium orthophosphate (MP) and melaminium dihydrogenpyrophosphate (MPy) provides insight in the mechanism of the endothermic dehydration processes that takes place in the reaction path MP --> MPy --> MPoly. Solid-state NMR characterization of various samples of the same batch showed inhomogeneities in the MPoly composition. Various quantities of orthophosphates were found, which cannot be assigned to be MP.
RESUMO
Warping methods are an important class of methods that can correct for misalignments in (a.o.) chemical measurements. Their use in preprocessing of chromatographic, spectroscopic and spectrometric data has grown rapidly over the last decade. This tutorial review aims to give a critical introduction to the most important warping methods, the place of warping in preprocessing and current views on the related matters of reference selection, optimization, and evaluation. Some pitfalls in warping, notably for liquid chromatography-mass spectrometry (LC-MS) data and similar, will be discussed. Examples will be given of the application of a number of freely available warping methods to a nuclear magnetic resonance (NMR) spectroscopic dataset and a chromatographic dataset. As part of the Supporting Information, we provide a number of programming scripts in Matlab and R, allowing the reader to work the extended examples in detail and to reproduce the figures in this paper.
RESUMO
In processive catalysis, a catalyst binds to a substrate and remains bound as it performs several consecutive reactions, as exemplified by DNA polymerases. Processivity is essential in nature and is often mediated by a clamp-like structure that physically tethers the catalyst to its (polymeric) template. In the case of the bacteriophage T4 replisome, a dedicated clamp protein acts as a processivity mediator by encircling DNA and subsequently recruiting its polymerase. Here we use this DNA-binding protein to construct a biohybrid catalyst. Conjugation of the clamp protein to a chemical catalyst with sequence-specific oxidation behaviour formed a catalytic clamp that can be loaded onto a DNA plasmid. The catalytic activity of the biohybrid catalyst was visualized using a procedure based on an atomic force microscopy method that detects and spatially locates oxidized sites in DNA. Varying the experimental conditions enabled switching between processive and distributive catalysis and influencing the sliding direction of this rotaxane-like catalyst.
Assuntos
Complexos de Coordenação/química , DNA/química , Oligopeptídeos/química , Sequência de Bases , Catálise , Dano ao DNA , Microscopia de Força Atômica , Modelos Moleculares , OxirreduçãoRESUMO
The peaks of magnetic resonance (MR) spectra can be shifted due to variations in physiological and experimental conditions, and correcting for misaligned peaks is an important part of data processing prior to multivariate analysis. In this paper, five warping algorithms (icoshift, COW, fastpa, VPdtw and PTW) are compared for their feasibility in aligning spectral peaks in three sets of high resolution magic angle spinning (HR-MAS) MR spectra with different degrees of misalignments, and their merits are discussed. In addition, extraction of information that might be present in the shifts is examined, both for simulated data and the real MR spectra. The generic evaluation methodology employs a number of frequently used quality criteria for evaluation of the alignments, together with PLS-DA to assess the influence of alignment on the classification outcome. Peak alignment greatly improved the internal similarity of the data sets. Especially icoshift and COW seem suitable for aligning HR-MAS MR spectra, possibly because they perform alignment segment-wise. The choice of reference spectrum can influence the alignment result, and it is advisable to test several references. Information from the peak shifts was extracted, and in one case cancer samples were successfully discriminated from normal tissue based on shift information only. Based on these findings, general recommendations for alignment of HR-MAS MRS data are presented. Where possible, observations are generalized to other data types (e.g. chromatographic data).
Assuntos
Neoplasias da Mama/química , Neoplasias da Mama/diagnóstico , Neoplasias do Colo/química , Neoplasias do Colo/diagnóstico , Diagnóstico por Computador/métodos , Espectroscopia de Ressonância Magnética/métodos , Sensibilidade e Especificidade , Neoplasias do Colo do Útero/química , Neoplasias do Colo do Útero/diagnóstico , Algoritmos , Feminino , Humanos , Análise Multivariada , Reprodutibilidade dos Testes , Estatística como AssuntoRESUMO
Unprecedented rhodium-catalyzed stereoselective polymerization of "carbenes" from ethyl diazoacetate (EDA) to give high molecular mass poly(ethyl 2-ylidene-acetate) is described. The mononuclear, neutral [(N,O-ligand)M(I)(cod)] (M = Rh, Ir) catalytic precursors for this reaction are characterized by (among others) single-crystal X-ray diffraction. These species mediate formation of a new type of polymers from EDA: carbon-chain polymers functionalized with a polar substituent at each carbon of the polymer backbone. The polymers are obtained as white powders with surprisingly sharp NMR resonances. Solution and solid state NMR data for these new polymers reveal a highly stereoregular polymer, with a high degree of crystallinity. The polymer is likely syndiotactic. Material properties are very different from those of atactic poly(diethyl fumarate) polymer obtained by radical polymerization of diethyl fumarate. Other diazoacetates are also polymerized. Further studies are underway to reveal possible applications of these new materials.