Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Mol Cell Proteomics ; 22(10): 100644, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37689310

RESUMEN

Cullin-RING finger ligases represent the largest family of ubiquitin ligases. They are responsible for the ubiquitination of ∼20% of cellular proteins degraded through the proteasome, by catalyzing the transfer of E2-loaded ubiquitin to a substrate. Seven cullins are described in vertebrates. Among them, cullin 4 (CUL4) associates with DNA damage-binding protein 1 (DDB1) to form the CUL4-DDB1 ubiquitin ligase complex, which is involved in protein ubiquitination and in the regulation of many cellular processes. Substrate recognition adaptors named DDB1/CUL4-associated factors (DCAFs) mediate the specificity of CUL4-DDB1 and have a short structural motif of approximately forty amino acids terminating in tryptophan (W)-aspartic acid (D) dipeptide, called the WD40 domain. Using different approaches (bioinformatics/structural analyses), independent studies suggested that at least sixty WD40-containing proteins could act as adaptors for the DDB1/CUL4 complex. To better define this association and classification, the interaction of each DCAFs with DDB1 was determined, and new partners and potential substrates were identified. Using BioID and affinity purification-mass spectrometry approaches, we demonstrated that seven WD40 proteins can be considered DCAFs with a high confidence level. Identifying protein interactions does not always lead to identifying protein substrates for E3-ubiquitin ligases, so we measured changes in protein stability or degradation by pulse-stable isotope labeling with amino acids in cell culture to identify changes in protein degradation, following the expression of each DCAF. In conclusion, these results provide new insights into the roles of DCAFs in regulating the activity of the DDB1-CUL4 complex, in protein targeting, and characterized the cellular processes involved.

2.
Proteomics ; 23(18): e2200406, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37357151

RESUMEN

In discovery proteomics, as well as many other "omic" approaches, the possibility to test for the differential abundance of hundreds (or of thousands) of features simultaneously is appealing, despite requiring specific statistical safeguards, among which controlling for the false discovery rate (FDR) has become standard. Moreover, when more than two biological conditions or group treatments are considered, it has become customary to rely on the one-way analysis of variance (ANOVA) framework, where a first global differential abundance landscape provided by an omnibus test can be subsequently refined using various post-hoc tests (PHTs). However, the interactions between the FDR control procedures and the PHTs are complex, because both correspond to different types of multiple test corrections (MTCs). This article surveys various ways to orchestrate them in a data processing workflow and discusses their pros and cons.


Asunto(s)
Proteómica , Proteómica/métodos , Análisis de Varianza
3.
J Proteome Res ; 21(7): 1783-1786, 2022 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-35687673

RESUMEN

In their recent review ( J. Proteome Res. 2022, 21 (4), 849-864), Crook et al. diligently discuss the basics (and less basics) of Bayesian modeling, survey its various applications to proteomics, and highlight its potential for the improvement of computational proteomic tools. Despite its interest and comprehensiveness on these aspects, the pitfalls and risks of Bayesian approaches are hardly introduced to proteomic investigators. Among them, one is sufficiently important to be brought to attention: namely, the possibility that priors introduced at an early stage of the computational investigations detrimentally influence the final statistical significance.


Asunto(s)
Inteligencia Artificial , Proteómica , Teorema de Bayes , Biología Computacional , Proteoma/genética
4.
J Proteome Res ; 21(12): 2840-2845, 2022 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-36305797

RESUMEN

In their recent article, Madej et al. (Madej, D.; Wu, L.; Lam, H.Common Decoy Distributions Simplify False Discovery Rate Estimation in Shotgun Proteomics. J. Proteome Res.2022, 21 (2), 339-348) proposed an original way to solve the recurrent issue of controlling for the false discovery rate (FDR) in peptide-spectrum-match (PSM) validation. Briefly, they proposed to derive a single precise distribution of decoy matches termed the Common Decoy Distribution (CDD) and to use it to control for FDR during a target-only search. Conceptually, this approach is appealing as it takes the best of two worlds, i.e., decoy-based approaches (which leverage a large-scale collection of empirical mismatches) and decoy-free approaches (which are not subject to the randomness of decoy generation while sparing an additional database search). Interestingly, CDD also corresponds to a middle-of-the-road approach in statistics with respect to the two main families of FDR control procedures: Although historically based on estimating the false-positive distribution, FDR control has recently been demonstrated to be possible thanks to competition between the original variables (in proteomics, target sequences) and their fictional counterparts (in proteomics, decoys). Discriminating between these two theoretical trends is of prime importance for computational proteomics. In addition to highlighting why proteomics was a source of inspiration for theoretical biostatistics, it provides practical insights into the improvements that can be made to FDR control methods used in proteomics, including CDD.


Asunto(s)
Algoritmos , Espectrometría de Masas en Tándem , Bases de Datos de Proteínas , Espectrometría de Masas en Tándem/métodos , Proteómica/métodos , Péptidos
5.
Bioinformatics ; 37(17): 2770-2771, 2021 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-33538793

RESUMEN

SUMMARY: Many factors can influence results in clinical research, in particular bias in the distribution of samples prior to biochemical preparation. Well Plate Maker is a user-friendly application to design single- or multiple-well plate assays. It allows multiple group experiments to be randomized and therefore helps to reduce possible batch effects. Although primarily fathered to optimize the design of clinical sample analysis by high throughput mass spectrometry (e.g. proteomics or metabolomics), it includes multiple options to limit edge-of-plate effects, to incorporate control samples or to limit cross-contamination. It thus fits the constraints of many experimental fields. AVAILABILITY AND IMPLEMENTATION: Well Plate Maker is implemented in R and available at Bioconductor repository (https://bioconductor.org/packages/wpm) under the open source Artistic 2.0 license. In addition to classical scripting, it can be used through a graphical user interface, developed with Shiny technology.

6.
BMC Bioinformatics ; 22(1): 68, 2021 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-33579189

RESUMEN

BACKGROUND: The clustering of data produced by liquid chromatography coupled to mass spectrometry analyses (LC-MS data) has recently gained interest to extract meaningful chemical or biological patterns. However, recent instrumental pipelines deliver data which size, dimensionality and expected number of clusters are too large to be processed by classical machine learning algorithms, so that most of the state-of-the-art relies on single pass linkage-based algorithms. RESULTS: We propose a clustering algorithm that solves the powerful but computationally demanding kernel k-means objective function in a scalable way. As a result, it can process LC-MS data in an acceptable time on a multicore machine. To do so, we combine three essential features: a compressive data representation, Nyström approximation and a hierarchical strategy. In addition, we propose new kernels based on optimal transport, which interprets as intuitive similarity measures between chromatographic elution profiles. CONCLUSIONS: Our method, referred to as CHICKN, is evaluated on proteomics data produced in our lab, as well as on benchmark data coming from the literature. From a computational viewpoint, it is particularly efficient on raw LC-MS data. From a data analysis viewpoint, it provides clusters which differ from those resulting from state-of-the-art methods, while achieving similar performances. This highlights the complementarity of differently principle algorithms to extract the best from complex LC-MS data.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Péptidos , Proteómica , Cromatografía Liquida , Compresión de Datos , Espectrometría de Masas , Péptidos/química , Proteómica/métodos
7.
Anal Chem ; 92(22): 14898-14906, 2020 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-32970414

RESUMEN

In bottom-up discovery proteomics, target-decoy competition (TDC) is the most popular method for false discovery rate (FDR) control. Despite unquestionable statistical foundations, this method has drawbacks, including its hitherto unknown intrinsic lack of stability vis-à-vis practical conditions of application. Although some consequences of this instability have already been empirically described, they may have been misinterpreted. This article provides evidence that TDC has become less reliable as the accuracy of modern mass spectrometers improved. We therefore propose to replace TDC by a totally different method to control the FDR at the spectrum, peptide, and protein levels, while benefiting from the theoretical guarantees of the Benjamini-Hochberg framework. As this method is simpler to use, faster to compute, and more stable than TDC, we argue that it is better adapted to the standardization and throughput constraints of current proteomic platforms.


Asunto(s)
Espectrometría de Masas , Péptidos/metabolismo , Proteínas/metabolismo , Proteómica/métodos , Reproducibilidad de los Resultados
8.
Biostatistics ; 20(4): 632-647, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29917055

RESUMEN

We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis a challenge and are often not accounted for. In this article, we use a linear model describing peptide-protein relationships to build a likelihood ratio test of differential abundance for proteins. We show that the likelihood ratio statistic can be computed in linear time with the number of peptides. We also provide the asymptotic null distribution of a regularized version of our statistic. Experiments on both real and simulated datasets show that our procedures outperforms state-of-the-art methods. The procedures are available via the pepa.test function of the DAPAR Bioconductor R package.


Asunto(s)
Bioestadística/métodos , Modelos Estadísticos , Péptidos , Proteómica/métodos , Humanos
9.
J Proteome Res ; 18(1): 571-573, 2019 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-30394750

RESUMEN

The term "spectral clustering" is sometimes used to refer to the clustering of mass spectrometry data. However, it also classically refers to a family of popular clustering algorithms. To avoid confusion, a more specific term could advantageously be coined.


Asunto(s)
Análisis por Conglomerados , Espectrometría de Masas/métodos , Terminología como Asunto , Algoritmos , Proteómica/métodos
10.
J Proteome Res ; 17(1): 12-22, 2018 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-29067805

RESUMEN

The vocabulary of theoretical statistics can be difficult to embrace from the viewpoint of computational proteomics research, even though the notions it conveys are essential to publication guidelines. For example, "adjusted p-values", "q-values", and "false discovery rates" are essentially similar concepts, whereas "false discovery rate" and "false discovery proportion" must not be confused, even though "rate" and "proportion" are related in everyday language. In the interdisciplinary context of proteomics, such subtleties may cause misunderstandings. This article aims to provide an easy-to-understand explanation of these four notions (and a few other related ones). Their statistical foundations are dealt with from a perspective that largely relies on intuition, addressing mainly protein quantification but also, to some extent, peptide identification. In addition, a clear distinction is made between concepts that define an individual property (i.e., related to a peptide or a protein) and those that define a set property (i.e., related to a list of peptides or proteins).


Asunto(s)
Reacciones Falso Positivas , Proteómica/estadística & datos numéricos , Vocabulario
11.
Bioinformatics ; 33(1): 135-136, 2017 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-27605098

RESUMEN

DAPAR and ProStaR are software tools to perform the statistical analysis of label-free XIC-based quantitative discovery proteomics experiments. DAPAR contains procedures to filter, normalize, impute missing value, aggregate peptide intensities, perform null hypothesis significance tests and select the most likely differentially abundant proteins with a corresponding false discovery rate. ProStaR is a graphical user interface that allows friendly access to the DAPAR functionalities through a web browser. AVAILABILITY AND IMPLEMENTATION: DAPAR and ProStaR are implemented in the R language and are available on the website of the Bioconductor project (http://www.bioconductor.org/). A complete tutorial and a toy dataset are accompanying the packages. CONTACT: samuel.wieczorek@cea.fr, florence.combes@cea.fr, thomas.burger@cea.fr.


Asunto(s)
Péptidos/química , Proteínas/química , Proteómica/métodos , Programas Informáticos
14.
Proteomics ; 16(14): 1955-60, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27272648

RESUMEN

Selecting proteins with significant differential abundance is the cornerstone of many relative quantitative proteomics experiments. To do so, a trade-off between p-value thresholding and fold-change thresholding can be performed because of a specific parameter, named fudge factor, and classically noted s0 . We have observed that this fudge factor is routinely turned away from its original (and statistically valid) use, leading to important distortion in the distribution of p-values, jeopardizing the protein differential analysis, as well as the subsequent biological conclusion. In this article, we provide a comprehensive viewpoint on this issue, as well as some guidelines to circumvent it.


Asunto(s)
Algoritmos , Proteínas/aislamiento & purificación , Proteoma/aislamiento & purificación , Proteómica/estadística & datos numéricos , Análisis de Varianza , Interpretación Estadística de Datos , Conjuntos de Datos como Asunto , Pliegue de Proteína , Proteómica/métodos
15.
Proteomics ; 16(1): 29-32, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26572953

RESUMEN

In MS-based quantitative proteomics, the FDR control (i.e. the limitation of the number of proteins that are wrongly claimed as differentially abundant between several conditions) is a major postanalysis step. It is classically achieved thanks to a specific statistical procedure that computes the adjusted p-values of the putative differentially abundant proteins. Unfortunately, such adjustment is conservative only if the p-values are well-calibrated; the false discovery control being spuriously underestimated otherwise. However, well-calibration is a property that can be violated in some practical cases. To overcome this limitation, we propose a graphical method to straightforwardly and visually assess the p-value well-calibration, as well as the R codes to embed it in any pipeline. All MS data have been deposited in the ProteomeXchange with identifier PXD002370 (http://proteomecentral.proteomexchange.org/dataset/PXD002370).


Asunto(s)
Espectrometría de Masas/métodos , Proteómica/métodos , Calibración , Gráficos por Computador , Proteínas/química
16.
J Proteome Res ; 15(4): 1116-25, 2016 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-26906401

RESUMEN

Missing values are a genuine issue in label-free quantitative proteomics. Recent works have surveyed the different statistical methods to conduct imputation and have compared them on real or simulated data sets and recommended a list of missing value imputation methods for proteomics application. Although insightful, these comparisons do not account for two important facts: (i) depending on the proteomics data set, the missingness mechanism may be of different natures and (ii) each imputation method is devoted to a specific type of missingness mechanism. As a result, we believe that the question at stake is not to find the most accurate imputation method in general but instead the most appropriate one. We describe a series of comparisons that support our views: For instance, we show that a supposedly "under-performing" method (i.e., giving baseline average results), if applied at the "appropriate" time in the data-processing pipeline (before or after peptide aggregation) on a data set with the "appropriate" nature of missing values, can outperform a blindly applied, supposedly "better-performing" method (i.e., the reference method from the state-of-the-art). This leads us to formulate few practical guidelines regarding the choice and the application of an imputation method in a proteomics context.


Asunto(s)
Adenocarcinoma/química , Carcinoma de Pulmón de Células no Pequeñas/química , Neoplasias Pulmonares/química , Proteínas de Neoplasias/análisis , Péptidos/análisis , Proteómica/estadística & datos numéricos , Adenocarcinoma/diagnóstico , Adenocarcinoma/metabolismo , Algoritmos , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Simulación por Computador , Interpretación Estadística de Datos , Conjuntos de Datos como Asunto , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/metabolismo , Espectrometría de Masas/estadística & datos numéricos , Proteínas de Neoplasias/metabolismo , Péptidos/metabolismo
17.
Mol Cell Proteomics ; 13(8): 1937-52, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24846987

RESUMEN

Quantitative mass-spectrometry-based spatial proteomics involves elaborate, expensive, and time-consuming experimental procedures, and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches for establishing high-quality proteome-wide datasets. However, data analysis is as critical as data production for reliable and insightful biological interpretation, and no consistent and robust solutions have been offered to the community so far. Here, we introduce the requirements for rigorous spatial proteomics data analysis, as well as the statistical machine learning methodologies needed to address them, including supervised and semi-supervised machine learning, clustering, and novelty detection. We present freely available software solutions that implement innovative state-of-the-art analysis pipelines and illustrate the use of these tools through several case studies involving multiple organisms, experimental designs, mass spectrometry platforms, and quantitation techniques. We also propose sound analysis strategies for identifying dynamic changes in subcellular localization by comparing and contrasting data describing different biological conditions. We conclude by discussing future needs and developments in spatial proteomics data analysis.


Asunto(s)
Interpretación Estadística de Datos , Proteómica/métodos , Inteligencia Artificial , Espectrometría de Masas , Programas Informáticos , Sonido
18.
Mol Cell Proteomics ; 13(8): 2147-67, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24872594

RESUMEN

Photosynthesis has shaped atmospheric and ocean chemistries and probably changed the climate as well, as oxygen is released from water as part of the photosynthetic process. In photosynthetic eukaryotes, this process occurs in the chloroplast, an organelle containing the most abundant biological membrane, the thylakoids. The thylakoids of plants and some green algae are structurally inhomogeneous, consisting of two main domains: the grana, which are piles of membranes gathered by stacking forces, and the stroma-lamellae, which are unstacked thylakoids connecting the grana. The major photosynthetic complexes are unevenly distributed within these compartments because of steric and electrostatic constraints. Although proteomic analysis of thylakoids has been instrumental to define its protein components, no extensive proteomic study of subthylakoid localization of proteins in the BBY (grana) and the stroma-lamellae fractions has been achieved so far. To fill this gap, we performed a complete survey of the protein composition of these thylakoid subcompartments using thylakoid membrane fractionations. We employed semiquantitative proteomics coupled with a data analysis pipeline and manual annotation to differentiate genuine BBY and stroma-lamellae proteins from possible contaminants. About 300 thylakoid (or potentially thylakoid) proteins were shown to be enriched in either the BBY or the stroma-lamellae fractions. Overall, present findings corroborate previous observations obtained for photosynthetic proteins that used nonproteomic approaches. The originality of the present proteomic relies in the identification of photosynthetic proteins whose differential distribution in the thylakoid subcompartments might explain already observed phenomenon such as LHCII docking. Besides, from the present localization results we can suggest new molecular actors for photosynthesis-linked activities. For instance, most PsbP-like subunits being differently localized in stroma-lamellae, these proteins could be linked to the PSI-NDH complex in the context of cyclic electron flow around PSI. In addition, we could identify about a hundred new likely minor thylakoid (or chloroplast) proteins, some of them being potential regulators of the chloroplast physiology.


Asunto(s)
Arabidopsis/metabolismo , Espectrometría de Masas/métodos , Tilacoides/metabolismo , Fotosíntesis , Proteínas de Plantas/aislamiento & purificación , Proteómica/métodos
19.
Bioinformatics ; 30(9): 1322-4, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24413670

RESUMEN

MOTIVATION: Experimental spatial proteomics, i.e. the high-throughput assignment of proteins to sub-cellular compartments based on quantitative proteomics data, promises to shed new light on many biological processes given adequate computational tools. RESULTS: Here we present pRoloc, a complete infrastructure to support and guide the sound analysis of quantitative mass-spectrometry-based spatial proteomics data. It provides functionality for unsupervised and supervised machine learning for data exploration and protein classification and novelty detection to identify new putative sub-cellular clusters. The software builds upon existing infrastructure for data management and data processing.


Asunto(s)
Espectrometría de Masas/métodos , Proteínas/química , Proteómica/métodos , Algoritmos , Análisis por Conglomerados , Programas Informáticos
20.
Mol Phylogenet Evol ; 76: 241-53, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24685498

RESUMEN

Cattle ticks of the subgenus Rhipicephalus (Boophilus) are major agricultural pests worldwide, causing billions of dollars in losses annually. Rhipicephalus (Boophilus) annulatus and R. microplus are the most well-known and widespread species, and a third species, R. australis, was recently reinstated for 'R. microplus' from Australia and parts of Southeast Asia. We use mitochondrial genome sequences to address the phylogenetic relationships among the species of the subgenus Boophilus. We sequenced the complete or partial mitochondrial genomes of R. annulatus, R. australis, R. kohlsi, R. geigyi, and of three geographically disparate specimens of R. microplus from Brazil, Cambodia and China. Phylogenetic analyses of mitochondrial genomes, as well as cox1 and 16S rRNA sequences, reveals a species complex of R. annulatus, R. australis, and two clades of R. microplus, which we call the R. microplus complex. We show that cattle ticks morphologically identified as R. microplus from Southern China and Northern India (R. microplus clade B) are more closely related to R. annulatus than other specimens of R. microplus s.s. from Asia, South America and Africa (R. microplus clade A). Our analysis suggests that ticks reported as R. microplus from Southern China and Northern India are a cryptic species. This highlights the need for further molecular, morphological and crossbreeding studies of the R. microplus complex, with emphasis on specimens from China and India. We found that cox1 and, to a lesser extent, 16S rRNA were far more successful in resolving the phylogenetic relationships within the R. microplus complex than 12S rRNA or the nuclear marker ITS2. We suggest that future molecular studies of the R. microplus complex should focus on cox1, supplemented by 16S rRNA, and develop nuclear markers alternative to ITS2 to complement the mitochondrial data.


Asunto(s)
Genoma Mitocondrial/genética , Filogenia , Rhipicephalus/clasificación , Rhipicephalus/genética , Animales , Brasil , Cambodia , Bovinos , China , ADN Espaciador Ribosómico/genética , India , ARN Ribosómico/genética , Rhipicephalus/citología , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA