Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Nature ; 579(7799): 409-414, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32188942

RESUMO

Plants are essential for life and are extremely diverse organisms with unique molecular capabilities1. Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where they are expressed, in which approximate quantities (a dynamic range of more than six orders of magnitude) and to what extent they are phosphorylated (over 43,000 sites). We present examples of how the data may be used, such as to discover proteins that are translated from short open-reading frames, to uncover sequence motifs that are involved in the regulation of protein production, and to identify tissue-specific protein complexes or phosphorylation-mediated signalling events. Interactive access to this resource for the plant community is provided by the ProteomicsDB and ATHENA databases, which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interactions.


Assuntos
Proteínas de Arabidopsis/análise , Proteínas de Arabidopsis/química , Arabidopsis/química , Espectrometria de Massas , Proteoma/análise , Proteoma/química , Proteômica , Motivos de Aminoácidos , Arabidopsis/anatomia & histologia , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/biossíntese , Proteínas de Arabidopsis/genética , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Regulação da Expressão Gênica de Plantas , Anotação de Sequência Molecular , Fases de Leitura Aberta , Especificidade de Órgãos , Fosfoproteínas/análise , Fosfoproteínas/química , Fosfoproteínas/genética , Fosforilação , Proteoma/biossíntese , Proteoma/genética , RNA Mensageiro/análise , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Transcriptoma
2.
Nat Methods ; 19(7): 803-811, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35710609

RESUMO

The laboratory mouse ranks among the most important experimental systems for biomedical research and molecular reference maps of such models are essential informational tools. Here, we present a quantitative draft of the mouse proteome and phosphoproteome constructed from 41 healthy tissues and several lines of analyses exemplify which insights can be gleaned from the data. For instance, tissue- and cell-type resolved profiles provide protein evidence for the expression of 17,000 genes, thousands of isoforms and 50,000 phosphorylation sites in vivo. Proteogenomic comparison of mouse, human and Arabidopsis reveal common and distinct mechanisms of gene expression regulation and, despite many similarities, numerous differentially abundant orthologs that likely serve species-specific functions. We leverage the mouse proteome by integrating phenotypic drug (n > 400) and radiation response data with the proteomes of 66 pancreatic ductal adenocarcinoma (PDAC) cell lines to reveal molecular markers for sensitivity and resistance. This unique atlas complements other molecular resources for the mouse and can be explored online via ProteomicsDB and PACiFIC.


Assuntos
Arabidopsis , Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Animais , Arabidopsis/genética , Carcinoma Ductal Pancreático/metabolismo , Espectrometria de Massas , Camundongos , Neoplasias Pancreáticas/genética , Proteoma/análise
3.
Nat Chem Biol ; 2023 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-37904048

RESUMO

Medicinal chemistry has discovered thousands of potent protein and lipid kinase inhibitors. These may be developed into therapeutic drugs or chemical probes to study kinase biology. Because of polypharmacology, a large part of the human kinome currently lacks selective chemical probes. To discover such probes, we profiled 1,183 compounds from drug discovery projects in lysates of cancer cell lines using Kinobeads. The resulting 500,000 compound-target interactions are available in ProteomicsDB and we exemplify how this molecular resource may be used. For instance, the data revealed several hundred reasonably selective compounds for 72 kinases. Cellular assays validated GSK986310C as a candidate SYK (spleen tyrosine kinase) probe and X-ray crystallography uncovered the structural basis for the observed selectivity of the CK2 inhibitor GW869516X. Compounds targeting PKN3 were discovered and phosphoproteomics identified substrates that indicate target engagement in cells. We anticipate that this molecular resource will aid research in drug discovery and chemical biology.

4.
Nat Methods ; 18(6): 604-617, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34099939

RESUMO

Single-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.


Assuntos
Análise de Sequência de Proteína/métodos , Imagem Individual de Molécula/métodos , Espectrometria de Massas/métodos , Nanotecnologia , Proteínas/química , Proteômica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
5.
Mol Cell Proteomics ; 21(12): 100437, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36328188

RESUMO

Estimating false discovery rates (FDRs) of protein identification continues to be an important topic in mass spectrometry-based proteomics, particularly when analyzing very large datasets. One performant method for this purpose is the Picked Protein FDR approach which is based on a target-decoy competition strategy on the protein level that ensures that FDRs scale to large datasets. Here, we present an extension to this method that can also deal with protein groups, that is, proteins that share common peptides such as protein isoforms of the same gene. To obtain well-calibrated FDR estimates that preserve protein identification sensitivity, we introduce two novel ideas. First, the picked group target-decoy and second, the rescued subset grouping strategies. Using entrapment searches and simulated data for validation, we demonstrate that the new Picked Protein Group FDR method produces accurate protein group-level FDR estimates regardless of the size of the data set. The validation analysis also uncovered that applying the commonly used Occam's razor principle leads to anticonservative FDR estimates for large datasets. This is not the case for the Picked Protein Group FDR method. Reanalysis of deep proteomes of 29 human tissues showed that the new method identified up to 4% more protein groups than MaxQuant. Applying the method to the reanalysis of the entire human section of ProteomicsDB led to the identification of 18,000 protein groups at 1% protein group-level FDR. The analysis also showed that about 1250 genes were represented by ≥2 identified protein groups. To make the method accessible to the proteomics community, we provide a software tool including a graphical user interface that enables merging results from multiple MaxQuant searches into a single list of identified and quantified protein groups.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Humanos , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Software , Proteoma , Algoritmos
6.
Nucleic Acids Res ; 50(D1): D1541-D1552, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34791421

RESUMO

ProteomicsDB (https://www.ProteomicsDB.org) is a multi-omics and multi-organism resource for life science research. In this update, we present our efforts to continuously develop and expand ProteomicsDB. The major focus over the last two years was improving the findability, accessibility, interoperability and reusability (FAIR) of the data as well as its implementation. For this purpose, we release a new application programming interface (API) that provides systematic access to essentially all data in ProteomicsDB. Second, we release a new open-source user interface (UI) and show the advantages the scientific community gains from such software. With the new interface, two new visualizations of protein primary, secondary and tertiary structure as well an updated spectrum viewer were added. Furthermore, we integrated ProteomicsDB with our deep-neural-network Prosit that can predict the fragmentation characteristics and retention time of peptides. The result is an automatic processing pipeline that can be used to reevaluate database search engine results stored in ProteomicsDB. In addition, we extended the data content with experiments investigating different human biology as well as a newly supported organism.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Proteômica/classificação , Software , Disciplinas das Ciências Biológicas , Humanos , Redes Neurais de Computação , Proteínas/química
7.
Nat Methods ; 17(5): 495-503, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32284610

RESUMO

We have used a mass spectrometry-based proteomic approach to compile an atlas of the thermal stability of 48,000 proteins across 13 species ranging from archaea to humans and covering melting temperatures of 30-90 °C. Protein sequence, composition and size affect thermal stability in prokaryotes and eukaryotic proteins show a nonlinear relationship between the degree of disordered protein structure and thermal stability. The data indicate that evolutionary conservation of protein complexes is reflected by similar thermal stability of their proteins, and we show examples in which genomic alterations can affect thermal stability. Proteins of the respiratory chain were found to be very stable in many organisms, and human mitochondria showed close to normal respiration at 46 °C. We also noted cell-type-specific effects that can affect protein stability or the efficacy of drugs. This meltome atlas broadly defines the proteome amenable to thermal profiling in biology and drug discovery and can be explored online at http://meltomeatlas.proteomics.wzw.tum.de:5003/ and http://www.proteomicsdb.org.


Assuntos
Regulação da Expressão Gênica , Células Procarióticas/metabolismo , Proteínas/química , Proteínas/metabolismo , Proteoma/análise , Temperatura de Transição , Animais , Complexo de Proteínas da Cadeia de Transporte de Elétrons/metabolismo , Humanos , Mitocôndrias/metabolismo , Estabilidade Proteica , Software , Especificidade da Espécie
8.
Nat Methods ; 16(6): 509-518, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31133760

RESUMO

In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools synthetic peptide library to 550,000 tryptic peptides and 21 million high-quality tandem mass spectra. We trained a deep neural network, termed Prosit, resulting in chromatographic retention time and fragment ion intensity predictions that exceed the quality of the experimental data. Integrating Prosit into database search pipelines led to more identifications at >10× lower false discovery rates. We show the general applicability of Prosit by predicting spectra for proteases other than trypsin, generating spectral libraries for data-independent acquisition and improving the analysis of metaproteomes. Prosit is integrated into ProteomicsDB, allowing search result re-scoring and custom spectral library generation for any organism on the basis of peptide sequence alone.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Fragmentos de Peptídeos/análise , Biblioteca de Peptídeos , Proteoma/análise , Software , Espectrometria de Massas em Tandem/métodos , Animais , Caenorhabditis elegans/metabolismo , Bases de Dados de Proteínas , Drosophila melanogaster/metabolismo , Células HEK293 , Humanos , Fragmentos de Peptídeos/metabolismo , Proteoma/metabolismo , Saccharomyces cerevisiae/metabolismo
9.
Nucleic Acids Res ; 48(D1): D1153-D1163, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31665479

RESUMO

ProteomicsDB (https://www.ProteomicsDB.org) started as a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. The data types and contents grew over time to include RNA-Seq expression data, drug-target interactions and cell line viability data. In this manuscript, we summarize new developments since the previous update that was published in Nucleic Acids Research in 2017. Over the past two years, we have enriched the data content by additional datasets and extended the platform to support protein turnover data. Another important new addition is that ProteomicsDB now supports the storage and visualization of data collected from other organisms, exemplified by Arabidopsis thaliana. Due to the generic design of ProteomicsDB, all analytical features available for the original human resource seamlessly transfer to other organisms. Furthermore, we introduce a new service in ProteomicsDB which allows users to upload their own expression datasets and analyze them alongside with data stored in ProteomicsDB. Initially, users will be able to make use of this feature in the interactive heat map functionality as well as the drug sensitivity prediction, but ultimately will be able to use all analytical features of ProteomicsDB in this way.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteômica/métodos , Pesquisa , Descoberta de Drogas , Software , Interface Usuário-Computador , Navegador
10.
J Proteome Res ; 20(6): 3388-3394, 2021 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-33970638

RESUMO

Here, we present the Universal Spectrum Explorer (USE), a web-based tool based on IPSA for cross-resource (peptide) spectrum visualization and comparison (https://www.proteomicsdb.org/use/). Mass spectra under investigation can be either provided manually by the user (table format) or automatically retrieved from online repositories supporting access to spectral data via the universal spectrum identifier (USI), or requested from other resources and services implementing a newly designed REST interface. As a proof of principle, we implemented such an interface in ProteomicsDB thereby allowing the retrieval of spectra acquired within the ProteomeTools project or real-time prediction of tandem mass spectra from the deep learning framework Prosit. Annotated mirror spectrum plots can be exported from the USE as editable scalable high-quality vector graphics. The USE was designed and implemented with minimal external dependencies allowing local usage and integration into other web sites (https://github.com/kusterlab/universal_spectrum_explorer).


Assuntos
Software , Espectrometria de Massas em Tandem , Internet , Peptídeos
11.
Mol Cell Proteomics ; 17(5): 974-992, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29414762

RESUMO

The coordination of protein synthesis and degradation regulating protein abundance is a fundamental process in cellular homeostasis. Today, mass spectrometry-based technologies allow determination of endogenous protein turnover on a proteome-wide scale. However, standard dynamic SILAC (Stable Isotope Labeling in Cell Culture) approaches can suffer from missing data across pulse time-points limiting the accuracy of such analysis. This issue is of particular relevance when studying protein stability at the level of proteoforms because often only single peptides distinguish between different protein products of the same gene. To address this shortcoming, we evaluated the merits of combining dynamic SILAC and tandem mass tag (TMT)-labeling of ten pulse time-points in a single experiment. Although the comparison to the standard dynamic SILAC method showed a high concordance of protein turnover rates, the pulsed SILAC-TMT approach yielded more comprehensive data (6000 proteins on average) without missing values. Replicate analysis further established that the same reproducibility of turnover rate determination can be obtained for peptides and proteins facilitating proteoform resolved investigation of protein stability. We provide several examples of differentially turned over splice variants and show that post-translational modifications can affect cellular protein half-lives. For example, N-terminally processed peptides exhibited both faster and slower turnover behavior compared with other peptides of the same protein. In addition, the suspected proteolytic processing of the fusion protein FAU was substantiated by measuring vastly different stabilities of the cleavage products. Furthermore, differential peptide turnover suggested a previously unknown mechanism of activity regulation by post-translational destabilization of cathepsin D as well as the DNA helicase BLM. Finally, our comprehensive data set facilitated a detailed evaluation of the impact of protein properties and functions on protein stability in steady-state cells and uncovered that the high turnover of respiratory chain complex I proteins might be explained by oxidative stress.


Assuntos
Peptídeos/metabolismo , Proteoma/metabolismo , Proteômica/métodos , Estabilidade Enzimática , Meia-Vida , Células HeLa , Humanos , Marcação por Isótopo , NADH Desidrogenase/metabolismo , Estresse Oxidativo/efeitos dos fármacos , Biossíntese de Proteínas , Proteólise , Reprodutibilidade dos Testes
12.
Nucleic Acids Res ; 46(D1): D1271-D1281, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29106664

RESUMO

ProteomicsDB (https://www.ProteomicsDB.org) is a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. ProteomicsDB was first released in 2014 to enable the interactive exploration of the first draft of the human proteome. To date, it contains quantitative data from 78 projects totalling over 19k LC-MS/MS experiments. A standardized analysis pipeline enables comparisons between multiple datasets to facilitate the exploration of protein expression across hundreds of tissues, body fluids and cell lines. We recently extended the data model to enable the storage and integrated visualization of other quantitative omics data. This includes transcriptomics data from e.g. NCBI GEO, protein-protein interaction information from STRING, functional annotations from KEGG, drug-sensitivity/selectivity data from several public sources and reference mass spectra from the ProteomeTools project. The extended functionality transforms ProteomicsDB into a multi-purpose resource connecting quantification and meta-data for each protein. The rich user interface helps researchers to navigate all data sources in either a protein-centric or multi-protein-centric manner. Several options are available to download data manually, while our application programming interface enables accessing quantitative data systematically.


Assuntos
Bases de Dados de Proteínas , Espectrometria de Massas em Tandem , Sobrevivência Celular , Apresentação de Dados , Humanos , Internet , Preparações Farmacêuticas/metabolismo , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Proteômica
13.
Science ; 380(6640): 93-101, 2023 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-36926954

RESUMO

Although most cancer drugs modulate the activities of cellular pathways by changing posttranslational modifications (PTMs), little is known regarding the extent and the time- and dose-response characteristics of drug-regulated PTMs. In this work, we introduce a proteomic assay called decryptM that quantifies drug-PTM modulation for thousands of PTMs in cells to shed light on target engagement and drug mechanism of action. Examples range from detecting DNA damage by chemotherapeutics, to identifying drug-specific PTM signatures of kinase inhibitors, to demonstrating that rituximab kills CD20-positive B cells by overactivating B cell receptor signaling. DecryptM profiling of 31 cancer drugs in 13 cell lines demonstrates the broad applicability of the approach. The resulting 1.8 million dose-response curves are provided as an interactive molecular resource in ProteomicsDB.


Assuntos
Antineoplásicos , Apoptose , Processamento de Proteína Pós-Traducional , Proteômica , Antígenos CD20/metabolismo , Antineoplásicos/farmacologia , Apoptose/efeitos dos fármacos , Linfócitos B/efeitos dos fármacos , Linhagem Celular Tumoral , Dano ao DNA , Processamento de Proteína Pós-Traducional/efeitos dos fármacos , Proteômica/métodos , Receptores de Antígenos de Linfócitos B/metabolismo , Transdução de Sinais , Humanos
14.
Cell Rep ; 38(13): 110604, 2022 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-35354033

RESUMO

Primary human hepatocytes are widely used to evaluate liver toxicity of drugs, but they are scarce and demanding to culture. Stem cell-derived hepatocytes are increasingly discussed as alternatives. To obtain a better appreciation of the molecular processes during the differentiation of induced pluripotent stem cells into hepatocytes, we employ a quantitative proteomic approach to follow the expression of 9,000 proteins, 12,000 phosphorylation sites, and 800 acetylation sites over time. The analysis reveals stage-specific markers, a major molecular switch between hepatic endoderm versus immature hepatocyte-like cells impacting, e.g., metabolism, the cell cycle, kinase activity, and the expression of drug transporters. Comparing the proteomes of two- (2D) and three-dimensional (3D)-derived hepatocytes with fetal and adult liver indicates a fetal-like status of the in vitro models and lower expression of important ADME/Tox proteins. The collective data enable constructing a molecular roadmap of hepatocyte development that serves as a valuable resource for future research.


Assuntos
Células-Tronco Pluripotentes Induzidas , Proteoma , Adulto , Diferenciação Celular , Hepatócitos/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Proteoma/metabolismo , Proteômica
15.
Nat Commun ; 12(1): 3346, 2021 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-34099720

RESUMO

Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows.


Assuntos
Aprendizado Profundo , Peptídeos/imunologia , Espectrometria de Massas em Tandem/métodos , Linhagem Celular , Epitopos , Proteínas da Matriz Extracelular/metabolismo , Antígenos HLA/imunologia , Antígenos de Histocompatibilidade Classe I/metabolismo , Antígenos de Histocompatibilidade Classe II/metabolismo , Humanos , Ligantes , Espectrometria de Massas , Medicina Molecular , Peptídeos/metabolismo , Proteômica
16.
Nat Commun ; 12(1): 5854, 2021 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-34615866

RESUMO

The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.


Assuntos
Análise de Dados , Bases de Dados de Proteínas , Metadados , Proteômica , Big Data , Humanos , Reprodutibilidade dos Testes , Software , Transcriptoma
17.
Sci Data ; 7(1): 334, 2020 10 09.
Artigo em Inglês | MEDLINE | ID: mdl-33037224

RESUMO

Plant growth and development are regulated by a tightly controlled interplay between cell division, cell expansion and cell differentiation during the entire plant life cycle from seed germination to maturity and seed propagation. To explore some of the underlying molecular mechanisms in more detail, we selected different aerial tissue types of the model plant Arabidopsis thaliana, namely rosette leaf, flower and silique/seed and performed proteomic, phosphoproteomic and transcriptomic analyses of sequential growth stages using tandem mass tag-based mass spectrometry and RNA sequencing. With this exploratory multi-omics dataset, development dynamics of photosynthetic tissues can be investigated from different angles. As expected, we found progressive global expression changes between growth stages for all three omics types and often but not always corresponding expression patterns for individual genes on transcript, protein and phosphorylation site level. The biggest difference between proteomic- and transcriptomic-based expression information could be observed for seed samples. Proteomic and transcriptomic data is available via ProteomeXchange and ArrayExpress with the respective identifiers PXD018814 and E-MTAB-7978.


Assuntos
Arabidopsis , Proteoma , Arabidopsis/genética , Perfilação da Expressão Gênica , Proteoma/genética , Proteômica , Transcriptoma
18.
Nat Commun ; 11(1): 3639, 2020 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-32686665

RESUMO

Integrated analysis of genomes, transcriptomes, proteomes and drug responses of cancer cell lines (CCLs) is an emerging approach to uncover molecular mechanisms of drug action. We extend this paradigm to measuring proteome activity landscapes by acquiring and integrating quantitative data for 10,000 proteins and 55,000 phosphorylation sites (p-sites) from 125 CCLs. These data are used to contextualize proteins and p-sites and predict drug sensitivity. For example, we find that Progesterone Receptor (PGR) phosphorylation is associated with sensitivity to drugs modulating estrogen signaling such as Raloxifene. We also demonstrate that Adenylate kinase isoenzyme 1 (AK1) inactivates antimetabolites like Cytarabine. Consequently, high AK1 levels correlate with poor survival of Cytarabine-treated acute myeloid leukemia patients, qualifying AK1 as a patient stratification marker and possibly as a drug target. We provide an interactive web application termed ATLANTiC (http://atlantic.proteomics.wzw.tum.de), which enables the community to explore the thousands of novel functional associations generated by this work.


Assuntos
Antineoplásicos/farmacologia , Neoplasias/tratamento farmacológico , Proteoma/metabolismo , Adenilato Quinase/metabolismo , Animais , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/metabolismo , Linhagem Celular Tumoral , Biologia Computacional , Simulação por Computador , Citarabina/metabolismo , Citarabina/farmacologia , Desenvolvimento de Medicamentos , Genômica , Humanos , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/metabolismo , Neoplasias/metabolismo , Proteoma/genética , Proteômica , Cloridrato de Raloxifeno/metabolismo , Cloridrato de Raloxifeno/farmacologia , Receptores de Progesterona/metabolismo , Transdução de Sinais/genética , Transdução de Sinais/fisiologia
19.
Proteomes ; 7(1)2019 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-30626002

RESUMO

The microbiome has a strong impact on human health and disease and is, therefore, increasingly studied in a clinical context. Metaproteomics is also attracting considerable attention, and such data can be efficiently generated today owing to improvements in mass spectrometry-based proteomics. As we will discuss in this study, there are still major challenges notably in data analysis that need to be overcome. Here, we analyzed 212 fecal samples from 56 hospitalized acute leukemia patients with multidrug-resistant Enterobactericeae (MRE) gut colonization using metagenomics and metaproteomics. This is one of the largest clinical metaproteomic studies to date, and the first metaproteomic study addressing the gut microbiome in MRE colonized acute leukemia patients. Based on this substantial data set, we discuss major current limitations in clinical metaproteomic data analysis to provide guidance to researchers in the field. Notably, the results show that public metagenome databases are incomplete and that sample-specific metagenomes improve results. Furthermore, biological variation is tremendous which challenges clinical study designs and argues that longitudinal measurements of individual patients are a valuable future addition to the analysis of patient cohorts.

20.
Comput Biol Med ; 90: 146-154, 2017 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-28992453

RESUMO

BACKGROUND AND OBJECTIVE: Single Nucleotide Polymorphism (SNPs) are, nowadays, becoming the marker of choice for biological analyses involving a wide range of applications with great medical, biological, economic and environmental interest. Classification tasks i.e. the assignment of individuals to groups of origin based on their (multi-locus) genotypes, are performed in many fields such as forensic investigations, discrimination between wild and/or farmed populations and others. Τhese tasks, should be performed with a small number of loci, for computational as well as biological reasons. Thus, feature selection should precede classification tasks, especially for Single Nucleotide Polymorphism (SNP) datasets, where the number of features can amount to hundreds of thousands or millions. METHODS: In this paper, we present a novel data mining approach, called FIFS - Frequent Item Feature Selection, based on the use of frequent items for selection of the most informative markers from population genomic data. It is a modular method, consisting of two main components. The first one identifies the most frequent and unique genotypes for each sampled population. The second one selects the most appropriate among them, in order to create the informative SNP subsets to be returned. RESULTS: The proposed method (FIFS) was tested on a real dataset, which comprised of a comprehensive coverage of pig breed types present in Britain. This dataset consisted of 446 individuals divided in 14 sub-populations, genotyped at 59,436 SNPs. Our method outperforms the state-of-the-art and baseline methods in every case. More specifically, our method surpassed the assignment accuracy threshold of 95% needing only half the number of SNPs selected by other methods (FIFS: 28 SNPs, Delta: 70 SNPs Pairwise FST: 70 SNPs, In: 100 SNPs.) CONCLUSION: Our approach successfully deals with the problem of informative marker selection in high dimensional genomic datasets. It offers better results compared to existing approaches and can aid biologists in selecting the most informative markers with maximum discrimination power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensics.


Assuntos
Mineração de Dados/métodos , Bases de Dados de Ácidos Nucleicos , Genômica , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Marcadores Genéticos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA