Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Mol Cell Proteomics ; 18(9): 1880-1892, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31235637

RESUMEN

Mass spectrometry based proteomics is the method of choice for quantifying genome-wide differential changes of protein expression in a wide range of biological and biomedical applications. Protein expression changes need to be reliably derived from many measured peptide intensities and their corresponding peptide fold changes. These peptide fold changes vary considerably for a given protein. Numerous instrumental setups aim to reduce this variability, whereas current computational methods only implicitly account for this problem. We introduce a new method, MS-EmpiRe, which explicitly accounts for the noise underlying peptide fold changes. We derive data set-specific, intensity-dependent empirical error fold change distributions, which are used for individual weighing of peptide fold changes to detect differentially expressed proteins (DEPs).In a recently published proteome-wide benchmarking data set, MS-EmpiRe doubles the number of correctly identified DEPs at an estimated FDR cutoff compared with state-of-the-art tools. We additionally confirm the superior performance of MS-EmpiRe on simulated data. MS-EmpiRe requires only peptide intensities mapped to proteins and, thus, can be applied to any common quantitative proteomics setup. We apply our method to diverse MS data sets and observe consistent increases in sensitivity with more than 1000 additional significant proteins in deep data sets, including a clinical study over multiple patients. At the same time, we observe that even the proteins classified as most insignificant by other methods but significant by MS-EmpiRe show very clear regulation on the peptide intensity level. MS-EmpiRe provides rapid processing (< 2 min for 6 LC-MS/MS runs (3 h gradients)) and is publicly available under github.com/zimmerlab/MS-EmpiRe with a manual including examples.


Asunto(s)
Espectrometría de Masas/métodos , Péptidos/análisis , Proteoma/análisis , Proteómica/métodos , Programas Informáticos , Enfermedad de Alzheimer/metabolismo , Benchmarking , Bases de Datos Factuales , Francisella/metabolismo , Proteínas Fúngicas/análisis , Células HeLa , Humanos , Enfermedad de Parkinson/metabolismo , Proteínas de Plantas/análisis , Reproducibilidad de los Resultados , Relación Señal-Ruido
2.
Bioinformatics ; 35(18): 3412-3420, 2019 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-30759193

RESUMEN

MOTIVATION: Several gene expression-based risk scores and subtype classifiers for breast cancer were developed to distinguish high- and low-risk patients. Evaluating the performance of these classifiers helps to decide which classifiers should be used in clinical practice for personal therapeutic recommendations. So far, studies that compared multiple classifiers in large independent patient cohorts mostly used microarray measurements. qPCR-based classifiers were not included in the comparison or had to be adapted to the different experimental platforms. RESULTS: We used a prospective study of 726 early breast cancer patients from seven certified German breast cancer centers. Patients were treated according to national guidelines and the expressions of 94 selected genes were measured by the mid-throughput qPCR platform Fluidigm. Clinical and pathological data including outcome over five years is available. Using these data, we could compare the performance of six classifiers (scmgene and research versions of PAM50, ROR-S, recurrence score, EndoPredict and GGI). Similar to other studies, we found a similar or even higher concordance between most of the classifiers and most were also able to differentiate high- and low-risk patients. The classifiers that were originally developed for microarray data still performed similarly using the Fluidigm data. Therefore, Fluidigm can be used to measure the gene expressions needed by several classifiers for a large cohort with little effort. In addition, we provide an interactive report of the results, which enables a transparent, in-depth comparison of classifiers and their prediction of individual patients. AVAILABILITY AND IMPLEMENTATION: https://services.bio.ifi.lmu.de/pia/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Neoplasias de la Mama , Humanos , Recurrencia Local de Neoplasia , Estudios Prospectivos , Reacción en Cadena en Tiempo Real de la Polimerasa , Riesgo
3.
Bioinformatics ; 33(12): 1837-1844, 2017 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-28165113

RESUMEN

MOTIVATION: The goal of many genome-wide experiments is to explain the changes between the analyzed conditions. Typically, the analysis is started with a set of differential genes DG and the first step is to identify the set of relevant biological processes BP . Current enrichment methods identify the involved biological process via statistically significant overrepresentation of differential genes in predefined sets, but do not further explain how the differential genes interact with each other or which other genes might be important for the enriched process. Other network-based methods determine subnetworks of interacting genes containing many differential genes, but do not employ process knowledge for a more focused analysis. RESULTS: RelExplain is a method to analyze a given biological process bp (e.g. identified by enrichment) in more detail by computing an explanation using the measured DG and a given network. An explanation is a subnetwork that contains the differential genes in the process bp and connects them in the best way given the experimental data using also genes that are not differential or not in bp . RelExplain takes into account the functional annotations of nodes and the edge consistency of the measurements. Explanations are compact networks of the relevant part of the bp and additional nodes that might be important for the bp . Our evaluation showed that RelExplain is better suited to retrieve manually curated subnetworks from unspecific networks than other algorithms. The interactive RelExplain tool allows to compute and inspect sub-optimal and alternative optimal explanations. AVAILABILITY AND IMPLEMENTATION: A webserver is available at https://services.bio.ifi.lmu.de/relexplain . CONTACT: berchtold@bio.ifi.lmu.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Redes y Vías Metabólicas , Programas Informáticos , Algoritmos , Fenómenos Biológicos , Neoplasias de la Mama/metabolismo , Humanos , Anotación de Secuencia Molecular/métodos
4.
PLoS One ; 8(9): e73071, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24019895

RESUMEN

RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.


Asunto(s)
Minería de Datos , Infecciones/genética , Análisis de Secuencia de ARN , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/microbiología , Células HeLa , Humanos , Microbiota
5.
Bioinformatics ; 27(13): i366-73, 2011 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-21685094

RESUMEN

MOTIVATION: Current gene set enrichment approaches do not take interactions and associations between set members into account. Mutual activation and inhibition causing positive and negative correlation among set members are thus neglected. As a consequence, inconsistent regulations and contextless expression changes are reported and, thus, the biological interpretation of the result is impeded. RESULTS: We analyzed established gene set enrichment methods and their result sets in a large-scale investigation of 1000 expression datasets. The reported statistically significant gene sets exhibit only average consistency between the observed patterns of differential expression and known regulatory interactions. We present Gene Graph Enrichment Analysis (GGEA) to detect consistently and coherently enriched gene sets, based on prior knowledge derived from directed gene regulatory networks. Firstly, GGEA improves the concordance of pairwise regulation with individual expression changes in respective pairs of regulating and regulated genes, compared with set enrichment methods. Secondly, GGEA yields result sets where a large fraction of relevant expression changes can be explained by nearby regulators, such as transcription factors, again improving on set-based methods. Thirdly, we demonstrate in additional case studies that GGEA can be applied to human regulatory pathways, where it sensitively detects very specific regulation processes, which are altered in tumors of the central nervous system. GGEA significantly increases the detection of gene sets where measured positively or negatively correlated expression patterns coincide with directed inducing or repressing relationships, thus facilitating further interpretation of gene expression data. AVAILABILITY: The method and accompanying visualization capabilities have been bundled into an R package and tied to a grahical user interface, the Galaxy workflow environment, that is running as a web server. CONTACT: Ludwig.Geistlinger@bio.ifi.lmu.de; Ralf.Zimmer@bio.ifi.lmu.de.


Asunto(s)
Perfilación de la Expresión Génica , Neoplasias de Tejido Nervioso/genética , Neoplasias de Tejido Nervioso/metabolismo , Programas Informáticos , Algoritmos , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Proteínas/genética , Transducción de Señal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA