Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
FEBS Lett ; 598(6): 635-657, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38366111

RESUMO

The response to proteotoxic stresses such as heat shock allows organisms to maintain protein homeostasis under changing environmental conditions. We asked what happens if an organism can no longer react to cytosolic proteotoxic stress. To test this, we deleted or depleted, either individually or in combination, the stress-responsive transcription factors Msn2, Msn4, and Hsf1 in Saccharomyces cerevisiae. Our study reveals a combination of survival strategies, which together protect essential proteins. Msn2 and 4 broadly reprogram transcription, triggering the response to oxidative stress, as well as biosynthesis of the protective sugar trehalose and glycolytic enzymes, while Hsf1 mainly induces the synthesis of molecular chaperones and reverses the transcriptional response upon prolonged mild heat stress (adaptation).


Assuntos
Proteínas de Saccharomyces cerevisiae , Fatores de Transcrição , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Fatores de Transcrição de Choque Térmico/genética , Fatores de Transcrição de Choque Térmico/metabolismo , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Resposta ao Choque Térmico/genética , Estresse Proteotóxico , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo
2.
Blood ; 141(6): 645-658, 2023 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-36223592

RESUMO

The mechanisms of coordinated changes in proteome composition and their relevance for the differentiation of neutrophil granulocytes are not well studied. Here, we discover 2 novel human genetic defects in signal recognition particle receptor alpha (SRPRA) and SRP19, constituents of the mammalian cotranslational targeting machinery, and characterize their roles in neutrophil granulocyte differentiation. We systematically study the proteome of neutrophil granulocytes from patients with variants in the SRP genes, HAX1, and ELANE, and identify global as well as specific proteome aberrations. Using in vitro differentiation of human induced pluripotent stem cells and in vivo zebrafish models, we study the effects of SRP deficiency on neutrophil granulocyte development. In a heterologous cell-based inducible protein expression system, we validate the effects conferred by SRP dysfunction for selected proteins that we identified in our proteome screen. Thus, SRP-dependent protein processing, intracellular trafficking, and homeostasis are critically important for the differentiation of neutrophil granulocytes.


Assuntos
Células-Tronco Pluripotentes Induzidas , Proteoma , Animais , Humanos , Peixe-Zebra , Genética Humana , Mamíferos , Proteínas Adaptadoras de Transdução de Sinal
3.
Brief Bioinform ; 22(1): 545-556, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-32026945

RESUMO

MOTIVATION: Although gene set enrichment analysis has become an integral part of high-throughput gene expression data analysis, the assessment of enrichment methods remains rudimentary and ad hoc. In the absence of suitable gold standards, evaluations are commonly restricted to selected datasets and biological reasoning on the relevance of resulting enriched gene sets. RESULTS: We develop an extensible framework for reproducible benchmarking of enrichment methods based on defined criteria for applicability, gene set prioritization and detection of relevant processes. This framework incorporates a curated compendium of 75 expression datasets investigating 42 human diseases. The compendium features microarray and RNA-seq measurements, and each dataset is associated with a precompiled GO/KEGG relevance ranking for the corresponding disease under investigation. We perform a comprehensive assessment of 10 major enrichment methods, identifying significant differences in runtime and applicability to RNA-seq data, fraction of enriched gene sets depending on the null hypothesis tested and recovery of the predefined relevance rankings. We make practical recommendations on how methods originally developed for microarray data can efficiently be applied to RNA-seq data, how to interpret results depending on the type of gene set test conducted and which methods are best suited to effectively prioritize gene sets with high phenotype relevance. AVAILABILITY: http://bioconductor.org/packages/GSEABenchmarkeR. CONTACT: ludwig.geistlinger@sph.cuny.edu.


Assuntos
Perfilação da Expressão Gênica/métodos , Genômica/métodos , RNA-Seq/métodos , Animais , Benchmarking , Bases de Dados Genéticas/normas , Perfilação da Expressão Gênica/normas , Genômica/normas , Humanos , RNA-Seq/normas , Software
4.
Cell Rep ; 29(13): 4593-4607.e8, 2019 12 24.
Artigo em Inglês | MEDLINE | ID: mdl-31875563

RESUMO

Life is resilient because living systems are able to respond to elevated temperatures with an ancient gene expression program called the heat shock response (HSR). In yeast, the transcription of hundreds of genes is upregulated at stress temperatures. Besides stress protection conferred by chaperones, the function of the majority of the upregulated genes under stress has remained enigmatic. We show that those genes are required to directly counterbalance increased protein turnover at stress temperatures and to maintain the metabolism. This anaplerotic reaction together with molecular chaperones allows yeast to efficiently buffer proteotoxic stress. When the capacity of this system is exhausted at extreme temperatures, aggregation processes stop translation and growth pauses. The emerging concept is that the HSR is modular with distinct programs dependent on the severity of the stress.


Assuntos
Resposta ao Choque Térmico , Chaperonas Moleculares/metabolismo , Proteostase , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Regulação Fúngica da Expressão Gênica , Resposta ao Choque Térmico/genética , Cinética , Modelos Genéticos , Agregados Proteicos , Biossíntese de Proteínas , Proteólise , Proteoma/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Saccharomyces cerevisiae/genética , Transcriptoma/genética
5.
Biotechnol Biofuels ; 12: 243, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31636702

RESUMO

BACKGROUND: One of the main obstacles preventing solventogenic clostridia from achieving higher yields in biofuel production is the toxicity of produced solvents. Unfortunately, regulatory mechanisms responsible for the shock response are poorly described on the transcriptomic level. Although the strain Clostridium beijerinckii NRRL B-598, a promising butanol producer, has been studied under different conditions in the past, its transcriptional response to a shock caused by butanol in the cultivation medium remains unknown. RESULTS: In this paper, we present a transcriptional response of the strain during a butanol challenge, caused by the addition of butanol to the cultivation medium at the very end of the acidogenic phase, using RNA-Seq. We resequenced and reassembled the genome sequence of the strain and prepared novel genome and gene ontology annotation to provide the most accurate results. When compared to samples under standard cultivation conditions, samples gathered during butanol shock represented a well-distinguished group. Using reference samples gathered directly before the addition of butanol, we identified genes that were differentially expressed in butanol challenge samples. We determined clusters of 293 down-regulated and 301 up-regulated genes whose expression was affected by the cultivation conditions. Enriched term "RNA binding" among down-regulated genes corresponded to the downturn of translation and the cluster contained a group of small acid-soluble spore proteins. This explained phenotype of the culture that had not sporulated. On the other hand, up-regulated genes were characterized by the term "protein binding" which corresponded to activation of heat-shock proteins that were identified within this cluster. CONCLUSIONS: We provided an overall transcriptional response of the strain C. beijerinckii NRRL B-598 to butanol shock, supplemented by auxiliary technologies, including high-pressure liquid chromatography and flow cytometry, to capture the corresponding phenotypic response. We identified genes whose regulation was affected by the addition of butanol to the cultivation medium and inferred related molecular functions that were significantly influenced. Additionally, using high-quality genome assembly and custom-made gene ontology annotation, we demonstrated that this settled terminology, widely used for the analysis of model organisms, could also be applied to non-model organisms and for research in the field of biofuels.

6.
Mol Cell Proteomics ; 18(9): 1880-1892, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31235637

RESUMO

Mass spectrometry based proteomics is the method of choice for quantifying genome-wide differential changes of protein expression in a wide range of biological and biomedical applications. Protein expression changes need to be reliably derived from many measured peptide intensities and their corresponding peptide fold changes. These peptide fold changes vary considerably for a given protein. Numerous instrumental setups aim to reduce this variability, whereas current computational methods only implicitly account for this problem. We introduce a new method, MS-EmpiRe, which explicitly accounts for the noise underlying peptide fold changes. We derive data set-specific, intensity-dependent empirical error fold change distributions, which are used for individual weighing of peptide fold changes to detect differentially expressed proteins (DEPs).In a recently published proteome-wide benchmarking data set, MS-EmpiRe doubles the number of correctly identified DEPs at an estimated FDR cutoff compared with state-of-the-art tools. We additionally confirm the superior performance of MS-EmpiRe on simulated data. MS-EmpiRe requires only peptide intensities mapped to proteins and, thus, can be applied to any common quantitative proteomics setup. We apply our method to diverse MS data sets and observe consistent increases in sensitivity with more than 1000 additional significant proteins in deep data sets, including a clinical study over multiple patients. At the same time, we observe that even the proteins classified as most insignificant by other methods but significant by MS-EmpiRe show very clear regulation on the peptide intensity level. MS-EmpiRe provides rapid processing (< 2 min for 6 LC-MS/MS runs (3 h gradients)) and is publicly available under github.com/zimmerlab/MS-EmpiRe with a manual including examples.


Assuntos
Espectrometria de Massas/métodos , Peptídeos/análise , Proteoma/análise , Proteômica/métodos , Software , Doença de Alzheimer/metabolismo , Benchmarking , Bases de Dados Factuais , Francisella/metabolismo , Proteínas Fúngicas/análise , Células HeLa , Humanos , Doença de Parkinson/metabolismo , Proteínas de Plantas/análise , Reprodutibilidade dos Testes , Razão Sinal-Ruído
7.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30821814

RESUMO

The stress response in the model organisms Saccharomyces cerevisiae is a well-studied system for which many data sets are available. Already in 2000, it was discovered that yeast cells trigger a similar transcriptional response when different types of stress are applied. However, the exact regulatory mechanisms and differences between the different types of stress are still not understood. Here, we present the Yeast Environmental Stress database (YESdb), a database containing all high-throughput experiments measuring various kinds of stress in yeast. The goal of the database is to allow the user to execute complex, integrative analyses of selected data sets, e.g. the comparison of measurements of the same stress using different platforms or differences between strains, stress strengths or types of stress. The analyses can be visualized in various ways and can be compiled into interactive reports to summarize and communicate the results. The data sets are available as differential conditions (typically stressed vs control), which are grouped to time or concentration series when multiple measurements over time or concentrations are done in one experiment. An annotation ontology has been constructed to annotate the data sets with the type, duration and strength of the applied stress, the used strain and experimental platform as well as the publication date. These annotations can easily be combined to select all relevant data sets for an analysis. YESdb allows to construct and execute Petri net-based workflows to perform predefined and custom analyses. E.g. to compare two types of stress (e.g. salt vs oxidative stress), the corresponding data sets are selected from the database, the consistently changed genes are defined and combined and the shared genes are characterized by enrichment analysis. A broad collection of visualizations is available most of which are also interactive. The results of all analyses can be summarized in an interactive report. Visualizations of individual steps (transitions) of YESdb workflows can be automatically added to this report or customized visualizations as well as interpretive text can manually be added to the report. Overall, YESdb aims at making all published data sets on yeast stress immediately available and comparable for integrated analysis of data sets and sets of genes in order to identify and assess hypotheses and mechanisms.


Assuntos
Bases de Dados Factuais , Meio Ambiente , Saccharomyces cerevisiae/fisiologia , Estresse Fisiológico , Curadoria de Dados , Internet , Interface Usuário-Computador
8.
Bioinformatics ; 35(18): 3412-3420, 2019 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-30759193

RESUMO

MOTIVATION: Several gene expression-based risk scores and subtype classifiers for breast cancer were developed to distinguish high- and low-risk patients. Evaluating the performance of these classifiers helps to decide which classifiers should be used in clinical practice for personal therapeutic recommendations. So far, studies that compared multiple classifiers in large independent patient cohorts mostly used microarray measurements. qPCR-based classifiers were not included in the comparison or had to be adapted to the different experimental platforms. RESULTS: We used a prospective study of 726 early breast cancer patients from seven certified German breast cancer centers. Patients were treated according to national guidelines and the expressions of 94 selected genes were measured by the mid-throughput qPCR platform Fluidigm. Clinical and pathological data including outcome over five years is available. Using these data, we could compare the performance of six classifiers (scmgene and research versions of PAM50, ROR-S, recurrence score, EndoPredict and GGI). Similar to other studies, we found a similar or even higher concordance between most of the classifiers and most were also able to differentiate high- and low-risk patients. The classifiers that were originally developed for microarray data still performed similarly using the Fluidigm data. Therefore, Fluidigm can be used to measure the gene expressions needed by several classifiers for a large cohort with little effort. In addition, we provide an interactive report of the results, which enables a transparent, in-depth comparison of classifiers and their prediction of individual patients. AVAILABILITY AND IMPLEMENTATION: https://services.bio.ifi.lmu.de/pia/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias da Mama , Humanos , Recidiva Local de Neoplasia , Estudos Prospectivos , Reação em Cadeia da Polimerase em Tempo Real , Risco
9.
J Proteome Res ; 18(4): 1553-1566, 2019 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-30793903

RESUMO

Spectral libraries play a central role in the analysis of data-independent-acquisition (DIA) proteomics experiments. A main assumption in current spectral library tools is that a single characteristic intensity pattern (CIP) suffices to describe the fragmentation of a peptide in a particular charge state (peptide charge pair). However, we find that this is often not the case. We carry out a systematic evaluation of spectral variability over public repositories and in-house data sets. We show that spectral variability is widespread and partly occurs under fixed experimental conditions. Using clustering of preprocessed spectra, we derive a limited number of multiple characteristic intensity patterns (MCIPs) for each peptide charge pair, which allow almost complete coverage of our heterogeneous data set without affecting the false discovery rate. We show that a MCIP library derived from public repositories performs in most cases similar to a "custom-made" spectral library, which has been acquired under identical experimental conditions as the query spectra. We apply the MCIP approach to a DIA data set and observe a significant increase in peptide recognition. We propose the MCIP approach as an easy-to-implement addition to current spectral library search engines and as a new way to utilize the data stored in spectral repositories.


Assuntos
Cromatografia Líquida , Bases de Dados de Proteínas , Biblioteca de Peptídeos , Proteômica/métodos , Espectrometria de Massas em Tandem , Algoritmos , Fragmentos de Peptídeos/química , Fragmentos de Peptídeos/genética
10.
Nat Commun ; 9(1): 2645, 2018 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-29980665

RESUMO

Blood flow at arterial bifurcations and curvatures is naturally disturbed. Endothelial cells (ECs) fail to adapt to disturbed flow, which transcriptionally direct ECs toward a maladapted phenotype, characterized by chronic regeneration of injured ECs. MicroRNAs (miRNAs) can regulate EC maladaptation through targeting of protein-coding RNAs. However, long noncoding RNAs (lncRNAs), known epigenetic regulators of biological processes, can also be miRNA targets, but their contribution on EC maladaptation is unclear. Here we show that hyperlipidemia- and oxLDL-induced upregulation of miR-103 inhibits EC proliferation and promotes endothelial DNA damage through targeting of novel lncWDR59. MiR-103 impedes lncWDR59 interaction with Notch1-inhibitor Numb, therefore affecting Notch1-induced EC proliferation. Moreover, miR-103 increases the susceptibility of proliferating ECs to oxLDL-induced mitotic aberrations, characterized by an increased micronucleic formation and DNA damage accumulation, by affecting Notch1-related ß-catenin co-activation. Collectively, these data indicate that miR-103 programs ECs toward a maladapted phenotype through targeting of lncWDR59, which may promote atherosclerosis.


Assuntos
Células Endoteliais/metabolismo , MicroRNAs/metabolismo , RNA Longo não Codificante/metabolismo , Animais , Aterosclerose/genética , Aterosclerose/patologia , Sequência de Bases , Proliferação de Células , Dano ao DNA , Regulação da Expressão Gênica , Proteínas HMGB/metabolismo , Humanos , Lipoproteínas LDL , Proteínas de Membrana/metabolismo , Camundongos , MicroRNAs/genética , Micronúcleos com Defeito Cromossômico , Proteínas do Tecido Nervoso/metabolismo , RNA Longo não Codificante/genética , Receptores Notch/metabolismo , Ribonuclease III/metabolismo , Fatores de Transcrição SOXF/metabolismo , Transdução de Sinais , beta Catenina/metabolismo
11.
Bioinformatics ; 33(12): 1837-1844, 2017 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-28165113

RESUMO

MOTIVATION: The goal of many genome-wide experiments is to explain the changes between the analyzed conditions. Typically, the analysis is started with a set of differential genes DG and the first step is to identify the set of relevant biological processes BP . Current enrichment methods identify the involved biological process via statistically significant overrepresentation of differential genes in predefined sets, but do not further explain how the differential genes interact with each other or which other genes might be important for the enriched process. Other network-based methods determine subnetworks of interacting genes containing many differential genes, but do not employ process knowledge for a more focused analysis. RESULTS: RelExplain is a method to analyze a given biological process bp (e.g. identified by enrichment) in more detail by computing an explanation using the measured DG and a given network. An explanation is a subnetwork that contains the differential genes in the process bp and connects them in the best way given the experimental data using also genes that are not differential or not in bp . RelExplain takes into account the functional annotations of nodes and the edge consistency of the measurements. Explanations are compact networks of the relevant part of the bp and additional nodes that might be important for the bp . Our evaluation showed that RelExplain is better suited to retrieve manually curated subnetworks from unspecific networks than other algorithms. The interactive RelExplain tool allows to compute and inspect sub-optimal and alternative optimal explanations. AVAILABILITY AND IMPLEMENTATION: A webserver is available at https://services.bio.ifi.lmu.de/relexplain . CONTACT: berchtold@bio.ifi.lmu.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Redes e Vias Metabólicas , Software , Algoritmos , Fenômenos Biológicos , Neoplasias da Mama/metabolismo , Humanos , Anotação de Sequência Molecular/métodos
12.
PLoS One ; 11(10): e0164513, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27723775

RESUMO

Several methods predict activity changes of transcription factors (TFs) from a given regulatory network and measured expression data. But available gene regulatory networks are incomplete and contain many condition-dependent regulations that are not relevant for the specific expression measurement. It is not known which combination of active TFs is needed to cause a change in the expression of a target gene. A method to systematically evaluate the inferred activity changes is missing. We present such an evaluation strategy that indicates for how many target genes the observed expression changes can be explained by a given set of active TFs. To overcome the problem that the exact combination of active TFs needed to activate a gene is typically not known, we assume a gene to be explained if there exists any combination for which the predicted active TFs can possibly explain the observed change of the gene. We introduce the i-score (inconsistency score), which quantifies how many genes could not be explained by the set of activity changes of TFs. We observe that, even for these minimal requirements, published methods yield many unexplained target genes, i.e. large i-scores. This holds for all methods and all expression datasets we evaluated. We provide new optimization methods to calculate the best possible (minimal) i-score given the network and measured expression data. The evaluation of this optimized i-score on a large data compendium yields many unexplained target genes for almost every case. This indicates that currently available regulatory networks are still far from being complete. Both the presented Act-SAT and Act-A* methods produce optimal sets of TF activity changes, which can be used to investigate the difficult interplay of expression and network data. A web server and a command line tool to calculate our i-score and to find the active TFs associated with the minimal i-score is available from https://services.bio.ifi.lmu.de/i-score.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Modelos Genéticos , Fatores de Transcrição/metabolismo , Animais , Humanos , Fatores de Transcrição/genética
13.
Circ Res ; 119(9): 1030-1038, 2016 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-27531933

RESUMO

RATIONALE: Atheroprogression is a consequence of nonresolved inflammation, and currently a comprehensive overview of the mechanisms preventing resolution is missing. However, in acute inflammation, resolution is known to be orchestrated by a switch from inflammatory to resolving lipid mediators. Therefore, we hypothesized that lesional lipid mediator imbalance favors atheroprogression. OBJECTIVE: To understand the lipid mediator balance during atheroprogression and to establish an interventional strategy based on the delivery of resolving lipid mediators. METHODS AND RESULTS: Aortic lipid mediator profiling of aortas from Apoe-/- mice fed a high-fat diet for 4 weeks, 8 weeks, or 4 months revealed an expansion of inflammatory lipid mediators, Leukotriene B4 and Prostaglandin E2, and a concomitant decrease of resolving lipid mediators, Resolvin D2 (RvD2) and Maresin 1 (MaR1), during advanced atherosclerosis. Functionally, aortic Leukotriene B4 and Prostaglandin E2 levels correlated with traits of plaque instability, whereas RvD2 and MaR1 levels correlated with the signs of plaque stability. In a therapeutic context, repetitive RvD2 and MaR1 delivery prevented atheroprogression as characterized by halted expansion of the necrotic core and accumulation of macrophages along with increased fibrous cap thickness and smooth muscle cell numbers. Mechanistically, RvD2 and MaR1 induced a shift in macrophage profile toward a reparative phenotype, which secondarily stimulated collagen synthesis in smooth muscle cells. CONCLUSIONS: We present evidence for the imbalance between inflammatory and resolving lipid mediators during atheroprogression. Delivery of RvD2 and MaR1 successfully prevented atheroprogression, suggesting that resolving lipid mediators potentially represent an innovative strategy to resolve arterial inflammation.


Assuntos
Aterosclerose/metabolismo , Aterosclerose/prevenção & controle , Ácidos Docosa-Hexaenoicos/metabolismo , Mediadores da Inflamação/metabolismo , Metabolismo dos Lipídeos/fisiologia , Animais , Aterosclerose/etiologia , Células Cultivadas , Dieta Hiperlipídica/efeitos adversos , Progressão da Doença , Ácidos Docosa-Hexaenoicos/administração & dosagem , Sistemas de Liberação de Medicamentos/métodos , Metabolismo dos Lipídeos/efeitos dos fármacos , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout
14.
J Mol Biol ; 428(8): 1544-57, 2016 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-26953259

RESUMO

Alternative splicing often affects structured and highly conserved regions of proteins, generating so called non-trivial splicing variants of unknown structure and cellular function. The human small G-protein Rab1A is involved in the regulation of the vesicle transfer from the ER to Golgi. A conserved non-trivial splice variant lacks nearly 40% of the sequence of the native Rab1A, including most of the regulatory interaction sites. We show that this variant of Rab1A represents a stable and folded protein, which is still able to bind nucleotides and co-localizes with membranes. Nevertheless, it should be mentioned that compared to other wild-typeRabGTPases, the measured nucleotide binding affinities are dramatically reduced in the variant studied. Furthermore, the Rab1A variant forms hetero-dimers with wild-type Rab1A and its presence in the cell enhances the efficiency of alkaline phosphatase secretion. However, this variant shows no specificity for GXP nucleotides, a constantly enhanced GTP hydrolysis activity and is no longer controlled by GEF or GAP proteins, indicating a new regulatory mechanism for the Rab1A cycle via alternative non-trivial splicing.


Assuntos
Proteínas rab1 de Ligação ao GTP/química , Processamento Alternativo , Membrana Celular/metabolismo , Evolução Molecular , Guanosina Difosfato/química , Guanosina Trifosfato/química , Humanos , Hidrólise , Nucleotídeos/química , Ligação Proteica , Dobramento de Proteína , Isoformas de Proteínas/química , Multimerização Proteica , Estrutura Terciária de Proteína , Proteoma , Proteínas rab de Ligação ao GTP/química
15.
BMC Bioinformatics ; 17: 45, 2016 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-26791995

RESUMO

BACKGROUND: Enrichment analysis of gene expression data is essential to find functional groups of genes whose interplay can explain experimental observations. Numerous methods have been published that either ignore (set-based) or incorporate (network-based) known interactions between genes. However, the often subtle benefits and disadvantages of the individual methods are confusing for most biological end users and there is currently no convenient way to combine methods for an enhanced result interpretation. RESULTS: We present the EnrichmentBrowser package as an easily applicable software that enables (1) the application of the most frequently used set-based and network-based enrichment methods, (2) their straightforward combination, and (3) a detailed and interactive visualization and exploration of the results. The package is available from the Bioconductor repository and implements additional support for standardized expression data preprocessing, differential expression analysis, and definition of suitable input gene sets and networks. CONCLUSION: The EnrichmentBrowser package implements essential functionality for the enrichment analysis of gene expression data. It combines the advantages of set-based and network-based enrichment analysis in order to derive high-confidence gene sets and biological pathways that are differentially regulated in the expression data under investigation. Besides, the package facilitates the visualization and exploration of such sets and pathways.


Assuntos
Redes Reguladoras de Genes , Análise em Microsséries/métodos , Software , Bases de Dados Factuais , Perfilação da Expressão Gênica , Análise de Sequência de RNA
16.
PLoS One ; 10(10): e0140487, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26469855

RESUMO

mRNA splicing is required in about 4% of protein coding genes in Saccharomyces cerevisiae. The gene structure of those genes is simple, generally comprising two exons and one intron. In order to characterize the impact of alternative splicing on the S. cerevisiae transcriptome, we perform a systematic analysis of mRNA sequencing data. We find evidence of a pervasive use of alternative splice sites and detect several novel introns both within and outside protein coding regions. We also find a predominance of alternative splicing on the 3' side of introns, a finding which is consistent with existing knowledge on conservation of exon-intron boundaries in S. cerevisiae. Some of the alternatively spliced transcripts allow for a translation into different protein products.


Assuntos
Processamento Alternativo , Saccharomyces cerevisiae/genética , Transcriptoma , Sequenciamento de Nucleotídeos em Larga Escala , Íntrons , Análise de Sequência de RNA
17.
BMC Bioinformatics ; 16: 122, 2015 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-25928589

RESUMO

BACKGROUND: Mapping of short sequencing reads is a crucial step in the analysis of RNA sequencing (RNA-seq) data. ContextMap is an RNA-seq mapping algorithm that uses a context-based approach to identify the best alignment for each read and allows parallel mapping against several reference genomes. RESULTS: In this article, we present ContextMap 2, a new and improved version of ContextMap. Its key novel features are: (i) a plug-in structure that allows easily integrating novel short read alignment programs with improved accuracy and runtime; (ii) context-based identification of insertions and deletions (indels); (iii) mapping of reads spanning an arbitrary number of exons and indels. ContextMap 2 using Bowtie, Bowtie 2 or BWA was evaluated on both simulated and real-life data from the recently published RGASP study. CONCLUSIONS: We show that ContextMap 2 generally combines similar or higher recall compared to other state-of-the-art approaches with significantly higher precision in read placement and junction and indel prediction. Furthermore, runtime was significantly lower than for the best competing approaches. ContextMap 2 is freely available at http://www.bio.ifi.lmu.de/ContextMap .


Assuntos
Algoritmos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Éxons/genética , Humanos , Mutação INDEL/genética , Transcriptoma
18.
PLoS One ; 8(9): e73071, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24019895

RESUMO

RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.


Assuntos
Mineração de Dados , Infecções/genética , Análise de Sequência de RNA , Neoplasias Colorretais/genética , Neoplasias Colorretais/microbiologia , Células HeLa , Humanos , Microbiota
19.
Nucleic Acids Res ; 41(18): 8452-63, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23873954

RESUMO

Existing machine-readable resources for large-scale gene regulatory networks usually do not provide context information characterizing the activating conditions for a regulation and how targeted genes are affected. Although this information is essentially required for data interpretation, available networks are often restricted to not condition-dependent, non-quantitative, plain binary interactions as derived from high-throughput screens. In this article, we present a comprehensive Petri net based regulatory network that controls the diauxic shift in Saccharomyces cerevisiae. For 100 specific enzymatic genes, we collected regulations from public databases as well as identified and manually curated >400 relevant scientific articles. The resulting network consists of >300 multi-input regulatory interactions providing (i) activating conditions for the regulators; (ii) semi-quantitative effects on their targets; and (iii) classification of the experimental evidence. The diauxic shift network compiles widespread distributed regulatory information and is available in an easy-to-use machine-readable form. Additionally, we developed a browsable system organizing the network into pathway maps, which allows to inspect and trace the evidence for each annotated regulation in the model.


Assuntos
Regulação Fúngica da Expressão Gênica , Redes Reguladoras de Genes , Saccharomyces cerevisiae/genética , Ciclo do Ácido Cítrico/genética , Ácidos Graxos/metabolismo , Gluconeogênese/genética , Modelos Genéticos , Fosfoenolpiruvato Carboxiquinase (ATP)/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética
20.
BMC Bioinformatics ; 13 Suppl 6: S9, 2012 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-22537048

RESUMO

BACKGROUND: Sequencing of mRNA (RNA-seq) by next generation sequencing technologies is widely used for analyzing the transcriptomic state of a cell. Here, one of the main challenges is the mapping of a sequenced read to its transcriptomic origin. As a simple alignment to the genome will fail to identify reads crossing splice junctions and a transcriptome alignment will miss novel splice sites, several approaches have been developed for this purpose. Most of these approaches have two drawbacks. First, each read is assigned to a location independent on whether the corresponding gene is expressed or not, i.e. information from other reads is not taken into account. Second, in case of multiple possible mappings, the mapping with the fewest mismatches is usually chosen which may lead to wrong assignments due to sequencing errors. RESULTS: To address these problems, we developed ContextMap which efficiently uses information on the context of a read, i.e. reads mapping to the same expressed region. The context information is used to resolve possible ambiguities and, thus, a much larger degree of ambiguities can be allowed in the initial stage in order to detect all possible candidate positions. Although ContextMap can be used as a stand-alone version using either a genome or transcriptome as input, the version presented in this article is focused on refining initial mappings provided by other mapping algorithms. Evaluation results on simulated sequencing reads showed that the application of ContextMap to either TopHat or MapSplice mappings improved the mapping accuracy of both initial mappings considerably. CONCLUSIONS: In this article, we show that the context of reads mapping to nearby locations provides valuable information for identifying the best unique mapping for a read. Using our method, mappings provided by other state-of-the-art methods can be refined and alignment accuracy can be further improved. AVAILABILITY: http://www.bio.ifi.lmu.de/ContextMap.


Assuntos
Algoritmos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Animais , Genoma , Humanos , Camundongos , Splicing de RNA , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA