Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Sci Rep ; 9(1): 13949, 2019 09 27.
Artículo en Inglés | MEDLINE | ID: mdl-31562339

RESUMEN

Alternative Splicing produces multiple mRNA isoforms of genes which have important diverse roles such as regulation of gene expression, human heritable diseases, and response to environmental stresses. However, little has been done to assign functions at the mRNA isoform level. Functional networks, where the interactions are quantified by their probability of being involved in the same biological process are typically generated at the gene level. We use a diverse array of tissue-specific RNA-seq datasets and sequence information to train random forest models that predict the functional networks. Since there is no mRNA isoform-level gold standard, we use single isoform genes co-annotated to Gene Ontology biological process annotations, Kyoto Encyclopedia of Genes and Genomes pathways, BioCyc pathways and protein-protein interactions as functionally related (positive pair). To generate the non-functional pairs (negative pair), we use the Gene Ontology annotations tagged with "NOT" qualifier. We describe 17 Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) following a leave-one-tissue-out strategy in addition to an organism level reference functional network for mouse. We validate our predictions by comparing its performance with previous methods, randomized positive and negative class labels, updated Gene Ontology annotations, and by literature evidence. We demonstrate the ability of our networks to reveal tissue-specific functional differences of the isoforms of the same genes. All scripts and data from TENSION are available at: https://doi.org/10.25380/iastate.c.4275191 .


Asunto(s)
Redes Reguladoras de Genes/fisiología , Isoformas de ARN/metabolismo , ARN Mensajero/metabolismo , Algoritmos , Empalme Alternativo , Animales , Ratones , Modelos Genéticos , Especificidad de Órganos , Isoformas de ARN/genética , ARN Mensajero/genética
2.
PLoS One ; 12(11): e0187091, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29121073

RESUMEN

Identification of central genes and proteins in biomolecular networks provides credible candidates for pathway analysis, functional analysis, and essentiality prediction. The DiffSLC centrality measure predicts central and essential genes and proteins using a protein-protein interaction network. Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures helped identify critical genes and proteins in biomolecular networks. The proposed centrality measure, DiffSLC, combines the number of interactions of a protein and the gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLC is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from the Saccharomyces Gene Deletion Project, which show that DiffSLC detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLC a stronger alternative to other centrality methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLC is implemented using the igraph package in R, and networkx package in Python. The python package can be obtained from git.io/diffslcpy. The R implementation and code to reproduce the analysis is available via git.io/diffslc.


Asunto(s)
Mapas de Interacción de Proteínas , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Regulación Fúngica de la Expresión Génica , Curva ROC , Saccharomyces cerevisiae/genética , Programas Informáticos
3.
BMC Syst Biol ; 10(1): 129, 2016 11 29.
Artículo en Inglés | MEDLINE | ID: mdl-27899149

RESUMEN

BACKGROUND: As metabolic pathway resources become more commonly available, researchers have unprecedented access to information about their organism of interest. Despite efforts to ensure consistency between various resources, information content and quality can vary widely. Two maize metabolic pathway resources for the B73 inbred line, CornCyc 4.0 and MaizeCyc 2.2, are based on the same gene model set and were developed using Pathway Tools software. These resources differ in their initial enzymatic function assignments and in the extent of manual curation. We present an in-depth comparison between CornCyc and MaizeCyc to demonstrate the effect of initial computational enzymatic function assignments on the quality and content of metabolic pathway resources. RESULTS: These two resources are different in their content. MaizeCyc contains GO annotations for over 21,000 genes that CornCyc is missing. CornCyc contains on average 1.6 transcripts per gene, while MaizeCyc contains almost no alternate splicing. MaizeCyc also does not match CornCyc's breadth in representing the metabolic domain; MaizeCyc has fewer compounds, reactions, and pathways than CornCyc. CornCyc's computational predictions are more accurate than those in MaizeCyc when compared to experimentally determined function assignments, demonstrating the relative strength of the enzymatic function assignment pipeline used to generate CornCyc. CONCLUSIONS: Our results show that the quality of initial enzymatic function assignments primarily determines the quality of the final metabolic pathway resource. Therefore, biologists should pay close attention to the methods and information sources used to develop a metabolic pathway resource to gauge the utility of using such functional assignments to construct hypotheses for experimental studies.


Asunto(s)
Biología Computacional , Zea mays/metabolismo , Anotación de Secuencia Molecular , Proteínas de Plantas/metabolismo , Zea mays/enzimología
4.
Mol Plant Microbe Interact ; 28(9): 968-83, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25938194

RESUMEN

The interaction of barley, Hordeum vulgare L., with the powdery mildew fungus Blumeria graminis f. sp. hordei is a well-developed model to investigate resistance and susceptibility to obligate biotrophic pathogens. The 130-Mb Blumeria genome encodes approximately 540 predicted effectors that are hypothesized to suppress or induce host processes to promote colonization. Blumeria effector candidate (BEC)1019, a single-copy gene encoding a putative, secreted metalloprotease, is expressed in haustorial feeding structures, and host-induced gene silencing of BEC1019 restricts haustorial development in compatible interactions. Here, we show that Barley stripe mosaic virus-induced gene silencing of BEC1019 significantly reduces fungal colonization of barley epidermal cells, demonstrating that BEC1019 plays a central role in virulence. In addition, delivery of BEC1019 to the host cytoplasm via Xanthomonas type III secretion suppresses cultivar nonspecific hypersensitive reaction (HR) induced by Xanthomonas oryzae pv. oryzicola, as well as cultivar-specific HR induced by AvrPphB from Pseudomonas syringae pv. phaseolicola. BEC1019 homologs are present in 96 of 241 sequenced fungal genomes, including plant pathogens, human pathogens, and free-living nonpathogens. Comparative analysis revealed variation at several amino acid positions that correlate with fungal lifestyle and several highly conserved, noncorrelated motifs. Site-directed mutagenesis of one of these, ETVIC, compromises the HR-suppressing activity of BEC1019. We postulate that BEC1019 represents an ancient, broadly important fungal protein family, members of which have evolved to function as effectors in plant and animal hosts.


Asunto(s)
Ascomicetos/patogenicidad , Hordeum/microbiología , Enfermedades de las Plantas/microbiología , Secuencia de Aminoácidos , Ascomicetos/genética , Ascomicetos/metabolismo , Secuencia Conservada , Regulación Fúngica de la Expresión Génica/fisiología , Silenciador del Gen , Datos de Secuencia Molecular , Filogenia , Hojas de la Planta , Virus de Plantas , Virulencia , Xanthomonas/metabolismo
5.
BMC Bioinformatics ; 15: 364, 2014 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-25511303

RESUMEN

BACKGROUND: Alternative Splicing (AS) as a post-transcription regulation mechanism is an important application of RNA-seq studies in eukaryotes. A number of software and computational methods have been developed for detecting AS. Most of the methods, however, are designed and tested on animal data, such as human and mouse. Plants genes differ from those of animals in many ways, e.g., the average intron size and preferred AS types. These differences may require different computational approaches and raise questions about their effectiveness on plant data. The goal of this paper is to benchmark existing computational differential splicing (or transcription) detection methods so that biologists can choose the most suitable tools to accomplish their goals. RESULTS: This study compares the eight popular public available software packages for differential splicing analysis using both simulated and real Arabidopsis thaliana RNA-seq data. All software are freely available. The study examines the effect of varying AS ratio, read depth, dispersion pattern, AS types, sample sizes and the influence of annotation. Using a real data, the study looks at the consistences between the packages and verifies a subset of the detected AS events using PCR studies. CONCLUSIONS: No single method performs the best in all situations. The accuracy of annotation has a major impact on which method should be chosen for AS analysis. DEXSeq performs well in the simulated data when the AS signal is relative strong and annotation is accurate. Cufflinks achieve a better tradeoff between precision and recall and turns out to be the best one when incomplete annotation is provided. Some methods perform inconsistently for different AS types. Complex AS events that combine several simple AS events impose problems for most methods, especially for MATS. MATS stands out in the analysis of real RNA-seq data when all the AS events being evaluated are simple AS events.


Asunto(s)
Empalme Alternativo/genética , Arabidopsis/genética , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , ARN de Planta/genética , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Animales , Genoma de Planta , Humanos , Intrones/genética , Ratones , Reacción en Cadena de la Polimerasa
6.
BMC Syst Biol ; 8: 115, 2014 Oct 12.
Artículo en Inglés | MEDLINE | ID: mdl-25304126

RESUMEN

BACKGROUND: BioCyc databases are an important resource for information on biological pathways and genomic data. Such databases represent the accumulation of biological data, some of which has been manually curated from literature. An essential feature of these databases is the continuing data integration as new knowledge is discovered. As functional annotations are improved, scalable methods are needed for curators to manage annotations without detailed knowledge of the specific design of the BioCyc database. RESULTS: We have developed CycTools, a software tool which allows curators to maintain functional annotations in a model organism database. This tool builds on existing software to improve and simplify annotation data imports of user provided data into BioCyc databases. Additionally, CycTools automatically resolves synonyms and alternate identifiers contained within the database into the appropriate internal identifiers. CONCLUSIONS: Automating steps in the manual data entry process can improve curation efforts for major biological databases. The functionality of CycTools is demonstrated by transferring GO term annotations from MaizeCyc to matching proteins in CornCyc, both maize metabolic pathway databases available at MaizeGDB, and by creating strain specific databases for metabolic engineering.


Asunto(s)
Biología Computacional/métodos , Curaduría de Datos/métodos , Bases de Datos como Asunto , Programas Informáticos
7.
Comput Biol Med ; 47: 66-75, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24561345

RESUMEN

Identifying key biomarkers for different cancer types can improve diagnosis accuracy and treatment. Gene expression data can help differentiate between cancer subtypes. However the limitation of having a small number of samples versus a larger number of genes represented in a dataset leads to the overfitting of classification models. Feature selection methods can help select the most distinguishing feature sets for classifying different cancers. A new class dependent feature selection approach integrates the F-statistic, Maximum Relevance Binary Particle Swarm Optimization (MRBPSO) and Class Dependent Multi-category Classification (CDMC) system. This feature selection method combines filter and wrapper based methods. A set of highly differentially expressed genes (features) are pre-selected using the F statistic for each dataset as a filter for selecting the most meaningful features. MRBPSO and CDMC function as a wrapper to select desirable feature subsets for each class and classify the samples using those chosen class-dependent feature subsets. The performance of the proposed methods is evaluated on eight real cancer datasets. The results indicate that the class-dependent approaches can effectively identify biomarkers related to each cancer type and improve classification accuracy compared to class independent feature selection methods.


Asunto(s)
Biomarcadores de Tumor/clasificación , Biología Computacional/métodos , Neoplasias/genética , Neoplasias/metabolismo , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Bases de Datos Factuales , Lógica Difusa , Humanos , Modelos Estadísticos , Máquina de Vectores de Soporte
8.
Int J Data Min Bioinform ; 6(2): 130-43, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22724294

RESUMEN

Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.


Asunto(s)
Proteínas/química , Programas Informáticos , Estructuras Celulares/metabolismo , Bases de Datos de Proteínas , Conformación Proteica , Proteínas/análisis , Proteínas/metabolismo , Máquina de Vectores de Soporte
9.
Front Plant Sci ; 3: 15, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22645570

RESUMEN

Metabolomics is the methodology that identifies and measures global pools of small molecules (of less than about 1,000 Da) of a biological sample, which are collectively called the metabolome. Metabolomics can therefore reveal the metabolic outcome of a genetic or environmental perturbation of a metabolic regulatory network, and thus provide insights into the structure and regulation of that network. Because of the chemical complexity of the metabolome and limitations associated with individual analytical platforms for determining the metabolome, it is currently difficult to capture the complete metabolome of an organism or tissue, which is in contrast to genomics and transcriptomics. This paper describes the analysis of Arabidopsis metabolomics data sets acquired by a consortium that includes five analytical laboratories, bioinformaticists, and biostatisticians, which aims to develop and validate metabolomics as a hypothesis-generating functional genomics tool. The consortium is determining the metabolomes of Arabidopsis T-DNA mutant stocks, grown in standardized controlled environment optimized to minimize environmental impacts on the metabolomes. Metabolomics data were generated with seven analytical platforms, and the combined data is being provided to the research community to formulate initial hypotheses about genes of unknown function (GUFs). A public database (www.PlantMetabolomics.org) has been developed to provide the scientific community with access to the data along with tools to allow for its interactive analysis. Exemplary datasets are discussed to validate the approach, which illustrate how initial hypotheses can be generated from the consortium-produced metabolomics data, integrated with prior knowledge to provide a testable hypothesis concerning the functionality of GUFs.

10.
BMC Syst Biol ; 6: 19, 2012 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-22423977

RESUMEN

BACKGROUND: Network motifs, recurring subnetwork patterns, provide significant insight into the biological networks which are believed to govern cellular processes. METHODS: We present a comparative network motif experimental approach, which helps to explain complex biological phenomena and increases the understanding of biological functions at the molecular level by exploring evolutionary design principles of network motifs. RESULTS: Using this framework to analyze the SM (Sec1/Munc18)-SNARE (N-ethylmaleimide-sensitive factor activating protein receptor) system in exocytic membrane fusion in yeast and neurons, we find that the SM-SNARE network motifs of yeast and neurons show distinct dynamical behaviors. We identify the closed binding mode of neuronal SM (Munc18-1) and SNARE (syntaxin-1) as the key factor leading to mechanistic divergence of membrane fusion systems in yeast and neurons. We also predict that it underlies the conflicting observations in SM overexpression experiments. Furthermore, hypothesis-driven lipid mixing assays validated the prediction. CONCLUSION: Therefore this study provides a new method to solve the discrepancies and to generalize the functional role of SM proteins.


Asunto(s)
Exocitosis , Modelos Biológicos , Proteínas Munc18/metabolismo , Proteínas SNARE/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Membrana Celular/metabolismo , Neuronas/citología , Neuronas/metabolismo , Proteínas Qa-SNARE/metabolismo , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/metabolismo , Sinapsis/metabolismo
11.
Bioinformatics ; 28(7): 947-54, 2012 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-22308149

RESUMEN

MOTIVATION: Analysis of omics experiments generates lists of entities (genes, metabolites, etc.) selected based on specific behavior, such as changes in response to stress or other signals. Functional interpretation of these lists often uses category enrichment tests using functional annotations like Gene Ontology terms and pathway membership. This approach does not consider the connected structure of biochemical pathways or the causal directionality of events. RESULTS: The Omics Response Group (ORG) method, described in this work, interprets omics lists in the context of metabolic pathway and regulatory networks using a statistical model for flow within the networks. Statistical results for all response groups are visualized in a novel Pathway Flow plot. The statistical tests are based on the Erlang distribution model under the assumption of independent and identically Exponential-distributed random walk flows through pathways. As a proof of concept, we applied our method to an Escherichia coli transcriptomics dataset where we confirmed common knowledge of the E.coli transcriptional response to Lipid A deprivation. The main response is related to osmotic stress, and we were also able to detect novel responses that are supported by the literature. We also applied our method to an Arabidopsis thaliana expression dataset from an abscisic acid study. In both cases, conventional pathway enrichment tests detected nothing, while our approach discovered biological processes beyond the original studies. AVAILABILITY: We created a prototype for an interactive ORG web tool at http://ecoserver.vrac.iastate.edu/pathwayflow (source code is available from https://subversion.vrac.iastate.edu/Subversion/jlv/public/jlv/pathwayflow). The prototype is described along with additional figures and tables in Supplementary Material. CONTACT: julied@iastate.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Redes y Vías Metabólicas , Modelos Estadísticos , Arabidopsis/genética , Arabidopsis/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Programas Informáticos
12.
Nucleic Acids Res ; 40(Database issue): D1194-201, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22084198

RESUMEN

PLEXdb (http://www.plexdb.org), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facilitate the interpretation of structure, function and regulation of genes in economically important plants. A list of Gene Atlas experiments highlights data sets that give responses across different developmental stages, conditions and tissues. Tools at PLEXdb allow users to perform complex analyses quickly and easily. The Model Genome Interrogator (MGI) tool supports mapping gene lists onto corresponding genes from model plant organisms, including rice and Arabidopsis. MGI predicts homologies, displays gene structures and supporting information for annotated genes and full-length cDNAs. The gene list-processing wizard guides users through PLEXdb functions for creating, analyzing, annotating and managing gene lists. Users can upload their own lists or create them from the output of PLEXdb tools, and then apply diverse higher level analyses, such as ANOVA and clustering. PLEXdb also provides methods for users to track how gene expression changes across many different experiments using the Gene OscilloScope. This tool can identify interesting expression patterns, such as up-regulation under diverse conditions or checking any gene's suitability as a steady-state control.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Genes de Plantas , Genoma de Planta , Anotación de Secuencia Molecular , Programas Informáticos , Transcriptoma
13.
Nucleic Acids Res ; 40(Database issue): D1216-20, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22080512

RESUMEN

The PlantMetabolomics (PM) database (http://www.plantmetabolomics.org) contains comprehensive targeted and untargeted mass spectrum metabolomics data for Arabidopsis mutants across a variety of metabolomics platforms. The database allows users to generate hypotheses about the changes in metabolism for mutants with genes of unknown function. Version 2.0 of PlantMetabolomics.org currently contains data for 140 mutant lines along with the morphological data. A web-based data analysis wizard allows researchers to select preprocessing and data-mining procedures to discover differences between mutants. This community resource enables researchers to formulate models of the metabolic network of Arabidopsis and enhances the research community's ability to formulate testable hypotheses concerning gene functions. PM features new web-based tools for data-mining analysis, visualization tools and enhanced cross links to other databases. The database is publicly available. PM aims to provide a hypothesis building platform for the researchers interested in any of the mutant lines or metabolites.


Asunto(s)
Arabidopsis/metabolismo , Bases de Datos Factuales , Espectrometría de Masas , Metaboloma , Arabidopsis/anatomía & histología , Arabidopsis/genética , Análisis por Conglomerados , Gráficos por Computador , Metaboloma/genética , Metabolómica , Mutación , Análisis de Componente Principal , Programas Informáticos
14.
BMC Bioinformatics ; 12: 233, 2011 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-21668997

RESUMEN

BACKGROUND: Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element. RESULTS: This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network. CONCLUSIONS: The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.


Asunto(s)
Proteínas de Escherichia coli/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Redes Reguladoras de Genes , Factores de Transcripción/metabolismo , Algoritmos , Perfilación de la Expresión Génica/métodos , Genoma Bacteriano
15.
Bioinformatics ; 27(11): 1578-80, 2011 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-21511714

RESUMEN

SUMMARY: CytoModeler is an open-source Java application based on the Cytoscape platform. It integrates large-scale network analysis and quantitative modeling by combining omics analysis on the Cytoscape platform, access to deterministic and stochastic simulators, and static and dynamic network context visualizations of simulation results. AVAILABILITY: Implemented in Java, CytoModeler runs with Cytoscape 2.6 and 2.7. Binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv/cytomodeler/.


Asunto(s)
Modelos Biológicos , Programas Informáticos , Simulación por Computador , Biología de Sistemas/métodos
16.
Comput Methods Programs Biomed ; 101(1): 80-6, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20541280

RESUMEN

Statistical tests are often performed to discover which experimental variables are reacting to specific treatments. Time-series statistical models usually require the researcher to make assumptions with respect to the distribution of measured responses which may not hold. Randomization tests can be applied to data in order to generate null distributions non-parametrically. However, large numbers of randomizations are required for the precise p-values needed to control false discovery rates. When testing tens of thousands of variables (genes, chemical compounds, or otherwise), significant q-value cutoffs can be extremely small (on the order of 10(-5) to 10(-8)). This requires high-precision p-values, which in turn require large numbers of randomizations. The NVIDIA(®) Compute Unified Device Architecture(®) (CUDA(®)) platform for General Programming on the Graphics Processing Unit (GPGPU) was used to implement an application which performs high-precision randomization tests via Monte Carlo sampling for quickly screening custom test statistics for experiments with large numbers of variables, such as microarrays, Next-Generation sequencing read counts, chromatographical signals, or other abundance measurements. The software has been shown to achieve up to more than 12 fold speedup on a Graphics Processing Unit (GPU) when compared to a powerful Central Processing Unit (CPU). The main limitation is concurrent random access of shared memory on the GPU. The software is available from the authors.


Asunto(s)
Método de Montecarlo , Programas Informáticos , Algoritmos , Gráficos por Computador , Simulación por Computador , Bases de Datos Factuales , Modelos Estadísticos
17.
Bioinformatics ; 26(23): 2995-6, 2010 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-20947524

RESUMEN

SUMMARY: OmicsAnalyzer is a Cytoscape plug-in for visual omics-based network analysis that (i) integrates hetero-omics data for one or more species; (ii) performs statistical tests on the integrated datasets; and (iii) visualizes results in a network context. AVAILABILITY: Implemented in Java, OmicsAnalyzer runs with Cytoscape 2.6 and 2.7. Binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv/omicsanalyzer/ CONTACT: julied@iastate.edu; netscape@iastate.edu


Asunto(s)
Redes Reguladoras de Genes , Programas Informáticos , Interpretación Estadística de Datos , Perfilación de la Expresión Génica , Modelos Biológicos , Modelos Genéticos
18.
BMC Bioinformatics ; 11: 469, 2010 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-20849585

RESUMEN

BACKGROUND: Linking high-throughput experimental data with biological networks is a key step for understanding complex biological systems. Currently, visualization tools for large metabolic networks often result in a dense web of connections that is difficult to interpret biologically. The MetNetGE application organizes and visualizes biological networks in a meaningful way to improve performance and biological interpretability. RESULTS: MetNetGE is an interactive visualization tool based on the Google Earth platform. MetNetGE features novel visualization techniques for pathway and ontology information display. Instead of simply showing hundreds of pathways in a complex graph, MetNetGE gives an overview of the network using the hierarchical pathway ontology using a novel layout, called the Enhanced Radial Space-Filling (ERSF) approach that allows the network to be summarized compactly. The non-tree edges in the pathway or gene ontology, which represent pathways or genes that belong to multiple categories, are linked using orbital connections in a third dimension. Biologists can easily identify highly activated pathways or gene ontology categories by mapping of summary experiment statistics such as coefficient of variation and overrepresentation values onto the visualization. After identifying such pathways, biologists can focus on the corresponding region to explore detailed pathway structure and experimental data in an aligned 3D tiered layout. In this paper, the use of MetNetGE is illustrated with pathway diagrams and data from E. coli and Arabidopsis. CONCLUSIONS: MetNetGE is a visualization tool that organizes biological networks according to a hierarchical ontology structure. The ERSF technique assigns attributes in 3D space, such as color, height, and transparency, to any ontological structure. For hierarchical data, the novel ERSF layout enables the user to identify pathways or categories that are differentially regulated in particular experiments. MetNetGE also displays complex biological pathway in an aligned 3D tiered layout for exploration.


Asunto(s)
Almacenamiento y Recuperación de la Información/métodos , Redes y Vías Metabólicas , Programas Informáticos , Algoritmos , Internet , Interfaz Usuario-Computador
19.
Bioinformatics ; 26(18): 2345-6, 2010 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-20647521

RESUMEN

UNLABELLED: CellDesigner provides a user-friendly interface for graphical biochemical pathway description. Many pathway databases are not directly exportable to CellDesigner models. PathwayAccess is an extensible suite of CellDesigner plugins, which connect CellDesigner directly to pathway databases using respective Java application programming interfaces. The process is streamlined for creating new PathwayAccess plugins for specific pathway databases. Three PathwayAccess plugins, MetNetAccess, BioCycAccess and ReactomeAccess, directly connect CellDesigner to the pathway databases MetNetDB, BioCyc and Reactome. PathwayAccess plugins enable CellDesigner users to expose pathway data to analytical CellDesigner functions, curate their pathway databases and visually integrate pathway data from different databases using standard Systems Biology Markup Language and Systems Biology Graphical Notation. AVAILABILITY: Implemented in Java, PathwayAccess plugins run with CellDesigner version 4.0.1 and were tested on Ubuntu Linux, Windows XP and 7, and MacOSX. Source code, binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv.


Asunto(s)
Redes y Vías Metabólicas , Programas Informáticos , Bases de Datos Factuales
20.
Plant Physiol ; 152(4): 1807-16, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-20147492

RESUMEN

PlantMetabolomics.org (PM) is a web portal and database for exploring, visualizing, and downloading plant metabolomics data. Widespread public access to well-annotated metabolomics datasets is essential for establishing metabolomics as a functional genomics tool. PM integrates metabolomics data generated from different analytical platforms from multiple laboratories along with the key visualization tools such as ratio and error plots. Visualization tools can quickly show how one condition compares to another and which analytical platforms show the largest changes. The database tries to capture a complete annotation of the experiment metadata along with the metabolite abundance databased on the evolving Metabolomics Standards Initiative. PM can be used as a platform for deriving hypotheses by enabling metabolomic comparisons between genetically unique Arabidopsis (Arabidopsis thaliana) populations subjected to different environmental conditions. Each metabolite is linked to relevant experimental data and information from various annotation databases. The portal also provides detailed protocols and tutorials on conducting plant metabolomics experiments to promote metabolomics in the community. PM currently houses Arabidopsis metabolomics data generated by a consortium of laboratories utilizing metabolomics to help elucidate the functions of uncharacterized genes. PM is publicly available at http://www.plantmetabolomics.org.


Asunto(s)
Arabidopsis/metabolismo , Internet , Metabolómica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA