Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Nucleic Acids Res ; 51(D1): D753-D759, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36477304

RESUMO

The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.


Assuntos
Microbiota , Análise de Sequência , Genômica/métodos , Metagenoma , Metagenômica/métodos , Microbiota/genética , Software , Análise de Sequência/métodos
2.
J Integr Bioinform ; 17(2-3)2020 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-32750035

RESUMO

Biological models often contain elements that have inexact numerical values, since they are based on values that are stochastic in nature or data that contains uncertainty. The Systems Biology Markup Language (SBML) Level 3 Core specification does not include an explicit mechanism to include inexact or stochastic values in a model, but it does provide a mechanism for SBML packages to extend the Core specification and add additional syntactic constructs. The SBML Distributions package for SBML Level 3 adds the necessary features to allow models to encode information about the distribution and uncertainty of values underlying a quantity.


Assuntos
Linguagens de Programação , Biologia de Sistemas , Documentação , Idioma , Modelos Biológicos , Software
3.
Mol Syst Biol ; 16(8): e9110, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32845085

RESUMO

Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.


Assuntos
Biologia de Sistemas/métodos , Animais , Humanos , Modelos Logísticos , Modelos Biológicos , Software
4.
J Integr Bioinform ; 16(2)2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31219795

RESUMO

Computational models can help researchers to interpret data, understand biological functions, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that different software systems can exchange. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Release 2 of Version 2 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language. Release 2 corrects some errors and clarifies some ambiguities discovered in Release 1. This specification also defines validation rules that determine the validity of an SBML document, and provides many examples of models in SBML form. Other materials and software are available from the SBML project website at http://sbml.org/.


Assuntos
Simulação por Computador , Modelos Biológicos , Linguagens de Programação , Biologia de Sistemas
5.
PLoS One ; 13(4): e0195484, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29649240

RESUMO

We investigate the feasibility of using a surrogate-based method to emulate the deformation and detachment behaviour of a biofilm in response to hydrodynamic shear stress. The influence of shear force, growth rate and viscoelastic parameters on the patterns of growth, structure and resulting shape of microbial biofilms was examined. We develop a statistical modelling approach to this problem, using combination of Bayesian Poisson regression and dynamic linear models for the emulation. We observe that the hydrodynamic shear force affects biofilm deformation in line with some literature. Sensitivity results also showed that the expected number of shear events, shear flow, yield coefficient for heterotrophic bacteria and extracellular polymeric substance (EPS) stiffness per unit EPS mass are the four principal mechanisms governing the bacteria detachment in this study. The sensitivity of the model parameters is temporally dynamic, emphasising the significance of conducting the sensitivity analysis across multiple time points. The surrogate models are shown to perform well, and produced ≈ 480 fold increase in computational efficiency. We conclude that a surrogate-based approach is effective, and resulting biofilm structure is determined primarily by a balance between bacteria growth, viscoelastic parameters and applied shear stress.


Assuntos
Biofilmes , Hidrodinâmica , Modelos Estatísticos , Resistência ao Cisalhamento , Estresse Mecânico , Teorema de Bayes , Distribuição de Poisson , Águas Residuárias/microbiologia
6.
J Integr Bioinform ; 15(1)2018 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-29522418

RESUMO

Computational models can help researchers to interpret data, understand biological functions, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that different software systems can exchange. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 2 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML, their encoding in XML (the eXtensible Markup Language), validation rules that determine the validity of an SBML document, and examples of models in SBML form. The design of Version 2 differs from Version 1 principally in allowing new MathML constructs, making more child elements optional, and adding identifiers to all SBML elements instead of only selected elements. Other materials and software are available from the SBML project website at http://sbml.org/.


Assuntos
Documentação/normas , Armazenamento e Recuperação da Informação/normas , Modelos Biológicos , Linguagens de Programação , Software , Biologia de Sistemas/normas , Animais , Simulação por Computador , Guias como Assunto , Humanos , Transdução de Sinais
7.
Stat Comput ; 28(4): 891-904, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-31983814

RESUMO

A statistical model assuming a preferential attachment network, which is generated by adding nodes sequentially according to a few simple rules, usually describes real-life networks better than a model assuming, for example, a Bernoulli random graph, in which any two nodes have the same probability of being connected, does. Therefore, to study the propagation of "infection" across a social network, we propose a network epidemic model by combining a stochastic epidemic model and a preferential attachment model. A simulation study based on the subsequent Markov Chain Monte Carlo algorithm reveals an identifiability issue with the model parameters. Finally, the network epidemic model is applied to a set of online commissioning data.

8.
Nucleic Acids Res ; 46(D1): D726-D735, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29069476

RESUMO

EBI metagenomics (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the analysis and archiving of sequence data derived from the microbial populations found in a particular environment. Over the past two years, EBI metagenomics has increased the number of datasets analysed 10-fold. In addition to increased throughput, the underlying analysis pipeline has been overhauled to include both new or updated tools and reference databases. Of particular note is a new workflow for taxonomic assignments that has been extended to include assignments based on both the large and small subunit RNA marker genes and to encompass all cellular micro-organisms. We also describe the addition of metagenomic assembly as a new analysis service. Our pilot studies have produced over 2400 assemblies from datasets in the public domain. From these assemblies, we have produced a searchable, non-redundant protein database of over 50 million sequences. To provide improved access to the data stored within the resource, we have developed a programmatic interface that provides access to the analysis results and associated sample metadata. Finally, we have integrated the results of a series of statistical analyses that provide estimations of diversity and sample comparisons.


Assuntos
Bases de Dados Genéticas , Metagenômica , Microbiota , Algoritmos , Sequência de Bases , Classificação/métodos , Conjuntos de Dados como Assunto , Metagenômica/métodos , RNA Arqueal/genética , RNA Bacteriano/genética , RNA Viral/genética , Ribotipagem , Software , Transcriptoma , Interface Usuário-Computador , Navegador , Fluxo de Trabalho
9.
J R Stat Soc Ser C Appl Stat ; 65(3): 367-393, 2016 04.
Artigo em Inglês | MEDLINE | ID: mdl-27134314

RESUMO

Quantitative fitness analysis (QFA) is a high throughput experimental and computational methodology for measuring the growth of microbial populations. QFA screens can be used to compare the health of cell populations with and without a mutation in a query gene to infer genetic interaction strengths genomewide, examining thousands of separate genotypes. We introduce Bayesian hierarchical models of population growth rates and genetic interactions that better reflect QFA experimental design than current approaches. Our new approach models population dynamics and genetic interaction simultaneously, thereby avoiding passing information between models via a univariate fitness summary. Matching experimental structure more closely, Bayesian hierarchical approaches use data more efficiently and find new evidence for genes which interact with yeast telomeres within a published data set.

10.
J Integr Bioinform ; 12(2): 266, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26528564

RESUMO

Computational models can help researchers to interpret data, understand biological function, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that can be exchanged between different software systems. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 1 of SBML Level 3 Core. The specification defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language. This specification also defines validation rules that determine the validity of an SBML document, and provides many examples of models in SBML form. Other materials and software are available from the SBML project web site, http://sbml.org/.


Assuntos
Gráficos por Computador/normas , Modelos Biológicos , Linguagens de Programação , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Biologia de Sistemas/normas , Animais , Ontologias Biológicas , Conjuntos de Dados como Assunto/normas , Documentação/normas , Guias como Assunto/normas , Humanos , Armazenamento e Recuperação da Informação/normas , Internacionalidade
11.
J Integr Bioinform ; 12(2): 271, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26528569

RESUMO

Computational models can help researchers to interpret data, understand biological function, and make quantitative predictions. The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that can be exchanged between different software systems. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model, removing opportunities for translation errors and assuring a common starting point for analyses and simulations. This document provides the specification for Version 5 of SBML Level 2. The specification defines the data structures prescribed by SBML as well as their encoding in XML, the eXtensible Markup Language. This specification also defines validation rules that determine the validity of an SBML document, and provides many examples of models in SBML form. Other materials and software are available from the SBML project web site, http://sbml.org.


Assuntos
Gráficos por Computador/normas , Modelos Biológicos , Linguagens de Programação , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Biologia de Sistemas/normas , Animais , Ontologias Biológicas , Conjuntos de Dados como Assunto/normas , Documentação/normas , Guias como Assunto/normas , Humanos , Armazenamento e Recuperação da Informação/normas , Internacionalidade
12.
PLoS One ; 10(7): e0132240, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26168240

RESUMO

Synthetic genetic array (SGA) has been successfully used to identify genetic interactions in S. cerevisiae and S. pombe. In S. pombe, SGA methods use either cycloheximide (C) or heat shock (HS) to select double mutants before measuring colony size as a surrogate for fitness. Quantitative Fitness Analysis (QFA) is a different method for determining fitness of microbial strains. In QFA, liquid cultures are spotted onto solid agar and growth curves determined for each spot by photography and model fitting. Here, we compared the two S. pombe SGA methods and found that the HS method was more reproducible for us. We also developed a QFA procedure for S. pombe. We used QFA to identify genetic interactions affecting two temperature sensitive, telomere associated query mutations (taz1Δ and pot1-1). We identify exo1∆ and other gene deletions as suppressors or enhancers of S. pombe telomere defects. Our study identifies known and novel gene deletions affecting the fitness of strains with telomere defects. The interactions we identify may be relevant in human cells.


Assuntos
Aptidão Genética/fisiologia , Schizosaccharomyces/genética , Telômero/genética , Elementos Facilitadores Genéticos/fisiologia , Deleção de Genes , Genes Supressores/fisiologia , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Sequências Reguladoras de Ácido Nucleico/fisiologia , Schizosaccharomyces/fisiologia , Telômero/fisiologia
13.
Stat Appl Genet Mol Biol ; 14(2): 169-88, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25720091

RESUMO

In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.


Assuntos
Teorema de Bayes , Cadeias de Markov , Algoritmos , Simulação por Computador , Cinética , Modelos Biológicos , Método de Monte Carlo , Probabilidade
14.
Stat Appl Genet Mol Biol ; 14(2): 189-209, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25720092

RESUMO

Approaches to Bayesian inference for problems with intractable likelihoods have become increasingly important in recent years. Approximate Bayesian computation (ABC) and "likelihood free" Markov chain Monte Carlo techniques are popular methods for tackling inference in these scenarios but such techniques are computationally expensive. In this paper we compare the two approaches to inference, with a particular focus on parameter inference for stochastic kinetic models, widely used in systems biology. Discrete time transition kernels for models of this type are intractable for all but the most trivial systems yet forward simulation is usually straightforward. We discuss the relative merits and drawbacks of each approach whilst considering the computational cost implications and efficiency of these techniques. In order to explore the properties of each approach we examine a range of observation regimes using two example models. We use a Lotka-Volterra predator-prey model to explore the impact of full or partial species observations using various time course observations under the assumption of known and unknown measurement error. Further investigation into the impact of observation error is then made using a Schlögl system, a test case which exhibits bi-modal state stability in some regions of parameter space.


Assuntos
Teorema de Bayes , Funções Verossimilhança , Cadeias de Markov , Modelos Biológicos , Método de Monte Carlo , Algoritmos , Simulação por Computador , Cinética , Biologia de Sistemas
15.
Stat Appl Genet Mol Biol ; 13(5): 531-51, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25153608

RESUMO

In this paper we develop a Bayesian statistical inference approach to the unified analysis of isobaric labelled MS/MS proteomic data across multiple experiments. An explicit probabilistic model of the log-intensity of the isobaric labels' reporter ions across multiple pre-defined groups and experiments is developed. This is then used to develop a full Bayesian statistical methodology for the identification of differentially expressed proteins, with respect to a control group, across multiple groups and experiments. This methodology is implemented and then evaluated on simulated data and on two model experimental datasets (for which the differentially expressed proteins are known) that use a TMT labelling protocol.


Assuntos
Teorema de Bayes , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Modelos Teóricos , Proteômica
16.
Biosystems ; 122: 55-72, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24906175

RESUMO

The transition density of a stochastic, logistic population growth model with multiplicative intrinsic noise is analytically intractable. Inferring model parameter values by fitting such stochastic differential equation (SDE) models to data therefore requires relatively slow numerical simulation. Where such simulation is prohibitively slow, an alternative is to use model approximations which do have an analytically tractable transition density, enabling fast inference. We introduce two such approximations, with either multiplicative or additive intrinsic noise, each derived from the linear noise approximation (LNA) of a logistic growth SDE. After Bayesian inference we find that our fast LNA models, using Kalman filter recursion for computation of marginal likelihoods, give similar posterior distributions to slow, arbitrarily exact models. We also demonstrate that simulations from our LNA models better describe the characteristics of the stochastic logistic growth models than a related approach. Finally, we demonstrate that our LNA model with additive intrinsic noise and measurement error best describes an example set of longitudinal observations of microbial population size taken from a typical, genome-wide screening experiment.


Assuntos
Biologia Computacional/métodos , Modelos Logísticos , Crescimento Demográfico , Teorema de Bayes , Simulação por Computador , Razão Sinal-Ruído , Processos Estocásticos
17.
Trends Ecol Evol ; 28(10): 578-83, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23827437

RESUMO

Modellers of biological, ecological, and environmental systems cannot take for granted the maxim 'simple means general means good'. We argue here that viewing simple models as the main way to achieve generality may be an obstacle to the progress of ecological research. We show how complex models can be both desirable and general, and how simple and complex models can be linked together to produce broad-scale and predictive understanding of biological systems.


Assuntos
Ecologia/métodos , Ecossistema , Modelos Biológicos
18.
PLoS One ; 8(6): e66242, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23776642

RESUMO

G-quadruplexes form in guanine-rich regions of DNA and the presence of these structures at telomeres prevents the activity of telomerase in vitro. Ligands such as the cationic porphyrin TMPyP4 stabilise G-quadruplexes and are therefore under investigation for their potential use as anti-cancer drugs. In order to investigate the mechanism of action of TMPyP4 in vivo, we carried out a genome-wide screen in the budding yeast Saccharomyces cerevisiae. We found that deletion of key pentose phosphate pathway (PPP) genes increased the sensitivity of yeast to the presence of TMPyP4. The PPP plays an important role in the oxidative stress response and sensitivity to TMPyP4 also increased when genes involved in the oxidative stress response, CCS1 and YAP1, were deleted. For comparison we also report genome wide-screens using hydrogen peroxide, which causes oxidative stress, RHPS4, another G-quadruplex binder and hydroxyurea, an S phase poison. We found that a number of TMPyP4-sensitive strains are also sensitive to hydrogen peroxide in a genome-wide screen. Overall our results suggest that treatment with TMPyP4 results in light-dependent oxidative stress response in budding yeast, and that this, rather than G-quadruplex binding, is the major route to cytotoxicity. Our results have implications for the usefulness and mechanism of action of TMPyP4.


Assuntos
Quadruplex G , Aptidão Genética/efeitos dos fármacos , Estresse Oxidativo/efeitos dos fármacos , Via de Pentose Fosfato/fisiologia , Porfirinas/farmacologia , Saccharomyces cerevisiae/crescimento & desenvolvimento , Acridinas/farmacologia , Descoberta de Drogas , Deleção de Genes , Aptidão Genética/efeitos da radiação , Estudo de Associação Genômica Ampla , Peróxido de Hidrogênio/farmacologia , Luz , Testes de Sensibilidade Microbiana , Estresse Oxidativo/efeitos da radiação , Via de Pentose Fosfato/genética , Saccharomyces cerevisiae/efeitos dos fármacos , Saccharomyces cerevisiae/efeitos da radiação
19.
BMC Syst Biol ; 6: 130, 2012 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-23046614

RESUMO

BACKGROUND: Global demographic changes have stimulated marked interest in the process of aging. There has been, and will continue to be, an unrelenting rise in the number of the oldest old ( >85 years of age). Together with an ageing population there comes an increase in the prevalence of age related disease. Of the diseases of ageing, cardiovascular disease (CVD) has by far the highest prevalence. It is regarded that a finely tuned lipid profile may help to prevent CVD as there is a long established relationship between alterations to lipid metabolism and CVD risk. In fact elevated plasma cholesterol, particularly Low Density Lipoprotein Cholesterol (LDL-C) has consistently stood out as a risk factor for having a cardiovascular event. Moreover it is widely acknowledged that LDL-C may rise with age in both sexes in a wide variety of groups. The aim of this work was to use a whole-body mathematical model to investigate why LDL-C rises with age, and to test the hypothesis that mechanistic changes to cholesterol absorption and LDL-C removal from the plasma are responsible for the rise. The whole-body mechanistic nature of the model differs from previous models of cholesterol metabolism which have either focused on intracellular cholesterol homeostasis or have concentrated on an isolated area of lipoprotein dynamics. The model integrates both current and previously published data relating to molecular biology, physiology, ageing and nutrition in an integrated fashion. RESULTS: The model was used to test the hypothesis that alterations to the rate of cholesterol absorption and changes to the rate of removal of LDL-C from the plasma are integral to understanding why LDL-C rises with age. The model demonstrates that increasing the rate of intestinal cholesterol absorption from 50% to 80% by age 65 years can result in an increase of LDL-C by as much as 34 mg/dL in a hypothetical male subject. The model also shows that decreasing the rate of hepatic clearance of LDL-C gradually to 50% by age 65 years can result in an increase of LDL-C by as much as 116 mg/dL. CONCLUSIONS: Our model clearly demonstrates that of the two putative mechanisms that have been implicated in the dysregulation of cholesterol metabolism with age, alterations to the removal rate of plasma LDL-C has the most significant impact on cholesterol metabolism and small changes to the number of hepatic LDL receptors can result in a significant rise in LDL-C. This first whole-body systems based model of cholesterol balance could potentially be used as a tool to further improve our understanding of whole-body cholesterol metabolism and its dysregulation with age. Furthermore, given further fine tuning the model may help to investigate potential dietary and lifestyle regimes that have the potential to mitigate the effects aging has on cholesterol metabolism.


Assuntos
Envelhecimento/metabolismo , Colesterol/metabolismo , Modelos Biológicos , Biologia de Sistemas/métodos , Adulto , Idoso , Envelhecimento/sangue , Colesterol/sangue , LDL-Colesterol/sangue , LDL-Colesterol/metabolismo , Gorduras na Dieta/metabolismo , Feminino , Humanos , Absorção Intestinal , Masculino , Pessoa de Meia-Idade , Adulto Jovem
20.
Bioinformatics ; 28(11): 1495-500, 2012 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-22492647

RESUMO

MOTIVATION: Biological experiments give insight into networks of processes inside a cell, but are subject to error and uncertainty. However, due to the overlap between the large number of experiments reported in public databases it is possible to assess the chances of individual observations being correct. In order to do so, existing methods rely on high-quality 'gold standard' reference networks, but such reference networks are not always available. RESULTS: We present a novel algorithm for computing the probability of network interactions that operates without gold standard reference data. We show that our algorithm outperforms existing gold standard-based methods. Finally, we apply the new algorithm to a large collection of genetic interaction and protein-protein interaction experiments. AVAILABILITY: The integrated dataset and a reference implementation of the algorithm as a plug-in for the Ondex data integration framework are available for download at http://bio-nexus.ncl.ac.uk/projects/nogold/


Assuntos
Algoritmos , Teorema de Bayes , Epistasia Genética , Mapeamento de Interação de Proteínas/normas , Funções Verossimilhança , Mapeamento de Interação de Proteínas/métodos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA