Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(50): e2303887120, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-38060555

RESUMO

Complex networked systems often exhibit higher-order interactions, beyond dyadic interactions, which can dramatically alter their observed behavior. Consequently, understanding hypergraphs from a structural perspective has become increasingly important. Statistical, group-based inference approaches are well suited for unveiling the underlying community structure and predicting unobserved interactions. However, these approaches often rely on two key assumptions: that the same groups can explain hyperedges of any order and that interactions are assortative, meaning that edges are formed by nodes with the same group memberships. To test these assumptions, we propose a group-based generative model for hypergraphs that does not impose an assortative mechanism to explain observed higher-order interactions, unlike current approaches. Our model allows us to explore the validity of the assumptions. Our results indicate that the first assumption appears to hold true for real networks. However, the second assumption is not necessarily accurate; we find that a combination of general statistical mechanisms can explain observed hyperedges. Finally, with our approach, we are also able to determine the importance of lower and high-order interactions for predicting unobserved interactions. Our research challenges the conventional assumptions of group-based inference methodologies and broadens our understanding of the underlying structure of hypergraphs.

2.
Bioinformatics ; 35(20): 4089-4097, 2019 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-30903689

RESUMO

MOTIVATION: The analysis of biological samples in untargeted metabolomic studies using LC-MS yields tens of thousands of ion signals. Annotating these features is of the utmost importance for answering questions as fundamental as, e.g. how many metabolites are there in a given sample. RESULTS: Here, we introduce CliqueMS, a new algorithm for annotating in-source LC-MS1 data. CliqueMS is based on the similarity between coelution profiles and therefore, as opposed to most methods, allows for the annotation of a single spectrum. Furthermore, CliqueMS improves upon the state of the art in several dimensions: (i) it uses a more discriminatory feature similarity metric; (ii) it treats the similarities between features in a transparent way by means of a simple generative model; (iii) it uses a well-grounded maximum likelihood inference approach to group features; (iv) it uses empirical adduct frequencies to identify the parental mass and (v) it deals more flexibly with the identification of the parental mass by proposing and ranking alternative annotations. We validate our approach with simple mixtures of standards and with real complex biological samples. CliqueMS reduces the thousands of features typically obtained in complex samples to hundreds of metabolites, and it is able to correctly annotate more metabolites and adducts from a single spectrum than available tools. AVAILABILITY AND IMPLEMENTATION: https://CRAN.R-project.org/package=cliqueMS and https://github.com/osenan/cliqueMS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Espectrometria de Massas em Tandem , Cromatografia Líquida , Íons , Metabolômica , Redes Neurais de Computação
3.
Phys Rev Lett ; 124(8): 084503, 2020 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-32167370

RESUMO

Ever since Nikuradse's experiments on turbulent friction in 1933, there have been theoretical attempts to describe his measurements by collapsing the data into single-variable functions. However, this approach, which is common in other areas of physics and in other fields, is limited by the lack of rigorous quantitative methods to compare alternative data collapses. Here, we address this limitation by using an unsupervised method to find analytic functions that optimally describe each of the data collapses for the Nikuradse dataset. By descaling these analytic functions, we show that a low dispersion of the scaled data does not guarantee that a data collapse is a good description of the original data. In fact, we find that, out of all the proposed data collapses, the original one proposed by Prandtl and Nikuradse over 80 years ago provides the best description of the data so far, and that it also agrees well with recent experimental data, provided that some model parameters are allowed to vary across experiments.

4.
Proc Natl Acad Sci U S A ; 113(50): 14207-14212, 2016 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-27911773

RESUMO

With increasing amounts of information available, modeling and predicting user preferences-for books or articles, for example-are becoming more important. We present a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of users' ratings. Like previous approaches, we assume that there are groups of users and of items and that the rating a user gives an item is determined by their respective group memberships. However, we allow each user and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches such as matrix factorization, we do not assume that users in each group prefer a single group of items. In particular, we do not assume that ratings depend linearly on a measure of similarity, but allow probability distributions of ratings to depend freely on the user's and item's groups. The resulting overlapping groups and predicted ratings can be inferred with an expectation-maximization algorithm whose running time scales linearly with the number of observed ratings. Our approach enables us to predict user preferences in large datasets and is considerably more accurate than the current algorithms for such large datasets.

5.
Proc Natl Acad Sci U S A ; 117(41): 25195-25197, 2020 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-32989129
6.
Anal Chem ; 89(6): 3474-3482, 2017 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-28221024

RESUMO

Structural annotation of metabolites relies mainly on tandem mass spectrometry (MS/MS) analysis. However, approximately 90% of the known metabolites reported in metabolomic databases do not have annotated spectral data from standards. This situation has fostered the development of computational tools that predict fragmentation patterns in silico and compare these to experimental MS/MS spectra. However, because such methods require the molecular structure of the detected compound to be available for the algorithm, the identification of novel metabolites in organisms relevant for biotechnological and medical applications remains a challenge. Here, we present iMet, a computational tool that facilitates structural annotation of metabolites not described in databases. iMet uses MS/MS spectra and the exact mass of an unknown metabolite to identify metabolites in a reference database that are structurally similar to the unknown metabolite. The algorithm also suggests the chemical transformation that converts the known metabolites into the unknown one. As a proxy for the structural annotation of novel metabolites, we tested 148 metabolites following a leave-one-out cross-validation procedure or by using MS/MS spectra experimentally obtained in our laboratory. We show that for 89% of the 148 metabolites at least one of the top four matches identified by iMet enables the proper annotation of the unknown metabolites. To further validate iMet, we tested 31 metabolites proposed in the 2012-16 CASMI challenges. iMet is freely available at http://imet.seeslab.net .


Assuntos
Algoritmos , Glucose-6-Fosfato/metabolismo , Glucose/metabolismo , Bases de Dados Factuais , Glucose/química , Glucose-6-Fosfato/biossíntese , Glucose-6-Fosfato/química , Estrutura Molecular , Fosforilação , Espectrometria de Massas em Tandem
7.
Proc Natl Acad Sci U S A ; 111(43): 15322-7, 2014 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-25288755

RESUMO

Tens of millions of individuals around the world use decentralized content distribution systems, a fact of growing social, economic, and technological importance. These sharing systems are poorly understood because, unlike in other technosocial systems, it is difficult to gather large-scale data about user behavior. Here, we investigate user activity patterns and the socioeconomic factors that could explain the behavior. Our analysis reveals that (i) the ecosystem is heterogeneous at several levels: content types are heterogeneous, users specialize in a few content types, and countries are heterogeneous in user profiles; and (ii) there is a strong correlation between socioeconomic indicators of a country and users behavior. Our findings open a research area on the dynamics of decentralized sharing ecosystems and the socioeconomic factors affecting them, and may have implications for the design of algorithms and for policymaking.


Assuntos
Comportamento , Comportamento Cooperativo , Ecossistema , Política , Humanos , Fatores Socioeconômicos
8.
PLoS Comput Biol ; 9(12): e1003374, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24339767

RESUMO

Characterizing interactions between drugs is important to avoid potentially harmful combinations, to reduce off-target effects of treatments and to fight antibiotic resistant pathogens, among others. Here we present a network inference algorithm to predict uncharacterized drug-drug interactions. Our algorithm takes, as its only input, sets of previously reported interactions, and does not require any pharmacological or biochemical information about the drugs, their targets or their mechanisms of action. Because the models we use are abstract, our approach can deal with adverse interactions, synergistic/antagonistic/suppressing interactions, or any other type of drug interaction. We show that our method is able to accurately predict interactions, both in exhaustive pairwise interaction data between small sets of drugs, and in large-scale databases. We also demonstrate that our algorithm can be used efficiently to discover interactions of new drugs as part of the drug discovery process.


Assuntos
Interações Medicamentosas , Modelos Teóricos , Algoritmos , Sistemas de Gerenciamento de Base de Dados , Descoberta de Drogas
9.
Proc Natl Acad Sci U S A ; 108(18): 7647-52, 2011 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-21502521

RESUMO

In this study, we investigated on a systems level how complex protein interactions underlying cell polarity in yeast determine the dynamic association of proteins with the polar cortical domain (PCD) where they localize and perform morphogenetic functions. We constructed a network of physical interactions among >100 proteins localized to the PCD. This network was further divided into five robust modules correlating with distinct subprocesses associated with cell polarity. Based on this reconstructed network, we proposed a simple model that approximates a PCD protein's molecular residence time as the sum of the characteristic time constants of the functional modules with which it interacts, weighted by the number of edges forming these interactions. Regression analyses showed excellent fitting of the model with experimentally measured residence times for a large subset of the PCD proteins. The model is able to predict residence times using small training sets. Our analysis also revealed a scaffold protein that imposes a local constraint of dynamics for certain interacting proteins.


Assuntos
Polaridade Celular/fisiologia , Proteínas Fúngicas/fisiologia , Modelos Biológicos , Mapeamento de Interação de Proteínas , Saccharomycetales/fisiologia , Análise de Variância , Recuperação de Fluorescência Após Fotodegradação , Transferência Ressonante de Energia de Fluorescência , Análise de Regressão , Biologia de Sistemas , Fatores de Tempo
10.
J Theor Biol ; 334: 35-44, 2013 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-23770400

RESUMO

In food webs, the degree of intervality of consumers' diets is an indicator of the number of dimensions that are necessary to determine the niche of a species. Previous studies modeling food-web structure have shown that real networks are compatible with a high degree of diet contiguity. However, current models are also compatible with the opposite, namely that species' diets have relatively low contiguity. This is particularly true when one takes species' body size as a proxy for niche value, in which case the indeterminacy of diet contiguities provided by current models can be large. We propose a model that enables us to narrow down the range of possible values of diet contiguity. According to this model, we find that diet contiguity not only can be high, but must be high when species are ranked in ascending order of body size.


Assuntos
Tamanho Corporal/fisiologia , Comportamento Alimentar/fisiologia , Cadeia Alimentar , Modelos Biológicos , Algoritmos , Animais , Simulação por Computador , Dieta , Ecossistema , Comportamento Predatório/fisiologia , Especificidade da Espécie
11.
PLoS Comput Biol ; 8(11): e1002762, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23133365

RESUMO

The ability of microbial species to consume compounds found in the environment to generate commercially-valuable products has long been exploited by humanity. The untapped, staggering diversity of microbial organisms offers a wealth of potential resources for tackling medical, environmental, and energy challenges. Understanding microbial metabolism will be crucial to many of these potential applications. Thermodynamically-feasible metabolic reconstructions can be used, under some conditions, to predict the growth rate of certain microbes using constraint-based methods. While these reconstructions are powerful, they are still cumbersome to build and, because of the complexity of metabolic networks, it is hard for researchers to gain from these reconstructions an understanding of why a certain nutrient yields a given growth rate for a given microbe. Here, we present a simple model of biomass production that accurately reproduces the predictions of thermodynamically-feasible metabolic reconstructions. Our model makes use of only: i) a nutrient's structure and function, ii) the presence of a small number of enzymes in the organism, and iii) the carbon flow in pathways that catabolize nutrients. When applied to test organisms, our model allows us to predict whether a nutrient can be a carbon source with an accuracy of about 90% with respect to in silico experiments. In addition, our model provides excellent predictions of whether a medium will produce more or less growth than another (p<10(-6)) and good predictions of the actual value of the in silico biomass production.


Assuntos
Bactérias/metabolismo , Modelos Biológicos , Saccharomyces cerevisiae/metabolismo , Biologia de Sistemas/métodos , Biomassa , Carbono/metabolismo , Ciclo do Carbono , Simulação por Computador , Metabolismo , Reprodutibilidade dos Testes
12.
Nat Commun ; 14(1): 1043, 2023 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-36823107

RESUMO

Given a finite and noisy dataset generated with a closed-form mathematical model, when is it possible to learn the true generating model from the data alone? This is the question we investigate here. We show that this model-learning problem displays a transition from a low-noise phase in which the true model can be learned, to a phase in which the observation noise is too high for the true model to be learned by any method. Both in the low-noise phase and in the high-noise phase, probabilistic model selection leads to optimal generalization to unseen data. This is in contrast to standard machine learning approaches, including artificial neural networks, which in this particular problem are limited, in the low-noise phase, by their ability to interpolate. In the transition region between the learnable and unlearnable phases, generalization is hard for all approaches including probabilistic model selection.

13.
Proc Natl Acad Sci U S A ; 106(52): 22073-8, 2009 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-20018705

RESUMO

Network analysis is currently used in a myriad of contexts, from identifying potential drug targets to predicting the spread of epidemics and designing vaccination strategies and from finding friends to uncovering criminal activity. Despite the promise of the network approach, the reliability of network data is a source of great concern in all fields where complex networks are studied. Here, we present a general mathematical and computational framework to deal with the problem of data reliability in complex networks. In particular, we are able to reliably identify both missing and spurious interactions in noisy network observations. Remarkably, our approach also enables us to obtain, from those noisy observations, network reconstructions that yield estimates of the true network properties that are more accurate than those provided by the observations themselves. Our approach has the potential to guide experiments, to better characterize network data sets, and to drive new discoveries.


Assuntos
Bioengenharia/estatística & dados numéricos , Bases de Dados Factuais , Algoritmos , Animais , Biologia Computacional , Humanos , Sistemas de Informação , Modelos Estatísticos , Redes Neurais de Computação , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Projetos de Pesquisa , Processos Estocásticos
14.
PNAS Nexus ; 1(3): pgac055, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36741465

RESUMO

A key question in human gut microbiome research is what are the robust structural patterns underlying its taxonomic composition. Herein, we use whole metagenomic datasets from healthy human guts to show that such robust patterns do exist, albeit not in the conventional enterotype sense. We first introduce the concept of mixed-membership enterotypes using a network inference approach based on stochastic block models. We find that gut microbiomes across a group of people (hosts) display a nested structure, which has been observed in a number of ecological systems. This finding led us to designate distinct ecological roles to both microbes and hosts: generalists and specialists. Specifically, generalist hosts have microbiomes with most microbial species, while specialist hosts only have generalist microbes. Moreover, specialist microbes are only present in generalist hosts. From the nested structure of microbial taxonomies, we show that these ecological roles of microbes are generally conserved across datasets. Our results show that the taxonomic composition of healthy human gut microbiomes is associated with robustly structured combinations of generalist and specialist species.

15.
ACS Omega ; 7(45): 41147-41164, 2022 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-36406548

RESUMO

Process modeling has become a fundamental tool to guide experimental work. Unfortunately, process models based on first principles can be expensive to develop and evaluate, and hard to use, particularly when convergence issues arise. This work proves that Bayesian symbolic learning can be applied to derive simple closed-form expressions from rigorous process simulations, streamlining the process modeling task and making process models more accessible to experimental groups. Compared to conventional surrogate models, our approach provides analytical expressions that are easier to communicate and manipulate algebraically to get insights into the process. We apply this method to synthetic data obtained from two basic CO2 capture processes simulated in Aspen HYSYS, identifying accurate simplified interpretable equations for key variables dictating the process economic and environmental performance. We then use these expressions to analyze the process variables' elasticities and benchmark an emerging CO2 capture process against the business as usual technology.

16.
iScience ; 25(1): 103663, 2022 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-35036864

RESUMO

We design a "wisdom-of-the-crowds" GRN inference pipeline and couple it to complex network analysis to understand the organizational principles governing gene regulation in long-lived glp-1/Notch Caenorhabditis elegans. The GRN has three layers (input, core, and output) and is topologically equivalent to bow-tie/hourglass structures prevalent among metabolic networks. To assess the functional importance of structural layers, we screened 80% of regulators and discovered 50 new aging genes, 86% with human orthologues. Genes essential for longevity-including ones involved in insulin-like signaling (ILS)-are at the core, indicating that GRN's structure is predictive of functionality. We used in vivo reporters and a novel functional network covering 5,497 genetic interactions to make mechanistic predictions. We used genetic epistasis to test some of these predictions, uncovering a novel transcriptional regulator, sup-37, that works alongside DAF-16/FOXO. We present a framework with predictive power that can accelerate discovery in C. elegans and potentially humans.

17.
Redox Biol ; 54: 102353, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35777200

RESUMO

Metabolic plasticity is the ability of a biological system to adapt its metabolic phenotype to different environmental stressors. We used a whole-body and tissue-specific phenotypic, functional, proteomic, metabolomic and transcriptomic approach to systematically assess metabolic plasticity in diet-induced obese mice after a combined nutritional and exercise intervention. Although most obesity and overnutrition-related pathological features were successfully reverted, we observed a high degree of metabolic dysfunction in visceral white adipose tissue, characterized by abnormal mitochondrial morphology and functionality. Despite two sequential therapeutic interventions and an apparent global healthy phenotype, obesity triggered a cascade of events in visceral adipose tissue progressing from mitochondrial metabolic and proteostatic alterations to widespread cellular stress, which compromises its biosynthetic and recycling capacity. In humans, weight loss after bariatric surgery showed a transcriptional signature in visceral adipose tissue similar to our mouse model of obesity reversion. Overall, our data indicate that obesity prompts a lasting metabolic fingerprint that leads to a progressive breakdown of metabolic plasticity in visceral adipose tissue.


Assuntos
Resistência à Insulina , Tecido Adiposo/metabolismo , Animais , Homeostase , Gordura Intra-Abdominal/metabolismo , Camundongos , Obesidade/genética , Obesidade/metabolismo , Proteômica
18.
Nature ; 433(7028): 895-900, 2005 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-15729348

RESUMO

High-throughput techniques are leading to an explosive growth in the size of biological databases and creating the opportunity to revolutionize our understanding of life and disease. Interpretation of these data remains, however, a major scientific challenge. Here, we propose a methodology that enables us to extract and display information contained in complex networks. Specifically, we demonstrate that we can find functional modules in complex networks, and classify nodes into universal roles according to their pattern of intra- and inter-module connections. The method thus yields a 'cartographic representation' of complex networks. Metabolic networks are among the most challenging biological networks and, arguably, the ones with most potential for immediate applicability. We use our method to analyse the metabolic networks of twelve organisms from three different superkingdoms. We find that, typically, 80% of the nodes are only connected to other nodes within their respective modules, and that nodes with different roles are affected by different evolutionary constraints and pressures. Remarkably, we find that metabolites that participate in only a few reactions but that connect different modules are more conserved than hubs whose links are mostly within a single module.


Assuntos
Archaea/metabolismo , Bactérias/metabolismo , Biologia Computacional/métodos , Simulação por Computador , Células Eucarióticas/metabolismo , Modelos Biológicos , Trifosfato de Adenosina/metabolismo , Algoritmos , Animais , Bases de Dados Factuais , Escherichia coli/metabolismo , Humanos
19.
Sci Adv ; 6(5): eaav6971, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32064326

RESUMO

Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need "machine scientists" that are able to extract these models automatically from data. Here, we introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions. It explores the space of models using Markov chain Monte Carlo. We show that this approach uncovers accurate models for synthetic and real data and provides out-of-sample predictions that are more accurate than those of existing approaches and of other nonparametric methods.

20.
Phys Rev E ; 99(3-1): 032307, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30999447

RESUMO

Many real-world complex systems are well represented as multilayer networks; predicting interactions in those systems is one of the most pressing problems in predictive network science. To address this challenge, we introduce two stochastic block models for multilayer and temporal networks; one of them uses nodes as its fundamental unit, whereas the other focuses on links. We also develop scalable algorithms for inferring the parameters of these models. Because our models describe all layers simultaneously, our approach takes full advantage of the information contained in the whole network when making predictions about any particular layer. We illustrate the potential of our approach by analyzing two empirical data sets: a temporal network of e-mail communications, and a network of drug interactions for treating different cancer types. We find that multilayer models consistently outperform their single-layer counterparts, but that the most predictive model depends on the data set under consideration; whereas the node-based model is more appropriate for predicting drug interactions, the link-based model is more appropriate for predicting e-mail communication.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA