Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 162(6): 1286-98, 2015 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-26359986

RESUMEN

Heat causes protein misfolding and aggregation and, in eukaryotic cells, triggers aggregation of proteins and RNA into stress granules. We have carried out extensive proteomic studies to quantify heat-triggered aggregation and subsequent disaggregation in budding yeast, identifying >170 endogenous proteins aggregating within minutes of heat shock in multiple subcellular compartments. We demonstrate that these aggregated proteins are not misfolded and destined for degradation. Stable-isotope labeling reveals that even severely aggregated endogenous proteins are disaggregated without degradation during recovery from shock, contrasting with the rapid degradation observed for many exogenous thermolabile proteins. Although aggregation likely inactivates many cellular proteins, in the case of a heterotrimeric aminoacyl-tRNA synthetase complex, the aggregated proteins remain active with unaltered fidelity. We propose that most heat-induced aggregation of mature proteins reflects the operation of an adaptive, autoregulatory process of functionally significant aggregate assembly and disassembly that aids cellular adaptation to thermal stress.


Asunto(s)
Respuesta al Choque Térmico , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/fisiología , Cicloheximida/farmacología , Gránulos Citoplasmáticos/metabolismo , Agregado de Proteínas , Biosíntesis de Proteínas/efectos de los fármacos , Inhibidores de la Síntesis de la Proteína/farmacología , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo
2.
Proc Natl Acad Sci U S A ; 119(44): e2208975119, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36279463

RESUMEN

Randomized experiments are widely used to estimate the causal effects of a proposed treatment in many areas of science, from medicine and healthcare to the physical and biological sciences, from the social sciences to engineering, and from public policy to the technology industry. Here we consider situations where classical methods for estimating the total treatment effect on a target population are considerably biased due to confounding network effects, i.e., the fact that the treatment of an individual may impact its neighbors' outcomes, an issue referred to as network interference or as nonindividualized treatment response. A key challenge in these situations is that the network is often unknown and difficult or costly to measure. We assume a potential outcomes model with heterogeneous additive network effects, encompassing a broad class of network interference sources, including spillover, peer effects, and contagion. First, we characterize the limitations in estimating the total treatment effect without knowledge of the network that drives interference. By contrast, we subsequently develop a simple estimator and efficient randomized design that outputs an unbiased estimate with low variance in situations where one is given access to average historical baseline measurements prior to the experiment. Our solution does not require knowledge of the underlying network structure, and it comes with statistical guarantees for a broad class of models. Due to their ease of interpretation and implementation, and their theoretical guarantees, we believe our results will have significant impact on the design of randomized experiments.


Asunto(s)
Ensayos Clínicos Controlados Aleatorios como Asunto , Causalidad
3.
Mol Cell ; 63(1): 60-71, 2016 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-27320198

RESUMEN

Despite its eponymous association with the heat shock response, yeast heat shock factor 1 (Hsf1) is essential even at low temperatures. Here we show that engineered nuclear export of Hsf1 results in cytotoxicity associated with massive protein aggregation. Genome-wide analysis revealed that Hsf1 nuclear export immediately decreased basal transcription and mRNA expression of 18 genes, which predominately encode chaperones. Strikingly, rescuing basal expression of Hsp70 and Hsp90 chaperones enabled robust cell growth in the complete absence of Hsf1. With the exception of chaperone gene induction, the vast majority of the heat shock response was Hsf1 independent. By comparative analysis of mammalian cell lines, we found that only heat shock-induced but not basal expression of chaperones is dependent on the mammalian Hsf1 homolog (HSF1). Our work reveals that yeast chaperone gene expression is an essential housekeeping mechanism and provides a roadmap for defining the function of HSF1 as a driver of oncogenesis.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Proteínas de Choque Térmico/metabolismo , Respuesta al Choque Térmico , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/metabolismo , Transcripción Genética , Animales , Sistemas CRISPR-Cas , Línea Celular , Proteínas de Unión al ADN/genética , Células Madre Embrionarias/metabolismo , Fibroblastos/metabolismo , Regulación Fúngica de la Expresión Génica , Redes Reguladoras de Genes , Proteínas HSP70 de Choque Térmico/metabolismo , Proteínas HSP90 de Choque Térmico/metabolismo , Factores de Transcripción del Choque Térmico , Proteínas de Choque Térmico/genética , Homeostasis , Ratones de la Cepa 129 , Ratones Endogámicos CBA , Agregado de Proteínas , Mapas de Interacción de Proteínas , ARN de Hongos/genética , ARN de Hongos/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Factores de Tiempo , Factores de Transcripción/genética , Transfección
4.
Proc Natl Acad Sci U S A ; 117(32): 19045-19053, 2020 08 11.
Artículo en Inglés | MEDLINE | ID: mdl-32723822

RESUMEN

Data analyses typically rely upon assumptions about the missingness mechanisms that lead to observed versus missing data, assumptions that are typically unassessable. We explore an approach where the joint distribution of observed data and missing data are specified in a nonstandard way. In this formulation, which traces back to a representation of the joint distribution of the data and missingness mechanism, apparently first proposed by J. W. Tukey, the modeling assumptions about the distributions are either assessable or are designed to allow relatively easy incorporation of substantive knowledge about the problem at hand, thereby offering a possibly realistic portrayal of the data, both observed and missing. We develop Tukey's representation for exponential-family models, propose a computationally tractable approach to inference in this class of models, and offer some general theoretical comments. We then illustrate the utility of this approach with an example in systems biology.

5.
Proc Natl Acad Sci U S A ; 117(38): 23393-23400, 2020 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-32887799

RESUMEN

Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speed up network data collection and improve network model validation. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 550 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity using network-based metalearning to construct a series of "stacked" models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state of the art for link prediction comes from combining individual algorithms, which can achieve nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvements.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Humanos , Aprendizaje Automático/normas , Modelos Estadísticos , Valor Predictivo de las Pruebas , Red Social
7.
Proc Natl Acad Sci U S A ; 112(21): 6595-600, 2015 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-25964337

RESUMEN

Social networks affect many aspects of life, including the spread of diseases, the diffusion of information, the workers' productivity, and consumers' behavior. Little is known, however, about how these networks form and change. Estimating causal effects and mechanisms that drive social network formation and dynamics is challenging because of the complexity of engineering social relations in a controlled environment, endogeneity between network structure and individual characteristics, and the lack of time-resolved data about individuals' behavior. We leverage data from a sample of 1.5 million college students on Facebook, who wrote more than 630 million messages and 590 million posts over 4 years, to design a long-term natural experiment of friendship formation and social dynamics in the aftermath of a natural disaster. The analysis shows that affected individuals are more likely to strengthen interactions, while maintaining the same number of friends as unaffected individuals. Our findings suggest that the formation of social relationships may serve as a coping mechanism to deal with high-stress situations and build resilience in communities.


Asunto(s)
Medios de Comunicación Sociales , Red Social , Tormentas Ciclónicas , Desastres , Humanos , Internet , Relaciones Interpersonales , Estudiantes , Estados Unidos , Universidades
8.
Proc Natl Acad Sci U S A ; 112(18): 5643-8, 2015 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-25902504

RESUMEN

Public transportation systems are an essential component of major cities. The widespread use of smart cards for automated fare collection in these systems offers a unique opportunity to understand passenger behavior at a massive scale. In this study, we use network-wide data obtained from smart cards in the London transport system to predict future traffic volumes, and to estimate the effects of disruptions due to unplanned closures of stations or lines. Disruptions, or shocks, force passengers to make different decisions concerning which stations to enter or exit. We describe how these changes in passenger behavior lead to possible overcrowding and model how stations will be affected by given disruptions. This information can then be used to mitigate the effects of these shocks because transport authorities may prepare in advance alternative solutions such as additional buses near the most affected stations. We describe statistical methods that leverage the large amount of smart-card data collected under the natural state of the system, where no shocks take place, as variables that are indicative of behavior under disruptions. We find that features extracted from the natural regime data can be successfully exploited to describe different disruption regimes, and that our framework can be used as a general tool for any similar complex transportation system.


Asunto(s)
Accidentes de Tránsito/estadística & datos numéricos , Ciudades , Vehículos a Motor/estadística & datos numéricos , Transportes/estadística & datos numéricos , Accidentes de Tránsito/prevención & control , Accidentes de Tránsito/tendencias , Algoritmos , Planificación de Ciudades/métodos , Planificación de Ciudades/estadística & datos numéricos , Planificación de Ciudades/tendencias , Planificación Ambiental/estadística & datos numéricos , Planificación Ambiental/tendencias , Predicción , Humanos , Londres , Mapas como Asunto , Modelos Teóricos , Transportes/métodos
9.
PLoS Genet ; 11(5): e1005206, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25950722

RESUMEN

Cells respond to their environment by modulating protein levels through mRNA transcription and post-transcriptional control. Modest observed correlations between global steady-state mRNA and protein measurements have been interpreted as evidence that mRNA levels determine roughly 40% of the variation in protein levels, indicating dominant post-transcriptional effects. However, the techniques underlying these conclusions, such as correlation and regression, yield biased results when data are noisy, missing systematically, and collinear---properties of mRNA and protein measurements---which motivated us to revisit this subject. Noise-robust analyses of 24 studies of budding yeast reveal that mRNA levels explain more than 85% of the variation in steady-state protein levels. Protein levels are not proportional to mRNA levels, but rise much more rapidly. Regulation of translation suffices to explain this nonlinear effect, revealing post-transcriptional amplification of, rather than competition with, transcriptional signals. These results substantially revise widely credited models of protein-level regulation, and introduce multiple noise-aware approaches essential for proper analysis of many biological phenomena.


Asunto(s)
Regulación Fúngica de la Expresión Génica , Procesamiento Postranscripcional del ARN , ARN Mensajero/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Modelos Genéticos , ARN Mensajero/metabolismo , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Transcripción Genética
10.
Bioinformatics ; 31(14): 2400-2, 2015 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-25617416

RESUMEN

MOTIVATION: Analysis of RNA sequencing (RNA-Seq) data revealed that the vast majority of human genes express multiple mRNA isoforms, produced by alternative pre-mRNA splicing and other mechanisms, and that most alternative isoforms vary in expression between human tissues. As RNA-Seq datasets grow in size, it remains challenging to visualize isoform expression across multiple samples. RESULTS: To help address this problem, we present Sashimi plots, a quantitative visualization of aligned RNA-Seq reads that enables quantitative comparison of exon usage across samples or experimental conditions. Sashimi plots can be made using the Broad Integrated Genome Viewer or with a stand-alone command line program. AVAILABILITY AND IMPLEMENTATION: Software code and documentation freely available here: http://miso.readthedocs.org/en/fastmiso/sashimi.html


Asunto(s)
Empalme Alternativo , Exones , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Gráficos por Computador , Humanos , Isoformas de ARN/química , Isoformas de ARN/metabolismo , Alineación de Secuencia
11.
Nature ; 462(7271): 358-62, 2009 Nov 19.
Artículo en Inglés | MEDLINE | ID: mdl-19924215

RESUMEN

Molecular regulation of embryonic stem cell (ESC) fate involves a coordinated interaction between epigenetic, transcriptional and translational mechanisms. It is unclear how these different molecular regulatory mechanisms interact to regulate changes in stem cell fate. Here we present a dynamic systems-level study of cell fate change in murine ESCs following a well-defined perturbation. Global changes in histone acetylation, chromatin-bound RNA polymerase II, messenger RNA (mRNA), and nuclear protein levels were measured over 5 days after downregulation of Nanog, a key pluripotency regulator. Our data demonstrate how a single genetic perturbation leads to progressive widespread changes in several molecular regulatory layers, and provide a dynamic view of information flow in the epigenome, transcriptome and proteome. We observe that a large proportion of changes in nuclear protein levels are not accompanied by concordant changes in the expression of corresponding mRNAs, indicating important roles for translational and post-translational regulation of ESC fate. Gene-ontology analysis across different molecular layers indicates that although chromatin reconfiguration is important for altering cell fate, it is preceded by transcription-factor-mediated regulatory events. The temporal order of gene expression alterations shows the order of the regulatory network reconfiguration and offers further insight into the gene regulatory network. Our studies extend the conventional systems biology approach to include many molecular species, regulatory layers and temporal series, and underscore the complexity of the multilayer regulatory mechanisms responsible for changes in protein expression that determine stem cell fate.


Asunto(s)
Diferenciación Celular , Células Madre Embrionarias/citología , Células Madre Embrionarias/metabolismo , Animales , Epigénesis Genética , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Ratones , Proteoma , Factores de Tiempo
12.
Mol Biol Evol ; 30(6): 1438-53, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23493257

RESUMEN

A key goal in molecular evolution is to extract mechanistic insights from signatures of selection. A case study is codon usage, where despite many recent advances and hypotheses, two longstanding problems remain: the relative contribution of selection and mutation in determining codon frequencies and the relative contribution of translational speed and accuracy to selection. The relevant targets of selection--the rate of translation and of mistranslation of a codon per unit time in the cell--can only be related to mechanistic properties of the translational apparatus if the number of transcripts per cell is known, requiring use of gene expression measurements. Perhaps surprisingly, different gene-expression data sets yield markedly different estimates of selection. We show that this is largely due to measurement noise, notably due to differences between studies rather than instrument error or biological variability. We develop an analytical framework that explicitly models noise in expression in the context of the population-genetic model. Estimates of mutation and selection strength in budding yeast produced by this method are robust to the expression data set used and are substantially higher than estimates using a noise-blind approach. We introduce per-gene selection estimates that correlate well with previous scoring systems, such as the codon adaptation index, while now carrying an evolutionary interpretation. On average, selection for codon usage in budding yeast is weak, yet our estimates show that genes range from virtually unselected to average per-codon selection coefficients above the inverse population size. Our analytical framework may be generally useful for distinguishing biological signals from measurement noise in other applications that depend upon measurements of gene expression.


Asunto(s)
Codón , Expresión Génica , Modelos Genéticos , Selección Genética , Evolución Molecular , Mutación , Biosíntesis de Proteínas , Ribosomas/genética , Ribosomas/metabolismo , Saccharomycetales/genética , Saccharomycetales/metabolismo
13.
Proc Natl Acad Sci U S A ; 108(41): 16916-21, 2011 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-21949369

RESUMEN

The goal of dimensionality reduction is to embed high-dimensional data in a low-dimensional space while preserving structure in the data relevant to exploratory data analysis such as clusters. However, existing dimensionality reduction methods often either fail to separate clusters due to the crowding problem or can only separate clusters at a single resolution. We develop a new approach to dimensionality reduction: tree preserving embedding. Our approach uses the topological notion of connectedness to separate clusters at all resolutions. We provide a formal guarantee of cluster separation for our approach that holds for finite samples. Our approach requires no parameters and can handle general types of data, making it easy to use in practice and suggesting new strategies for robust data visualization.


Asunto(s)
Interpretación Estadística de Datos , Algoritmos , Análisis por Conglomerados , Escritura Manual , Modelos Estadísticos , Radar , Análisis de Secuencia de Proteína/estadística & datos numéricos
14.
Science ; 384(6695): eadi5147, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38696582

RESUMEN

Certain people occupy topological positions within social networks that enhance their effectiveness at inducing spillovers. We mapped face-to-face networks among 24,702 people in 176 isolated villages in Honduras and randomly assigned villages to targeting methods, varying the fraction of households receiving a 22-month health education package and the method by which households were chosen (randomly versus using the friendship-nomination algorithm). We assessed 117 diverse knowledge, attitude, and practice outcomes. Friendship-nomination targeting reduced the number of households needed to attain specified levels of village-wide uptake. Knowledge spread more readily than behavior, and spillovers extended to two degrees of separation. Outcomes that were intrinsically easier to adopt also manifested greater spillovers. Network targeting using friendship nomination effectively promotes population-wide improvements in welfare through social contagion.

15.
BMJ Open ; 14(6): e060784, 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38858139

RESUMEN

OBJECTIVES: To assess the efficacy of a sustained educational intervention to affect diverse outcomes across the pregnancy and infancy timeline. SETTING: A multi-arm cluster-randomised controlled trial in 99 villages in Honduras' Copán region, involving 16 301 people in 5633 households from October 2015 to December 2019. PARTICIPANTS: Residents aged 12 and older were eligible. A photographic census involved 93% of the population, with 13 881 and 10 263 individuals completing baseline and endline surveys, respectively. INTERVENTION: 22-month household-based counselling intervention aiming to improve practices, knowledge and attitudes related to maternal, neonatal and child health. PRIMARY AND SECONDARY OUTCOME MEASURES: Primary outcomes were prenatal/postnatal care behaviours, facility births, exclusive breast feeding, parental involvement, treatment of diarrhoea and respiratory illness, reproductive health, and gender/reproductive norms. Secondary outcomes were knowledge and attitudes related to the primary outcomes. RESULTS: Parents targeted for the intervention were 16.4% (95% CI 3.1%-29.8%, p=0.016) more likely to have their newborn's health checked in a health facility within 3 days of birth; 19.6% (95% CI 4.2%-35.1%, p=0.013) more likely to not wrap a fajero around the umbilical cord in the first week after birth; and 8.9% (95% CI 0.3%-17.5%, p=0.043) more likely to report that the mother breast fed immediately after birth. Changes in knowledge and attitudes related to these primary outcomes were also observed. We found no significant effect on various other practices. CONCLUSION: A sustained counselling intervention delivered in the home setting by community health workers can meaningfully change practices, knowledge and attitudes related to proper newborn care following birth, including professional care-seeking, umbilical cord care and breast feeding. TRIAL REGISTRATION NUMBER: NCT02694679.


Asunto(s)
Conocimientos, Actitudes y Práctica en Salud , Humanos , Honduras , Femenino , Adulto , Embarazo , Recién Nacido , Masculino , Promoción de la Salud/métodos , Niño , Lactancia Materna , Consejo/métodos , Lactante , Adolescente , Salud Infantil , Adulto Joven , Atención Prenatal/métodos , Atención Posnatal/métodos
16.
Nat Methods ; 7(12): 1009-15, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21057496

RESUMEN

Through alternative splicing, most human genes express multiple isoforms that often differ in function. To infer isoform regulation from high-throughput sequencing of cDNA fragments (RNA-seq), we developed the mixture-of-isoforms (MISO) model, a statistical model that estimates expression of alternatively spliced exons and isoforms and assesses confidence in these estimates. Incorporation of mRNA fragment length distribution in paired-end RNA-seq greatly improved estimation of alternative-splicing levels. MISO also detects differentially regulated exons or isoforms. Application of MISO implicated the RNA splicing factor hnRNP H1 in the regulation of alternative cleavage and polyadenylation, a role that was supported by UV cross-linking-immunoprecipitation sequencing (CLIP-seq) analysis in human cells. Our results provide a probabilistic framework for RNA-seq analysis, give functional insights into pre-mRNA processing and yield guidelines for the optimal design of RNA-seq experiments for studies of gene and isoform expression.


Asunto(s)
ARN/química , Análisis de Secuencia de ARN/métodos , Empalme Alternativo , Secuencia de Bases , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Exones/genética , Ribonucleoproteínas Nucleares Heterogéneas/química , Humanos , Intrones/genética , Factores de Transcripción NFATC/genética , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , ARN/genética , ARN Mensajero/química , ARN Mensajero/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos
17.
Proc Natl Acad Sci U S A ; 107(49): 20899-904, 2010 Dec 07.
Artículo en Inglés | MEDLINE | ID: mdl-21078953

RESUMEN

PNAS article classification is rooted in long-standing disciplinary divisions that do not necessarily reflect the structure of modern scientific research. We reevaluate that structure using latent pattern models from statistical machine learning, also known as mixed-membership models, that identify semantic structure in co-occurrence of words in the abstracts and references. Our findings suggest that the latent dimensionality of patterns underlying PNAS research articles in the Biological Sciences is only slightly larger than the number of categories currently in use, but it differs substantially in the content of the categories. Further, the number of articles that are listed under multiple categories is only a small fraction of what it should be. These findings together with the sensitivity analyses suggest ways to reconceptualize the organization of papers published in PNAS.


Asunto(s)
Publicaciones Periódicas como Asunto/clasificación , Publicaciones/clasificación , Clasificación , Métodos , National Academy of Sciences, U.S. , Estadística como Asunto , Estados Unidos
18.
Bioinformatics ; 27(13): i374-82, 2011 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-21685095

RESUMEN

MOTIVATION: Proteins and protein complexes coordinate their activity to execute cellular functions. In a number of experimental settings, including synthetic genetic arrays, genetic perturbations and RNAi screens, scientists identify a small set of protein interactions of interest. A working hypothesis is often that these interactions are the observable phenotypes of some functional process, which is not directly observable. Confirmatory analysis requires finding other pairs of proteins whose interaction may be additional phenotypical evidence about the same functional process. Extant methods for finding additional protein interactions rely heavily on the information in the newly identified set of interactions. For instance, these methods leverage the attributes of the individual proteins directly, in a supervised setting, in order to find relevant protein pairs. A small set of protein interactions provides a small sample to train parameters of prediction methods, thus leading to low confidence. RESULTS: We develop RBSets, a computational approach to ranking protein interactions rooted in analogical reasoning; that is, the ability to learn and generalize relations between objects. Our approach is tailored to situations where the training set of protein interactions is small, and leverages the attributes of the individual proteins indirectly, in a Bayesian ranking setting that is perhaps closest to propensity scoring in mathematical psychology. We find that RBSets leads to good performance in identifying additional interactions starting from a small evidence set of interacting proteins, for which an underlying biological logic in terms of functional processes and signaling pathways can be established with some confidence. Our approach is scalable and can be applied to large databases with minimal computational overhead. Our results suggest that analogical reasoning within a Bayesian ranking problem is a promising new approach for real-time biological discovery. AVAILABILITY: Java code is available at: www.gatsby.ucl.ac.uk/~rbas. CONTACT: airoldi@fas.harvard.edu; kheller@mit.edu; ricardo@stats.ucl.ac.uk.


Asunto(s)
Teorema de Bayes , Biología Computacional/métodos , Proteínas/metabolismo , Saccharomycetales/metabolismo , Transducción de Señal
19.
PLoS Comput Biol ; 6(12): e1001034, 2010 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-21187909

RESUMEN

Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We analyze genome-wide histone acetylation patterns and gene expression profiles measured over the first five days of cell differentiation triggered by silencing Nanog, a key transcription factor in ESC regulation. We explore the temporal and spatial dynamics of histone acetylation data and its correlation with gene expression using supervised and unsupervised statistical models. On a genome-wide scale, changes in acetylation are significantly correlated to changes in mRNA expression and, surprisingly, this coherence increases over time. We quantify the predictive power of histone acetylation for gene expression changes in a balanced cross-validation procedure. In an in-depth study we focus on genes central to the regulatory network of Mouse ESC, including those identified in a recent genome-wide RNAi screen and in the PluriNet, a computationally derived stem cell signature. We find that compared to the rest of the genome, ESC-specific genes show significantly more acetylation signal and a much stronger decrease in acetylation over time, which is often not reflected in a concordant expression change. These results shed light on the complexity of the relationship between histone acetylation and gene expression and are a step forward to dissect the multilayer regulatory mechanisms that determine stem cell fate.


Asunto(s)
Acetilación , Células Madre Embrionarias/metabolismo , Perfilación de la Expresión Génica/métodos , Histonas/metabolismo , Proteínas de Homeodominio/genética , Animales , Análisis por Conglomerados , Biología Computacional , Regulación de la Expresión Génica , Silenciador del Gen , Estudio de Asociación del Genoma Completo , Histonas/química , Proteínas de Homeodominio/metabolismo , Ratones , Proteína Homeótica Nanog , Fenotipo
20.
Decis Support Syst ; 51(1): 10-20, 2011 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-21647242

RESUMEN

We live in an increasingly mobile world, which leads to the duplication of information across domains. Though organizations attempt to obscure the identities of their constituents when sharing information for worthwhile purposes, such as basic research, the uncoordinated nature of such environment can lead to privacy vulnerabilities. For instance, disparate healthcare providers can collect information on the same patient. Federal policy requires that such providers share "de-identified" sensitive data, such as biomedical (e.g., clinical and genomic) records. But at the same time, such providers can share identified information, devoid of sensitive biomedical data, for administrative functions. On a provider-by-provider basis, the biomedical and identified records appear unrelated, however, links can be established when multiple providers' databases are studied jointly. The problem, known as trail disclosure, is a generalized phenomenon and occurs because an individual's location access pattern can be matched across the shared databases. Due to technical and legal constraints, it is often difficult to coordinate between providers and thus it is critical to assess the disclosure risk in distributed environments, so that we can develop techniques to mitigate such risks. Research on privacy protection has so far focused on developing technologies to suppress or encrypt identifiers associated with sensitive information. There is growing body of work on the formal assessment of the disclosure risk of database entries in publicly shared databases, but a less attention has been paid to the distributed setting. In this research, we review the trail disclosure problem in several domains with known vulnerabilities and show that disclosure risk is influenced by the distribution of how people visit service providers. Based on empirical evidence, we propose an entropy metric for assessing such risk in shared databases prior to their release. This metric assesses risk by leveraging the statistical characteristics of a visit distribution, as opposed to person-level data. It is computationally efficient and superior to existing risk assessment methods, which rely on ad hoc assessment that are often computationally expensive and unreliable. We evaluate our approach on a range of location access patterns in simulated environments. Our results demonstrate the approach is effective at estimating trail disclosure risks and the amount of self-information contained in a distributed system is one of the main driving factors.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA