Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Mol Cell ; 65(2): 285-295, 2017 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-27989441

RESUMEN

Eukaryotic cell division is known to be controlled by the cyclin/cyclin dependent kinase (CDK) machinery. However, eukaryotes have evolved prior to CDKs, and cells can divide in the absence of major cyclin/CDK components. We hypothesized that an autonomous metabolic oscillator provides dynamic triggers for cell-cycle initiation and progression. Using microfluidics, cell-cycle reporters, and single-cell metabolite measurements, we found that metabolism of budding yeast is a CDK-independent oscillator that oscillates across different growth conditions, both in synchrony with and also in the absence of the cell cycle. Using environmental perturbations and dynamic single-protein depletion experiments, we found that the metabolic oscillator and the cell cycle form a system of coupled oscillators, with the metabolic oscillator separately gating and maintaining synchrony with the early and late cell cycle. Establishing metabolism as a dynamic component within the cell-cycle network opens new avenues for cell-cycle research and therapeutic interventions for proliferative disorders.


Asunto(s)
Ciclo Celular , Quinasas Ciclina-Dependientes/metabolismo , Metabolismo Energético , Periodicidad , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Adenosina Trifosfato/metabolismo , Quinasas Ciclina-Dependientes/genética , Genotipo , Microscopía Fluorescente , Microscopía por Video , Modelos Biológicos , Mutación , NADP/metabolismo , Oscilometría , Fenotipo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crecimiento & desarrollo , Proteínas de Saccharomyces cerevisiae/genética , Factores de Tiempo
2.
Bioinformatics ; 39(10)2023 10 03.
Artículo en Inglés | MEDLINE | ID: mdl-37774002

RESUMEN

MOTIVATION: Investigating cell differentiation under a genetic disorder offers the potential for improving current gene therapy strategies. Clonal tracking provides a basis for mathematical modelling of population stem cell dynamics that sustain the blood cell formation, a process known as haematopoiesis. However, many clonal tracking protocols rely on a subset of cell types for the characterization of the stem cell output, and the data generated are subject to measurement errors and noise. RESULTS: We propose a stochastic framework to infer dynamic models of cell differentiation from clonal tracking data. A state-space formulation combines a stochastic quasi-reaction network, describing cell differentiation, with a Gaussian measurement model accounting for data errors and noise. We developed an inference algorithm based on an extended Kalman filter, a nonlinear optimization, and a Rauch-Tung-Striebel smoother. Simulations show that our proposed method outperforms the state-of-the-art and scales to complex structures of cell differentiations in terms of nodes size and network depth. The application of our method to five in vivo gene therapy studies reveals different dynamics of cell differentiation. Our tool can provide statistical support to biologists and clinicians to better understand cell differentiation and haematopoietic reconstitution after a gene therapy treatment. The equations of the state-space model can be modified to infer other dynamics besides cell differentiation. AVAILABILITY AND IMPLEMENTATION: The stochastic framework is implemented in the R package Karen which is available for download at https://cran.r-project.org/package=Karen. The code that supports the findings of this study is openly available at https://github.com/delcore-luca/CellDifferentiationNetworks.


Asunto(s)
Algoritmos , Modelos Teóricos , Diferenciación Celular , Hematopoyesis/genética , Redes Reguladoras de Genes
3.
BMC Bioinformatics ; 24(1): 228, 2023 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-37268887

RESUMEN

BACKGROUND: Mathematical models of haematopoiesis can provide insights on abnormal cell expansions (clonal dominance), and in turn can guide safety monitoring in gene therapy clinical applications. Clonal tracking is a recent high-throughput technology that can be used to quantify cells arising from a single haematopoietic stem cell ancestor after a gene therapy treatment. Thus, clonal tracking data can be used to calibrate the stochastic differential equations describing clonal population dynamics and hierarchical relationships in vivo. RESULTS: In this work we propose a random-effects stochastic framework that allows to investigate the presence of events of clonal dominance from high-dimensional clonal tracking data. Our framework is based on the combination between stochastic reaction networks and mixed-effects generalized linear models. Starting from the Kramers-Moyal approximated Master equation, the dynamics of cells duplication, death and differentiation at clonal level, can be described by a local linear approximation. The parameters of this formulation, which are inferred using a maximum likelihood approach, are assumed to be shared across the clones and are not sufficient to describe situation in which clones exhibit heterogeneity in their fitness that can lead to clonal dominance. In order to overcome this limitation, we extend the base model by introducing random-effects for the clonal parameters. This extended formulation is calibrated to the clonal data using a tailor-made expectation-maximization algorithm. We also provide the companion  package RestoreNet, publicly available for download at https://cran.r-project.org/package=RestoreNet . CONCLUSIONS: Simulation studies show that our proposed method outperforms the state-of-the-art. The application of our method in two in-vivo studies unveils the dynamics of clonal dominance. Our tool can provide statistical support to biologists in gene therapy safety analyses.


Asunto(s)
Algoritmos , Modelos Teóricos , Funciones de Verosimilitud , Simulación por Computador , Células Clonales , Procesos Estocásticos
4.
PLoS Comput Biol ; 17(8): e1009259, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34383741

RESUMEN

In this study we demonstrated through analytic considerations and numerical studies that the mitochondrial fatty-acid ß-oxidation can exhibit bistable-hysteresis behavior. In an experimentally validated computational model we identified a specific region in the parameter space in which two distinct stable and one unstable steady state could be attained with different fluxes. The two stable states were referred to as low-flux (disease) and high-flux (healthy) state. By a modular kinetic approach we traced the origin and causes of the bistability back to the distributive kinetics and the conservation of CoA, in particular in the last rounds of the ß-oxidation. We then extended the model to investigate various interventions that may confer health benefits by activating the pathway, including (i) activation of the last enzyme MCKAT via its endogenous regulator p46-SHC protein, (ii) addition of a thioesterase (an acyl-CoA hydrolysing enzyme) as a safety valve, and (iii) concomitant activation of a number of upstream and downstream enzymes by short-chain fatty-acids (SCFA), metabolites that are produced from nutritional fibers in the gut. A high concentration of SCFAs, thioesterase activity, and inhibition of the p46Shc protein led to a disappearance of the bistability, leaving only the high-flux state. A better understanding of the switch behavior of the mitochondrial fatty-acid oxidation process between a low- and a high-flux state may lead to dietary and pharmacological intervention in the treatment or prevention of obesity and or non-alcoholic fatty-liver disease.


Asunto(s)
Ácidos Grasos/metabolismo , Modelos Biológicos , Acetil-CoA C-Aciltransferasa/antagonistas & inhibidores , Acetil-CoA C-Aciltransferasa/metabolismo , Animales , Biología Computacional , Simulación por Computador , Estabilidad de Enzimas , Ácidos Grasos/química , Humanos , Cinética , Redes y Vías Metabólicas , Mitocondrias/metabolismo , Enfermedad del Hígado Graso no Alcohólico/etiología , Enfermedad del Hígado Graso no Alcohólico/metabolismo , Obesidad/etiología , Obesidad/metabolismo
5.
Biostatistics ; 21(2): e131-e147, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-30380025

RESUMEN

Clinical studies where patients are routinely screened for many genomic features are becoming more routine. In principle, this holds the promise of being able to find genomic signatures for a particular disease. In particular, cancer survival is thought to be closely linked to the genomic constitution of the tumor. Discovering such signatures will be useful in the diagnosis of the patient, may be used for treatment decisions and, perhaps, even the development of new treatments. However, genomic data are typically noisy and high-dimensional, not rarely outstripping the number of patients included in the study. Regularized survival models have been proposed to deal with such scenarios. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and a (near) non-convex regularizer. The disadvantages of such methods are that they are typically non-invariant to scale changes of the covariates, they struggle with highly correlated covariates, and they have a practical problem of determining the amount of regularization. In this article, we propose an extension of the differential geometric least angle regression method for sparse inference in relative risk regression models. A software implementation of our method is available on github (https://github.com/LuigiAugugliaro/dgcox).


Asunto(s)
Bioestadística/métodos , Modelos Estadísticos , Medición de Riesgo/métodos , Análisis de Supervivencia , Simulación por Computador , Humanos , Neoplasias/genética , Neoplasias/mortalidad , Análisis de Regresión
6.
Bioinformatics ; 35(7): 1083-1093, 2019 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-30184062

RESUMEN

MOTIVATION: Linkage maps are used to identify the location of genes responsible for traits and diseases. New sequencing techniques have created opportunities to substantially increase the density of genetic markers. Such revolutionary advances in technology have given rise to new challenges, such as creating high-density linkage maps. Current multiple testing approaches based on pairwise recombination fractions are underpowered in the high-dimensional setting and do not extend easily to polyploid species. To remedy these issues, we propose to construct linkage maps using graphical models either via a sparse Gaussian copula or a non-paranormal skeptic approach. RESULTS: We determine linkage groups, typically chromosomes, and the order of markers in each linkage group by inferring the conditional independence relationships among large numbers of markers in the genome. Through simulations, we illustrate the utility of our map construction method and compare its performance with other available methods, both when the data are clean and contain no missing observations and when data contain genotyping errors. Our comprehensive map construction method makes full use of the dosage SNP data to reconstruct linkage map for any bi-parental diploid and polyploid species. We apply the proposed method to three genotype datasets: barley, peanut and potato from diploid and polyploid populations. AVAILABILITY AND IMPLEMENTATION: The method is implemented in the R package netgwas which is freely available at https://cran.r-project.org/web/packages/netgwas. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Polimorfismo de Nucleótido Simple , Poliploidía , Mapeo Cromosómico , Ligamiento Genético , Genotipo
7.
J Pediatr Gastroenterol Nutr ; 69(1): 131-136, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31058782

RESUMEN

OBJECTIVE: Antibiotic treatment in early life appears to increase the risk for childhood overweight and obesity. So far, the association between antibiotics administrated specifically during the first week of life and growth has not been studied. Therefore, we studied the association between growth and antibiotics, given in the first week of life and antibiotic courses later in the first year of life. METHOD: A prospective observational birth cohort of 436 term infants with 151 receiving broad-spectrum antibiotics for suspected neonatal infection (AB+), and 285 healthy controls (AB-) was followed during their first year. Weight, height, and additional antibiotic courses were collected monthly. A generalized-additive-mixed-effects model was used to fit the growth data. Growth curve estimation was controlled for differences in sex, gestational age, delivery mode, exclusive breast-feeding, tobacco exposure, presence of siblings, and additional antibiotic courses. RESULTS: Weight-for-age and length-for-age increase was lower in AB+ compared with AB- (P < 0.0001), resulting in a lower weight and length increase 6.26 kg (standard error [SE] 0.07 kg) and 25.4 cm (SE 0.27 cm) versus 6.47 kg (SE 0.06 kg) and 26.4 cm (SE 0.21 cm) (P < 0.05 and P < 0.005, respectively) in the first year of life. Approximately 30% of the children in both groups received additional antibiotic course(s) in their first year, whereafter additional weight gain of 76 g per course was observed (P = 0.0285). CONCLUSIONS: Decreased growth was observed after antibiotics in the first week of life, whereas increased growth was observed after later antibiotic course(s) in term born infants in the first year of life. Therefore, timing of antibiotics may determine the association with growth.


Asunto(s)
Antibacterianos/administración & dosificación , Estatura/efectos de los fármacos , Peso Corporal/efectos de los fármacos , Crecimiento/efectos de los fármacos , Antibacterianos/efectos adversos , Antibacterianos/farmacología , Estudios de Casos y Controles , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Obesidad Infantil/etiología , Estudios Prospectivos
8.
Twin Res Hum Genet ; 22(1): 4-13, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30944055

RESUMEN

Large multigenerational cohort studies offer powerful ways to study the hereditary effects on various health outcomes. However, accounting for complex kinship relations in big data structures can be methodologically challenging. The traditional kinship model is computationally infeasible when considering thousands of individuals. In this article, we propose a computationally efficient alternative that employs fractional relatedness of family members through a series of founding members. The primary goal of this study is to investigate whether the effect of determinants on health outcome variables differs with and without accounting for family structure. We compare a fixed-effects model without familial effects with several variance components models that account for heritability and shared environment structure. Our secondary goal is to apply the fractional relatedness model in a realistic setting. Lifelines is a three-generation cohort study investigating the biological, behavioral, and environmental determinants of healthy aging. We analyzed a sample of 89,353 participants from 32,452 reconstructed families. Our primary conclusion is that the effect of determinants on health outcome variables does not differ with and without accounting for family structure. However, accounting for family structure through fractional relatedness allows for estimating heritability in a computationally efficient way, showing some interesting differences between physical and mental quality of life heritability. We have shown through simulations that the proposed fractional relatedness model performs better than the standard kinship model, not only in terms of computational time and convenience of fitting using standard functions in R, but also in terms of bias of heritability estimates and coverage.


Asunto(s)
Envejecimiento/genética , Macrodatos , Bases de Datos Genéticas , Familia , Interacción Gen-Ambiente , Modelos Genéticos , Femenino , Humanos , Masculino
9.
Stat Appl Genet Mol Biol ; 15(3): 193-212, 2016 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-27023322

RESUMEN

Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order - some entries of the precision matrix are a priori zeros - or equal dependency strengths across time lags - some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l1-penalized maximum likelihood, imposing a further constraint on the absolute value of its entries, which results in sparse networks. Selecting the optimal sparsity level is a major challenge for this type of approaches. In this paper, we evaluate the performance of a number of model selection criteria for fGGMs by means of two simulated regulatory networks from realistic biological processes. The analysis reveals a good performance of fGGMs in comparison with other methods for inferring dynamic networks and of the KLCV criterion in particular for model selection. Finally, we present an application on a high-resolution time-course microarray data from the Neisseria meningitidis bacterium, a causative agent of life-threatening infections such as meningitis. The methodology described in this paper is implemented in the R package sglasso, freely available at CRAN, http://CRAN.R-project.org/package=sglasso.


Asunto(s)
Redes Reguladoras de Genes , Modelos Genéticos , Algoritmos , Simulación por Computador , Neisseria/genética , Distribución Normal , Probabilidad
10.
Proc Natl Acad Sci U S A ; 111(32): 11727-31, 2014 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-25071164

RESUMEN

Calorie restriction (CR) is often described as the most robust manner to extend lifespan in a large variety of organisms. Hence, considerable research effort is directed toward understanding the mechanisms underlying CR, especially in the yeast Saccharomyces cerevisiae. However, the effect of CR on lifespan has never been systematically reviewed in this organism. Here, we performed a meta-analysis of replicative lifespan (RLS) data published in more than 40 different papers. Our analysis revealed that there is significant variation in the reported RLS data, which appears to be mainly due to the low number of cells analyzed per experiment. Furthermore, we found that the RLS measured at 2% (wt/vol) glucose in CR experiments is partly biased toward shorter lifespans compared with identical lifespan measurements from other studies. Excluding the 2% (wt/vol) glucose experiments from CR experiments, we determined that the average RLS of the yeast strains BY4741 and BY4742 is 25.9 buds at 2% (wt/vol) glucose and 30.2 buds under CR conditions. RLS measurements with a microfluidic dissection platform produced identical RLS data at 2% (wt/vol) glucose. However, CR conditions did not induce lifespan extension. As we excluded obvious methodological differences, such as temperature and medium, as causes, we conclude that subtle method-specific factors are crucial to induce lifespan extension under CR conditions in S. cerevisiae.


Asunto(s)
Saccharomyces cerevisiae/fisiología , Animales , Restricción Calórica , Medios de Cultivo , Glucosa/metabolismo , Longevidad/fisiología , Técnicas Analíticas Microfluídicas , Modelos Biológicos , Especificidad de la Especie , Factores de Tiempo
11.
Biom J ; 59(6): 1301-1316, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28664629

RESUMEN

Model-based clustering is a technique widely used to group a collection of units into mutually exclusive groups. There are, however, situations in which an observation could in principle belong to more than one cluster. In the context of next-generation sequencing (NGS) experiments, for example, the signal observed in the data might be produced by two (or more) different biological processes operating together and a gene could participate in both (or all) of them. We propose a novel approach to cluster NGS discrete data, coming from a ChIP-Seq experiment, with a mixture model, allowing each unit to belong potentially to more than one group: these multiple allocation clusters can be flexibly defined via a function combining the features of the original groups without introducing new parameters. The formulation naturally gives rise to a 'zero-inflation group' in which values close to zero can be allocated, acting as a correction for the abundance of zeros that manifest in this type of data. We take into account the spatial dependency between observations, which is described through a latent conditional autoregressive process that can reflect different dependency patterns. We assess the performance of our model within a simulation environment and then we apply it to ChIP-seq real data.


Asunto(s)
Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Modelos Estadísticos , Análisis de Secuencia de ADN , Análisis por Conglomerados , Proteína p300 Asociada a E1A/genética , Humanos
12.
BMC Bioinformatics ; 17(1): 352, 2016 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-27597310

RESUMEN

BACKGROUND: Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. RESULTS: We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. CONCLUSIONS: NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).


Asunto(s)
Redes Reguladoras de Genes , Saccharomyces cerevisiae/genética , Programas Informáticos , Simulación por Computador , Perfilación de la Expresión Génica/métodos , Ontología de Genes , Genes Fúngicos , Estrés Fisiológico/genética
13.
BMC Bioinformatics ; 17: 254, 2016 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-27342572

RESUMEN

BACKGROUND: Sparse Gaussian graphical models are popular for inferring biological networks, such as gene regulatory networks. In this paper, we investigate the consistency of these models across different data platforms, such as microarray and next generation sequencing, on the basis of a rich dataset containing samples that are profiled under both techniques as well as a large set of independent samples. RESULTS: Our analysis shows that individual node variances can have a remarkable effect on the connectivity of the resulting network. Their inconsistency across platforms and the fact that the variability level of a node may not be linked to its regulatory role mean that, failing to scale the data prior to the network analysis, leads to networks that are not reproducible across different platforms and that may be misleading. Moreover, we show how the reproducibility of networks across different platforms is significantly higher if networks are summarised in terms of enrichment amongst functional groups of interest, such as pathways, rather than at the level of individual edges. CONCLUSIONS: Careful pre-processing of transcriptional data and summaries of networks beyond individual edges can improve the consistency of network inference across platforms. However, caution is needed at this stage in the (over)interpretation of gene regulatory networks inferred from biological data.


Asunto(s)
Redes Reguladoras de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ARN , Ansiedad/genética , Depresión/genética , Femenino , Humanos , Masculino , Reproducibilidad de los Resultados , Estudios en Gemelos como Asunto
14.
BMC Bioinformatics ; 16 Suppl 6: S5, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25917062

RESUMEN

Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with l1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Modelos Estadísticos , Linfocitos T/metabolismo , Simulación por Computador , Humanos , Activación de Linfocitos , Análisis por Micromatrices
15.
BMC Med Res Methodol ; 15: 88, 2015 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-26471992

RESUMEN

BACKGROUND: Heterogeneity of psychopathological concepts such as depression hampers progress in research and clinical practice. Latent Variable Models (LVMs) have been widely used to reduce this problem by identification of more homogeneous factors or subgroups. However, heterogeneity exists at multiple levels (persons, symptoms, time) and LVMs cannot capture all these levels and their interactions simultaneously, which leads to incomplete models. Our objective is to briefly review the most widely used LVMs in depression research, illustrating their use and incompatibility in real data, and to consider an alternative, statistical approach, namely multimode principal component analysis (MPCA). METHODS: We applied LVMs to data from 147 patients, who filled out the Quick Inventory of Depressive Symptomatology (QIDS) at 9 time points. Compatibility of the results and suitability of the LVMs to capture the heterogeneity of the data were evaluated. Alternatively, MPCA was used to simultaneously decompose depression on the person-, symptom- and time-level and to investigate the interactions between these levels. RESULTS: QIDS-data could be decomposed on the person-level (2 classes), symptom-level (2 factors) and time-level (2 trajectory-classes). However, these results could not be integrated into a single model. Instead, MPCA allowed for decomposition of the data at the person- (3 components), symptom- (2 components) and time-level (2 components) and for the investigation of these components' interactions. CONCLUSIONS: Traditional LVMs have limited use when trying to define an integrated model of depression heterogeneity at the person, symptom and time level. More integrative statistical techniques such as MPCA can be used to address these relatively complex data patterns and could be used in future attempts to identify empirically-based subtypes/phenotypes of depression.


Asunto(s)
Depresión/psicología , Trastorno Depresivo Mayor/psicología , Determinación de la Personalidad , Análisis de Componente Principal/métodos , Adulto , Depresión/diagnóstico , Trastorno Depresivo Mayor/diagnóstico , Femenino , Humanos , Masculino , Modelos Estadísticos , Índice de Severidad de la Enfermedad , Encuestas y Cuestionarios
16.
Psychometrika ; 89(1): 151-171, 2024 03.
Artículo en Inglés | MEDLINE | ID: mdl-38446394

RESUMEN

Temporal network data is often encoded as time-stamped interaction events between senders and receivers, such as co-authoring scientific articles or communication via email. A number of relational event frameworks have been proposed to address specific issues raised by complex temporal dependencies. These models attempt to quantify how individual behaviour, endogenous and exogenous factors, as well as interactions with other individuals modify the network dynamics over time. It is often of interest to determine whether changes in the network can be attributed to endogenous mechanisms reflecting natural relational tendencies, such as reciprocity or triadic effects. The propensity to form or receive ties can also, at least partially, be related to actor attributes. Nodal heterogeneity in the network is often modelled by including actor-specific or dyadic covariates. However, comprehensively capturing all personality traits is difficult in practice, if not impossible. A failure to account for heterogeneity may confound the substantive effect of key variables of interest. This work shows that failing to account for node level sender and receiver effects can induce ghost triadic effects. We propose a random-effect extension of the relational event model to deal with these problems. We show that it is often effective over more traditional approaches, such as in-degree and out-degree statistics. These results that the violation of the hierarchy principle due to insufficient information about nodal heterogeneity can be resolved by including random effects in the relational event model as a standard.


Asunto(s)
Relaciones Interpersonales , Humanos , Psicometría , Modelos Estadísticos
17.
Bioinformatics ; 28(15): 1980-9, 2012 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-22668791

RESUMEN

MOTIVATION: Cancer biology is a field where the complexity of the phenomena battles against the availability of data. Often only a few observations per signal source, i.e. genes, are available. Such scenarios are becoming increasingly more relevant as modern sensing technologies generally have no trouble in measuring lots of channels, but where the number of subjects, such as patients or samples, is limited. In statistics, this problem falls under the heading 'large p, small n'. Moreover, in such situations the use of asymptotic analytical results should generally be mistrusted. RESULTS: We consider two cancer datasets, with the aim to mine the activity of functional groups of genes. We propose a hierarchical model with two layers in which the individual signals share a common variance component. A likelihood ratio test is defined for the difference between two collections of corresponding signals. The small number of observations requires a careful consideration of the bias of the statistic, which is corrected through an explicit Bartlett correction. The test is validated on Monte Carlo simulations, which show improved detection of differences compared with other methods. In a leukaemia study and a cancerous fibroblast cell line, we find that the method also works better in practice, i.e. it gives a richer picture of the underlying biology. AVAILABILITY: The MATLAB code is available from the authors or on http://www.math.rug.nl/stat/Software. CONTACT: e.c.wit@rug.nl d.bakewell@liv.ac.uk.


Asunto(s)
Biología Computacional/métodos , Neoplasias/genética , Programas Informáticos , Línea Celular Tumoral , Simulación por Computador , Minería de Datos , Humanos , Leucemia/genética , Funciones de Verosimilitud , Modelos Estadísticos , Método de Montecarlo
18.
BMC Genet ; 14: 125, 2013 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-24378210

RESUMEN

BACKGROUND: Many QTL studies have two common features: (1) often there is missing marker information, (2) among many markers involved in the biological process only a few are causal. In statistics, the second issue falls under the headings "sparsity" and "causal inference". The goal of this work is to develop a two-step statistical methodology for QTL mapping for markers with binary genotypes. The first step introduces a novel imputation method for missing genotypes. Outcomes of the proposed imputation method are probabilities which serve as weights to the second step, namely in weighted lasso. The sparse phenotype inference is employed to select a set of predictive markers for the trait of interest. RESULTS: Simulation studies validate the proposed methodology under a wide range of realistic settings. Furthermore, the methodology outperforms alternative imputation and variable selection methods in such studies. The methodology was applied to an Arabidopsis experiment, containing 69 markers for 165 recombinant inbred lines of a F8 generation. The results confirm previously identified regions, however several new markers are also found. On the basis of the inferred ROC behavior these markers show good potential for being real, especially for the germination trait Gmax. CONCLUSIONS: Our imputation method shows higher accuracy in terms of sensitivity and specificity compared to alternative imputation method. Also, the proposed weighted lasso outperforms commonly practiced multiple regression as well as the traditional lasso and adaptive lasso with three weighting schemes. This means that under realistic missing data settings this methodology can be used for QTL identification.


Asunto(s)
Modelos Genéticos , Sitios de Carácter Cuantitativo , Arabidopsis/genética , Arabidopsis/crecimiento & desarrollo , Área Bajo la Curva , Cromosomas/química , Cromosomas/metabolismo , Cromosomas de las Plantas/química , Cromosomas de las Plantas/metabolismo , Marcadores Genéticos , Genotipo , Germinación/genética , Modelos Estadísticos , Fenotipo , Probabilidad , Curva ROC
19.
PLoS One ; 18(3): e0283247, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36940211

RESUMEN

The citation network of patents citing prior art arises from the legal obligation of patent applicants to properly disclose their invention. One way to study the relationship between current patents and their antecedents is by analyzing the similarity between the textual elements of patents. Many patent similarity indicators have shown a constant decrease since the mid-70s. Although several explanations have been proposed, more comprehensive analyses of this phenomenon have been rare. In this paper, we use a computationally efficient measure of patent similarity scores that leverages state-of-the-art Natural Language Processing tools, to investigate potential drivers of this apparent similarity decrease. This is achieved by modeling patent similarity scores by means of generalized additive models. We found that non-linear modeling specifications are able to distinguish between distinct, temporally varying drivers of the patent similarity levels that explain more variation in the data (R2 ∼ 18%) compared to previous methods. Moreover, the model reveals an underlying trend in similarity scores that is fundamentally different from the one presented previously.


Asunto(s)
Invenciones , Procesamiento de Lenguaje Natural
20.
Schizophr Res ; 239: 95-102, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34871996

RESUMEN

The clinical staging model distinguishes different stages of mental illness. Early stages, are suggested to be more mild, diffuse and volatile in terms of expression of psychopathology than later stages. This study aimed to compare individual transdiagnostic symptom networks based on intensive longitudinal data between individuals in different early clinical stages for psychosis. It was hypothesized that with increasing clinical stage (i) density of symptom networks would increase and (ii) psychotic experiences would be more central in the symptom networks. Data came from a 90-day diary study, resulting in 8640 observations within N = 96 individuals, divided over four subgroups representing different early clinical stages (n1 = 25, n2 = 27, n3 = 24, n4 = 20). Sparse Time Series Chain Graphical Models were used to create individual contemporaneous and temporal symptom networks based on 10 items concerning symptoms of depression, anxiety, psychosis, non-specific and vulnerability domains. Network density and symptom centrality (strength) were calculated individually and compared between and within the four subgroups. Level of psychopathology increased with clinical stage. The symptom networks showed large between-individual variation, but neither network density not psychotic symptom strength differed between the subgroups in the contemporaneous (pdensity = 0.59, pstrength > 0.51) and temporal (pdensity = 0.75, pstrength > 0.35) networks. No support was found for our hypothesis that higher clinical stage comes with higher symptom network density or a more central role for psychotic symptoms. Based on the high inter-individual variability, our results highlight the importance of individualized assessment of symptom networks.


Asunto(s)
Trastornos Psicóticos , Ansiedad , Trastornos de Ansiedad , Humanos , Psicopatología , Trastornos Psicóticos/diagnóstico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA