Search | VHL Regional Portal

A New Regression Model for the Analysis of Overdispersed and Zero-Modified Count Data.

Bertoli, Wesley; Conceição, Katiane S; Andrade, Marinho G; Louzada, Francisco.

Entropy (Basel) ; 23(6)2021 May 21.

Article in English | MEDLINE | ID: mdl-34064281

ABSTRACT

Count datasets are traditionally analyzed using the ordinary Poisson distribution. However, said model has its applicability limited, as it can be somewhat restrictive to handling specific data structures. In this case, the need arises for obtaining alternative models that accommodate, for example, overdispersion and zero modification (inflation/deflation at the frequency of zeros). In practical terms, these are the most prevalent structures ruling the nature of discrete phenomena nowadays. Hence, this paper's primary goal was to jointly address these issues by deriving a fixed-effects regression model based on the hurdle version of the Poisson-Sujatha distribution. In this framework, the zero modification is incorporated by considering that a binary probability model determines which outcomes are zero-valued, and a zero-truncated process is responsible for generating positive observations. Posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the g-prior method. Intensive Monte Carlo simulation studies were performed to assess the Bayesian estimators' empirical properties, and the obtained results have been discussed. The proposed model was considered for analyzing a real dataset, and its competitiveness regarding some well-established fixed-effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian p-value and the randomized quantile residuals were considered for the task of model validation.

A discrete analog of Gumbel distribution: properties, parameter estimation and applications.

Chakraborty, Subrata; Chakravarty, Dhrubajyoti; Mazucheli, Josmar; Bertoli, Wesley.

J Appl Stat ; 48(4): 712-737, 2021.

Article in English | MEDLINE | ID: mdl-35706987

ABSTRACT

A discrete version of the Gumbel distribution (Type-I Extreme Value distribution) has been derived by using the general approach of discretization of a continuous distribution. Important distributional and reliability properties have been explored. It has been shown that depending on the choice of parameters the proposed distribution can be positively or negatively skewed; possess long-tail(s). Log-concavity of the distribution and consequent results have been established. Estimation of parameters by method of maximum likelihood, method of moments, and method of proportions has been discussed. A method of checking model adequacy and regression type estimation based on empirical survival function has also been examined. A simulation study has been carried out to compare and check the efficacy of the three methods of estimations. The distribution has been applied to model three real count data sets from diverse application area namely, survival times in number of days, maximum annual floods data from Brazil and goal differences in English premier league, and the results show the relevance of the proposed distribution.

A new mixed-effects regression model for the analysis of zero-modified hierarchical count data.

Bertoli, Wesley; Conceição, Katiane S; Andrade, Marinho G; Louzada, Francisco.

Biom J ; 63(1): 81-104, 2021 01.

Article in English | MEDLINE | ID: mdl-33073871

ABSTRACT

Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero-modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)-(b) and (b)-(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed-effects regression model based on the hurdle version of the Poisson-Lindley distribution. In this framework, the zero-modification is incorporated by assuming that a binary probability model determines which outcomes are zero-valued, and a zero-truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well-established mixed-effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian p -value and the randomized quantile residuals were considered for model diagnostics.

Subject(s)

Models, Statistical , Bayes Theorem , Cluster Analysis , Computer Simulation , Monte Carlo Method , Poisson Distribution

Whole transcriptomic network analysis using Co-expression Differential Network Analysis (CoDiNA).

Morselli Gysi, Deisy; de Miranda Fragoso, Tiago; Zebardast, Fatemeh; Bertoli, Wesley; Busskamp, Volker; Almaas, Eivind; Nowick, Katja.

PLoS One ; 15(10): e0240523, 2020.

Article in English | MEDLINE | ID: mdl-33057419

ABSTRACT

Biological and medical sciences are increasingly acknowledging the significance of gene co-expression-networks for investigating complex-systems, phenotypes or diseases. Typically, complex phenotypes are investigated under varying conditions. While approaches for comparing nodes and links in two networks exist, almost no methods for the comparison of multiple networks are available and-to best of our knowledge-no comparative method allows for whole transcriptomic network analysis. However, it is the aim of many studies to compare networks of different conditions, for example, tissues, diseases, treatments, time points, or species. Here we present a method for the systematic comparison of an unlimited number of networks, with unlimited number of transcripts: Co-expression Differential Network Analysis (CoDiNA). In particular, CoDiNA detects links and nodes that are common, specific or different among the networks. We developed a statistical framework to normalize between these different categories of common or changed network links and nodes, resulting in a comprehensive network analysis method, more sophisticated than simply comparing the presence or absence of network nodes. Applying CoDiNA to a neurogenesis study we identified candidate genes involved in neuronal differentiation. We experimentally validated one candidate, demonstrating that its overexpression resulted in a significant disturbance in the underlying gene regulatory network of neurogenesis. Using clinical studies, we compared whole transcriptome co-expression networks from individuals with or without HIV and active tuberculosis (TB) and detected signature genes specific to HIV. Furthermore, analyzing multiple cancer transcription factor (TF) networks, we identified common and distinct features for particular cancer types. These CoDiNA applications demonstrate the successful detection of genes associated with specific phenotypes. Moreover, CoDiNA can also be used for comparing other types of undirected networks, for example, metabolic, protein-protein interaction, ecological and psychometric networks. CoDiNA is publicly available as an R package in CRAN (https://CRAN.R-project.org/package=CoDiNA).

Subject(s)

Gene Expression Regulation , Gene Regulatory Networks , HIV Infections/genetics , Neoplasms/genetics , Neurons/metabolism , Software , Transcriptome , Algorithms , HIV/isolation & purification , HIV Infections/virology , Humans , Neurogenesis , Neurons/cytology , Phenotype

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL