RESUMO
The article proposes a new regression based on the generalized odd log-logistic family for interval-censored data. The survival times are not observed for this type of data, and the event of interest occurs at some random interval. This family can be used in interval modeling since it generalizes some popular lifetime distributions in addition to its ability to present various forms of the risk function. The estimation of the parameters is addressed by the classical and Bayesian methods. We examine the behavior of the estimates for some sample sizes and censorship percentages. Selection criteria, likelihood ratio tests, residual analysis, and graphical techniques assess the goodness of fit of the fitted models. The usefulness of the proposed models is red shown by means of two real data sets.
RESUMO
Several statistical models have been proposed in recent years, among them is the semiparametric regression. In medicine, there are several situations in which it is impracticable to consider a linear regression for statistical modeling, especially when the data contain explanatory variables that present a nonlinear relationship with the response variable. Another common situation is when the response variable does not have a unimodal shape, and it is not possible to adopt distributions belonging to the symmetric or asymmetric classes. In this context, a semiparametric heteroskedastic regression is proposed based on an extension of the normal distribution. Then, we show the usefulness of this model to analyze the cost of prostate cancer surgery. The predictor variables refer to two groups of patients such that one group receives a multimodal local anesthetic solution (Preemptive Target Anesthetic Solution) and the second group is treated with neuraxial blockade (spinal anesthesia/traditional standard). The other relevant predictor variables are also evaluated, thus allowing for the in-depth interpretation of the predictor variables with a nonlinear effect on the dependent variable cost. The penalized maximum likelihood method is adopted to estimate the model parameters. The new regression is a useful statistical tool for analyzing medical data.
RESUMO
The future values of the expected claims are very important for the insurance companies for avoiding the big losses under uncertainty which may be produced from future claims. In this paper, we define a new size-of-loss distribution for the negatively skewed insurance claims data. Four key risk indicators are defined and analyzed under four estimation methods: maximum likelihood, ordinary least squares, weighted least squares, and Anderson Darling. The insurance claims data are modeled using many competitive models and comprehensive comparison is performed under nine statistical tests. The autoregressive model is proposed to analyze the insurance claims data and estimate the future values of the expected claims. The value-at-risk estimation and the peaks-over random threshold mean-of-order-p methodology are considered.
RESUMO
We investigate the use of the Probabilistic Incremental Programming Evolution (PIPE) algorithm as a tool to construct continuous cumulative distribution functions to model given data sets. The PIPE algorithm can generate several candidate functions to fit the empirical distribution of data. These candidates are generated by following a set of probability rules. The set of rules is then evolved over a number of iterations to generate better candidates regarding some optimality criteria. This approach rivals that of generated distribution, obtained by adding parameters to existing probability distributions. There are two main advantages for this method. The first is that it is possible to explicitly control the complexity of the candidate functions, by specifying which mathematical functions and operators can be used and how lengthy the mathematical expression of the candidate can be. The second advantage is that this approach deals with model selection and estimation at the same time. The overall performance in both simulated and real data was very satisfying. For the real data applications, the PIPE algorithm obtained better likelihoods for the data when compared to existing models, but with remarkably simpler mathematical expressions.
Assuntos
Algoritmos , Probabilidade , Distribuições EstatísticasRESUMO
In many practical situations, there is an interest in modeling bounded random variables in the interval (0, 1), such as rates, proportions, and indexes. It is important to provide new continuous models to deal with the uncertainty involved by variables of this type. This paper proposes a new quantile regression model based on an alternative parameterization of the unit Burr XII (UBXII) distribution. For the UBXII distribution and its associated regression, we obtain score functions and observed information matrices. We use the maximum likelihood method to estimate the parameters of the regression model, and conduct a Monte Carlo study to evaluate the performance of its estimates in samples of finite size. Furthermore, we present general diagnostic analysis and model selection techniques for the regression model. We empirically show its importance and flexibility through an application to an actual data set, in which the dropout proportion of Brazilian undergraduate animal sciences courses is analyzed. We use a statistical learning method for comparing the proposed model with the beta, Kumaraswamy, and unit-Weibull regressions. The results show that the UBXII regression provides the best fit and the most accurate predictions. Therefore, it is a valuable alternative and competitive to the well-known regressions for modeling double-bounded variables in the unit interval.
Assuntos
Análise de Regressão , Animais , Brasil , Método de Monte Carlo , IncertezaRESUMO
The work proposes a new family of survival models called the Odd log-logistic generalized Neyman type A long-term. We consider different activation schemes in which the number of factors M has the Neyman type A distribution and the time of occurrence of an event follows the odd log-logistic generalized family. The parameters are estimated by the classical and Bayesian methods. We investigate the mean estimates, biases, and root mean square errors in different activation schemes using Monte Carlo simulations. The residual analysis via the frequentist approach is used to verify the model assumptions. We illustrate the applicability of the proposed model for patients with gastric adenocarcinoma. The choice of the adenocarcinoma data is because the disease is responsible for most cases of stomach tumors. The estimated cured proportion of patients under chemoradiotherapy is higher compared to patients undergoing only surgery. The estimated hazard function for the chemoradiotherapy level tends to decrease when the time increases. More information about the data is addressed in the application section.
RESUMO
We define two new flexible families of continuous distributions to fit real data by compoun-ding the Marshall-Olkin class and the power series distribution. These families are very competitive to the popular beta and Kumaraswamy generators. Their densities have linear representations of exponentiated densities. In fact, as the main properties of thirty five exponentiated distributions are well-known, we can easily obtain several properties of about three hundred fifty distributions using the references of this article and five special cases of the power series distribution. We provide a package implemented in R software that shows numerically the precision of one of the linear representations. This package is useful to calculate numerical values for some statistical measurements of the generated distributions. We estimate the parameters by maximum likelihood. We define a regression based on one of the two families. The usefulness of a generated distribution and the associated regression is proved empirically.
Assuntos
Distribuições EstatísticasRESUMO
We introduce here a new distribution called the power-modified Kies-exponential (PMKE) distribution and derive some of its mathematical properties. Its hazard function can be bathtub-shaped, increasing, or decreasing. Its parameters are estimated by seven classical methods. Further, Bayesian estimation, under square error, general entropy, and Linex loss functions are adopted to estimate the parameters. Simulation results are provided to investigate the behavior of these estimators. The estimation methods are sorted, based on partial and overall ranks, to determine the best estimation approach for the model parameters. The proposed distribution can be used to model a real-life turbocharger dataset, as compared with 24 extensions of the exponential distribution.
RESUMO
We propose a new flexible generalized family (NFGF) for constructing many families of distributions. The importance of the NFGF is that any baseline distribution can be chosen and it does not involve any additional parameters. Some useful statistical properties of the NFGF are determined such as a linear representation for the family density, analytical shapes of the density and hazard rate, random variable generation, moments and generating function. Further, the structural properties of a special model named the new flexible Kumaraswamy (NFKw) distribution, are investigated, and the model parameters are estimated by maximum-likelihood method. A simulation study is carried out to assess the performance of the estimates. The usefulness of the NFKw model is proved empirically by means of three real-life data sets. In fact, the two-parameter NFKw model performs better than three-parameter transmuted-Kumaraswamy, three-parameter exponentiated-Kumaraswamy and the well-known two-parameter Kumaraswamy models.
RESUMO
In this study, a new one-parameter discrete distribution obtained by compounding the Poisson and xgamma distributions is proposed. Some statistical properties of the new distribution are obtained including moments and probability and moment generating functions. Two methods are used for the estimation of the unknown parameter: the maximum likelihood method and the method of moments. Additionally, the count regression model and integer-valued autoregressive process of the proposed distribution are introduced. Some possible applications of the introduced models are considered and discussed.
RESUMO
We study a five-parameter model called the Weibull Burr XII (WBXII) distribution, which extends several models, including new ones. This model is quite flexible in terms of the hazard function, which exhibits increasing, decreasing, upside-down bathtub, and bathtub shapes. Its density function allows different forms such as left-skewed, right-skewed, reversed-J, and bimodal. We aim to provide some general mathematical quantities for the proposed distribution, which can be useful to real data analysis. We develop a shiny application to provide interactive illustrations of the WBXII density and hazard functions. We estimate the model parameters using maximum likelihood and derive a profile log-likelihood for all members of the Weibull-G family. The survival analysis application reveals that the WBXII model is suitable to accommodate left-skewed tails, which are very common when the variable of interest is the time to failure of a product. The income application is related to player salaries within a professional sports league and it is peculiar because the mean of the player's salaries is much higher than for most professions. Both applications illustrate that the new distribution provides much better fits than other models with the same and less number of parameters.
Assuntos
Distribuições Estatísticas , Funções Verossimilhança , Análise de SobrevidaRESUMO
The article presents some aspects related to the COVID-19 pandemic in Brazil including public health, challenges facing healthcare workers and adverse impacts on the country's economy. Its main contribution is the availability of two web applications for online monitoring of the evolution of the pandemic in Brazil and South America. The applications provide the possibility to download data in different formats, view interactive maps and graphs of the cumulative confirmed cases, deaths and lethality rates, in addition to presenting plots of moving averages for states and municipalities. The predictions about new cases and new deaths caused by COVID-19, in states and regions of Brazil, are also reported using GAMLSS models. The forecasts can be easily used by public managers for effective decision-making.
RESUMO
The main contribution of this article is to report general statistics about COVID-19 in Brazil, based on analysis of accumulated series of confirmed cases, deaths and lethality rates, in addition to presenting graphs of moving averages for states and municipalities. The data show that the pandemic in Brazil has grown rapidly since February 25th (date of the first reported case). Furthermore, the lethality rate of COVID-19 in Brazil is greater than in many other Latin American countries (Chile, Argentina, Uruguay and Paraguay). However, the number of new confirmed cases in Brazil has little statistical relevance because only a small part of the population has been tested. In relation to Brazilian municipalities, we highlight the 10 states with the highest lethality rates, ranked from highest to lowest. Also, predictions about the increaseor decrease innew cases and deaths for states and capital cities are presented. These results can help managers and researchers to better guide their decisions regarding COVID-19.
Assuntos
COVID-19/epidemiologia , COVID-19/mortalidade , Pandemias , Brasil/epidemiologia , COVID-19/virologia , Política de Saúde/tendências , Humanos , Mortalidade , Vigilância da População/métodos , Saúde Pública/normas , Saúde Pública/tendências , SARS-CoV-2/fisiologiaRESUMO
The transmuted family of distributions has been receiving increased attention over the last few years. In this paper, we generalize the Marshall-Olkin extended Lomax distribution using the quadratic rank transmutation map to obtain the transmuted Marshall-Olkin extended Lomax distribution. Several properties of the new distribution are discussed including the hazard rate function, ordinary and incomplete moments, characteristic function and order statistics. We provide an estimation procedure by the maximum likelihood method and a simulation study to assess the performance of the new distribution. We prove empirically the flexibility of the new model by means of an application to a real data set. It is superior to other three and four parameter lifetime distributions.
Assuntos
Biometria , Modelos Estatísticos , ProbabilidadeRESUMO
We introduce a new class of continuous distributions called the generalized odd Lindley-G family. Four special models of the new family are provided. Some explicit expressions for the quantile and generating functions, ordinary and incomplete moments, order statistics and Rényi and Shannon entropies are derived. The maximum likelihood method is used for estimating the model parameters. The flexibility of the generated family is illustrated by means of two applications to real data sets.
Assuntos
Tábuas de Vida , Modelos Estatísticos , Distribuições Estatísticas , Simulação por Computador , Entropia , Funções VerossimilhançaRESUMO
Several lifetime distributions have played an important role to fit survival data. However, for some of these models, the computation of maximum likelihood estimators is quite difficult due to presence of flat regions in the search space, among other factors. Several well-known derivative-based optimization tools are unsuitable for obtaining such estimates. To circumvent this problem, we introduce the AdequacyModel computational library version 2.0.0 for the R statistical environment with two major contributions: a general optimization technique based on the Particle Swarm Optimization (PSO) method (with a minor modification of the original algorithm) and a set of statistical measures for assessment of the adequacy of the fitted model. This library is very useful for researchers in probability and statistics and has been cited in various papers in these areas. It serves as the basis for the Newdistns library (version 2.1) published in an impact journal in the area of computational statistics, see https://CRAN.R-project.org/package=Newdistns. It is also the basis of the Wrapped library (version 2.0), see https://CRAN.R-project.org/package=Wrapped. A third package making use of the AdequacyModel library can be found in https://CRAN.R-project.org/package=sglg. In addition, the proposed library has proved to be very useful for maximizing log-likelihood functions with complex search regions. The library provides a greater control of the optimization process by introducing a stop criterion based on a minimum number of iterations and the variance of a given proportion of optimal values. We emphasize that the new library can be used not only in statistics but in physics and mathematics as proved in several examples throughout the paper.
Assuntos
Probabilidade , Software , Algoritmos , Simulação por Computador , Método de Monte CarloRESUMO
The normal distribution has a central place in distribution theory and statistics. We propose the log-odd normal generalized (LONG) family of distributions based on log-odds and obtain some of its mathematical properties including a useful linear representation for the new family. We investigate, as a special model, the log-odd normal power-Cauchy (LONPC) distribution. Some structural properties of LONPC distribution are obtained including quantile function, ordinary and incomplete moments, generating function and some asymptotics. We estimate the model parameters using the maximum likelihood method. The usefulness of the proposed family is proved empirically by means of a real air pollution data set.
RESUMO
We define a new lifetime model based on compounding the Lindley and Nadarajah-Haghighi distributions. The proposed distribution is very competitive to other lifetime models. Some of its mathematical properties are investigated including generating function, mean residual life, moments, Bonferroni and Lorenz curves and mean deviations. We discuss the estimation of the model parameters by maximum likelihood. We provide a simulation study and two applications to real data for illustrative purposes. We prove empirically that the new distribution yields good fits to both data sets, and it can be a useful alternative for other classical lifetime models.
Assuntos
Modelos Estatísticos , Análise de Sobrevida , Algoritmos , Método de Monte Carlo , Probabilidade , Padrões de Referência , Reprodutibilidade dos Testes , Fatores de TempoRESUMO
In this paper, we introduce a new three-parameter distribution by compounding the Nadarajah-Haghighi and geometric distributions, which can be interpreted as a truncated Marshall-Olkin extended Weibull. The compounding procedure is based on the work by Marshall and Olkin 1997. We prove that the new distribution can be obtained as a compound model with mixing exponential distribution. It can have decreasing, increasing, upside-down bathtub, bathtub-shaped, constant and decreasing-increasing-decreasing failure rate functions depending on the values of the parameters. Some mathematical properties of the new distribution are studied including moments and quantile function. The maximum likelihood estimation procedure is discussed and a particle swarm optimization algorithm is provided for estimating the model parameters. The flexibility of the new model is illustrated with an application to a real data set.
Assuntos
Algoritmos , Intervalos de Confiança , Funções Verossimilhança , Método de Monte Carlo , Padrões de Referência , Fatores de TempoRESUMO
We propose a new survival model for lifetime data in the presence of surviving fraction and obtain some of its properties. Its genesis is based on extensions of the promotion time cure model, where an extra parameter controls the heterogeneity or dependence of an unobserved number of lifetimes. We construct a regression model to evaluate the effects of covariates in the cured fraction. We discuss inference aspects for the proposed model in a classical approach, where some maximum likelihood tools are explored. Further, an expectation maximization algorithm is developed to calculate the maximum likelihood estimates of the model parameters. We also perform an empirical study of the likelihood ratio test in order to compare the promotion time cure and the proposed models. We illustrate the usefulness of the new model by means of a colorectal cancer data set.