Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Stat Med ; 42(13): 2162-2178, 2023 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-36973919

RESUMO

Informative cluster size (ICS) arises in situations with clustered data where a latent relationship exists between the number of participants in a cluster and the outcome measures. Although this phenomenon has been sporadically reported in the statistical literature for nearly two decades now, further exploration is needed in certain statistical methodologies to avoid potentially misleading inferences. For inference about population quantities without covariates, inverse cluster size reweightings are often employed to adjust for ICS. Further, to study the effect of covariates on disease progression described by a multistate model, the pseudo-value regression technique has gained popularity in time-to-event data analysis. We seek to answer the question: "How to apply pseudo-value regression to clustered time-to-event data when cluster size is informative?" ICS adjustment by the reweighting method can be performed in two steps; estimation of marginal functions of the multistate model and fitting the estimating equations based on pseudo-value responses, leading to four possible strategies. We present theoretical arguments and thorough simulation experiments to ascertain the correct strategy for adjusting for ICS. A further extension of our methodology is implemented to include informativeness induced by the intracluster group size. We demonstrate the methods in two real-world applications: (i) to determine predictors of tooth survival in a periodontal study and (ii) to identify indicators of ambulatory recovery in spinal cord injury patients who participated in locomotor-training rehabilitation.


Assuntos
Modelos Estatísticos , Dente , Humanos , Análise por Conglomerados , Simulação por Computador , Análise de Regressão
2.
Stat Med ; 2022 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-36574753

RESUMO

We propose a Bayesian hurdle mixed-effects model to analyze longitudinal ordinal data under a complex multilevel structure. This research was motivated by the dataset gathered from the Iowa Fluoride Study (IFS) in order to establish the relationships between fluorosis status and potential risk/protective factors. Dental fluorosis is characterized by spots on tooth enamel and is due to ingestion of excessive fluoride intake during enamel formation. Observations are collected from multiple surface zones on each tooth and on all available teeth of children from the studied cohort, which are longitudinally observed at ages 9, 13, and 17. The data not only exhibit a complex hierarchical structure, but also have a large proportion of zero values that are likely to follow different statistical patterns from non-zero categories. Therefore, we develop a hurdle model to consider the zero category separately, while a proportional odds model is used for the positive categories. The estimated parameters are obtained from a Gibbs sampler implemented by the OpenBUGS software. Our model is compared with two popular methods for ordinal data: the proportional odds model and the partial proportional odds model. We perform a comprehensive analysis of the IFS data and evaluate the accuracy and effectiveness of our methodology through simulation studies. Our discoveries provide novel insights to statisticians and dental practitioners about the associations between patient and clinical characteristics and dental fluorosis.

3.
Stat Med ; 40(6): 1336-1356, 2021 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-33368533

RESUMO

Dental caries (i.e., cavities) is one of the most common chronic childhood diseases and may continue to progress throughout a person's lifetime. The Iowa Fluoride Study (IFS) was designed to investigate the effects of various fluoride, dietary and nondietary factors on the progression of dental caries among a cohort of Iowa school children. We develop a mixed effects model to perform a comprehensive analysis of the longitudinal clustered data of IFS at ages 5, 9, 13, and 17. We combine a Bayesian hurdle framework with the Conway-Maxwell-Poisson regression model, which can account for both excessive zeros and various levels of dispersion. A hierarchical shrinkage prior distribution is used to share the temporal information for predictors in the fixed-effects model. The dependence among teeth of each individual child is modeled through a sparse covariance structure of the random effects across time. Moreover, we obtain the parameter estimates and credible intervals from a Gibbs sampler. Simulation studies are conducted to assess the accuracy and effectiveness of our statistical methodology. The results of this article provide novel tools to statistical practitioners and offer fresh insights to dental researchers on effects of various risk and protective factors on caries progression.


Assuntos
Cárie Dentária , Adolescente , Teorema de Bayes , Criança , Pré-Escolar , Simulação por Computador , Cárie Dentária/epidemiologia , Humanos , Iowa/epidemiologia , Distribuição de Poisson
4.
Biom J ; 63(4): 761-786, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33393147

RESUMO

Biological and medical researchers often collect count data in clusters at multiple time points. The data can exhibit excessive zeros and a wide range of dispersion levels. In particular, our research was motivated by a dental dataset with such complex data features: the Iowa Fluoride Study (IFS). The study was designed to investigate the effects of various dietary and nondietary factors on the caries development of a cohort of Iowa school children at the ages of 5, 9, and 13. To analyze the multiyear IFS data, we propose a novel longitudinal method of a generalized estimating equations based marginal regression model. We use a zero-inflated model with a Conway-Maxwell-Poisson (CMP) distribution, which has the flexibility to account for all levels of dispersion. The parameters of interest are estimated through a modified expectation-solution algorithm to account for the clustered and temporal correlation structure. We fit the proposed zero-inflated CMP model and perform a comprehensive secondary analysis of the IFS dataset. It resulted in a number of notable conclusions that also make clinical sense. Additionally, we demonstrated the superiority of this modeling approach over two other popular competing models: the zero-inflated Poisson and negative binomial models. In the simulation studies, we further evaluate the performance of our point estimators, the variance estimators, and that of the large sample confidence intervals for the parameters of interest. It is also demonstrated that our longitudinal CMP model can correctly identify the time-varying dispersion patterns.


Assuntos
Fluoretos , Modelos Estatísticos , Criança , Simulação por Computador , Humanos , Iowa , Distribuição de Poisson
5.
Stat Med ; 37(30): 4807-4822, 2018 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-30232808

RESUMO

There have been numerous attempts to extend the Wilcoxon rank-sum test to clustered data. Recently, one such rank-sum test (Dutta & Datta, 2016, Biometrics 72, 432-440) was developed to compare the group-specific marginal distributions of outcomes in clustered data where the conditional distributions of outcomes depend on the number of observations from that group in a given cluster, a phenomenon referred to as informative intra-cluster group (ICG) size. However, comparison of group-specific marginal distributions may not be sufficient in presence of some potentially useful covariables that are observed in the study. In addition, not accounting for the effect of these covariates can lead to biased and misleading inference for the group comparisons. Thus, the purpose of this article is twofold. First, we develop a method to estimate the covariate effects using rank-based weighted estimating equations that are appropriate when the ICG size is informative. Second, we construct an aligned rank-sum test based on the covariate adjusted outcomes. Asymptotic distributions of the R-estimators and the test statistic are provided. Through simulation studies, we show the importance of selecting proper weights in constructing the estimating equations when informativeness is present through the cluster or ICG sizes. We also demonstrate the superiority and the robustness of our method in comparison to regular parametric linear mixed models in clustered data. We apply our method to analyze different real-life data sets including a data on birthweights of rat pups in different litters and a dental data on tooth attachment loss.


Assuntos
Análise por Conglomerados , Tamanho da Amostra , Estatísticas não Paramétricas , Idoso , Animais , Peso ao Nascer , Interpretação Estatística de Dados , Humanos , Modelos Lineares , Modelos Estatísticos , Perda da Inserção Periodontal/epidemiologia , Ratos
6.
Stat Med ; 37(5): 801-812, 2018 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-29108124

RESUMO

In practice, count data may exhibit varying dispersion patterns and excessive zero values; additionally, they may appear in groups or clusters sharing a common source of variation. We present a novel Bayesian approach for analyzing such data. To model these features, we combine the Conway-Maxwell-Poisson distribution, which allows both overdispersion and underdispersion, with a hurdle component for the zeros and random effects for clustering. We propose an efficient Markov chain Monte Carlo sampling scheme to obtain posterior inference from our model. Through simulation studies, we compare our hurdle Conway-Maxwell-Poisson model with a hurdle Poisson model to demonstrate the effectiveness of our Conway-Maxwell-Poisson approach. Furthermore, we apply our model to analyze an illustrative dataset containing information on the number and types of carious lesions on each tooth in a population of 9-year-olds from the Iowa Fluoride Study, which is an ongoing longitudinal study on a cohort of Iowa children that began in 1991.


Assuntos
Teorema de Bayes , Análise por Conglomerados , Distribuição de Poisson , Criança , Simulação por Computador , Interpretação Estatística de Dados , Cárie Dentária/prevenção & controle , Fluoretos/uso terapêutico , Humanos , Iowa , Estudos Longitudinais , Cadeias de Markov , Método de Monte Carlo
7.
Stat Med ; 36(16): 2630-2640, 2017 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-28324913

RESUMO

Clustered data are often encountered in biomedical studies, and to date, a number of approaches have been proposed to analyze such data. However, the phenomenon of informative cluster size (ICS) is a challenging problem, and its presence has an impact on the choice of a correct analysis methodology. For example, Dutta and Datta (2015, Biometrics) presented a number of marginal distributions that could be tested. Depending on the nature and degree of informativeness of the cluster size, these marginal distributions may differ, as do the choices of the appropriate test. In particular, they applied their new test to a periodontal data set where the plausibility of the informativeness was mentioned, but no formal test for the same was conducted. We propose bootstrap tests for testing the presence of ICS. A balanced bootstrap method is developed to successfully estimate the null distribution by merging the re-sampled observations with closely matching counterparts. Relying on the assumption of exchangeability within clusters, the proposed procedure performs well in simulations even with a small number of clusters, at different distributions and against different alternative hypotheses, thus making it an omnibus test. We also explain how to extend the ICS test to a regression setting and thereby enhancing its practical utility. The methodologies are illustrated using the periodontal data set mentioned earlier. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Análise por Conglomerados , Modelos Estatísticos , Bioestatística , Simulação por Computador , Interpretação Estatística de Dados , Inquéritos de Saúde Bucal , Humanos , Método de Monte Carlo , Doenças Periodontais/diagnóstico , Análise de Regressão , Tamanho da Amostra
8.
Stat Neerl ; 71(1): 31-57, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-28798498

RESUMO

Datasets examining periodontal disease records current (disease) status information of tooth-sites, whose stochastic behavior can be attributed to a multistate system with state occupation determined at a single inspection time. In addition, the tooth-sites remain clustered within a subject, and the number of available tooth-sites may be representative of the true PD status of that subject, leading to an 'informative cluster size' scenario. To provide insulation against incorrect model assumptions, we propose a nonparametric regression framework to estimate state occupation probabilities at a given time and state exit/entry distributions, utilizing weighted monotonic regression and smoothing techniques. We demonstrate the superior performance of our proposed weighted estimators over the un-weighted counterparts via. a simulation study, and illustrate the methodology using a dataset on periodontal disease.

9.
Biometrics ; 72(2): 432-40, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26575695

RESUMO

The Wilcoxon rank-sum test is a popular nonparametric test for comparing two independent populations (groups). In recent years, there have been renewed attempts in extending the Wilcoxon rank sum test for clustered data, one of which (Datta and Satten, 2005, Journal of the American Statistical Association 100, 908-915) addresses the issue of informative cluster size, i.e., when the outcomes and the cluster size are correlated. We are faced with a situation where the group specific marginal distribution in a cluster depends on the number of observations in that group (i.e., the intra-cluster group size). We develop a novel extension of the rank-sum test for handling this situation. We compare the performance of our test with the Datta-Satten test, as well as the naive Wilcoxon rank sum test. Using a naturally occurring simulation model of informative intra-cluster group size, we show that only our test maintains the correct size. We also compare our test with a classical signed rank test based on averages of the outcome values in each group paired by the cluster membership. While this test maintains the size, it has lower power than our test. Extensions to multiple group comparisons and the case of clusters not having samples from all groups are also discussed. We apply our test to determine whether there are differences in the attachment loss between the upper and lower teeth and between mesial and buccal sites of periodontal patients.


Assuntos
Biometria , Análise por Conglomerados , Interpretação Estatística de Dados , Estatísticas não Paramétricas , Simulação por Computador , Humanos , Perda da Inserção Periodontal , Tamanho da Amostra , Dente
10.
Biometrics ; 72(2): 606-18, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26575079

RESUMO

Community water fluoridation is an important public health measure to prevent dental caries, but it continues to be somewhat controversial. The Iowa Fluoride Study (IFS) is a longitudinal study on a cohort of Iowa children that began in 1991. The main purposes of this study (http://www.dentistry.uiowa.edu/preventive-fluoride-study) were to quantify fluoride exposures from both dietary and nondietary sources and to associate longitudinal fluoride exposures with dental fluorosis (spots on teeth) and dental caries (cavities). We analyze a subset of the IFS data by a marginal regression model with a zero-inflated version of the Conway-Maxwell-Poisson distribution for count data exhibiting excessive zeros and a wide range of dispersion patterns. In general, we introduce two estimation methods for fitting a ZICMP marginal regression model. Finite sample behaviors of the estimators and the resulting confidence intervals are studied using extensive simulation studies. We apply our methodologies to the dental caries data. Our novel modeling incorporating zero inflation, clustering, and overdispersion sheds some new light on the effect of community water fluoridation and other factors. We also include a second application of our methodology to a genomic (next-generation sequencing) dataset that exhibits underdispersion.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Análise de Regressão , Biometria/métodos , Análise por Conglomerados , Simulação por Computador , Intervalos de Confiança , Água Potável , Fluoretação , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Distribuição de Poisson
11.
Biometrics ; 72(2): 441-51, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26682911

RESUMO

Ignorance of the mechanisms responsible for the availability of information presents an unusual problem for analysts. It is often the case that the availability of information is dependent on the outcome. In the analysis of cluster data we say that a condition for informative cluster size (ICS) exists when the inference drawn from analysis of hypothetical balanced data varies from that of inference drawn on observed data. Much work has been done in order to address the analysis of clustered data with informative cluster size; examples include Inverse Probability Weighting (IPW), Cluster Weighted Generalized Estimating Equations (CWGEE), and Doubly Weighted Generalized Estimating Equations (DWGEE). When cluster size changes with time, i.e., the data set possess temporally varying cluster sizes (TVCS), these methods may produce biased inference for the underlying marginal distribution of interest. We propose a new marginalization that may be appropriate for addressing clustered longitudinal data with TVCS. The principal motivation for our present work is to analyze the periodontal data collected by Beck et al. (1997, Journal of Periodontal Research 6, 497-505). Longitudinal periodontal data often exhibits both ICS and TVCS as the number of teeth possessed by participants at the onset of study is not constant and teeth as well as individuals may be displaced throughout the study.


Assuntos
Análise por Conglomerados , Estudos Longitudinais , Modelos Estatísticos , Análise de Regressão , Simulação por Computador , Humanos , Periodontia , Dente
12.
Comput Stat Data Anal ; 85: 54-66, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25620827

RESUMO

Use of zero-inflated count data models is common in applications where the number of zero counts exceeds that predicted from a traditional count data model such as Poisson or negative binomial. When count data exhibiting inflated zero counts are correlated among subjects, a natural approach will be to fit a marginal model with the help of generalized estimating equations (GEE) that can incorporate subject-to-subject correlations. A GEE based zero-inflated negative binomial (ZINB) model is proposed to fit clustered counts with excessive zeros. However, the corresponding sandwich variance estimator appears to underestimate the true variance. The theoretical reasons for its failure are explained and a correction under additional modeling assumptions is offered. In addition, a clustered resampling (bootstrap) procedure is proposed to estimate the variance and it is shown that the bootstrap procedure captures the correct variance under no additional model assumptions. Utility of this marginal GEE based ZINB model over two other competing models has been assessed using a thorough simulation study. The resulting inference procedure is applied to study the association between the dental caries and fluoride exposures using a dataset extracted from the Iowa Fluoride Study. A number of risk factors of clinical significance are reliably identified using the proposed model.

13.
Stat Methods Med Res ; 33(7): 1264-1277, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38767219

RESUMO

In many cluster-correlated data analyses, informative cluster size poses a challenge that can potentially introduce bias in statistical analyses. Different methodologies have been introduced in statistical literature to address this bias. In this study, we consider a complex form of informativeness where the number of observations corresponding to latent levels of a unit-level continuous covariate within a cluster is associated with the response variable. This type of informativeness has not been explored in prior research. We present a novel test statistic designed to evaluate the effect of the continuous covariate while accounting for the presence of informativeness. The covariate induces a continuum of latent subgroups within the clusters, and our test statistic is formulated by aggregating values from an established statistic that accounts for informative subgroup sizes when comparing group-specific marginal distributions. Through carefully designed simulations, we compare our test with four traditional methods commonly employed in the analysis of cluster-correlated data. Only our test maintains the size across all data-generating scenarios with informativeness. We illustrate the proposed method to test for marginal associations in periodontal data with this distinctive form of informativeness.


Assuntos
Modelos Estatísticos , Humanos , Análise por Conglomerados , Simulação por Computador , Interpretação Estatística de Dados , Tamanho da Amostra , Viés , Doenças Periodontais
14.
Stat Methods Med Res ; 32(8): 1494-1510, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37323013

RESUMO

Multistate current status data presents a more severe form of censoring due to the single observation of study participants transitioning through a sequence of well-defined disease states at random inspection times. Moreover, these data may be clustered within specified groups, and informativeness of the cluster sizes may arise due to the existing latent relationship between the transition outcomes and the cluster sizes. Failure to adjust for this informativeness may lead to a biased inference. Motivated by a clinical study of periodontal disease, we propose an extension of the pseudo-value approach to estimate covariate effects on the state occupation probabilities for these clustered multistate current status data with informative cluster or intra-cluster group sizes. In our approach, the proposed pseudo-value technique initially computes marginal estimators of the state occupation probabilities utilizing nonparametric regression. Next, the estimating equations based on the corresponding pseudo-values are reweighted by functions of the cluster sizes to adjust for informativeness. We perform a variety of simulation studies to study the properties of our pseudo-value regression based on the nonparametric marginal estimators under different scenarios of informativeness. For illustration, the method is applied to the motivating periodontal disease dataset, which encapsulates the complex data-generation mechanism.


Assuntos
Modelos Estatísticos , Doenças Periodontais , Humanos , Análise por Conglomerados , Simulação por Computador , Doenças Periodontais/epidemiologia , Tamanho da Amostra
15.
Stat Methods Med Res ; 27(6): 1806-1817, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-27655806

RESUMO

In the marginal analysis of clustered data, where the marginal distribution of interest is that of a typical observation within a typical cluster, analysis by reweighting has been introduced as a useful tool for estimating parameters of these marginal distributions. Such reweighting methods have foundation in within-cluster resampling schemes that marginalize potential informativeness due to cluster size or within-cluster covariate distribution, to which reweighting methods are asymptotically equivalent. In this paper, we introduce a reweighting scheme for the marginal analysis of clustered data that generalizes prior reweighting methods, with a particular application to measuring bivariate correlation in unpaired clustered data, in which observations of two random variables are not naturally paired at the within-cluster level. We develop unpaired clustered data analogs of well-known product moment correlation coefficients (Pearson, Spearman, phi), as well as the polyserial coefficient for measuring correlation between one discrete and one continuous variable. We evaluate the performance of these coefficients via a simulation study and demonstrate their use by finding no statistically significant association between dental caries at an early age and dental fluorosis at age 13 using a large dental dataset.


Assuntos
Interpretação Estatística de Dados , Cárie Dentária/complicações , Fluorose Dentária/etiologia , Adolescente , Algoritmos , Criança , Análise por Conglomerados , Bases de Dados Factuais , Cárie Dentária/epidemiologia , Humanos , Estudos Observacionais como Assunto , Estados Unidos/epidemiologia
16.
Stat Methods Med Res ; 20(4): 347-67, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20223781

RESUMO

Clustered longitudinal data are often collected as repeated measures on subjects arising in clusters. Examples include periodontal disease study, where the measurements related to the disease status of each tooth are collected over time for each patient, which can be considered as a cluster. For such applications, the number of teeth for each patient may be related to the overall oral health of the individual and hence may influence the distribution of the outcome measure of interest leading to an informative cluster size. Under such situations, generalised estimating equations (GEE) may lead to invalid inferences. In this article, we investigate the performance of three competing proposals of fitting marginal linear models to clustered longitudinal data, namely, GEE, within-cluster resampling (WCR) and cluster-weighted generalised estimating equations (CWGEE). We show by simulations and theoretical calculations that, when the cluster size is informative, GEE provides biased estimators, while both WCR and CWGEE achieve unbiasedness under a variety of 'working' correlation structures for temporal measurements within each subject. Statistical properties of confidence intervals have been investigated using the probability-probability plots. Overall, CWGEE appears to be the recommended choice for marginal parametric inference with clustered longitudinal data that achieves similar parameter estimates and test statistics as WCR while avoiding Monte Carlo computation. The corresponding Wald tests have desirable power properties as well. We illustrate our analysis using a temporal data set on periodontal disease, which clearly demonstrates the need for CWGEE over GEE.


Assuntos
Análise por Conglomerados , Modelos Lineares , Estudos Longitudinais/estatística & dados numéricos , Idoso , Idoso de 80 Anos ou mais , Viés , Simulação por Computador , Interpretação Estatística de Dados , Feminino , Humanos , Masculino , Modelos Estatísticos , Doenças Periodontais/epidemiologia
17.
Biometrics ; 59(1): 36-42, 2003 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-12762439

RESUMO

We propose a new approach to fitting marginal models to clustered data when cluster size is informative. This approach uses a generalized estimating equation (GEE) that is weighted inversely with the cluster size. We show that our approach is asymptotically equivalent to within-cluster resampling (Hoffman, Sen, and Weinberg, 2001, Biometrika 73, 13-22), a computationally intensive approach in which replicate data sets containing a randomly selected observation from each cluster are analyzed, and the resulting estimates averaged. Using simulated data and an example involving dental health, we show the superior performance of our approach compared to unweighted GEE, the equivalence of our approach with WCR for large sample sizes, and the superior performance of our approach compared with WCR when sample sizes are small.


Assuntos
Análise por Conglomerados , Interpretação Estatística de Dados , Simulação por Computador , Humanos , Periodontite/epidemiologia , Tamanho da Amostra
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA