Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 136
Filtrar
1.
Stat Med ; 2024 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-39278641

RESUMO

Trivariate joint modeling for longitudinal count data, recurrent events, and a terminal event for family data has increased interest in medical studies. For example, families with Lynch syndrome (LS) are at high risk of developing colorectal cancer (CRC), where the number of polyps and the frequency of colonoscopy screening visits are highly associated with the risk of CRC among individuals and families. To assess how screening visits influence polyp detection, which in turn influences time to CRC, we propose a clustered trivariate joint model. The proposed model facilitates longitudinal count data that are zero-inflated and over-dispersed and invokes individual-specific and family-specific random effects to account for dependence among individuals and families. We formulate our proposed model as a latent Gaussian model to use the Bayesian estimation approach with the integrated nested Laplace approximation algorithm and evaluate its performance using simulation studies. Our trivariate joint model is applied to a series of 18 families from Newfoundland, with the occurrence of CRC taken as the terminal event, the colonoscopy screening visits as recurrent events, and the number of polyps detected at each visit as zero-inflated count data with overdispersion. We showed that our trivariate model fits better than alternative bivariate models and that the cluster effects should not be ignored when analyzing family data. Finally, the proposed model enables us to quantify heterogeneity across families and individuals in polyp detection and CRC risk, thus helping to identify individuals and families who would benefit from more intensive screening visits.

2.
J Indian Soc Probab Stat ; 25: 17-45, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-39070705

RESUMO

Studies/trials assessing status and progression of periodontal disease (PD) usually focus on quantifying the relationship between the clustered (tooth within subjects) bivariate endpoints, such as probed pocket depth (PPD), and clinical attachment level (CAL) with the covariates. Although assumptions of multivariate normality can be invoked for the random terms (random effects and errors) under a linear mixed model (LMM) framework, violations of those assumptions may lead to imprecise inference. Furthermore, the response-covariate relationship may not be linear, as assumed under a LMM fit, and the regression estimates obtained therein do not provide an overall summary of the risk of PD, as obtained from the covariates. Motivated by a PD study on Gullah-speaking African-American Type-2 diabetics, we cast the asymmetric clustered bivariate (PPD and CAL) responses into a non-linear mixed model framework, where both random terms follow the multivariate asymmetric Laplace distribution (ALD). In order to provide a one-number risk summary, the possible non-linearity in the relationship is modeled via a single-index model, powered by polynomial spline approximations for index functions, and the normal mixture expression for ALD. To proceed with a maximum-likelihood inferential setup, we devise an elegant EM-type algorithm. Moreover, the large sample theoretical properties are established under some mild conditions. Simulation studies using synthetic data generated under a variety of scenarios were used to study the finite-sample properties of our estimators, and demonstrate that our proposed model and estimation algorithm can efficiently handle asymmetric, heavy-tailed data, with outliers. Finally, we illustrate our proposed methodology via application to the motivating PD study.

3.
Stat Med ; 43(17): 3264-3279, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-38822699

RESUMO

Researchers often estimate the association between the hazard of a time-to-event outcome and the characteristics of individuals and the clusters in which individuals are nested. Lin and Wei's robust variance estimator is often used with a Cox regression model fit to clustered data. Recently, alternative variance estimators have been proposed: the Fay-Graubard estimator, the Kauermann-Carroll estimator, and the Mancl-DeRouen estimator. Using Monte Carlo simulations, we found that, when fitting a marginal Cox regression model with both individual-level and cluster-level covariates: (i) in the presence of weak to moderate within-cluster homogeneity of outcomes, the Lin-Wei variance estimator can result in estimates of the SE with moderate bias when the number of clusters is fewer than 20-30, while in the presence of strong within-cluster homogeneity, it can result in biased estimation even when the number of clusters is as large as 100; (ii) when the number of clusters was less than approximately 20, the Fay-Graubard variance estimator tended to result in estimates of SE with the lowest bias; (iii) when the number of clusters exceeded approximately 20, the Mancl-DeRouen estimator tended to result in estimated standard errors with the lowest bias; (iv) the Mancl-DeRouen estimator used with a t-distribution tended to result in 95% confidence that had the best performance of the estimators; (v) when the magnitude of within-cluster homogeneity in outcomes was strong or very strong, all methods resulted in confidence intervals with lower than advertised coverage rates even when the number of clusters was very large.


Assuntos
Simulação por Computador , Método de Monte Carlo , Estudos Observacionais como Assunto , Modelos de Riscos Proporcionais , Humanos , Análise por Conglomerados , Estudos Observacionais como Assunto/estatística & dados numéricos , Viés , Análise Multivariada , Interpretação Estatística de Dados
4.
Stat Med ; 43(12): 2332-2358, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38558286

RESUMO

In a clustered observational study, a treatment is assigned to groups and all units within the group are exposed to the treatment. We develop a new method for statistical adjustment in clustered observational studies using approximate balancing weights, a generalization of inverse propensity score weights that solve a convex optimization problem to find a set of weights that directly minimize a measure of covariate imbalance, subject to an additional penalty on the variance of the weights. We tailor the approximate balancing weights optimization problem to the clustered observational study setting by deriving an upper bound on the mean square error and finding weights that minimize this upper bound, linking the level of covariate balance to a bound on the bias. We implement the procedure by specializing the bound to a random cluster-level effects model, leading to a variance penalty that incorporates the signal-to-noise ratio and penalizes the weight on individuals and the total weight on groups differently according to the the intra-class correlation.


Assuntos
Modelos Estatísticos , Estudos Observacionais como Assunto , Pontuação de Propensão , Humanos , Estudos Observacionais como Assunto/métodos , Análise por Conglomerados , Simulação por Computador , Viés , Projetos de Pesquisa , Razão Sinal-Ruído
5.
Glob Health Action ; 17(1): 2331291, 2024 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-38666727

RESUMO

BACKGROUND: There is a lack of empirical data on design effects (DEFF) for mortality rate for highly clustered data such as with Ebola virus disease (EVD), along with a lack of documentation of methodological limitations and operational utility of mortality estimated from cluster-sampled studies when the DEFF is high. OBJECTIVES: The objectives of this paper are to report EVD mortality rate and DEFF estimates, and discuss the methodological limitations of cluster surveys when data are highly clustered such as during an EVD outbreak. METHODS: We analysed the outputs of two independent population-based surveys conducted at the end of the 2014-2016 EVD outbreak in Bo District, Sierra Leone, in urban and rural areas. In each area, 35 clusters of 14 households were selected with probability proportional to population size. We collected information on morbidity, mortality and changes in household composition during the recall period (May 2014 to April 2015). Rates were calculated for all-cause, all-age, under-5 and EVD-specific mortality, respectively, by areas and overall. Crude and adjusted mortality rates were estimated using Poisson regression, accounting for the surveys sample weights and the clustered design. RESULTS: Overall 980 households and 6,522 individuals participated in both surveys. A total of 64 deaths were reported, of which 20 were attributed to EVD. The crude and EVD-specific mortality rates were 0.35/10,000 person-days (95%CI: 0.23-0.52) and 0.12/10,000 person-days (95%CI: 0.05-0.32), respectively. The DEFF for EVD mortality was 5.53, and for non-EVD mortality, it was 1.53. DEFF for EVD-specific mortality was 6.18 in the rural area and 0.58 in the urban area. DEFF for non-EVD-specific mortality was 1.87 in the rural area and 0.44 in the urban area. CONCLUSION: Our findings demonstrate a high degree of clustering; this contributed to imprecise mortality estimates, which have limited utility when assessing the impact of disease. We provide DEFF estimates that can inform future cluster surveys and discuss design improvements to mitigate the limitations of surveys for highly clustered data.


Main findings: For humanitarian organizations it is imperative to document the methodological limitations of cluster surveys and discuss the utility.Added knowledge: This paper adds new knowledge on cluster surveys for highly clustered data such us in Ebola virus disease.Global health impact of policy and action: We provided empirical estimates and discuss design improvements to inform future study.


Assuntos
Surtos de Doenças , Doença pelo Vírus Ebola , Humanos , Serra Leoa/epidemiologia , Doença pelo Vírus Ebola/mortalidade , Doença pelo Vírus Ebola/epidemiologia , Estudos Retrospectivos , Adulto , Feminino , Adolescente , Pré-Escolar , Masculino , Pessoa de Meia-Idade , Adulto Jovem , Análise por Conglomerados , Criança , Lactente , População Rural/estatística & dados numéricos , População Urbana , Inquéritos e Questionários
6.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38488465

RESUMO

Age-related hearing loss has a complex etiology. Researchers have made efforts to classify relevant audiometric phenotypes, aiming to enhance medical interventions and improve hearing health. We leveraged existing pattern analyses of age-related hearing loss and implemented the phenotype classification via quadratic discriminant analysis (QDA). We herein propose a method for analyzing the exposure effects on the soft classification probabilities of the phenotypes via estimating equations. Under reasonable assumptions, the estimating equations are unbiased and lead to consistent estimators. The resulting estimator had good finite sample performances in simulation studies. As an illustrative example, we applied our proposed methods to assess the association between a dietary intake pattern, assessed as adherence scores for the dietary approaches to stop hypertension diet calculated using validated food-frequency questionnaires, and audiometric phenotypes (older-normal, metabolic, sensory, and metabolic plus sensory), determined based on data obtained in the Nurses' Health Study II Conservation of Hearing Study, the Audiology Assessment Arm. Our findings suggested that participants with a more healthful dietary pattern were less likely to develop the metabolic plus sensory phenotype of age-related hearing loss.


Assuntos
Perda Auditiva , Humanos , Causalidade , Análise de Regressão , Perda Auditiva/diagnóstico , Perda Auditiva/etiologia , Fenótipo
7.
Stats (Basel) ; 6(2): 526-538, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37920864

RESUMO

The area under the true ROC curve (AUC) is routinely used to determine how strongly a given model discriminates between the levels of a binary outcome. Standard inference with the AUC requires that outcomes be independent of each other. To overcome this limitation, a method was developed for the estimation of the variance of the AUC in the setting of two-level hierarchical data using probit-transformed prediction scores generated from generalized estimating equation models, thereby allowing for the application of inferential methods. This manuscript presents an extension of this approach so that inference for the AUC may be performed in a three-level hierarchical data setting (e.g., eyes nested within persons and persons nested within families). A method that accounts for the effect of tied prediction scores on inference is also described. The performance of 95% confidence intervals around the AUC was assessed through the simulation of three-level clustered data in multiple settings, including ones with tied data and variable cluster sizes. Across all settings, the actual 95% confidence interval coverage varied from 0.943 to 0.958, and the ratio of the theoretical variance to the empirical variance of the AUC varied from 0.920 to 1.013. The results are better than those from existing methods. Two examples of applying the proposed methodology are presented.

8.
Psychometrika ; 88(4): 1171-1196, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37874510

RESUMO

Optimal treatment regimes (OTRs) have been widely employed in computer science and personalized medicine to provide data-driven, optimal recommendations to individuals. However, previous research on OTRs has primarily focused on settings that are independent and identically distributed, with little attention given to the unique characteristics of educational settings, where students are nested within schools and there are hierarchical dependencies. The goal of this study is to propose a framework for designing OTRs from multisite randomized trials, a commonly used experimental design in education and psychology to evaluate educational programs. We investigate modifications to popular OTR methods, specifically Q-learning and weighting methods, in order to improve their performance in multisite randomized trials. A total of 12 modifications, 6 for Q-learning and 6 for weighting, are proposed by utilizing different multilevel models, moderators, and augmentations. Simulation studies reveal that all Q-learning modifications improve performance in multisite randomized trials and the modifications that incorporate random treatment effects show the most promise in handling cluster-level moderators. Among weighting methods, the modification that incorporates cluster dummies into moderator variables and augmentation terms performs best across simulation conditions. The proposed modifications are demonstrated through an application to estimate an OTR of conditional cash transfer programs using a multisite randomized trial in Colombia to maximize educational attainment.


Assuntos
Políticas , Projetos de Pesquisa , Humanos , Psicometria , Ensaios Clínicos Controlados Aleatórios como Assunto , Simulação por Computador
9.
Biometrika ; 110(3): 645-662, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37711671

RESUMO

The micro-randomized trial (MRT) is a sequential randomized experimental design to empirically evaluate the effectiveness of mobile health (mHealth) intervention components that may be delivered at hundreds or thousands of decision points. MRTs have motivated a new class of causal estimands, termed "causal excursion effects", for which semiparametric inference can be conducted via a weighted, centered least squares criterion (Boruvka et al., 2018). Existing methods assume between-subject independence and non-interference. Deviations from these assumptions often occur. In this paper, causal excursion effects are revisited under potential cluster-level treatment effect heterogeneity and interference, where the treatment effect of interest may depend on cluster-level moderators. Utility of the proposed methods is shown by analyzing data from a multi-institution cohort of first year medical residents in the United States.

10.
Stat Med ; 42(24): 4333-4348, 2023 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-37548059

RESUMO

Clustered data are common in biomedical research. Observations in the same cluster are often more similar to each other than to observations from other clusters. The intraclass correlation coefficient (ICC), first introduced by R. A. Fisher, is frequently used to measure this degree of similarity. However, the ICC is sensitive to extreme values and skewed distributions, and depends on the scale of the data. It is also not applicable to ordered categorical data. We define the rank ICC as a natural extension of Fisher's ICC to the rank scale, and describe its corresponding population parameter. The rank ICC is simply interpreted as the rank correlation between a random pair of observations from the same cluster. We also extend the definition when the underlying distribution has more than two hierarchies. We describe estimation and inference procedures, show the asymptotic properties of our estimator, conduct simulations to evaluate its performance, and illustrate our method in three real data examples with skewed data, count data, and three-level ordered categorical data.

11.
Stat Med ; 42(21): 3745-3763, 2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37593802

RESUMO

Hierarchical data arise when observations are clustered into groups. Multilevel models are practically useful in these settings, but these models are elusive in the context of hierarchical data with mixed multivariate outcomes. In this article, we consider binary and survival outcomes and assume the hierarchical structure is induced by clustering of both outcomes within patients and clustering of patients within hospitals which frequently occur in multicenter studies. We introduce a multilevel joint frailty model that analyzes the outcomes simultaneously to jointly estimate their regression parameters and explicitly model within-patient correlation between the outcomes and within-hospital correlation separately for each outcome. Estimation is facilitated by a computationally efficient residual maximum likelihood method that further predicts cluster-specific frailties for both outcomes and circumvents the formidable challenges induced by multidimensional integration that complicates the underlying likelihood. The performance of the model and estimation procedure is investigated via extensive simulation studies. The practical utility of the model is illustrated through simultaneous modeling of disease-free survival and binary endpoint of platelet recovery in a multicenter allogeneic bone marrow transplantation dataset that motivates this study.


Assuntos
Transplante de Medula Óssea , Fragilidade , Articulações , Análise por Conglomerados , Humanos , Simulação por Computador , Intervalo Livre de Doença
12.
Biometrics ; 79(4): 3764-3777, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37459181

RESUMO

Continuous response data are regularly transformed to meet regression modeling assumptions. However, approaches taken to identify the appropriate transformation can be ad hoc and can increase model uncertainty. Further, the resulting transformations often vary across studies leading to difficulties with synthesizing and interpreting results. When a continuous response variable is measured repeatedly within individuals or when continuous responses arise from clusters, analyses have the additional challenge caused by within-individual or within-cluster correlations. We extend a widely used ordinal regression model, the cumulative probability model (CPM), to fit clustered, continuous response data using generalized estimating equations for ordinal responses. With the proposed approach, estimates of marginal model parameters, cumulative distribution functions , expectations, and quantiles conditional on covariates can be obtained without pretransformation of the response data. While computational challenges arise with large numbers of distinct values of the continuous response variable, we propose feasible and computationally efficient approaches to fit CPMs under commonly used working correlation structures. We study finite sample operating characteristics of the estimators via simulation and illustrate their implementation with two data examples. One studies predictors of CD4:CD8 ratios in a cohort living with HIV, and the other investigates the association of a single nucleotide polymorphism and lung function decline in a cohort with early chronic obstructive pulmonary disease.


Assuntos
Modelos Estatísticos , Humanos , Simulação por Computador , Probabilidade , Incerteza
13.
J Appl Stat ; 50(10): 2228-2245, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37434628

RESUMO

Group testing study designs have been used since the 1940s to reduce screening costs for uncommon diseases; for rare diseases, all cases are identifiable with substantially fewer tests than the population size. Substantial research has identified efficient designs under this paradigm. However, little work has focused on the important problem of disease screening among clustered data, such as geographic heterogeneity in HIV prevalence. We evaluated designs where we first estimate disease prevalence and then apply efficient group testing algorithms using these estimates. Specifically, we evaluate prevalence using individual testing on a fixed-size subset of each cluster and use these prevalence estimates to choose group sizes that minimize the corresponding estimated average number of tests per subject. We compare designs where we estimate cluster-specific prevalences as well as a common prevalence across clusters, use different group testing algorithms, construct groups from individuals within and in different clusters, and consider misclassification. For diseases with low prevalence, our results suggest that accounting for clustering is unnecessary. However, for diseases with higher prevalence and sizeable between-cluster heterogeneity, accounting for clustering in study design and implementation improves efficiency. We consider the practical aspects of our design recommendations with two examples with strong clustering effects: (1) Identification of HIV carriers in the US population and (2) Laboratory screening of anti-cancer compounds using cell lines.

14.
Stat Methods Med Res ; 32(7): 1284-1299, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37303120

RESUMO

Real-world data sources offer opportunities to compare the effectiveness of treatments in practical clinical settings. However, relevant outcomes are often recorded selectively and collected at irregular measurement times. It is therefore common to convert the available visits to a standardized schedule with equally spaced visits. Although more advanced imputation methods exist, they are not designed to recover longitudinal outcome trajectories and typically assume that missingness is non-informative. We, therefore, propose an extension of multilevel multiple imputation methods to facilitate the analysis of real-world outcome data that is collected at irregular observation times. We illustrate multilevel multiple imputation in a case study evaluating two disease-modifying therapies for multiple sclerosis in terms of time to confirmed disability progression. This survival outcome is derived from repeated measurements of the Expanded Disability Status Scale, which is collected when patients come to the healthcare center for a clinical visit and for which longitudinal trajectories can be estimated. Subsequently, we perform a simulation study to compare the performance of multilevel multiple imputation to commonly used single imputation methods. Results indicate that multilevel multiple imputation leads to less biased treatment effect estimates and improves the coverage of confidence intervals, even when outcomes are missing not at random.


Assuntos
Esclerose Múltipla , Humanos , Esclerose Múltipla/tratamento farmacológico , Projetos de Pesquisa , Interpretação Estatística de Dados , Simulação por Computador
15.
Stat Med ; 42(19): 3443-3466, 2023 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-37308115

RESUMO

Across research disciplines, cluster randomized trials (CRTs) are commonly implemented to evaluate interventions delivered to groups of participants, such as communities and clinics. Despite advances in the design and analysis of CRTs, several challenges remain. First, there are many possible ways to specify the causal effect of interest (eg, at the individual-level or at the cluster-level). Second, the theoretical and practical performance of common methods for CRT analysis remain poorly understood. Here, we present a general framework to formally define an array of causal effects in terms of summary measures of counterfactual outcomes. Next, we provide a comprehensive overview of CRT estimators, including the t-test, generalized estimating equations (GEE), augmented-GEE, and targeted maximum likelihood estimation (TMLE). Using finite sample simulations, we illustrate the practical performance of these estimators for different causal effects and when, as commonly occurs, there are limited numbers of clusters of different sizes. Finally, our application to data from the Preterm Birth Initiative (PTBi) study demonstrates the real-world impact of varying cluster sizes and targeting effects at the cluster-level or at the individual-level. Specifically, the relative effect of the PTBi intervention was 0.81 at the cluster-level, corresponding to a 19% reduction in outcome incidence, and was 0.66 at the individual-level, corresponding to a 34% reduction in outcome risk. Given its flexibility to estimate a variety of user-specified effects and ability to adaptively adjust for covariates for precision gains while maintaining Type-I error control, we conclude TMLE is a promising tool for CRT analysis.


Assuntos
Nascimento Prematuro , Recém-Nascido , Feminino , Humanos , Simulação por Computador , Ensaios Clínicos Controlados Aleatórios como Assunto , Tamanho da Amostra , Causalidade , Análise por Conglomerados
16.
Entropy (Basel) ; 25(6)2023 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-37372207

RESUMO

Multilevel semicontinuous data occur frequently in medical, environmental, insurance and financial studies. Such data are often measured with covariates at different levels; however, these data have traditionally been modelled with covariate-independent random effects. Ignoring dependence of cluster-specific random effects and cluster-specific covariates in these traditional approaches may lead to ecological fallacy and result in misleading results. In this paper, we propose Tweedie compound Poisson model with covariate-dependent random effects to analyze multilevel semicontinuous data where covariates at different levels are incorporated at relevant levels. The estimation of our models has been developed based on the orthodox best linear unbiased predictor of random effect. Explicit expressions of random effects predictors facilitate computation and interpretation of our models. Our approach is illustrated through the analysis of the basic symptoms inventory study data where 409 adolescents from 269 families were observed at varying number of times from 1 to 17 times. The performance of the proposed methodology was also examined through the simulation studies.

17.
Stat Methods Med Res ; 32(8): 1494-1510, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37323013

RESUMO

Multistate current status data presents a more severe form of censoring due to the single observation of study participants transitioning through a sequence of well-defined disease states at random inspection times. Moreover, these data may be clustered within specified groups, and informativeness of the cluster sizes may arise due to the existing latent relationship between the transition outcomes and the cluster sizes. Failure to adjust for this informativeness may lead to a biased inference. Motivated by a clinical study of periodontal disease, we propose an extension of the pseudo-value approach to estimate covariate effects on the state occupation probabilities for these clustered multistate current status data with informative cluster or intra-cluster group sizes. In our approach, the proposed pseudo-value technique initially computes marginal estimators of the state occupation probabilities utilizing nonparametric regression. Next, the estimating equations based on the corresponding pseudo-values are reweighted by functions of the cluster sizes to adjust for informativeness. We perform a variety of simulation studies to study the properties of our pseudo-value regression based on the nonparametric marginal estimators under different scenarios of informativeness. For illustration, the method is applied to the motivating periodontal disease dataset, which encapsulates the complex data-generation mechanism.


Assuntos
Modelos Estatísticos , Doenças Periodontais , Humanos , Análise por Conglomerados , Simulação por Computador , Doenças Periodontais/epidemiologia , Tamanho da Amostra
18.
J Appl Stat ; 50(8): 1836-1852, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37260471

RESUMO

Although under-five mortality (U5M) rates have declined worldwide, many countries in sub-Saharan Africa still have much higher rates. Detection of subnational areas with unusually higher U5M rates could support targeted high impact child health interventions. We propose a novel group outlier detection statistic for identifying areas with extreme U5M rates under a multivariate survival data model. The performance of the proposed statistic was evaluated through a simulation study. We applied the proposed method to an analysis of child survival data in Malawi to identify sub-districts with unusually higher or lower U5M rates. The simulation study showed that the proposed outlier statistic can detect unusual high or low mortality groups with a high accuracy of at least 90%, for datasets with at least 50 clusters of size 80 or more. In the application, at most 7 U5M outlier sub-districts were identified, based on the best fitting model as measured by the Akaike information criterion (AIC).

19.
Biom J ; 65(8): e2300123, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37377083

RESUMO

The formula of Fleiss and Cuzick (1979) to estimate the intraclass correlation coefficient is applied to reduce the task of sample size calculation for clustered data with binary outcome. It is demonstrated that this approach reduces the complexity of sample size calculation to the determination of the null and alternative hypothesis and the formulation of the quantitative influence of the belonging to the same cluster on the therapy success probability.


Assuntos
Projetos de Pesquisa , Tamanho da Amostra , Probabilidade , Análise por Conglomerados
20.
J Appl Stat ; 50(9): 1921-1941, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37378268

RESUMO

Clustered current status data are frequently encountered in biomedical research and other areas that require survival analysis. This paper proposes graphical and formal model assessment procedures to evaluate the goodness of fit of the additive hazards model to clustered current status data. The test statistics proposed are based on sums of martingale-based residuals. Relevant asymptotic properties are established, and empirical distributions of the test statistics can be simulated utilizing Gaussian multipliers. Extensive simulation studies confirmed that the proposed test procedures work well for practical scenarios. This proposed method applies when failure times within the same cluster are correlated, and in particular, when cluster sizes can be informative about intra-cluster correlations. The method is applied to analyze clustered current status data from a lung tumorigenicity study.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA