RESUMO
Controlled clinical trials in neuropsychopharmacology, as in numerous other clinical research domains, tend to employ a conventional parallel-groups design with repeated measurements. The hypothesis of primary interest in the relatively short-term, double-blind trials, concerns the difference between patterns or magnitudes of change from baseline. A simple two-stage approach to the analysis of such data involves calculation of an index or coefficient of change in stage 1 and testing the significance of difference between group means on the derived measure of change in stage 2. This article has the aim of introducing formulas and a computer program for sample size and/or power calculations for such two-stage analyses involving each of three definitions of change, with or without baseline scores entered as a covariate, in the presence of homogeneous or heterogeneous (autoregressive) patterns of correlation among the repeated measurements. Empirical adjustments of sample size for the projected dropout rates are also provided in the computer program.
Assuntos
Ensaios Clínicos Controlados como Assunto/estatística & dados numéricos , Software , Humanos , Tamanho da Amostra , Esquizofrenia/tratamento farmacológicoRESUMO
A project that originated with the aim of documenting the implications of dropouts for tests of significance based on general linear mixed model procedures resulted in recognition of problems in the use of SAS PROC.MIXED for this purpose. In responding to suggestions and criticisms, we have further analyzed simulated clinical trial data with realistic autoregressive structure, using alternative error model formulations, different approaches to the use of covariates to model dropout patterns, and different ways to include the critical time variable in the mixed model. Results emphasize the sensitivity of the PROC.MIXED tests of significance for GROUP and TIME x GROUP equal slopes hypothesis to less than optimal modeling of the error covariance structure. Even with the authoritatively recommended best available modeling of the error structure, model formulations that made use of the REPEATED statement did not maintain conservative test sizes when covariates were required to model dropout data patterns. Random coefficients models that employed the RANDOM statement did permit appropriate covariate controls, but the tests of significance for treatment effects were lacking in power. After examining a variety of alternative PROC.MIXED model formulations, it is concluded that none provided both Type I error protection and power comparable to that of simple two-stage analysis of covariance (ANCOVA) procedures for confirming the presence of true treatment effects in controlled clinical trials. Other issues examined in this article concern treating baseline scores as both covariate and initial repeated measurement to which a linear means model is fitted, failure to take advantage of the regression of repeated measurements on time in modeling time as an unordered categorical variable, and fitting linear regression models to nonlinear response patterns.
Assuntos
Ensaios Clínicos Controlados como Assunto/estatística & dados numéricos , Algoritmos , Simulação por Computador , Modelos Estatísticos , Dinâmica não Linear , Pacientes Desistentes do Tratamento , Projetos de Pesquisa , Software , Resultado do TratamentoRESUMO
The power of univariate and multivariate tests of significance is compared in relation to linear and nonlinear patterns of treatment effects in a repeated measurement design. Bonferroni correction was used to control the experiment-wise error rate in combining results from univariate tests of significance accomplished separately on average level, linear, quadratic, and cubic trend components. Multivariate tests on these same components of the overall treatment effect, as well as a multivariate test for between-groups difference on the original repeated measurements, were also evaluated for power against the same representative patterns of treatment effects. Results emphasize the advantage of parsimony that is achieved by transforming multiple repeated measurements into a reduced set of mean ngful composite variables representing average levels and rates of change. The Bonferroni correction applied to the separate univariate tests provided experiment-wise protection against Type I error, produced slightly greater experiment-wise power than a multivariate test applied to the same components of the data patterns, and provided substantially greater power than a multivariate test on the complete set of original repeated measurements. The separate univariate tests provide interpretive advantage regarding locus of the treatment effects.
Assuntos
Ensaios Clínicos Controlados como Assunto/estatística & dados numéricos , Análise de Variância , Humanos , Modelos Estatísticos , Método de Monte Carlo , Análise Multivariada , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Projetos de PesquisaRESUMO
The work reported in this article was undertaken to evaluate the utility of SAS PROC.MIXED for testing hypotheses concerning GROUP and TIME x GROUP effects in repeated measurements designs with drop-outs. If dropouts are not completely at random, covariate control over informative individual differences on which dropout data patterns depend is widely recognized to be important. However, the inclusion of baseline scores and time-in-study as between-subject covariates in an otherwise well formulated SAS PROC.MIXED model resulted in inadequate control over type I error in simulated data with or without drop-outs present. The inadequate model formulations and resulting deviant test sizes are presented here as a warning for others who might be guided by the same information sources to employ similar model specifications when analyzing data from actual clinical trials. It is important that the complete model specification be provided in detail when reporting applications of the general linear mixed-model procedure. A single random-coefficients model produced appropriate test sizes, but it provided inferior power when informative covariates were added in the attempt to adjust for dropouts. As an alternative, the incorporation of covariate controls in simpler two-stage endpoint or random regression analyses is documented to be effective in dealing with dropouts under specifiable conditions.
Assuntos
Modelos Estatísticos , Software , Biofarmácia/estatística & dados numéricos , Ensaios Clínicos Controlados como Assunto/métodos , Ensaios Clínicos Controlados como Assunto/estatística & dados numéricos , Humanos , Modelos Lineares , Método de Monte Carlo , Análise Multivariada , Psicofarmacologia/estatística & dados numéricos , Análise de RegressãoRESUMO
A two-stage mixed model analysis of repeated measurement calculates participant-specific regression slopes relating change in available measurements to associated assessment times, and then the difference between mean regression slopes in two or more treatment groups is tested for significance against the within-groups variability of the participant-specific regression slopes. It is not necessary that all participants have the same schedule or number of repeated measurements. However, when dropouts are included in an "intent to treat" analysis, the shortened treatment exposures for the dropouts substantially increase variability and reduce power of tests for differences in rates of change. Previous work has suggested that normalizing the time scale to unit length for all participants prior to fitting the individual regression equations materially reduces the power attenuation produced by dropouts. This article reports a more detailed evaluation of the enhanced robustness against dropouts that is achieved by rescaling the time dimension. The robust analysis is recognized to be equivalent to weighting ordinary least squares regression on the original time scale by the duration of treatment for each participant. Slope coefficients calculated across a shortened time span for dropouts are less stable, so they are given less weight in defining the (linear) treatment effects.
Assuntos
Avaliação de Processos e Resultados em Cuidados de Saúde/estatística & dados numéricos , Pacientes Desistentes do Tratamento/estatística & dados numéricos , Psicoterapia/estatística & dados numéricos , Viés , Humanos , Computação Matemática , Psicometria , Análise de RegressãoRESUMO
Statistical models for calculating sample sizes for controlled clinical trials often fail to take into account the negative impact that dropouts have on the power of intent-to-treat analyses. Empirically defined dropout correction coefficients are proposed to adjust sample sizes for endpoint analysis of variance (ANOVA) and analysis of covariance (ANCOVA) that have been initially calculated assuming complete data. The implications of type of analysis (change-score ANOVA or ANCOVA), correlational structure of the repeated measurements (compound symmetry or autoregressive), and percentage of dropouts (20% or 30%) are considered, together with other less influential design and data parameters. We recommend the use of ANCOVA to correct for baseline differences and for time-in-study if there is a nonspecific change across time. Given a realistic autoregressive (order 1) correlational structure for the repeated measurements and a proposed endpoint ANCOVA, the empirical results support the common practice of increasing calculated sample size by the anticipated number of dropouts. The previous rationale has been to retain a requisite number of "completers" on which to base statistical inferences. We believe the present results provide the first documentation of the relevance of that strategy for intent-to-treat analyses in which the incomplete data for dropouts must be included. Based on comparative power analyses, the strategy also seems appropriate for maintaining the power of mixed-model regression analyses, simple regression on a normalized time scale, and analyses of trends fitted to imputed scores for dropouts.
Assuntos
Ensaios Clínicos como Assunto/métodos , Ensaios Clínicos como Assunto/estatística & dados numéricos , Projetos de Pesquisa , Estudos de AmostragemRESUMO
Two equations for calculating sample sizes that are required for power in testing differences in rates of change in repeated measurement designs have been presented by different authors. One equation provides support for the conclusion that increased frequency of measurements across a treatment period of fixed duration enhances power of the tests. The other equation supports the counterintuitive conclusion that increased frequency of measurements actually tends to decrease power in the presence of realistic serial dependencies in the data. Monte Carlo methods confirm that the equation providing support for the latter conclusion is accurate, whereas the alternative equation tends to underestimate sample sizes required for power in testing differences in slopes of regression lines fitted to changes in the repeated measurements across time when symmetry is absent from the covariance structure.
Assuntos
Ensaios Clínicos como Assunto/estatística & dados numéricos , Modelos Estatísticos , Tamanho da Amostra , Humanos , Método de Monte Carlo , Reprodutibilidade dos TestesRESUMO
Changes in GABA function have been postulated to be involved in alcohol tolerance, withdrawal and addiction. In this study we measured regional brain metabolic responses to lorazepam, to indirectly assess GABA function (benzodiazepines facilitate GABAergic neurotransmission), in alcoholics during early and late withdrawal. Brain metabolism was measured using PET and 2-deoxy-2[18F]fluoro-D-glucose after placebo (baseline) and after lorazepam (30 micrograms/kg intravenously) in 10 alcoholics and 16 controls. In the alcoholics evaluations were performed 2 to 3 weeks after detoxification and were repeated 6 to 8 weeks later. Controls were also evaluated twice at a 6 to 8 weeks interval. While during the initial evaluation metabolism was significantly lower for most brain regions in the alcoholics than in controls in the repeated evaluation the only significant differences were in cingulate and orbitofrontal cortex. Lorazepam-induced decrements in metabolism did not change with protracted alcohol withdrawal and the magnitude of these changes were similar in controls and alcoholics except for a trend towards a blunted response to lorazepam in orbitofrontal cortex in alcoholics during the second evaluation. Abnormalities in orbitofrontal cortex and cingulate gyrus in alcoholics are unlikely to be due to withdrawal since they persist 8 to 11 weeks after detoxification. The fact that there was only a trend of significance for an abnormal response to lorazepam in orbitofrontal cortex indicates that mechanisms other than GABA are involved in the brain metabolic abnormalities observed in alcoholic subjects.
Assuntos
Delirium por Abstinência Alcoólica/reabilitação , Alcoolismo/reabilitação , Encéfalo/efeitos dos fármacos , Metabolismo Energético/efeitos dos fármacos , Moduladores GABAérgicos/farmacologia , Lorazepam/farmacologia , Adulto , Delirium por Abstinência Alcoólica/fisiopatologia , Alcoolismo/fisiopatologia , Glicemia/metabolismo , Encéfalo/fisiopatologia , Mapeamento Encefálico , Metabolismo Energético/fisiologia , Fluordesoxiglucose F18/metabolismo , Lobo Frontal/efeitos dos fármacos , Lobo Frontal/fisiopatologia , Giro do Cíngulo/efeitos dos fármacos , Giro do Cíngulo/fisiopatologia , Humanos , Masculino , Pessoa de Meia-Idade , Testes Neuropsicológicos , Córtex Pré-Frontal/efeitos dos fármacos , Córtex Pré-Frontal/fisiopatologia , Valores de Referência , Tomografia Computadorizada de EmissãoRESUMO
The implications of drop-outs for power of random regression model (RRM) tests of significance for differences in the rate of change produced by two treatments in a randomized parallel-groups design were investigated by Monte Carlo simulation methods. The two-stage RRM fitted a least squares linear regression equation to all of the available data for each individual, and then ANOVA or ANCOVA tests of significance were applied to the resulting slope coefficients. The tests of significance were adequately protected against type I error, but power was seriously eroded by the presence of drop-outs. Simple endpoint analyses with baseline and time-in-treatment covaried proved more robust against the power degradations.
Assuntos
Modelos Estatísticos , Pacientes Desistentes do Tratamento , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Análise de Regressão , Análise de Variância , Simulação por Computador , Humanos , Método de Monte Carlo , Análise MultivariadaRESUMO
OBJECTIVE: To report results from a long-term prospective study of safety of haloperidol treatment and prevalence of haloperidol-related dyskinesias. METHOD: Subjects were children with autism requiring pharmacotherapy for target symptoms. After baseline assessments, children received haloperidol treatment; responders requiring further treatment were considered for enrollment into the present study. Six-month haloperidol treatment periods were followed by a 4-week placebo period. The procedure was repeated if further haloperidol treatment was required. At specified times children were evaluated by using multiple instruments. RESULTS: Between 1979 and 1994, 118 children aged 2.3 to 8.2 years participated in the study. The mean dose of haloperidol was 1.75 mg/day. Mainly withdrawal dyskinesias (WD) developed in 40 (33.9%) children; 20 had more than one dyskinetic episode. A subgroup that remained significantly longer in the study and had a significantly higher cumulative dose of haloperidol evidenced a significantly higher incidence of WD. Occurrence rates of tardive dyskinesia (TD) and multiple episodes of TD/WD were higher among girls. CONCLUSION: Female gender and pre- and perinatal complications may be involved in susceptibility to dyskinesias; greater cumulative haloperidol dose and/or longer exposure to haloperidol may increase the risk.
Assuntos
Antipsicóticos/efeitos adversos , Transtorno Autístico/tratamento farmacológico , Discinesia Induzida por Medicamentos , Haloperidol/efeitos adversos , Criança , Pré-Escolar , Feminino , Humanos , Estudos Longitudinais , Masculino , Estudos ProspectivosRESUMO
There exists a current interest in the application of survival analysis methodology to evaluate differences in latencies of response to psychological or psychopharmacological treatment modalities. However, unreliability in the measurement of treatment responses in such research poses a problem. Two methods of defining the "discrete endpoint" that is required for survival analysis are compared regarding power of tests of significance for differences in survival curves. Discrete endpoints defined by regression equations fitted to all available data for each subject provided greater power when entered into survival analysis than did endpoints dependent only on individual measurements. While this may not surprise statisticians, no examples of the use of regression estimates for survival analysis endpoints have been identified in reports of previous clinical trials nor in discussions concerning potential applications of survival analysis methodology in psychiatric research.
Assuntos
Transtornos Mentais/tratamento farmacológico , Psicotrópicos/uso terapêutico , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Análise de Sobrevida , Análise de Variância , Humanos , Modelos Estatísticos , Método de Monte Carlo , Testes Psicológicos/estatística & dados numéricos , Análise de Regressão , Reprodutibilidade dos Testes , Resultado do TratamentoRESUMO
Simulated data for a two-group repeated measurements design were generated with different numbers of equally-spaced measurements interposed between baseline and the end of the study. A standard repeated measurements ANOVA for a split-pilot design was used to test the significance of the between-groups main effect, the Geisser-Greenhouse corrected groups x times interaction, and the difference in linear trends across time. The analyses were repeated with and without baseline measurements entered as a covariate in the model. Monte Carlo results confirmed that increasing the number of repeated measurements across a fixed treatment period generally had negative or neutral implications for power of the tests of significance in the presence of serial dependencies that produced heterogeneous correlations among the repeated measurements.
Assuntos
Ensaios Clínicos como Assunto/estatística & dados numéricos , Interpretação Estatística de Dados , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , Análise de Variância , Humanos , Método de Monte Carlo , ProbabilidadeRESUMO
OBJECTIVE: To assess critically the short-term efficacy and safety of carbamazepine in the reduction of aggressiveness in children with diagnosed conduct disorder. METHOD: Subjects were children aged 5 to 12 years who were hospitalized for treatment-resistant aggressiveness and explosiveness and who had diagnosed conduct disorder. The study was double-blind and placebo-controlled, using a parallel-groups design. Following a 2-week placebo baseline period, children who met the aggression criteria were randomly assigned to treatments for 6 weeks; the study ended with a 1-week posttreatment placebo period. Multiple raters rated the children independently, using multiple rating scales under four conditions. The main outcome measures included the Overt Aggression Scale, the Global Clinical Judgments (Consensus) Scale, and the Children's Psychiatric Rating Scale. RESULTS: Twenty-two children, aged 5.33 to 11.7 years, completed the study. Carbamazepine was not superior to placebo at optimal daily doses ranging from 400 to 800 mg, mean 683 mg, at serum levels of 4.98 to 9.1 micrograms/mL. Untoward effects associated with administration of carbamazepine were common. CONCLUSIONS: In this modest sample of children, the superiority of carbamazepine over placebo in reducing aggressive behavior was not demonstrated.
Assuntos
Agressão/efeitos dos fármacos , Carbamazepina/uso terapêutico , Transtornos do Comportamento Infantil/tratamento farmacológico , Carbamazepina/farmacologia , Criança , Método Duplo-Cego , Feminino , Humanos , Masculino , Escalas de Graduação PsiquiátricaRESUMO
Failure to recognize the serious implications of heterogeneous correlations and disregard of the multiple test problem in interpreting the results from repeated measurements ANOVA of any single primary outcome measure can produce false-positive error rates that are more than five times the alpha level that is reported. Alternative analyses that do not depend on the symmetry assumption, together with a Bonferroni correction of the multiple tests of significance that are routinely accomplished by the repeated measurements ANOVA, appropriately control the probability of statistical support for a false-positive claim. The magnitudes of error inflation and appropriate procedures for error control are examined in this article using simulated clinical trials data.
Assuntos
Análise de Variância , Reações Falso-Positivas , Projetos de Pesquisa , Humanos , Modelos Estatísticos , PopulaçãoRESUMO
The random regression model (RRM) has been advocated as a potential solution to problems of statistical analysis posed by dropouts in clinical trials. However, the power of the RRM tests for differences in rates of change can be seriously attenuated by presence of dropouts. The use of imputed scores and other modifications are examined in an attempt to render a simple growth-curve form of the RRM analysis more robust against dropouts. Methods that extrapolate from an individual's own performance were found effective, although inclusion of time-in-treatment as a covariate was documented to be important under identifiable conditions. Of the methods evaluated, those that used group data to impute missing values for dropouts produced nonconservative bias. The results suggest the importance of careful evaluation of potential bias when integrating any group-based imputation procedure into the RRM analyses.
Assuntos
Ensaios Clínicos como Assunto/métodos , Pacientes Desistentes do Tratamento , Humanos , Modelos EstatísticosRESUMO
Factor analysis methodology applied to Alzheimer's Disease Assessment Scale (ADAS) subtest profiles for patients in two large-scale clinical trials of the antidementia drug tacrine yielded three oblique factors interpreted as dysfunctions in memory, language, and praxis. The factor structures confirmed reliable assessment of primary dimensions of cognitive impairment in Alzheimer's disease that the original authors of the ADAS proposed to measure and that correspond well to that of the only previously reported factor analysis of the ADAS-COG. The presence of a strong general factor, supported by stable correlations among the oblique primary factors, justifies the recommendation to continue reliance on the ADAS-COG total score as a primary outcome measure in clinical trials, whereas the factor scores are recommended for evaluation of differential treatment effects on more specific aspects of the general cognitive decline. The stability of correlations across time appears to satisfy a primary requirement for application of repeated measures ANOVA to ADAS-COG total score and factor scores in longitudinal clinical trials.
Assuntos
Doença de Alzheimer/diagnóstico , Transtornos Cognitivos/diagnóstico , Avaliação Geriátrica/estatística & dados numéricos , Testes Neuropsicológicos/estatística & dados numéricos , Idoso , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/psicologia , Transtornos Cognitivos/tratamento farmacológico , Transtornos Cognitivos/psicologia , Relação Dose-Resposta a Droga , Esquema de Medicação , Análise Fatorial , Humanos , Nootrópicos/administração & dosagem , Psicometria , Tacrina/administração & dosagem , Resultado do Tratamento , Estados UnidosRESUMO
Welch (1947) proposed an adjusted t test that can be used to correct the serious bias in Type I error protection that is otherwise present when both sample sizes and variances are unequal. The implications of the Welch adjustment for power of tests for the difference between two treatments across k levels of a concomitant factor are evaluated in this article for k x 2 designs with unequal sample sizes and unequal variances. Analyses confirm that, although Type I error is uniformly controlled, power of the Welch test of significance for the main effect of treatments remains rather seriously dependent on direction of the correlation between unequal variances and unequal sample sizes. Nevertheless, considering the fact that analysis of variance is not an acceptable option in such cases, the Welch t test appears to have an important role to play in the analysis of experimental data.
Assuntos
Análise de Variância , Psicometria/métodos , Viés , HumanosRESUMO
Heterogeneity of variance produces serious bias in conventional analysis of variance tests of significance when cell frequencies are unequal. Welch in 1938 and 1947 proposed an adjusted t test for the difference between two means when cell frequencies and population variances are both unequal. This article describes two ways to use the Welch t to evaluate the significance of the main effect for two treatments across k levels of a concomitant factor in a two-way design. Monte Carlo results document the bias in conventional analysis of variance tests and the stable and appropriately conservative results from applications of the Welch t to evaluation of treatment effects in the two-way design.
Assuntos
Análise de Variância , Ensaios Clínicos como Assunto/estatística & dados numéricos , Viés , Humanos , Método de Monte CarloRESUMO
OBJECTIVE: To assess critically the efficacy and safety of lithium and replicate earlier findings in a larger sample of aggressive children with conduct disorder and to assess the utility of the Profile of Mood States (POMS) in this population. METHODS: Children hospitalized for treatment-refractory severe aggressiveness and explosiveness and with diagnosed conduct disorder were subjects in this double-blind, placebo-controlled clinical trial. After a 2-week placebo baseline period, children were randomly assigned to lithium or placebo treatment for 6 weeks of placebo. The main outcome measures were the Global Clinical Judgments (Consensus) Scale, Children's Psychiatric Rating Scale, Conners Teacher Questionnaire, Parent-Teacher Questionnaire, and the POMS. RESULTS: Fifty children (mean age 9.4 years) completed this study. The mean optimal daily dose of lithium was 1,248 mg and the mean serum level was 1.12 mEq/L. Lithium was superior to placebo, although the effects on some measures were more modest than in a previous study. CONCLUSIONS: Lithium appears to be an effective treatment for some severely aggressive children with conduct disorder. Although the POMS appeared to be reliable, it did not detect any response to lithium.
Assuntos
Agressão/efeitos dos fármacos , Transtornos do Comportamento Infantil/tratamento farmacológico , Carbonato de Lítio/administração & dosagem , Admissão do Paciente , Agressão/psicologia , Criança , Transtornos do Comportamento Infantil/psicologia , Pré-Escolar , Relação Dose-Resposta a Droga , Método Duplo-Cego , Esquema de Medicação , Feminino , Humanos , Carbonato de Lítio/efeitos adversos , Masculino , Determinação da Personalidade , Resultado do TratamentoRESUMO
Justification of a "fast acting" claim for antidepressant drugs is, first and foremost, a definition problem. The one controversial statistical issue that has been raised is whether time should enter the equation as an independent or dependent variable. Although alternative models provide essentially equivalent tests of significance and estimates of response latencies, a regression model--in which assessment times are recognized to be fixed independent variables and repeated quantitative measurements of clinical response are the dependent variable--is more congruent with the determinate experimental conditions and sources of error variation in antidepressant drug trials. The regression model gains power for tests of significance and precision for estimates of response times by utilizing fully the information contained in quantitative repeated measurements rather than considering only time to an artificially discrete endpoint.