ABSTRACT
This article investigates the confidence interval (CI) construction of proportion difference for two independent partially validated series under the double-sampling scheme in which both classifiers are fallible. Several CIs based on the variance estimates recovery method of combining confidence limits from asymptotic, bootstrap, and Bayesian methods for two independent binomial proportions are developed under two models. Simulation results show that all CIs except for the bootstrap percentile-t CI and Bayesian credible interval with uniform prior under the independence model and all CIs under the dependence model generally perform well and are recommended. Two examples are used to illustrate the methodologies.
Subject(s)
Models, Statistical , Humans , Bayes Theorem , Confidence Intervals , Computer SimulationABSTRACT
New treatments that are noninferior or equivalent to-but not necessarily superior to-the reference treatment may still be beneficial to patients because they have fewer side effects, are more convenient, take less time, or cost less. The noninferiority test is widely used in medical research to provide guidance in such situation. In addition, categorical variables are frequently encountered in medical research, such as in studies involving patient-reported outcomes. In this paper, we develop a noninferiority testing procedure for correlated ordinal categorical variables based on a paired design with a latent normal distribution approach. Misclassification is frequently encountered in the collection of ordinal categorical data; therefore, we further extend the procedure to account for misclassification using information in the partially validated data. Simulation studies are conducted to investigate the accuracy of the estimates, the type I error rates, and the power of the proposed procedure. Finally, we analyze one substantive example to demonstrate the utility of the proposed approach.
Subject(s)
Equivalence Trials as Topic , Models, Statistical , Biostatistics , Computer Simulation , Data Interpretation, Statistical , Humans , Malaria/parasitology , Malaria/prevention & control , Malaria/transmission , Treatment OutcomeABSTRACT
A stratified study is often designed for adjusting a confounding effect or effect of different centers/groups in two treatments or diagnostic tests, and the risk difference is one of the most frequently used indices in comparing efficiency between two treatments or diagnostic tests. This article presented five simultaneous confidence intervals (CIs) for risk differences in stratified bilateral designs accounting for the intraclass correlation and developed seven CIs for the common risk difference under the homogeneity assumption. The performance of the CIs is evaluated with respect to the empirical coverage probabilities, empirical coverage widths and ratios of mesial noncoverage probability and the noncoverage probability under various scenarios. Empirical results show that Wald simultaneous CI, Haldane simultaneous CI, Score simultaneous CI based on Bonferroni method and simultaneous CI based on bootstrap-resampling method perform satisfactorily and hence be recommended for applications, the CI based on the weighted-least-square (WLS) estimator, the CIs based on Mantel-Haenszel estimator, the CI based on Cochran statistic and the CI based on Score statistic for the common risk difference behave well even under small sample sizes. A real data example is used to demonstrate the proposed methodologies.
Subject(s)
Confidence Intervals , Models, Statistical , Randomized Controlled Trials as Topic/methods , Randomized Controlled Trials as Topic/statistics & numerical data , Research Design/statistics & numerical data , Computer Simulation , Humans , Least-Squares Analysis , Probability , Risk , Sample SizeABSTRACT
In clinical studies, ordered categorical responses are common. To compare the efficacy of several treatments with a control for ordinal responses, the normal latent variable model has recently been proposed. This approach conceptualizes the responses as manifestations of an underlying continuous normal variable. In this article, we extend this idea to develop the multiple comparison method for use when there are two controls in the clinical trial. The proposed method is constructed such that the familywise type I error rate is controlled at a prespecified level. In addition, for a given level of test power, the procedure to evaluate the required sample size is provided. The proposed testing procedure is also illustrated by an example from a clinical study.
Subject(s)
Clinical Trials as Topic , Models, Statistical , Research Design , Humans , Sample SizeABSTRACT
In clinical studies, the proportional odds model is widely used to compare treatment efficacies when the responses are categorically ordered. However, this model has been shown to be inappropriate when the proportional odds assumption is invalid, mainly because it is unable to control the type I error rate in such circumstances. To remedy this problem, the latent normal model was recently promoted and has been demonstrated to be superior to the proportional odds model. However, the application of the latent normal model is limited to compare treatments with similar underlying distributions except possibly their means and variances. When the underlying distributions are very different in skewness, both of the aforementioned procedures suffer from the undesirable inflation of the type I error rate. To solve the problem for clinical studies with ordinal responses, we provide a viable solution that relies on the use of the latent Weibull distribution, which is a member of the log-location-scale family. The proposed model is able to control the type I error rate regardless of the degree of skewness of the treatment responses. In addition, the power of the test also outperforms that of the latent normal model. The testing procedure draws on newly developed theoretical results related to latent distributions from the location-scale family. The testing procedure is illustrated with two clinical examples.
Subject(s)
Biostatistics/methods , Models, Statistical , Treatment Outcome , Analgesics/pharmacology , Computer Simulation , Humans , Ketamine/pharmacology , Logistic Models , Pain/prevention & control , Propofol/administration & dosage , Propofol/adverse effects , Retinal Diseases/etiology , Smoking/adverse effects , Statistical DistributionsABSTRACT
In clinical studies, multiple comparisons of several treatments to a control with ordered categorical responses are often encountered. A popular statistical approach to analyzing the data is to use the logistic regression model with the proportional odds assumption. As discussed in several recent research papers, if the proportional odds assumption fails to hold, the undesirable consequence of an inflated familywise type I error rate may affect the validity of the clinical findings. To remedy the problem, a more flexible approach that uses the latent normal model with single-step and stepwise testing procedures has been recently proposed. In this paper, we introduce a step-up procedure that uses the correlation structure of test statistics under the latent normal model. A simulation study demonstrates the superiority of the proposed procedure to all existing testing procedures. Based on the proposed step-up procedure, we derive an algorithm that enables the determination of the total sample size and the sample size allocation scheme with a pre-determined level of test power before the onset of a clinical trial. A clinical example is presented to illustrate our proposed method.
Subject(s)
Algorithms , Clinical Trials as Topic/methods , Data Interpretation, Statistical , Models, Statistical , Computer Simulation , Fentanyl/administration & dosage , Humans , Lidocaine/administration & dosage , Pain/prevention & control , Sample SizeABSTRACT
A sufficient number of participants should be included to adequately address the research interest in the surveys with sensitive questions. In this paper, sample size formulas/iterative algorithms are developed from the perspective of controlling the confidence interval width of the prevalence of a sensitive attribute under four non-randomized response models: the crosswise model, parallel model, Poisson item count technique model and negative binomial item count technique model. In contrast to the conventional approach for sample size determination, our sample size formulas/algorithms explicitly incorporate an assurance probability of controlling the width of a confidence interval within the pre-specified range. The performance of the proposed methods is evaluated with respect to the empirical coverage probability, empirical assurance probability and confidence width. Simulation results show that all formulas/algorithms are effective and hence are recommended for practical applications. A real example is used to illustrate the proposed methods.
Subject(s)
Algorithms , Models, Statistical , Humans , Sample Size , Confidence Intervals , Psychometrics/methods , Psychometrics/statistics & numerical data , Poisson Distribution , Computer SimulationABSTRACT
Clinical trials frequently involve pairwise comparisons of different treatments to evaluate their relative efficacy. In this study, we examine methods for conducting pairwise tests of treatments with ordered categorical responses. A modified version of the Wilcoxon-Mann-Whitney test based on a logistic regression model assuming proportional odds is a popular choice for comparing two treatments. This paper discusses the extension of this test to pairwise comparisons involving more than two treatments. However, when the proportional odds assumption is not valid, the Wilcoxon-Mann-Whitney-type test procedure cannot control the overall type I error rate at the prespecified level of significance. We therefore propose a better strategy in which a latent normal model is employed. We presented a simulated comparative study of power and the overall type I error rate to illustrate the superiority of the latent normal model. Examples are also given for illustrative purposes.
Subject(s)
Clinical Trials as Topic/methods , Logistic Models , Alfentanil/pharmacology , Child , Child, Preschool , Computer Simulation , Humans , Pain/drug therapy , Piperidines/pharmacology , Propofol/adverse effects , RemifentanilABSTRACT
Investigating the prevalence of a disease is an important topic in medical studies. Such investigations are usually based on the classification results of a group of subjects according to whether they have the disease. To classify subjects, screening tests that are inexpensive and nonintrusive to the test subjects are frequently used to produce results in a timely manner. However, such screening tests may suffer from high levels of misclassification. Although it is often possible to design a gold-standard test or device that is not subject to misclassification, such devices are usually costly and time-consuming, and in some cases intrusive to the test subjects. As a compromise between these two approaches, it is possible to use data that are obtained by the method of double-sampling. In this article, we derive and investigate four test statistics for testing a hypothesis on disease prevalence with double-sampling data. The test statistics are implemented through both the asymptotic method suitable for large samples and approximate unconditional method suitable for small samples. Our simulation results show that the approximate unconditional method usually produces a more satisfactory empirical type I error rate and power than its asymptotic counterpart, especially for small to moderate sample sizes. The results also suggest that the score test and the Wald test based on an estimate of variance with parameters estimated under the null hypothesis outperform the others. An real example is used to illustrate the proposed methods.
Subject(s)
Data Interpretation, Statistical , Epidemiology/statistics & numerical data , Prevalence , Algorithms , Disease , Epidemiologic Studies , Humans , Likelihood Functions , Research Design , Sample SizeABSTRACT
Comparing disease prevalence in two groups is an important topic in medical research, and prevalence rates are obtained by classifying subjects according to whether they have the disease. Both high-cost infallible gold-standard classifiers or low-cost fallible classifiers can be used to classify subjects. However, statistical analysis that is based on data sets with misclassifications leads to biased results. As a compromise between the two classification approaches, partially validated sets are often used in which all individuals are classified by fallible classifiers, and some of the individuals are validated by the accurate gold-standard classifiers. In this article, we develop several reliable test procedures and approximate sample size formulas for disease prevalence studies based on the difference between two disease prevalence rates with two independent partially validated series. Empirical studies show that (i) the Score test produces close-to-nominal level and is preferred in practice; and (ii) the sample size formula based on the Score test is also fairly accurate in terms of the empirical power and type I error rate, and is hence recommended. A real example from an aplastic anemia study is used to illustrate the proposed methodologies.
Subject(s)
Biometry/methods , Graft vs Host Disease/epidemiology , Humans , Prevalence , Sample Size , Young AdultABSTRACT
We develop a method for the analysis of multivariate ordinal categorical data with misclassification based on the latent normal variable approach. Misclassification arises if a subject has been classified into a category that does not truly reflect its actual state, and can occur with one or more variables. A basic framework is developed to enable the analysis of two types of data. The first corresponds to a single sample that is obtained from a fallible design that may lead to misclassified data. The other corresponds to data that is obtained by double sampling. Double sampling data consists of two parts: a sample that is obtained by classifying subjects using the fallible design only and a sample that is obtained by classifying subjects using both fallible and true designs, which is assumed to have no misclassification. A unified expectation-maximization approach is developed to find the maximum likelihood estimate of model parameters. Simulation studies and examples that are based on real data are used to demonstrate the applicability and practicability of the proposed methods.
Subject(s)
Classification/methods , Computer Simulation , Likelihood Functions , Selection Bias , Accidents/statistics & numerical data , Confidence Intervals , Data Collection/statistics & numerical data , Humans , Models, Statistical , Multivariate Analysis , Psychometrics/statistics & numerical dataABSTRACT
Ordinal responses are common in clinical studies. Although the proportional odds model is a popular option for analyzing ordered-categorical data, it cannot control the type I error rate when the proportional odds assumption fails to hold. The latent Weibull model was recently shown to be a superior candidate for modeling ordinal data, with remarkably better performance than the latent normal model when the data are highly skewed. In clinical trials with ordinal responses, a balanced design is common, with equal sample allocation for each treatment. However, a more ethical approach is to adopt a response-adaptive allocation scheme in which more patients receive the better treatment. In this paper, we propose the use of the doubly adaptive biased coin design to generate treatment allocations that benefit the trial participants. The proposed treatment allocation scheme not only allows more patients to receive the better treatment, it also maintains compatible test power for the comparison of treatment efficiencies. A clinical example is used to illustrate the proposed procedure.
Subject(s)
Bias , Clinical Protocols , Clinical Studies as Topic/statistics & numerical data , Models, Statistical , Humans , Outcome and Process Assessment, Health Care/statistics & numerical data , Treatment OutcomeABSTRACT
A disease prevalence can be estimated by classifying subjects according to whether they have the disease. When gold-standard tests are too expensive to be applied to all subjects, partially validated data can be obtained by double-sampling in which all individuals are classified by a fallible classifier, and some of individuals are validated by the gold-standard classifier. However, it could happen in practice that such infallible classifier does not available. In this article, we consider two models in which both classifiers are fallible and propose four asymptotic test procedures for comparing disease prevalence in two groups. Corresponding sample size formulae and validated ratio given the total sample sizes are also derived and evaluated. Simulation results show that (i) Score test performs well and the corresponding sample size formula is also accurate in terms of the empirical power and size in two models; (ii) the Wald test based on the variance estimator with parameters estimated under the null hypothesis outperforms the others even under small sample sizes in Model II, and the sample size estimated by this test is also accurate; (iii) the estimated validated ratios based on all tests are accurate. The malarial data are used to illustrate the proposed methodologies.
ABSTRACT
A Thurstonian type approach is applied to modelling ranking data with ties. It uses a non-totally differentiable discriminational process instead of the conventional totally differential one to relate the observed rankings and the underlying subjective values. A Monte Carlo expectation-maximization algorithm is proposed to find the maximum likelihood estimates together with the standard errors of the parameters. The approach is examined numerically by means of an artificial example and a simulation study and is applied to a study of attribute assessment.
Subject(s)
Decision Support Techniques , Factor Analysis, Statistical , Models, Statistical , Relative Value Scales , Social Marketing , Statistics as Topic , Algorithms , Data Interpretation, Statistical , Discriminant Analysis , Humans , Likelihood Functions , Monte Carlo MethodABSTRACT
Double sampling is usually applied to collect necessary information for situations in which an infallible classifier is available for validating a subset of the sample that has already been classified by a fallible classifier. Inference procedures have previously been developed based on the partially validated data obtained by the double-sampling process. However, it could happen in practice that such infallible classifier or gold standard does not exist. In this article, we consider the case in which both classifiers are fallible and propose asymptotic and approximate unconditional test procedures based on six test statistics for a population proportion and five approximate sample size formulas based on the recommended test procedures under two models. Our results suggest that both asymptotic and approximate unconditional procedures based on the score statistic perform satisfactorily for small to large sample sizes and are highly recommended. When sample size is moderate or large, asymptotic procedures based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic, log- and logit-transformation statistics based on both models generally perform well and are hence recommended. The approximate unconditional procedures based on the log-transformation statistic under Model I, Wald statistic with the variance being estimated under the null hypothesis, log- and logit-transformation statistics under Model II are recommended when sample size is small. In general, sample size formulae based on the Wald statistic with the variance being estimated under the null hypothesis, likelihood rate statistic and score statistic are recommended in practical applications. The applicability of the proposed methods is illustrated by a real-data example.
Subject(s)
Models, Statistical , Sampling Studies , Algorithms , Humans , Likelihood Functions , Norway , Sample SizeABSTRACT
Many variables that are used in social and behavioural science research are ordinal categorical or polytomous variables. When more than one polytomous variable is involved in an analysis, observations are classified in a contingency table, and a commonly used statistic for describing the association between two variables is the polychoric correlation. This paper investigates the estimation of the polychoric correlation when the data set consists of misclassified observations. Two approaches for estimating the polychoric correlation have been developed. One assumes that the probabilities in relation to misclassification are known, and the other uses a double sampling scheme to obtain information on misclassification. A parameter estimation procedure is developed, and statistical properties for the estimates are discussed. The practicability and applicability of the proposed approaches are illustrated by analysing data sets that are based on real and generated data. Excel programmes with visual basic for application (VBA) have been developed to compute the estimate of the polychoric correlation and its standard error. The use of the structural equation modelling programme Mx to find parameter estimates in the double sampling scheme is discussed.
Subject(s)
Behavioral Sciences/statistics & numerical data , Data Collection/classification , Mathematical Computing , Psychometrics/statistics & numerical data , Social Sciences/statistics & numerical data , Software , Data Collection/statistics & numerical data , Models, Statistical , Probability , Surveys and QuestionnairesABSTRACT
Influence analysis is an important component of data analysis, and the local influence approach has been widely applied to many statistical models to identify influential observations and assess minor model perturbations since the pioneering work of Cook (1986). The approach is often adopted to develop influence analysis procedures for factor analysis models with ranking data. However, as this well-known approach is based on the observed data likelihood, which involves multidimensional integrals, directly applying it to develop influence analysis procedures for the factor analysis models with ranking data is difficult. To address this difficulty, a Monte Carlo expectation and maximization algorithm (MCEM) is used to obtain the maximum-likelihood estimate of the model parameters, and measures for influence analysis on the basis of the conditional expectation of the complete data log likelihood at the E-step of the MCEM algorithm are then obtained. Very little additional computation is needed to compute the influence measures, because it is possible to make use of the by-products of the estimation procedure. Influence measures that are based on several typical perturbation schemes are discussed in detail, and the proposed method is illustrated with two real examples and an artificial example.
Subject(s)
Data Collection/statistics & numerical data , Factor Analysis, Statistical , Models, Statistical , Psychological Tests/statistics & numerical data , Algorithms , Humans , Monte Carlo Method , Personality Tests/statistics & numerical data , Probability , Psychometrics/statistics & numerical data , Reproducibility of ResultsABSTRACT
Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example.
Subject(s)
Databases, Factual/statistics & numerical data , Prevalence , Sample Size , Anemia, Aplastic/therapy , Biostatistics , Bone Marrow Transplantation/adverse effects , Computer Simulation , Confidence Intervals , Graft vs Host Disease/epidemiology , Graft vs Host Disease/etiology , Humans , Likelihood Functions , Models, Statistical , Validation Studies as TopicABSTRACT
Partially validated series are common when a gold-standard test is too expensive to be applied to all subjects, and hence a fallible device is used accordingly to measure the presence of a characteristic of interest. In this article, confidence interval construction for proportion difference between two independent partially validated series is studied. Ten confidence intervals based on the method of variance estimates recovery (MOVER) are proposed, with each using the confidence limits for the two independent binomial proportions obtained by the asymptotic, Logit-transformation, Agresti-Coull and Bayesian methods. The performances of the proposed confidence intervals and three likelihood-based intervals available in the literature are compared with respect to the empirical coverage probability, confidence width and ratio of mesial non-coverage to non-coverage probability. Our empirical results show that (1) all confidence intervals exhibit good performance in large samples; (2) confidence intervals based on MOVER combining the confidence limits for binomial proportions based on Wilson, Agresti-Coull, Logit-transformation, Bayesian (with three priors) methods perform satisfactorily from small to large samples, and hence can be recommended for practical applications. Two real data sets are analysed to illustrate the proposed methods.
Subject(s)
Bayes Theorem , Confidence Intervals , Accidents, Traffic/statistics & numerical data , Anemia, Aplastic/epidemiology , Automobiles , Binomial Distribution , Female , Humans , Likelihood Functions , Male , Prevalence , Reproducibility of Results , Young AdultABSTRACT
Ordered categorical data are frequently encountered in clinical studies. A popular method for comparing the efficacy of treatments is to use logistic regression with the proportional odds assumption. The test statistic is based on the Wilcoxon-Mann-Whitney test. However, the proportional odds assumption may not be appropriate. In such cases, the probability of rejecting the null hypothesis is much inflated even though the treatments have the same mean efficacy. An alternative approach that does not rely on the proportional odds assumption is to conceptualize the responses as manifestations of some underlying continuous variables. However, statistical procedures were developed only for the comparison of two treatments. In this article, we derive testing procedures that compare several treatments to a control, utilizing a latent normal distribution with the latent variable model. The proposed procedure is useful because multiple comparisons with a control is very frequently an objective of a clinical study. Data from clinical trials are used to illustrate the proposed procedures.