Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
Más filtros

País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J R Stat Soc Series B Stat Methodol ; 86(2): 411-434, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38746015

RESUMEN

Mediation analysis aims to assess if, and how, a certain exposure influences an outcome of interest through intermediate variables. This problem has recently gained a surge of attention due to the tremendous need for such analyses in scientific fields. Testing for the mediation effect (ME) is greatly challenged by the fact that the underlying null hypothesis (i.e. the absence of MEs) is composite. Most existing mediation tests are overly conservative and thus underpowered. To overcome this significant methodological hurdle, we develop an adaptive bootstrap testing framework that can accommodate different types of composite null hypotheses in the mediation pathway analysis. Applied to the product of coefficients test and the joint significance test, our adaptive testing procedures provide type I error control under the composite null, resulting in much improved statistical power compared to existing tests. Both theoretical properties and numerical examples of the proposed methodology are discussed.

2.
J Stat Softw ; 1052023.
Artículo en Inglés | MEDLINE | ID: mdl-38586564

RESUMEN

Recurrent event analyses have found a wide range of applications in biomedicine, public health, and engineering, among others, where study subjects may experience a sequence of event of interest during follow-up. The R package reReg offers a comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, possibly with the presence of an informative terminal event. The regression framework is a general scale-change model which encompasses the popular Cox-type model, the accelerated rate model, and the accelerated mean model as special cases. Informative censoring is accommodated through a subject-specific frailty without any need for parametric specification. Different regression models are allowed for the recurrent event process and the terminal event. Also included are visualization and simulation tools.

3.
Multivariate Behav Res ; 58(2): 387-407, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-35086405

RESUMEN

Differential item functioning (DIF) analysis refers to procedures that evaluate whether an item's characteristic differs for different groups of persons after controlling for overall differences in performance. DIF is routinely evaluated as a screening step to ensure items behave the same across groups. Currently, the majority DIF studies focus predominately on unidimensional IRT models, although multidimensional IRT (MIRT) models provide a powerful tool for enriching the information gained in modern assessment. In this study, we explore regularization methods for DIF detection in MIRT models and compare their performance to the classic likelihood ratio test. Regularization methods have recently emerged as a new family of methods for DIF detection due to their advantages: (1) they bypass the tedious iterative purification procedure that is often needed in other methods for identifying anchor items, and (2) they can handle multiple covariates simultaneously. The specific regularization methods considered in the study are: lasso with expectation-maximization (EM), lasso with expectation-maximization-maximization (EMM) algorithm, and adaptive lasso with EM. Simulation results show that lasso EMM and adaptive lasso EM hold great promise when the sample size is large, and they both outperform lasso EM. A real data example from PROMIS depression and anxiety scales is presented in the end.


Asunto(s)
Algoritmos , Funciones de Verosimilitud
4.
Biometrics ; 78(1): 261-273, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-33215683

RESUMEN

A central but challenging problem in genetic studies is to test for (usually weak) associations between a complex trait (e.g., a disease status) and sets of multiple genetic variants. Due to the lack of a uniformly most powerful test, data-adaptive tests, such as the adaptive sum of powered score (aSPU) test, are advantageous in maintaining high power against a wide range of alternatives. However, there is often no closed-form to accurately and analytically calculate the p-values of many adaptive tests like aSPU, thus Monte Carlo (MC) simulations are often used, which can be time consuming to achieve a stringent significance level (e.g., 5e-8) used in genome-wide association studies (GWAS). To estimate such a small p-value, we need a huge number of MC simulations (e.g., 1e+10). As an alternative, we propose using importance sampling to speed up such calculations. We develop some theory to motivate a proposed algorithm for the aSPU test, and show that the proposed method is computationally more efficient than the standard MC simulations. Using both simulated and real data, we demonstrate the superior performance of the new method over the standard MC simulations.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Algoritmos , Estudio de Asociación del Genoma Completo/métodos , Método de Montecarlo
5.
Multivariate Behav Res ; 57(5): 840-858, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-33755507

RESUMEN

Cognitive diagnosis models (CDMs) are useful statistical tools to provide rich information relevant for intervention and learning. As a popular approach to estimate and make inference of CDMs, the Markov chain Monte Carlo (MCMC) algorithm is widely used in practice. However, when the number of attributes, K, is large, the existing MCMC algorithm may become time-consuming, due to the fact that O(2K) calculations are usually needed in the process of MCMC sampling to get the conditional distribution for each attribute profile. To overcome this computational issue, motivated by Culpepper and Hudson's earlier work in 2018, we propose a computationally efficient sequential Gibbs sampling method, which needs O(K) calculations to sample each attribute profile. We use simulation and real data examples to show the good finite-sample performance of the proposed sequential Gibbs sampling, and its advantage over existing methods.


Asunto(s)
Algoritmos , Cognición , Teorema de Bayes , Simulación por Computador , Cadenas de Markov , Método de Montecarlo
6.
Ann Stat ; 49(1): 154-181, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34857975

RESUMEN

Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients. This paper constructs a family of U-statistics as unbiased estimators of the ℓ p -norms of those features. We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed. Moreover, they are also asymptotically independent with the maximum-type test statistic, whose limiting distribution is an extreme value distribution. Based on the asymptotic independence property, we propose an adaptive testing procedure which combines p-values computed from the U-statistics of different orders. We further establish power analysis results and show that the proposed adaptive procedure maintains high power against various alternatives.

7.
Stat Sin ; 30: 1773-1795, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-34385810

RESUMEN

Two major challenges arise in regression analyses of recurrent event data: first, popular existing models, such as the Cox proportional rates model, may not fully capture the covariate effects on the underlying recurrent event process; second, the censoring time remains informative about the risk of experiencing recurrent events after accounting for covariates. We tackle both challenges by a general class of semiparametric scale-change models that allow a scale-change covariate effect as well as a multiplicative covariate effect. The proposed model is flexible and includes several existing models as special cases, such as the popular proportional rates model, the accelerated mean model, and the accelerated rate model. Moreover, it accommodates informative censoring through a subject-level latent frailty whose distribution is left unspecified. A robust estimation procedure which requires neither a parametric assumption on the distribution of the frailty nor a Poisson assumption on the recurrent event process is proposed to estimate the model parameters. The asymptotic properties of the resulting estimator are established, with the asymptotic variance estimated from a novel resampling approach. As a byproduct, the structure of the model provides a model selection approach among the submodels via hypothesis testing of model parameters. Numerical studies show that the proposed estimator and the model selection procedure perform well under both noninformative and informative censoring scenarios. The methods are applied to data from two transplant cohorts to study the risk of infections after transplantation.

8.
Int Stat Rev ; 87(1): 24-43, 2019 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34366547

RESUMEN

Panel count data arise in many applications when the event history of a recurrent event process is only examined at a sequence of discrete time points. In spite of the recent methodological developments, the availability of their software implementations has been rather limited. Focusing on a practical setting where the effects of some time-independent covariates on the recurrent events are of primary interest, we review semiparametric regression modelling approaches for panel count data that have been implemented in R package spef. The methods are grouped into two categories depending on whether the examination times are associated with the recurrent event process after conditioning on covariates. The reviewed methods are illustrated with a subset of the data from a skin cancer clinical trial.

9.
Genet Epidemiol ; 41(7): 599-609, 2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-28714590

RESUMEN

Testing for association between two random vectors is a common and important task in many fields, however, existing tests, such as Escoufier's RV test, are suitable only for low-dimensional data, not for high-dimensional data. In moderate to high dimensions, it is necessary to consider sparse signals, which are often expected with only a few, but not many, variables associated with each other. We generalize the RV test to moderate-to-high dimensions. The key idea is to data adaptively weight each variable pair based on its empirical association. As the consequence, the proposed test is adaptive, alleviating the effects of noise accumulation in high-dimensional data, and thus maintaining the power for both dense and sparse alternative hypotheses. We show the connections between the proposed test with several existing tests, such as a generalized estimating equations-based adaptive test, multivariate kernel machine regression (KMR), and kernel distance methods. Furthermore, we modify the proposed adaptive test so that it can be powerful for nonlinear or nonmonotonic associations. We use both real data and simulated data to demonstrate the advantages and usefulness of the proposed new test. The new test is freely available in R package aSPC on CRAN at https://cran.r-project.org/web/packages/aSPC/index.html and https://github.com/jasonzyx/aSPC.


Asunto(s)
Biología Computacional/métodos , Regulación de la Expresión Génica , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Simulación por Computador , Humanos , Transcriptoma
10.
Biometrics ; 74(3): 944-953, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29286532

RESUMEN

Panel count data arise when the number of recurrent events experienced by each subject is observed intermittently at discrete examination times. The examination time process can be informative about the underlying recurrent event process even after conditioning on covariates. We consider a semiparametric accelerated mean model for the recurrent event process and allow the two processes to be correlated through a shared frailty. The regression parameters have a simple marginal interpretation of modifying the time scale of the cumulative mean function of the event process. A novel estimation procedure for the regression parameters and the baseline rate function is proposed based on a conditioning technique. In contrast to existing methods, the proposed method is robust in the sense that it requires neither the strong Poisson-type assumption for the underlying recurrent event process nor a parametric assumption on the distribution of the unobserved frailty. Moreover, the distribution of the examination time process is left unspecified, allowing for arbitrary dependence between the two processes. Asymptotic consistency of the estimator is established, and the variance of the estimator is estimated by a model-based smoothed bootstrap procedure. Numerical studies demonstrated that the proposed point estimator and variance estimator perform well with practical sample sizes. The methods are applied to data from a skin cancer chemoprevention trial.


Asunto(s)
Estadística como Asunto/métodos , Factores de Tiempo , Quimioprevención/métodos , Quimioprevención/estadística & datos numéricos , Ensayos Clínicos como Asunto , Simulación por Computador , Recurrencia , Análisis de Regresión , Tamaño de la Muestra , Neoplasias Cutáneas/prevención & control
11.
Stat Med ; 37(7): 1086-1100, 2018 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-29205446

RESUMEN

Various semiparametric regression models have recently been proposed for the analysis of gap times between consecutive recurrent events. Among them, the semiparametric accelerated failure time (AFT) model is especially appealing owing to its direct interpretation of covariate effects on the gap times. In general, estimation of the semiparametric AFT model is challenging because the rank-based estimating function is a nonsmooth step function. As a result, solutions to the estimating equations do not necessarily exist. Moreover, the popular resampling-based variance estimation for the AFT model requires solving rank-based estimating equations repeatedly and hence can be computationally cumbersome and unstable. In this paper, we extend the induced smoothing approach to the AFT model for recurrent gap time data. Our proposed smooth estimating function permits the application of standard numerical methods for both the regression coefficients estimation and the standard error estimation. Large-sample properties and an asymptotic variance estimator are provided for the proposed method. Simulation studies show that the proposed method outperforms the existing nonsmooth rank-based estimating function methods in both point estimation and variance estimation. The proposed method is applied to the data analysis of repeated hospitalizations for patients in the Danish Psychiatric Center Register.


Asunto(s)
Biometría/métodos , Recurrencia , Análisis de Regresión , Simulación por Computador , Dinamarca , Hospitalización , Humanos , Trastornos Mentales , Readmisión del Paciente , Sistema de Registros , Factores de Tiempo
12.
Stat Med ; 37(6): 996-1008, 2018 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-29171035

RESUMEN

Alternating recurrent event data arise frequently in clinical and epidemiologic studies, where 2 types of events such as hospital admission and discharge occur alternately over time. The 2 alternating states defined by these recurrent events could each carry important and distinct information about a patient's underlying health condition and/or the quality of care. In this paper, we propose a semiparametric method for evaluating covariate effects on the 2 alternating states jointly. The proposed methodology accounts for the dependence among the alternating states as well as the heterogeneity across patients via a frailty with unspecified distribution. Moreover, the estimation procedure, which is based on smooth estimating equations, not only properly addresses challenges such as induced dependent censoring and intercept sampling bias commonly confronted in serial event gap time data but also is more computationally tractable than the existing rank-based methods. The proposed methods are evaluated by simulation studies and illustrated by analyzing psychiatric contacts from the South Verona Psychiatric Case Register.


Asunto(s)
Biometría/métodos , Análisis de Regresión , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Simulación por Computador , Femenino , Hospitalización , Humanos , Italia , Masculino , Trastornos Mentales , Persona de Mediana Edad , Recurrencia , Sistema de Registros , Factores de Riesgo , Factores de Tiempo , Adulto Joven
13.
Appl Psychol Meas ; 41(8): 579-599, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29033476

RESUMEN

Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire.

15.
Psychometrika ; 89(2): 717-740, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38517594

RESUMEN

Cognitive diagnosis models (CDMs) provide a powerful statistical and psychometric tool for researchers and practitioners to learn fine-grained diagnostic information about respondents' latent attributes. There has been a growing interest in the use of CDMs for polytomous response data, as more and more items with multiple response options become widely used. Similar to many latent variable models, the identifiability of CDMs is critical for accurate parameter estimation and valid statistical inference. However, the existing identifiability results are primarily focused on binary response models and have not adequately addressed the identifiability of CDMs with polytomous responses. This paper addresses this gap by presenting sufficient and necessary conditions for the identifiability of the widely used DINA model with polytomous responses, with the aim to provide a comprehensive understanding of the identifiability of CDMs with polytomous responses and to inform future research in this field.


Asunto(s)
Modelos Estadísticos , Psicometría , Humanos , Psicometría/métodos , Cognición
16.
Psychometrika ; 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38429494

RESUMEN

Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap, this paper presents a novel Gaussian variational estimation algorithm for the multidimensional generalized partial credit model. The proposed algorithm demonstrates both fast and accurate performance, as illustrated through a series of simulation studies and two real data analyses.

17.
Psychometrika ; 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38814412

RESUMEN

With the growing attention on large-scale educational testing and assessment, the ability to process substantial volumes of response data becomes crucial. Current estimation methods within item response theory (IRT), despite their high precision, often pose considerable computational burdens with large-scale data, leading to reduced computational speed. This study introduces a novel "divide- and-conquer" parallel algorithm built on the Wasserstein posterior approximation concept, aiming to enhance computational speed while maintaining accurate parameter estimation. This algorithm enables drawing parameters from segmented data subsets in parallel, followed by an amalgamation of these parameters via Wasserstein posterior approximation. Theoretical support for the algorithm is established through asymptotic optimality under certain regularity assumptions. Practical validation is demonstrated using real-world data from the Programme for International Student Assessment. Ultimately, this research proposes a transformative approach to managing educational big data, offering a scalable, efficient, and precise alternative that promises to redefine traditional practices in educational assessments.

18.
Stat Sin ; 232013.
Artículo en Inglés | MEDLINE | ID: mdl-24307816

RESUMEN

Sellke and Siegmund (1983) developed the Brownian approximation to the Cox partial likelihood score as a process of calendar time, laying the foundation for group sequential analysis of survival studies. We extend their results to cover situations in which treatment allocations may depend on observed outcomes. The new development makes use of the entry time and calendar time along with the corresponding σ-filtrations to handle the natural information accumulation. Large sample properties are established under suitable regularity conditions.

19.
Bernoulli (Andover) ; 19(5A): 1790-1817, 2013 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-24812537

RESUMEN

Cognitive assessment is a growing area in psychological and educational measurement, where tests are given to assess mastery/deficiency of attributes or skills. A key issue is the correct identification of attributes associated with items in a test. In this paper, we set up a mathematical framework under which theoretical properties may be discussed. We establish sufficient conditions to ensure that the attributes required by each item are learnable from the data.

20.
Psychometrika ; 88(4): 1407-1442, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35648266

RESUMEN

In recent years, the four-parameter model (4PM) has received increasing attention in item response theory. The purpose of this article is to provide more efficient and more reliable computational tools for fitting the 4PM. In particular, this article focuses on the four-parameter normal ogive model (4PNO) model and develops efficient stochastic approximation expectation maximization (SAEM) algorithms to compute the marginalized maximum a posteriori estimator. First, a data augmentation scheme is used for the 4PNO model, which makes the complete data model be an exponential family, and then, a basic SAEM algorithm is developed for the 4PNO model. Second, to overcome the drawback of the SAEM algorithm, we develop an improved SAEM algorithm for the 4PNO model, which is called the mixed SAEM (MSAEM). Results from simulation studies demonstrate that: (1) the MSAEM provides more accurate or comparable estimates as compared with the other estimation methods, while computationally more efficient; (2) the MSAEM is more robust to the choices of initial values and the priors for item parameters, which is a valuable property for practice use. Finally, a real data set is analyzed to show the good performance of the proposed methods.


Asunto(s)
Algoritmos , Psicometría , Simulación por Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA