Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Behav Res Methods ; 56(3): 1852-1862, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37326772

RESUMEN

A popular approach to the simulation of multivariate, non-normal data in the social sciences is to define a multivariate normal distribution first, and then alter its lower-dimensional marginals to achieve the shape of the distribution intended by the researchers. A consequence of this process is that the correlation structure is altered, so further methods are needed to specify an intermediate correlation matrix in the multivariate normal distribution step. Most of the techniques available in the literature estimate this intermediate correlation matrix bivariately (i.e., correlation by correlation), risking the possibility of generating a non-positive definite matrix. The present article addresses this issue by offering an algorithm that estimates all elements of the intermediate correlation matrix simultaneously, through stochastic approximation. A small simulation study demonstrates the feasibility of the present method to induce the correlation structure both in simulated and empirical data.


Asunto(s)
Algoritmos , Humanos , Método de Montecarlo , Simulación por Computador
2.
BMC Med Res Methodol ; 19(1): 97, 2019 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-31072299

RESUMEN

BACKGROUND: Despite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches). These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. To address both matters, we present a sample of cases documenting the influence that predictor distribution have on statistical power as well as a user-friendly, web-based application to conduct power analysis for multilevel logistic regression. METHOD: Computer simulations are implemented to estimate statistical power in multilevel logistic regression with varying numbers of clusters, varying cluster sample sizes, and non-normal and non-symmetrical distributions of the Level 1/2 predictors. Power curves were simulated to see in what ways non-normal/unbalanced distributions of a binary predictor and a continuous predictor affect the detection of population effect sizes for main effects, a cross-level interaction and the variance of the random effects. RESULTS: Skewed continuous predictors and unbalanced binary ones require larger sample sizes at both levels than balanced binary predictors and normally-distributed continuous ones. In the most extreme case of imbalance (10% incidence) and skewness of a chi-square distribution with 1 degree of freedom, even 110 Level 2 units and 100 Level 1 units were not sufficient for all predictors to reach power of 80%, mostly hovering at around 50% with the exception of the skewed, continuous Level 2 predictor. CONCLUSIONS: Given the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced. The more skewed or imbalanced the predictor is, the larger the sample size requirements. To assist researchers in planning research studies, a user-friendly web application that conducts power analysis via computer simulations in the R programming language is provided. With this web application, users can conduct simulations, tailored to their study design, to estimate statistical power for multilevel logistic regression models.


Asunto(s)
Simulación por Computador/estadística & datos numéricos , Interpretación Estadística de Datos , Modelos Logísticos , Modelos Estadísticos , Humanos , Tamaño de la Muestra
3.
Psychol Methods ; 2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38095991

RESUMEN

Polynomial regression is an old and commonly discussed modeling technique, though recommendations for its usage are widely variable. Here, we make the case that polynomial regression with second- and third-order terms should be part of every applied practitioners standard model-building toolbox, and should be taught to new students of the subject as the default technique to model nonlinearity. We argue that polynomial regression is superior to nonparametric alternatives for nonstatisticians due to its ease of interpretation, flexibility, and its nonreliance on sophisticated mathematics, like knots and kernel smoothing. This makes it the ideal default for nonstatisticians interested in building realistic models that can capture global as well as local effects of predictors on a response variable. Low-order polynomial regression can effectively model compact floor and ceiling effects, local linearity, and prevent inferring the presence of spurious interaction effects between distinct predictors when none are present. We also argue that the case against polynomial regression is largely specious, relying on either misconceptions around the method, strawman arguments, or historical artifacts. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

4.
Educ Psychol Meas ; 82(3): 517-538, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35444337

RESUMEN

Setting cutoff scores is one of the most common practices when using scales to aid in classification purposes. This process is usually done univariately where each optimal cutoff value is decided sequentially, subscale by subscale. While it is widely known that this process necessarily reduces the probability of "passing" such a test, what is not properly recognized is that such a test loses power to meaningfully discriminate between target groups with each new subscale that is introduced. We quantify and describe this property via an analytical exposition highlighting the counterintuitive geometry implied by marginal threshold-setting in multiple dimensions. Recommendations are presented that encourage applied researchers to think jointly, rather than marginally, when setting cutoff scores to ensure an informative test.

5.
Educ Psychol Meas ; 79(5): 813-826, 2019 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-31488914

RESUMEN

Within the context of moderated multiple regression, mean centering is recommended both to simplify the interpretation of the coefficients and to reduce the problem of multicollinearity. For almost 30 years, theoreticians and applied researchers have advocated for centering as an effective way to reduce the correlation between variables and thus produce more stable estimates of regression coefficients. By reviewing the theory on which this recommendation is based, this article presents three new findings. First, that the original assumption of expectation-independence among predictors on which this recommendation is based can be expanded to encompass many other joint distributions. Second, that for many jointly distributed random variables, even some that enjoy considerable symmetry, the correlation between the centered main effects and their respective interaction can increase when compared with the correlation of the uncentered effects. Third, that the higher order moments of the joint distribution play as much of a role as lower order moments such that the symmetry of lower dimensional marginals is a necessary but not sufficient condition for a decrease in correlation between centered main effects and their interaction. Theoretical and simulation results are presented to help conceptualize the issues.

6.
Br J Math Stat Psychol ; 71(3): 437-458, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-29323414

RESUMEN

The Fleishman third-order polynomial algorithm is one of the most-often used non-normal data-generating methods in Monte Carlo simulations. At the crux of the Fleishman method is the solution of a non-linear system of equations needed to obtain the constants to transform data from normality to non-normality. A rarely acknowledged fact in the literature is that the solution to this system is not unique, and it is currently unknown what influence the different types of solutions have on the computer-generated data. To address this issue, analytical and empirical investigations were conducted, aimed at documenting the impact that each solution type has on the design of computer simulations. In the first study, it was found that certain types of solutions generate data with different multivariate properties and wider coverage of the theoretical range spanned by population correlations. In the second study, it was found that previously published recommendations from Monte Carlo simulations could change if different types of solutions were used to generate the data. A mathematical description of the multiple solutions to the Fleishman polynomials is provided, as well as recommendations for users of this method.


Asunto(s)
Algoritmos , Modelos Estadísticos , Método de Montecarlo , Simulación por Computador , Interpretación Estadística de Datos , Humanos , Análisis Multivariante
7.
Early Hum Dev ; 115: 99-109, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29049945

RESUMEN

BACKGROUND: Very little research exists that looks at the Ages and Stages Questionnaire simultaneously from a modern latent variable point of view and by looking at its psychometric properties over time. AIMS: To explore the latent factor structure of the ASQ using Exploratory Structural Equation Modeling techniques for ordinal data and investigate its change over time using the method of vertical scaling from multidimensional Item Response Theory. STUDY DESIGN: Longitudinal, with the same children being assessed at multiple timepoints. SUBJECTS: Children measured using the 12, 14, 16, 18, 20, 22, 24, 27, 30, 33, 36, 42, 48month questionnaires of the ASQ. Initial sample (12months) consisted of 2219 children and final sample (48months), 892 children due to drop-out. OUTCOME MEASURES: Ages and Stages Questionnaire, 3rd version (ASQ-3). RESULTS: Results indicate that all ASQ-3 age questionnaires examined showed the proposed 5-factor structure (except the 12-month version) but with different patterns over time. The Gross Motor domain had the fewest misfitting items, from 12months onwards. The Personal-Social domain and the Problem Solving domain had larger numbers of misfitting items. Results from the vertical scaling analysis showed that both the Problem-Solving and Personal-Social dimensions also exhibited the most complex patterns of change over time. CONCLUSIONS: The psychometric properties of the ASQ-3 seem to be both time-dependent and domain-dependent. Earlier questionnaires reflect a latent structure that was not as well-defined as for later versions. Also, domains such as Communication and Gross Motor appear to be much more reliably measured than others, such as Problem-Solving and Personal-Social.


Asunto(s)
Desarrollo Infantil , Pruebas Neuropsicológicas/normas , Encuestas y Cuestionarios/normas , Preescolar , Femenino , Humanos , Lactante , Masculino , Solución de Problemas , Psicometría , Conducta Social
8.
Educ Psychol Meas ; 75(4): 541-567, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-29795832

RESUMEN

To further understand the properties of data-generation algorithms for multivariate, nonnormal data, two Monte Carlo simulation studies comparing the Vale and Maurelli method and the Headrick fifth-order polynomial method were implemented. Combinations of skewness and kurtosis found in four published articles were run and attention was specifically paid to the quality of the sample estimates of univariate skewness and kurtosis. In the first study, it was found that the Vale and Maurelli algorithm yielded downward-biased estimates of skewness and kurtosis (particularly at small samples) that were also highly variable. This method was also prone to generate extreme sample kurtosis values if the population kurtosis was high. The estimates obtained from Headrick's algorithm were also biased downward, but much less so than the estimates obtained through Vale and Maurelli and much less variable. The second study reproduced the first simulation in the Curran, West, and Finch article using both the Vale and Maurelli method and the Heardick method. It was found that the chi-square values and empirical rejection rates changed depending on which data-generation method was used, sometimes sufficiently so that some of the original conclusions of the authors would no longer hold. In closing, recommendations are presented regarding the relative merits of each algorithm.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA