Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 281
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Genet Epidemiol ; 2024 Nov 05.
Artículo en Inglés | MEDLINE | ID: mdl-39498871

RESUMEN

We propose two novel one-sample Mendelian randomization (MR) approaches to causal inference from count-type health outcomes, tailored to both equidispersion and overdispersion conditions. Selecting valid single-nucleotide polymorphisms (SNPs) as instrumental variables (IVs) poses a key challenge for MR approaches, as it requires meeting the necessary IV assumptions. To bolster the proposed approaches by addressing violations of IV assumptions, we incorporate a process for removing invalid SNPs that violate the assumptions. In simulations, our proposed approaches demonstrate robustness to the violations, delivering valid estimates, and interpretable type-I errors and statistical power. This increases the practical applicability of the models. We applied the proposed approaches to evaluate the causal effect of fetal hemoglobin (HbF) on the vaso-occlusive crisis and acute chest syndrome (ACS) events in patients with sickle cell disease (SCD) and revealed the causal relation between HbF and ACS events in these patients. We also developed a user-friendly Shiny web application to facilitate researchers' exploration of causal relations.

2.
Biometrics ; 80(1)2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38497823

RESUMEN

In longitudinal follow-up studies, panel count data arise from discrete observations on recurrent events. We investigate a more general situation where a partly interval-censored failure event is informative to recurrent events. The existing methods for the informative failure event are based on the latent variable model, which provides indirect interpretation for the effect of failure event. To solve this problem, we propose a failure-time-dependent proportional mean model with panel count data through an unspecified link function. For estimation of model parameters, we consider a conditional expectation of least squares function to overcome the challenges from partly interval-censoring, and develop a two-stage estimation procedure by treating the distribution function of the failure time as a functional nuisance parameter and using the B-spline functions to approximate unknown baseline mean and link functions. Furthermore, we derive the overall convergence rate of the proposed estimators and establish the asymptotic normality of finite-dimensional estimator and functionals of infinite-dimensional estimator. The proposed estimation procedure is evaluated by extensive simulation studies, in which the finite-sample performances coincide with the theoretical results. We further illustrate our method with a longitudinal healthy longevity study and draw some insightful conclusions.


Asunto(s)
Estado de Salud , Simulación por Computador
3.
Biometrics ; 80(2)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38682464

RESUMEN

The current Poisson factor models often assume that the factors are unknown, which overlooks the explanatory potential of certain observable covariates. This study focuses on high dimensional settings, where the number of the count response variables and/or covariates can diverge as the sample size increases. A covariate-augmented overdispersed Poisson factor model is proposed to jointly perform a high-dimensional Poisson factor analysis and estimate a large coefficient matrix for overdispersed count data. A group of identifiability conditions is provided to theoretically guarantee computational identifiability. We incorporate the interdependence of both response variables and covariates by imposing a low-rank constraint on the large coefficient matrix. To address the computation challenges posed by nonlinearity, two high-dimensional latent matrices, and the low-rank constraint, we propose a novel variational estimation scheme that combines Laplace and Taylor approximations. We also develop a criterion based on a singular value ratio to determine the number of factors and the rank of the coefficient matrix. Comprehensive simulation studies demonstrate that the proposed method outperforms the state-of-the-art methods in estimation accuracy and computational efficiency. The practical merit of our method is demonstrated by an application to the CITE-seq dataset. A flexible implementation of our proposed method is available in the R package COAP.


Asunto(s)
Simulación por Computador , Modelos Estadísticos , Distribución de Poisson , Humanos , Tamaño de la Muestra , Biometría/métodos , Análisis Factorial
4.
Biometrics ; 80(1)2024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38465988

RESUMEN

Mixed panel count data represent a common complex data structure in longitudinal survey studies. A major challenge in analyzing such data is variable selection and estimation while efficiently incorporating both the panel count and panel binary data components. Analyses in the medical literature have often ignored the panel binary component and treated it as missing with the unknown panel counts, while obviously such a simplification does not effectively utilize the original data information. In this research, we put forward a penalized likelihood variable selection and estimation procedure under the proportional mean model. A computationally efficient EM algorithm is developed that ensures sparse estimation for variable selection, and the resulting estimator is shown to have the desirable oracle property. Simulation studies assessed and confirmed the good finite-sample properties of the proposed method, and the method is applied to analyze a motivating dataset from the Health and Retirement Study.


Asunto(s)
Algoritmos , Funciones de Verosimilitud , Simulación por Computador , Estudios Longitudinales
5.
Biometrics ; 80(3)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-39073775

RESUMEN

Recent breakthroughs in spatially resolved transcriptomics (SRT) technologies have enabled comprehensive molecular characterization at the spot or cellular level while preserving spatial information. Cells are the fundamental building blocks of tissues, organized into distinct yet connected components. Although many non-spatial and spatial clustering approaches have been used to partition the entire region into mutually exclusive spatial domains based on the SRT high-dimensional molecular profile, most require an ad hoc selection of less interpretable dimensional-reduction techniques. To overcome this challenge, we propose a zero-inflated negative binomial mixture model to cluster spots or cells based on their molecular profiles. To increase interpretability, we employ a feature selection mechanism to provide a low-dimensional summary of the SRT molecular profile in terms of discriminating genes that shed light on the clustering result. We further incorporate the SRT geospatial profile via a Markov random field prior. We demonstrate how this joint modeling strategy improves clustering accuracy, compared with alternative state-of-the-art approaches, through simulation studies and 3 real data applications.


Asunto(s)
Teorema de Bayes , Simulación por Computador , Perfilación de la Expresión Génica , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Transcriptoma , Cadenas de Markov , Modelos Estadísticos , Interpretación Estadística de Datos
6.
BMC Med Res Methodol ; 24(1): 75, 2024 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-38532325

RESUMEN

BACKGROUND: Diabetes is one of the top four non-communicable diseases that cause death and illness to many people around the world. This study aims to use an efficient count data model to estimate socio-environmental factors associated with diabetes incidences in Tanzania mainland, addressing lack of evidence on the efficient count data model for estimating factors associated with disease incidences disparities. METHODS: This study analyzed diabetes counts in 184 Tanzania mainland councils collected in 2020. The study applied generalized Poisson, negative binomial, and Poisson count data models and evaluated their adequacy using information criteria and Pearson chi-square values. RESULTS: The data were over-dispersed, as evidenced by the mean and variance values and the positively skewed histograms. The results revealed uneven distribution of diabetes incidence across geographical locations, with northern and urban councils having more cases. Factors like population, GDP, and hospital numbers were associated with diabetes counts. The GP model performed better than NB and Poisson models. CONCLUSION: The occurrence of diabetes can be attributed to geographical locations. To address this public health issue, environmental interventions can be implemented. Additionally, the generalized Poisson model is an effective tool for analyzing health information system count data across different population subgroups.


Asunto(s)
Diabetes Mellitus , Modelos Estadísticos , Humanos , Incidencia , Tanzanía , Distribución de Poisson
7.
Dev Sci ; 27(4): e13499, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38544371

RESUMEN

Scale errors are intriguing phenomena in which a child tries to perform an object-specific action on a tiny object. Several viewpoints explaining the developmental mechanisms underlying scale errors exist; however, there is no unified account of how different factors interact and affect scale errors, and the statistical approaches used in the previous research do not adequately capture the structure of the data. By conducting a secondary analysis of aggregated datasets across nine different studies (n = 528) and using more appropriate statistical methods, this study provides a more accurate description of the development of scale errors. We implemented the zero-inflated Poisson (ZIP) regression that could directly handle the count data with a stack of zero observations and regarded developmental indices as continuous variables. The results suggested that the developmental trend of scale errors was well documented by an inverted U-shaped curve rather than a simple linear function, although nonlinearity captured different aspects of the scale errors between the laboratory and classroom data. We also found that repeated experiences with scale error tasks reduced the number of scale errors, whereas girls made more scale errors than boys. Furthermore, a model comparison approach revealed that predicate vocabulary size (e.g., adjectives or verbs), predicted developmental changes in scale errors better than noun vocabulary size, particularly in terms of the presence or absence of scale errors. The application of the ZIP model enables researchers to discern how different factors affect scale error production, thereby providing new insights into demystifying the mechanisms underlying these phenomena. A video abstract of this article can be viewed at https://youtu.be/1v1U6CjDZ1Q RESEARCH HIGHLIGHTS: We fit a large dataset by aggregating the existing scale error data to the zero-inflated Poisson (ZIP) model. Scale errors peaked along the different developmental indices, but the underlying statistical structure differed between the in-lab and classroom datasets. Repeated experiences with scale error tasks and the children's gender affected the number of scale errors produced per session. Predicate vocabulary size (e.g., adjectives or verbs) better predicts developmental changes in scale errors than noun vocabulary size.


Asunto(s)
Vocabulario , Humanos , Distribución de Poisson , Niño , Femenino , Masculino , Desarrollo Infantil/fisiología , Preescolar , Modelos Estadísticos
8.
BMC Public Health ; 24(1): 901, 2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38539086

RESUMEN

BACKGROUND: Count time series (e.g., daily deaths) are a very common type of data in environmental health research. The series is generally autocorrelated, while the widely used generalized linear model is based on the assumption of independent outcomes. None of the existing methods for modelling parameter-driven count time series can obtain consistent and reliable standard error of parameter estimates, causing potential inflation of type I error rate. METHODS: We proposed a new maximum significant ρ correction (MSRC) method that utilizes information of significant autocorrelation coefficient ρ estimate within 5 orders by moment estimation. A Monte Carlo simulation was conducted to evaluate and compare the finite sample performance of the MSRC and classical unbiased correction (UB-corrected) method. We demonstrated a real-data analysis for assessing the effect of drunk driving regulations on the incidence of road traffic injuries (RTIs) using MSRC in Shenzhen, China. Moreover, there is no previous paper assessing the time-varying intervention effect and considering autocorrelation based on daily data of RTIs. RESULTS: Both methods had a small bias in the regression coefficients. The autocorrelation coefficient estimated by UB-corrected is slightly underestimated at high autocorrelation (≥ 0.6), leading to the inflation of the type I error rate. The new method well controlled the type I error rate when the sample size reached 340. Moreover, the power of MSRC increased with increasing sample size and effect size and decreasing nuisance parameters, and it approached UB-corrected when ρ was small (≤ 0.4), but became more reliable as autocorrelation increased further. The daily data of RTIs exhibited significant autocorrelation after controlling for potential confounding, and therefore the MSRC was preferable to the UB-corrected. The intervention contributed to a decrease in the incidence of RTIs by 8.34% (95% CI, -5.69-20.51%), 45.07% (95% CI, 25.86-59.30%) and 42.94% (95% CI, 9.56-64.00%) at 1, 3 and 5 years after the implementation of the intervention, respectively. CONCLUSIONS: The proposed MSRC method provides a reliable and consistent approach for modelling parameter-driven time series with autocorrelated count data. It offers improved estimation compared to existing methods. The strict drunk driving regulations can reduce the risk of RTIs.


Asunto(s)
Factores de Tiempo , Humanos , Modelos Lineales , Simulación por Computador , Sesgo , China
9.
Int J Biometeorol ; 68(3): 581-593, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36607447

RESUMEN

This study investigates empirically how natural snow depth and permanent snow affect the number of new second homes in Norway. One out of four Norwegian municipalities is partly covered by glaciers and permanent snow. In the winter seasons of 1983-2020, there is a decline in snow depth from 50 to 35 cm on average (based on 41 popular second-home areas in the mountains). Results of the fixed effects Poisson estimator with spatial elements show that there is a significant and positive relationship between natural snow depth in the municipality and the number of second homes started. There is also a significant and negative relationship between the number of new second homes in the municipality and a scarcity of snow in the surrounding municipalities. However, the magnitude of both effects is small. Estimates also show a strong positive relationship between the proportion of surface covered by permanent snow or glaciers in the municipality and new second homes. This implies that a decline in permanent snow and glaciers may make these areas less attractive for the location of second homes.


Asunto(s)
Monitoreo del Ambiente , Nieve , Monitoreo del Ambiente/métodos , Estaciones del Año , Cubierta de Hielo
10.
Multivariate Behav Res ; 59(3): 502-522, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38348679

RESUMEN

In psychology and education, tests (e.g., reading tests) and self-reports (e.g., clinical questionnaires) generate counts, but corresponding Item Response Theory (IRT) methods are underdeveloped compared to binary data. Recent advances include the Two-Parameter Conway-Maxwell-Poisson model (2PCMPM), generalizing Rasch's Poisson Counts Model, with item-specific difficulty, discrimination, and dispersion parameters. Explaining differences in model parameters informs item construction and selection but has received little attention. We introduce two 2PCMPM-based explanatory count IRT models: The Distributional Regression Test Model for item covariates, and the Count Latent Regression Model for (categorical) person covariates. Estimation methods are provided and satisfactory statistical properties are observed in simulations. Two examples illustrate how the models help understand tests and underlying constructs.


Asunto(s)
Modelos Estadísticos , Humanos , Análisis de Regresión , Reproducibilidad de los Resultados , Simulación por Computador/estadística & datos numéricos , Distribución de Poisson , Psicometría/métodos , Interpretación Estadística de Datos
11.
Behav Sci Law ; 42(4): 385-400, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38762888

RESUMEN

This study explores the offender, victim, and environmental characteristics that significantly influence the number of days a sexual homicide victim remains undiscovered. Utilizing a sample of 269 cases from the Homicide Investigation Tracking System database an in-depth analysis was conducted to unveil the factors contributing to the delay in the discovery of victims' bodies. The methodological approach involves applying a negative binomial regression analysis, which allows for the examination of count data, specifically addressing the over-dispersion and excess zeros in the dependent variable - the number of days until the victim is found. The findings reveal that certain offender characteristics, victim traits, and spatio-temporal factors play a pivotal role in the time lag experienced in locating the bodies of homicide victims. These findings have crucial implications for investigative efforts in homicide cases, offering valuable insights that can inform and enhance the efficacy and efficiency of future investigative procedures and strategies.


Asunto(s)
Víctimas de Crimen , Homicidio , Delitos Sexuales , Humanos , Masculino , Femenino , Adulto , Delitos Sexuales/psicología , Criminales/psicología , Persona de Mediana Edad , Factores de Tiempo , Adulto Joven , Adolescente , Anciano , Autopsia
12.
Biom J ; 66(3): e2200342, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38616336

RESUMEN

The research on the quantitative trait locus (QTL) mapping of count data has aroused the wide attention of researchers. There are frequent problems in applied research that limit the application of the conventional Poisson model in the analysis of count phenotypes, which include the overdispersion and excess zeros and ones. In this article, a novel model, that is, the zero-and-one-inflated generalized Poisson (ZOIGP) model, is proposed to deal with these problems. Based on the proposed model, a score test is performed for the inflation parameter, in which the ZOIGP model with a constant proportion of excess zeros and ones is compared with a standard generalized Poisson model. To illustrate the practicability of the ZOIGP model, we extend it to the QTL interval mapping application that underpins count phenotype with excess zeros and excess ones. The genetic effects are estimated utilizing the expectation-maximization algorithm embedded with the Newton-Raphson algorithm, and the genome-wide scan and likelihood ratio test is performed to map and test the potential QTLs. The statistical properties exhibited by the proposed method are investigated through simulation. Finally, a real data analysis example is used to illustrate the utility of the proposed method for QTL mapping.


Asunto(s)
Algoritmos , Sitios de Carácter Cuantitativo , Simulación por Computador , Análisis de Datos , Fenotipo
13.
Lifetime Data Anal ; 30(4): 721-741, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38805094

RESUMEN

Panel count regression is often required in recurrent event studies, where the interest is to model the event rate. Existing rate models are unable to handle time-varying covariate effects due to theoretical and computational difficulties. Mean models provide a viable alternative but are subject to the constraints of the monotonicity assumption, which tends to be violated when covariates fluctuate over time. In this paper, we present a new semiparametric rate model for panel count data along with related theoretical results. For model fitting, we present an efficient EM algorithm with three different methods for variance estimation. The algorithm allows us to sidestep the challenges of numerical integration and difficulties with the iterative convex minorant algorithm. We showed that the estimators are consistent and asymptotically normally distributed. Simulation studies confirmed an excellent finite sample performance. To illustrate, we analyzed data from a real clinical study of behavioral risk factors for sexually transmitted infections.


Asunto(s)
Algoritmos , Simulación por Computador , Modelos Estadísticos , Humanos , Enfermedades de Transmisión Sexual , Análisis de Regresión , Factores de Tiempo
14.
Behav Res Methods ; 56(7): 7963-7984, 2024 10.
Artículo en Inglés | MEDLINE | ID: mdl-38987450

RESUMEN

Generalized linear mixed models (GLMMs) have great potential to deal with count data in single-case experimental designs (SCEDs). However, applied researchers have faced challenges in making various statistical decisions when using such advanced statistical techniques in their own research. This study focused on a critical issue by investigating the selection of an appropriate distribution to handle different types of count data in SCEDs due to overdispersion and/or zero-inflation. To achieve this, I proposed two model selection frameworks, one based on calculating information criteria (AIC and BIC) and another based on utilizing a multistage-model selection procedure. Four data scenarios were simulated including Poisson, negative binominal (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB). The same set of models (i.e., Poisson, NB, ZIP, and ZINB) were fitted for each scenario. In the simulation, I evaluated 10 model selection strategies within the two frameworks by assessing the model selection bias and its consequences on the accuracy of the treatment effect estimates and inferential statistics. Based on the simulation results and previous work, I provide recommendations regarding which model selection methods should be adopted in different scenarios. The implications, limitations, and future research directions are also discussed.


Asunto(s)
Método de Montecarlo , Modelos Lineales , Humanos , Estudios de Casos Únicos como Asunto , Simulación por Computador , Interpretación Estadística de Datos , Modelos Estadísticos , Distribución de Poisson , Proyectos de Investigación
15.
Behav Res Methods ; 56(4): 2765-2781, 2024 04.
Artículo en Inglés | MEDLINE | ID: mdl-38383801

RESUMEN

Count outcomes are frequently encountered in single-case experimental designs (SCEDs). Generalized linear mixed models (GLMMs) have shown promise in handling overdispersed count data. However, the presence of excessive zeros in the baseline phase of SCEDs introduces a more complex issue known as zero-inflation, often overlooked by researchers. This study aimed to deal with zero-inflated and overdispersed count data within a multiple-baseline design (MBD) in single-case studies. It examined the performance of various GLMMs (Poisson, negative binomial [NB], zero-inflated Poisson [ZIP], and zero-inflated negative binomial [ZINB] models) in estimating treatment effects and generating inferential statistics. Additionally, a real example was used to demonstrate the analysis of zero-inflated and overdispersed count data. The simulation results indicated that the ZINB model provided accurate estimates for treatment effects, while the other three models yielded biased estimates. The inferential statistics obtained from the ZINB model were reliable when the baseline rate was low. However, when the data were overdispersed but not zero-inflated, both the ZINB and ZIP models exhibited poor performance in accurately estimating treatment effects. These findings contribute to our understanding of using GLMMs to handle zero-inflated and overdispersed count data in SCEDs. The implications, limitations, and future research directions are also discussed.


Asunto(s)
Estudios de Casos Únicos como Asunto , Humanos , Modelos Lineales , Análisis Multinivel/métodos , Interpretación Estadística de Datos , Modelos Estadísticos , Distribución de Poisson , Simulación por Computador , Proyectos de Investigación
16.
J Theor Biol ; 557: 111323, 2023 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-36273592

RESUMEN

Dopamine D1 receptor (D1DR) is proved to be a promising target to prevent tumor metastasis, and our previous studies showed that QAP14, a potent anti-cancer agent, exerted inhibitory effect on lung metastasis via D1DR activation. Therefore, the purpose of the study was to establish count data models to quantitatively characterize the disease progression of lung metastasis and assess the anti-metastatic effect of QAP14. Data of metastatic progression were collected in 4T1 tumor-bearing mice. Generalized Poisson distribution best described the variability of metastasis counts among the individuals. An empirical PK/PD model was developed to establish mathematical relationships between steady plasma concentrations of QAP14 and metastasis growth dynamics. The latency period of metastasis was estimated to be 12 days after tumor implantation. Our model structure also fitted well to other D1DR agonists (fenoldopam and l-stepholidine) which had inhibitory impact on breast cancer lung metastasis likewise. QAP14 40 mg/kg showed the best inhibitory efficacy, for it provided the longest prolongation of metastasis-free periods compared with fenoldopam or l-stepholidine. This study provides a quantitative method to describe the lung metastasis progression of 4T1 allografts, as well as an alternative PD model structure to evaluate anti-metastatic efficacy.


Asunto(s)
Fenoldopam , Neoplasias Pulmonares , Ratones , Animales , Línea Celular Tumoral , Neoplasias Pulmonares/tratamiento farmacológico , Neoplasias Pulmonares/patología , Aloinjertos/patología , Ratones Endogámicos BALB C , Metástasis de la Neoplasia/patología
17.
Biometrics ; 79(3): 2171-2183, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-36065934

RESUMEN

Wildlife monitoring for open populations can be performed using a number of different survey methods. Each survey method gives rise to a type of data and, in the last five decades, a large number of associated statistical models have been developed for analyzing these data. Although these models have been parameterized and fitted using different approaches, they have all been designed to either model the pattern with which individuals enter and/or exit the population, or to estimate the population size by accounting for the corresponding observation process, or both. However, existing approaches rely on a predefined model structure and complexity, either by assuming that parameters linked to the entry and exit pattern (EEP) are specific to sampling occasions, or by employing parametric curves to describe the EEP. Instead, we propose a novel Bayesian nonparametric framework for modeling EEPs based on the Polya tree (PT) prior for densities. Our Bayesian nonparametric approach avoids overfitting when inferring EEPs, while simultaneously allowing more flexibility than is possible using parametric curves. Finally, we introduce the replicate PT prior for defining classes of models for these data allowing us to impose constraints on the EEPs, when required. We demonstrate our new approach using capture-recapture, count, and ring-recovery data for two different case studies.


Asunto(s)
Animales Salvajes , Modelos Estadísticos , Humanos , Animales , Teorema de Bayes , Densidad de Población
18.
Biometrics ; 79(3): 2063-2075, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-36454666

RESUMEN

In many applications of hierarchical models, there is often interest in evaluating the inherent heterogeneity in view of observed data. When the underlying hypothesis involves parameters resting on the boundary of their support space such as variances and mixture proportions, it is a usual practice to entertain testing procedures that rely on common heterogeneity assumptions. Such procedures, albeit omnibus for general alternatives, may entail a substantial loss of power for specific alternatives such as heterogeneity varying with covariates. We introduce a novel and flexible approach that uses covariate information to improve the power to detect heterogeneity, without imposing unnecessary restrictions. With continuous covariates, the approach does not impose a regression model relating heterogeneity parameters to covariates or rely on arbitrary discretizations. Instead, a scanning approach requiring continuous dichotomizations of the covariates is proposed. Empirical processes resulting from these dichotomizations are then used to construct the test statistics, with limiting null distributions shown to be functionals of tight random processes. We illustrate our proposals and results on a popular class of two-component mixture models, followed by simulation studies and applications to two real datasets in cancer and caries research.


Asunto(s)
Modelos Estadísticos , Proyectos de Investigación , Simulación por Computador , Causalidad , Correlación de Datos
19.
Stat Med ; 42(30): 5596-5615, 2023 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-37867199

RESUMEN

Panel count data and interval-censored data are two types of incomplete data that often occur in event history studies. Almost all existing statistical methods are developed for their separate analysis. In this paper, we investigate a more general situation where a recurrent event process and an interval-censored failure event occur together. To intuitively and clearly explain the relationship between the recurrent current process and failure event, we propose a failure time-dependent mean model through a completely unspecified link function. To overcome the challenges arising from the blending of nonparametric components and parametric regression coefficients, we develop a two-stage conditional expected likelihood-based estimation procedure. We establish the consistency, the convergence rate and the asymptotic normality of the proposed two-stage estimator. Furthermore, we construct a class of two-sample tests for comparison of mean functions from different groups. The proposed methods are evaluated by extensive simulation studies and are illustrated with the skin cancer data that motivated this study.


Asunto(s)
Neoplasias Cutáneas , Humanos , Funciones de Verosimilitud , Análisis de Regresión , Simulación por Computador , Tiempo
20.
Int J Behav Nutr Phys Act ; 20(1): 57, 2023 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-37147664

RESUMEN

BACKGROUND: Inference using standard linear regression models (LMs) relies on assumptions that are rarely satisfied in practice. Substantial departures, if not addressed, have serious impacts on any inference and conclusions; potentially rendering them invalid and misleading. Count, bounded and skewed outcomes, common in physical activity research, can substantially violate LM assumptions. A common approach to handle these is to transform the outcome and apply a LM. However, a transformation may not suffice. METHODS: In this paper, we introduce the generalized linear model (GLM), a generalization of the LM, as an approach for the appropriate modelling of count and non-normally distributed (i.e., bounded and skewed) outcomes. Using data from a study of physical activity among older adults, we demonstrate appropriate methods to analyse count, bounded and skewed outcomes. RESULTS: We show how fitting an LM when inappropriate, especially for the type of outcomes commonly encountered in physical activity research, substantially impacts the analysis, inference, and conclusions compared to a GLM. CONCLUSIONS: GLMs which more appropriately model non-normally distributed response variables should be considered as more suitable approaches for managing count, bounded and skewed outcomes rather than simply relying on transformations. We recommend that physical activity researchers add the GLM to their statistical toolboxes and become aware of situations when GLMs are a better method than traditional approaches for modeling count, bounded and skewed outcomes.


Asunto(s)
Ejercicio Físico , Anciano , Humanos , Modelos Lineales
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA