Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Stat Med ; 43(3): 578-605, 2024 02 10.
Artículo en Inglés | MEDLINE | ID: mdl-38213277

RESUMEN

Research on dynamic treatment regimes has enticed extensive interest. Many methods have been proposed in the literature, which, however, are vulnerable to the presence of misclassification in covariates. In particular, although Q-learning has received considerable attention, its applicability to data with misclassified covariates is unclear. In this article, we investigate how ignoring misclassification in binary covariates can impact the determination of optimal decision rules in randomized treatment settings, and demonstrate its deleterious effects on Q-learning through empirical studies. We present two correction methods to address misclassification effects on Q-learning. Numerical studies reveal that misclassification in covariates induces non-negligible estimation bias and that the correction methods successfully ameliorate bias in parameter estimation.


Asunto(s)
Reglas de Decisión Clínica , Aprendizaje Automático , Humanos
2.
Biometrics ; 79(2): 1073-1088, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-35032335

RESUMEN

Research of complex associations between a gene network and multiple responses has attracted increasing attention. A great challenge in analyzing genetic data is posited by the presence of the genetic network that is typically unknown. Moreover, mismeasurement of responses introduces additional complexity to distort usual inferential procedures. In this paper, we consider the problem with mixed binary and continuous responses that are subject to mismeasurement and associated with complex structured covariates. We first start with the case where data are precisely measured. We propose a generalized network structured model and develop a two-step inferential procedure. In the first step, we employ a Gaussian graphical model to facilitate the covariates network structure, and in the second step, we incorporate the estimated graphical structure of covariates and develop an estimating equation method. Furthermore, we extend the development to accommodating mismeasured responses. We consider two cases where the information on mismeasurement is either known or estimated from a validation sample. Theoretical results are established and numerical studies are conducted to evaluate the finite sample performance of the proposed methods. We apply the proposed method to analyze the outbred Carworth Farms White mice data arising from a genome-wide association study.


Asunto(s)
Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Animales , Ratones , Distribución Normal
3.
Biometrics ; 79(2): 1089-1102, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-35261029

RESUMEN

Zero-inflated count data arise frequently from genomics studies. Analysis of such data is often based on a mixture model which facilitates excess zeros in combination with a Poisson distribution, and various inference methods have been proposed under such a model. Those analysis procedures, however, are challenged by the presence of measurement error in count responses. In this article, we propose a new measurement error model to describe error-contaminated count data. We show that ignoring the measurement error effects in the analysis may generally lead to invalid inference results, and meanwhile, we identify situations where ignoring measurement error can still yield consistent estimators. Furthermore, we propose a Bayesian method to address the effects of measurement error under the zero-inflated Poisson model and discuss the identifiability issues. We develop a data-augmentation algorithm that is easy to implement. Simulation studies are conducted to evaluate the performance of the proposed method. We apply our method to analyze the data arising from a prostate adenocarcinoma genomic study.


Asunto(s)
Algoritmos , Modelos Estadísticos , Masculino , Humanos , Teorema de Bayes , Simulación por Computador , Distribución de Poisson
4.
J Med Virol ; 94(9): 4156-4169, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35524338

RESUMEN

Providing sensible estimates of the mean incubation time for COVID-19 is important yet complex. This study aims to provide synthetic estimates of the mean incubation time of COVID-19 by capitalizing on available estimates reported in the literature and exploring different ways to accommodate heterogeneity involved in the reported studies. Online databases between January 1, 2020 and May 20, 2021 are first searched to obtain estimates of the mean incubation time of COVID-19, and meta-analyses are then conducted to generate synthetic estimates. Heterogeneity of the studies is examined via the use of Cochran's Q $Q$ statistic and Higgin's & Thompson's I 2 ${I}^{2}$ statistic, and subgroup analyses are conducted using mixed effects models. The publication bias issue is assessed using the funnel plot and Egger's test. Using all those reported mean incubation estimates for COVID-19, the synthetic mean incubation time is estimated to be 6.43 days with a 95% confidence interval (CI) [5.90, 6.96], and using all those reported mean incubation estimates together with those transformed median incubation estimates, the estimated mean incubation time is 6.07 days with a 95% CI [5.70, 6.45]. The reported estimates of the mean incubation time of COVID-19 vary considerably due to multiple reasons, including heterogeneity and publication bias. To alleviate these issues, we take different angles to provide a sensible estimate of the mean incubation time of COVID-19. Our analyses show that the mean incubation time of COVID-19 between January 1, 2020 and May 20, 2021 ranges from 5.68 to 8.30 days.


Asunto(s)
COVID-19 , Humanos
5.
Biometrics ; 78(3): 894-907, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-33881782

RESUMEN

Data with a huge size present great challenges in modeling, inferences, and computation. In handling big data, much attention has been directed to settings with "large p small n", and relatively less work has been done to address problems with p and n being both large, though data with such a feature have now become more accessible than before, where p represents the number of variables and n stands for the sample size. The big volume of data does not automatically ensure good quality of inferences because a large number of unimportant variables may be collected in the process of gathering informative variables. To carry out valid statistical analysis, it is imperative to screen out noisy variables that have no predictive value for explaining the outcome variable. In this paper, we develop a screening method for handling large-sized survival data, where the sample size n is large and the dimension p of covariates is of non-polynomial order of the sample size n, or the so-called NP-dimension. We rigorously establish theoretical results for the proposed method and conduct numerical studies to assess its performance. Our research offers multiple extensions of existing work and enlarges the scope of high-dimensional data analysis. The proposed method capitalizes on the connections among useful regression settings and offers a computationally efficient screening procedure. Our method can be applied to different situations with large-scale data including genomic data.


Asunto(s)
Genoma , Genómica , Modelos de Riesgos Proporcionales , Tamaño de la Muestra
6.
BMC Med Res Methodol ; 22(1): 15, 2022 01 14.
Artículo en Inglés | MEDLINE | ID: mdl-35026998

RESUMEN

BACKGROUND: The coronavirus disease 2019 (COVID-19) pandemic has posed a significant influence on public mental health. Current efforts focus on alleviating the impacts of the disease on public health and the economy, with the psychological effects due to COVID-19 relatively ignored. In this research, we are interested in exploring the quantitative characterization of the pandemic impact on public mental health by studying an online survey dataset of the United States. METHODS: The analyses are conducted based on a large scale of online mental health-related survey study in the United States, conducted over 12 consecutive weeks from April 23, 2020 to July 21, 2020. We are interested in examining the risk factors that have a significant impact on mental health as well as in their estimated effects over time. We employ the multiple imputation by chained equations (MICE) method to deal with missing values and take logistic regression with the least absolute shrinkage and selection operator (Lasso) method to identify risk factors for mental health. RESULTS: Our analysis shows that risk predictors for an individual to experience mental health issues include the pandemic situation of the State where the individual resides, age, gender, race, marital status, health conditions, the number of household members, employment status, the level of confidence of the future food affordability, availability of health insurance, mortgage status, and the information of kids enrolling in school. The effects of most of the predictors seem to change over time though the degree varies for different risk factors. The effects of risk factors, such as States and gender show noticeable change over time, whereas the factor age exhibits seemingly unchanged effects over time. CONCLUSIONS: The analysis results unveil evidence-based findings to identify the groups who are psychologically vulnerable to the COVID-19 pandemic. This study provides helpful evidence for assisting healthcare providers and policymakers to take steps for mitigating the pandemic effects on public mental health, especially in boosting public health care, improving public confidence in future food conditions, and creating more job opportunities. TRIAL REGISTRATION: This article does not report the results of a health care intervention on human participants.


Asunto(s)
COVID-19 , Pandemias , Humanos , Salud Mental , SARS-CoV-2 , Instituciones Académicas , Estados Unidos/epidemiología
7.
Can J Stat ; 50(2): 395-416, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35573897

RESUMEN

The coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has spread stealthily and presented a tremendous threat to the public. It is important to investigate the transmission dynamics of COVID-19 to help understand the impact of the disease on public health and the economy. In this article, we develop a new epidemic model that utilizes a set of ordinary differential equations with unknown parameters to delineate the transmission process of COVID-19. The model accounts for asymptomatic infections as well as the lag between symptom onset and the confirmation date of infection. To reflect the transmission potential of an infected case, we derive the basic reproduction number from the proposed model. Using the daily reported number of confirmed cases, we describe an estimation procedure for the model parameters, which involves adapting the iterated filter-ensemble adjustment Kalman filter (IF-EAKF) algorithm. To illustrate the use of the proposed model, we examine the COVID-19 data from Quebec for the period from 2 April 2020 to 10 May 2020 and carry out sensitivity studies under a variety of assumptions. Simulation studies are used to evaluate the performance of the proposed model under a variety of settings.


La maladie à coronavirus 2019 (COVID­19), causée par le coronavirus 2 du syndrome respiratoire aigu sévère (SARS­CoV­2), s'est rapidement propagée et représente une grande menace pour le public. Pour mieux comprendre l'impact de cette maladie sur la santé publique et l'économie, il est important d'étudier la dynamique de sa transmission. A cette fin, les auteurs de cet article proposent un nouveau modèle épidémiologique basé sur un ensemble d'équations différentielles ordinaires avec des paramètres inconnus et qui tient compte des infections asymptomatiques ainsi que du décalage entre l'apparition des symptômes et la date de confirmation de l'infection. Ils en déduisent le taux de reproduction de base qui traduit le potentiel de transmission d'un cas infecté. En utilisant le nombre rapporté de cas confirmés, les auteurs décrivent une procédure d'estimation des paramètres du modèle qui repose sur une adaptation de l'algorithme filtre itéré ­ filtre de Kalman énsemble àjustement (IF­EAKF). Une mise en application du modèle proposé est illustrée à travers l'examen des données COVID­19 du Québec pour la période du 2 avril 2020 au 10 mai 2020. Une analyse de sensibilité du modèle construit est explorée sous diverses hypothèses. Enfin, les auteurs ont fait appel à des études de simulation pour évaluer la performance du modèle proposé et ce sous différents scénarios.

8.
Biometrics ; 77(3): 956-969, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-32687216

RESUMEN

In survival data analysis, the Cox proportional hazards (PH) model is perhaps the most widely used model to feature the dependence of survival times on covariates. While many inference methods have been developed under such a model or its variants, those models are not adequate for handling data with complex structured covariates. High-dimensional survival data often entail several features: (1) many covariates are inactive in explaining the survival information, (2) active covariates are associated in a network structure, and (3) some covariates are error-contaminated. To hand such kinds of survival data, we propose graphical PH measurement error models and develop inferential procedures for the parameters of interest. Our proposed models significantly enlarge the scope of the usual Cox PH model and have great flexibility in characterizing survival data. Theoretical results are established to justify the proposed methods. Numerical studies are conducted to assess the performance of the proposed methods.


Asunto(s)
Modelos Estadísticos , Modelos de Riesgos Proporcionales , Análisis de Supervivencia
9.
J Med Virol ; 92(11): 2543-2550, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32470164

RESUMEN

The coronavirus disease-2019 (COVID-19) has been found to be caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, comprehensive knowledge of COVID-19 remains incomplete and many important features are still unknown. This manuscript conducts a meta-analysis and a sensitivity study to answer the questions: What is the basic reproduction number? How long is the incubation time of the disease on average? What portion of infections are asymptomatic? And ultimately, what is the case fatality rate? Our studies estimate the basic reproduction number to be 3.15 with the 95% CI (2.41-3.90), the average incubation time to be 5.08 days with the 95% CI (4.77-5.39) (in day), the asymptomatic infection rate to be 46% with the 95% CI (18.48%-73.60%), and the case fatality rate to be 2.72% with 95% CI (1.29%-4.16%) where asymptomatic infections are accounted for.


Asunto(s)
Infecciones Asintomáticas/epidemiología , Número Básico de Reproducción , COVID-19/mortalidad , COVID-19/virología , Periodo de Incubación de Enfermedades Infecciosas , SARS-CoV-2/fisiología , Humanos
10.
Stat Med ; 39(26): 3700-3719, 2020 11 20.
Artículo en Inglés | MEDLINE | ID: mdl-32914420

RESUMEN

In genetic association studies, mixed effects models have been widely used in detecting the pleiotropy effects which occur when one gene affects multiple phenotype traits. In particular, bivariate mixed effects models are useful for describing the association of a gene with a continuous trait and a binary trait. However, such models are inadequate to feature the data with response mismeasurement, a characteristic that is often overlooked. It has been well studied that in univariate settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose an induced likelihood approach and an EM algorithm method to handle measurement error in continuous response and misclassification in binary response simultaneously. Simulation studies confirm that the proposed methods successfully remove the bias induced from the response mismeasurement.


Asunto(s)
Sesgo , Estudios de Asociación Genética , Simulación por Computador , Funciones de Verosimilitud , Fenotipo
11.
Stat Med ; 39(4): 456-468, 2020 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-31802532

RESUMEN

Causal inference has been widely conducted in various fields and many methods have been proposed for different settings. However, for noisy data with both mismeasurements and missing observations, those methods often break down. In this paper, we consider a problem that binary outcomes are subject to both missingness and misclassification, when the interest is in estimation of the average treatment effects (ATE). We examine the asymptotic biases caused by ignoring missingness and/or misclassification and establish the intrinsic connections between missingness effects and misclassification effects on the estimation of ATE. We develop valid weighted estimation methods to simultaneously correct for missingness and misclassification effects. To provide protection against model misspecification, we further propose a doubly robust correction method which yields consistent estimators when either the treatment model or the outcome model is misspecified. Simulation studies are conducted to assess the performance of the proposed methods. An application to smoking cessation data is reported to illustrate the use of the proposed methods.


Asunto(s)
Modelos Estadísticos , Modelos Teóricos , Sesgo , Causalidad , Simulación por Computador , Humanos
12.
Lifetime Data Anal ; 26(3): 421-450, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-31432384

RESUMEN

It is well established that measurement error has drastically negative impact on data analysis. It can not only bias parameter estimates but may also cause loss of power for testing relationship between variables. Although survival analysis of error-contaminated data has attracted extensive interest, relatively little attention has been paid to dealing with survival data with error-contaminated covariates when the underlying population is characterized by a cured fraction. In this paper, we consider this problem for which lifetimes of the non-cured individuals are featured by the additive hazards model and the measurement error process is described by an additive model. Unlike estimating the relative risk in the proportional hazards model, the additive hazards model allows us to estimate the absolute risk difference associated with the covariates. To allow the model flexibility, we incorporate time-dependent covariates in the model. We develop estimation methods for the two scenarios, without or with measurement error. The proposed methods are evaluated from both the theoretical view point and the numerical perspectives. Furthermore, a real-life data application is presented to illustrate the utility of the methodology.


Asunto(s)
Modelos de Riesgos Proporcionales , Algoritmos , Sesgo , Simulación por Computador , Humanos , Análisis de Supervivencia
13.
Lifetime Data Anal ; 26(2): 369-388, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-31372924

RESUMEN

In survival analysis, accelerated failure time models are useful in modeling the relationship between failure times and the associated covariates, where covariate effects are assumed to appear in a linear form in the model. Such an assumption of covariate effects is, however, quite restrictive for many practical problems. To incorporate flexible nonlinear relationship between covariates and transformed failure times, we propose partially linear single index models to facilitate complex relationship between transformed failure times and covariates. We develop two inference methods which handle the unknown nonlinear function in the model from different perspectives. The first approach is weakly parametric which approximates the nonlinear function globally, whereas the second method is a semiparametric quasi-likelihood approach which focuses on picking up local features. We establish the asymptotic properties for the proposed methods. A real example is used to illustrate the usage of the proposed methods, and simulation studies are conducted to assess the performance of the proposed methods for a broad variety of situations.


Asunto(s)
Modelos Estadísticos , Análisis de Supervivencia , Algoritmos , Humanos , Proyectos de Investigación
14.
Stat Med ; 38(10): 1835-1854, 2019 05 10.
Artículo en Inglés | MEDLINE | ID: mdl-30609095

RESUMEN

Inverse probability weighting (IPW) estimation has been widely used in causal inference. Its validity relies on the important condition that the variables are precisely measured. This condition, however, is often violated, which distorts the IPW method and thus yields biased results. In this paper, we study the IPW estimation of average treatment effects for settings with mismeasured covariates and misclassified outcomes. We develop estimation methods to correct for measurement error and misclassification effects simultaneously. Our discussion covers a broad scope of treatment models, including typically assumed logistic regression models and general treatment assignment mechanisms. Satisfactory performance of the proposed methods is demonstrated by extensive numerical studies.


Asunto(s)
Algoritmos , Causalidad , Adulto , Anciano , Simulación por Computador , Ejercicio Físico , Femenino , Encuestas Epidemiológicas/estadística & datos numéricos , Humanos , Masculino , Persona de Mediana Edad , Probabilidad , Fumadores/estadística & datos numéricos , Cese del Hábito de Fumar/estadística & datos numéricos
15.
Biom J ; 61(6): 1507-1525, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31449324

RESUMEN

Inverse-probability-of-treatment weighted (IPTW) estimation has been widely used to consistently estimate the causal parameters in marginal structural models, with time-dependent confounding effects adjusted for. Just like other causal inference methods, the validity of IPTW estimation typically requires the crucial condition that all variables are precisely measured. However, this condition, is often violated in practice due to various reasons. It has been well documented that ignoring measurement error often leads to biased inference results. In this paper, we consider the IPTW estimation of the causal parameters in marginal structural models in the presence of error-contaminated and time-dependent confounders. We explore several methods to correct for the effects of measurement error on the estimation of causal parameters. Numerical studies are reported to assess the finite sample performance of the proposed methods.


Asunto(s)
Biometría/métodos , Análisis Multivariante , Probabilidad , Análisis de Regresión , Proyectos de Investigación , Factores de Tiempo
16.
Stat Med ; 36(20): 3231-3243, 2017 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-28766830

RESUMEN

Analysis of panel data is often challenged by the presence of heterogeneity and state misclassification. In this paper, we propose a hidden mover-stayer model to facilitate heterogeneity for a population that consists of two subpopulations each of movers or of stayers and to simultaneously account for state misclassification. We develop an inference procedure based on the expectation-maximization algorithm by treating the mover-stayer indicator and underlying true states as latent variables. We evaluate the performance of the proposed method and investigate the impact of ignoring misclassification through simulation studies. The proposed method is applied to analyze the data arising from the Waterloo Smoking Prevention Project. Copyright © 2017 John Wiley & Sons, Ltd.


Asunto(s)
Interpretación Estadística de Datos , Modelos Estadísticos , Algoritmos , Análisis de Varianza , Bioestadística , Niño , Simulación por Computador , Humanos , Estudios Longitudinales , Cadenas de Markov , Probabilidad , Fumar , Prevención del Hábito de Fumar/estadística & datos numéricos
17.
Lifetime Data Anal ; 22(3): 321-42, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-26328545

RESUMEN

Covariate measurement error occurs commonly in survival analysis. Under the proportional hazards model, measurement error effects have been well studied, and various inference methods have been developed to correct for error effects under such a model. In contrast, error-contaminated survival data under the additive hazards model have received relatively less attention. In this paper, we investigate this problem by exploring measurement error effects on parameter estimation and the change of the hazard function. New insights of measurement error effects are revealed, as opposed to well-documented results for the Cox proportional hazards model. We propose a class of bias correction estimators that embraces certain existing estimators as special cases. In addition, we exploit the regression calibration method to reduce measurement error effects. Theoretical results for the developed methods are established, and numerical assessments are conducted to illustrate the finite sample performance of our methods.


Asunto(s)
Modelos de Riesgos Proporcionales , Análisis de Supervivencia , Humanos , Análisis de Regresión
18.
Can J Stat ; 43(4): 498-518, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26877582

RESUMEN

In contrast to extensive attention on model selection for univariate data, research on model selection for longitudinal data remains largely unexplored. This is particularly the case when data are subject to missingness and measurement error. To address this important problem, we propose marginal methods that simultaneously carry out model selection and estimation for longitudinal data with missing responses and error-prone covariates. Our method have several appealing features: the applicability is broad because the methods are developed for a unified framework with marginal generalized linear models; model assumptions are minimal in that no full distribution is required for the response process and the distribution of the mismeasured covariates is left unspecified; and the implementation is straightforward. To justify the proposed methods, we provide both theoretical properties and numerical assessments.

19.
Biom J ; 56(1): 69-85, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24123126

RESUMEN

Marginal methods have been widely used for the analysis of longitudinal ordinal and categorical data. These models do not require full parametric assumptions on the joint distribution of repeated response measurements but only specify the marginal or even association structures. However, inference results obtained from these methods often incur serious bias when variables are subject to error. In this paper, we tackle the problem that misclassification exists in both response and categorical covariate variables. We develop a marginal method for misclassification adjustment, which utilizes second-order estimating functions and a functional modeling approach, and can yield consistent estimates and valid inference for mean and association parameters. We propose a two-stage estimation approach for cases in which validation data are available. Our simulation studies show good performance of the proposed method under a variety of settings. Although the proposed method is phrased to data with a longitudinal design, it also applies to correlated data arising from clustered and family studies, in which association parameters may be of scientific interest. The proposed method is applied to analyze a dataset from the Framingham Heart Study as an illustration.


Asunto(s)
Biometría/métodos , Adulto , Anciano , Cardiopatías/epidemiología , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Análisis Multivariante
20.
Stat Med ; 32(5): 833-48, 2013 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-22833460

RESUMEN

In longitudinal studies, missing observations occur commonly. It has been well known that biased results could be produced if missingness is not properly handled in the analysis. Authors have developed many methods with the focus on either incomplete response or missing covariate observations, but rarely on both. The complexity of modeling and computational difficulty would be the major challenges in handling missingness in both response and covariate variables. In this paper, we develop methods using the pairwise likelihood formulation to handle longitudinal binary data with missing observations present in both response and covariate variables. We propose a unified framework to accommodate various types of missing data patterns. We evaluate the performance of the methods empirically under a variety of circumstances. In particular, we investigate issues on efficiency and robustness. We analyze longitudinal data from the National Population Health Study with the use of our methods.


Asunto(s)
Bioestadística/métodos , Anciano , Interpretación Estadística de Datos , Encuestas Epidemiológicas/estadística & datos numéricos , Humanos , Funciones de Verosimilitud , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Modelos Estadísticos , Análisis de Regresión
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA