Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
1.
Biometrics ; 80(3)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39101549

RESUMO

Many existing methodologies for analyzing spatiotemporal point patterns are developed based on the assumption of stationarity in both space and time for the second-order intensity or pair correlation. In practice, however, such an assumption often lacks validity or proves to be unrealistic. In this paper, we propose a novel and flexible nonparametric approach for estimating the second-order characteristics of spatiotemporal point processes, accommodating non-stationary temporal correlations. Our proposed method employs kernel smoothing and effectively accounts for spatial and temporal correlations differently. Under a spatially increasing-domain asymptotic framework, we establish consistency of the proposed estimators, which can be constructed using different first-order intensity estimators to enhance practicality. Simulation results reveal that our method, in comparison with existing approaches, significantly improves statistical efficiency. An application to a COVID-19 dataset further illustrates the flexibility and interpretability of our procedure.


Assuntos
COVID-19 , Simulação por Computador , Análise Espaço-Temporal , Humanos , Estatísticas não Paramétricas , Modelos Estatísticos , SARS-CoV-2 , Biometria/métodos , Interpretação Estatística de Dados
2.
J Am Stat Assoc ; 119(545): 297-307, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38716406

RESUMO

The weighted nearest neighbors (WNN) estimator has been popularly used as a flexible and easy-to-implement nonparametric tool for mean regression estimation. The bagging technique is an elegant way to form WNN estimators with weights automatically generated to the nearest neighbors (Steele, 2009; Biau et al., 2010); we name the resulting estimator as the distributional nearest neighbors (DNN) for easy reference. Yet, there is a lack of distributional results for such estimator, limiting its application to statistical inference. Moreover, when the mean regression function has higher-order smoothness, DNN does not achieve the optimal nonparametric convergence rate, mainly because of the bias issue. In this work, we provide an in-depth technical analysis of the DNN, based on which we suggest a bias reduction approach for the DNN estimator by linearly combining two DNN estimators with different subsampling scales, resulting in the novel two-scale DNN (TDNN) estimator. The two-scale DNN estimator has an equivalent representation of WNN with weights admitting explicit forms and some being negative. We prove that, thanks to the use of negative weights, the two-scale DNN estimator enjoys the optimal nonparametric rate of convergence in estimating the regression function under the fourth-order smoothness condition. We further go beyond estimation and establish that the DNN and two-scale DNN are both asymptotically normal as the subsampling scales and sample size diverge to infinity. For the practical implementation, we also provide variance estimators and a distribution estimator using the jackknife and bootstrap techniques for the two-scale DNN. These estimators can be exploited for constructing valid confidence intervals for nonparametric inference of the regression function. The theoretical results and appealing finite-sample performance of the suggested two-scale DNN method are illustrated with several simulation examples and a real data application.

3.
Math Biosci Eng ; 20(11): 20345-20377, 2023 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-38052648

RESUMO

The existence and consistency of a maximum likelihood estimator for the joint probability distribution of random parameters in discrete-time abstract parabolic systems was established by taking a nonparametric approach in the context of a mixed effects statistical model using a Prohorov metric framework on a set of feasible measures. A theoretical convergence result for a finite dimensional approximation scheme for computing the maximum likelihood estimator was also established and the efficacy of the approach was demonstrated by applying the scheme to the transdermal transport of alcohol modeled by a random parabolic partial differential equation (PDE). Numerical studies included show that the maximum likelihood estimator is statistically consistent, demonstrated by the convergence of the estimated distribution to the "true" distribution in an example involving simulated data. The algorithm developed was then applied to two datasets collected using two different transdermal alcohol biosensors. Using the leave-one-out cross-validation (LOOCV) method, we found an estimate for the distribution of the random parameters based on a training set. The input from a test drinking episode was then used to quantify the uncertainty propagated from the random parameters to the output of the model in the form of a 95 error band surrounding the estimated output signal.


Assuntos
Técnicas Biossensoriais , Modelos Estatísticos , Probabilidade , Algoritmos , Etanol
4.
Stat Med ; 42(12): 1995-2008, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-36945185

RESUMO

We consider nonparametrically estimating the joint distribution of a survival time and mark variable, where the survival time is subject to right censoring and the mark variable is only observed when the survival time is not censored. The possibility of dependent censoring is allowed for using inverse probability of censoring weights. The proposed estimator is shown to be consistent and asymptotically normal. Finite sample behavior of the proposed methods are investigated via simulation study. Finally, we illustrate the nonparametric estimator from a recent HIV vaccine efficacy trial.


Assuntos
Análise de Sobrevida , Humanos , Probabilidade , Simulação por Computador
5.
Biometrics ; 79(2): 788-798, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35426444

RESUMO

Identifying effective and valid surrogate markers to make inference about a treatment effect on long-term outcomes is an important step in improving the efficiency of clinical trials. Replacing a long-term outcome with short-term and/or cheaper surrogate markers can potentially shorten study duration and reduce trial costs. There is sizable statistical literature on methods to quantify the effectiveness of a single surrogate marker. Both parametric and nonparametric approaches have been well developed for different outcome types. However, when there are multiple markers available, methods for combining markers to construct a composite marker with improved surrogacy remain limited. In this paper, building on top of the optimal transformation framework of Wang et al. (2020), we propose a novel calibrated model fusion approach to optimally combine multiple markers to improve surrogacy. Specifically, we obtain two initial estimates of optimal composite scores of the markers based on two sets of models with one set approximating the underlying data distribution and the other directly approximating the optimal transformation function. We then estimate an optimal calibrated combination of the two estimated scores which ensures both validity of the final combined score and optimality with respect to the proportion of treatment effect explained by the final combined score. This approach is unique in that it identifies an optimal combination of the multiple surrogates without strictly relying on parametric assumptions while borrowing modeling strategies to avoid fully nonparametric estimation which is subject to the curse of dimensionality. Our identified optimal transformation can also be used to directly quantify the surrogacy of this identified combined score. Theoretical properties of the proposed estimators are derived, and the finite sample performance of the proposed method is evaluated through simulation studies. We further illustrate the proposed method using data from the Diabetes Prevention Program study.


Assuntos
Modelos Estatísticos , Simulação por Computador , Biomarcadores
6.
Psychometrika ; 88(1): 51-75, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-35972628

RESUMO

A number of parametric and nonparametric methods for estimating cognitive diagnosis models (CDMs) have been developed and applied in a wide range of contexts. However, in the literature, a wide chasm exists between these two families of methods, and their relationship to each other is not well understood. In this paper, we propose a unified estimation framework to bridge the divide between parametric and nonparametric methods in cognitive diagnosis to better understand their relationship. We also develop iterative joint estimation algorithms and establish consistency properties within the proposed framework. Lastly, we present comprehensive simulation results to compare different methods and provide practical recommendations on the appropriate use of the proposed framework in various CDM contexts.


Assuntos
Algoritmos , Cognição , Funções Verossimilhança , Psicometria/métodos , Simulação por Computador
7.
Biostatistics ; 24(2): 518-537, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-34676400

RESUMO

Instrumental variable (IV) methods allow us the opportunity to address unmeasured confounding in causal inference. However, most IV methods are only applicable to discrete or continuous outcomes with very few IV methods for censored survival outcomes. In this article, we propose nonparametric estimators for the local average treatment effect on survival probabilities under both covariate-dependent and outcome-dependent censoring. We provide an efficient influence function-based estimator and a simple estimation procedure when the IV is either binary or continuous. The proposed estimators possess double-robustness properties and can easily incorporate nonparametric estimation using machine learning tools. In simulation studies, we demonstrate the flexibility and double robustness of our proposed estimators under various plausible scenarios. We apply our method to the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial for estimating the causal effect of screening on survival probabilities and investigate the causal contrasts between the two interventions under different censoring assumptions.


Assuntos
Simulação por Computador , Humanos , Causalidade , Probabilidade
8.
J Appl Stat ; 49(15): 3908-3927, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36324481

RESUMO

In this study, we consider different poverty indexes in a dynamic framework where individuals change their rate of income randomly in time. The primary objective of this paper is to assess the accuracy of the approximation of the indexes that can be obtained by applying the strong law of large numbers to an economic system composed of an infinite number of agents. The main result is a multivariate central limit theorem for dynamic poverty measures, which is obtained applying the theory of U-statistics. We also show how to get the confidence sets for the considered dynamic indexes, which show the appropriateness of the model. An application to the Italian income data from 1998 to 2012 confirms the effectiveness of the considered approach and the possibility to determine the evolution of poverty and inequality in real economies.

9.
Stat Med ; 41(26): 5290-5304, 2022 11 20.
Artigo em Inglês | MEDLINE | ID: mdl-36062392

RESUMO

In comparative effectiveness research (CER), leveraging short-term surrogates to infer treatment effects on long-term outcomes can guide policymakers evaluating new treatments. Numerous statistical procedures for identifying surrogates have been proposed for randomized clinical trials (RCTs), but no methods currently exist to evaluate the proportion of treatment effect (PTE) explained by surrogates in real-world data (RWD), which have become increasingly common. To address this knowledge gap, we propose inverse probability weighted (IPW) and doubly robust (DR) estimators of an optimal transformation of the surrogate and the corresponding PTE measure. We demonstrate that the proposed estimators are consistent and asymptotically normal, and the DR estimator is consistent when either the propensity score model or outcome regression model is correctly specified. Our proposed estimators are evaluated through extensive simulation studies. In two RWD settings, we show that our method can identify and validate surrogate markers for inflammatory bowel disease (IBD).


Assuntos
Pesquisa Comparativa da Efetividade , Modelos Estatísticos , Humanos , Simulação por Computador , Pontuação de Propensão , Biomarcadores
10.
Ann Stat ; 50(1): 487-510, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35813218

RESUMO

In long-term follow-up studies, data are often collected on repeated measures of multivariate response variables as well as on time to the occurrence of a certain event. To jointly analyze such longitudinal data and survival time, we propose a general class of semiparametric latent-class models that accommodates a heterogeneous study population with flexible dependence structures between the longitudinal and survival outcomes. We combine nonparametric maximum likelihood estimation with sieve estimation and devise an efficient EM algorithm to implement the proposed approach. We establish the asymptotic properties of the proposed estimators through novel use of modern empirical process theory, sieve estimation theory, and semiparametric efficiency theory. Finally, we demonstrate the advantages of the proposed methods through extensive simulation studies and provide an application to the Atherosclerosis Risk in Communities study.

11.
Land use policy ; 119: 106191, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35665311

RESUMO

The ongoing pandemic has led to substantial volatility in residential housing markets. However, relatively little is known about whether the volatility is dominated by housing demand or supply, and how different priced markets contribute to the volatility. This article first examines the temporal effect of COVID-19 on house prices, housing demand, and supply in Los Angeles, and second explores the effect heterogeneity in luxury and low-end housing markets within the city. For identification, the article employs a revised difference-in-differences (DID) method that controls more rigorously for unobservables and improves on the traditional DID with smaller prior trends. Using individual level data, the result first shows that, in response to the outbreak, house prices, demand, and supply all decreased in March to May 2020 and increased in July and August 2020, with demand dominating the process. Second, the heterogeneity exploration identifies diverging COVID-19 impacts in higher- and lower- priced markets. Particularly, the decline in overall price and demand before June originates mainly from the lower-priced market while the higher-priced one experienced limited changes in demand. After July, higher-priced markets led housing market's surge in price, demand, and supply, whereas the lower-priced market has not fully recovered from decreases in house prices and housing demand. Finally, a larger price decline in lower-priced markets is found to be associated with higher service shares and lower homeownership rates. The results not only facilitate market participants in their decision making but also aid local governments in formulating policies and allocating subsidies to mitigate the effects of the outbreak.

12.
Stat Med ; 41(18): 3561-3578, 2022 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-35608143

RESUMO

We consider survival data that combine three types of observations: uncensored, right-censored, and left-censored. Such data arises from screening a medical condition, in situations where self-detection arises naturally. Our goal is to estimate the failure-time distribution, based on these three observation types. We propose a novel methodology for distribution estimation using both semiparametric and nonparametric techniques. We then evaluate the performance of these estimators via simulated data. Finally, as a case study, we estimate the patience of patients who arrive at an emergency department and wait for treatment. Three categories of patients are observed: those who leave the system and announce it, and thus their patience time is observed; those who get service and thus their patience time is right-censored by the waiting time; and those who leave the system without announcing it. For this third category, the patients' absence is revealed only when they are called to service, which is after they have already left; formally, their patience time is left-censored. Other applications of our proposed methodology are discussed.

13.
BMC Public Health ; 22(1): 871, 2022 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-35501734

RESUMO

BACKGROUND: During a fast-moving epidemic, timely monitoring of case counts and other key indicators of disease spread is critical to an effective public policy response. METHODS: We describe a nonparametric statistical method, originally applied to the reporting of AIDS cases in the 1980s, to estimate the distribution of reporting delays of confirmed COVID-19 cases in New York City during the late summer and early fall of 2020. RESULTS: During August 15-September 26, the estimated mean delay in reporting was 3.3 days, with 87% of cases reported by 5 days from diagnosis. Relying upon the estimated reporting-delay distribution, we projected COVID-19 incidence during the most recent 3 weeks as if each case had instead been reported on the same day that the underlying diagnostic test had been performed. Applying our delay-corrected estimates to case counts reported as of September 26, we projected a surge in new diagnoses that had already occurred but had yet to be reported. Our projections were consistent with counts of confirmed cases subsequently reported by November 7. CONCLUSION: The projected estimate of recently diagnosed cases could have had an impact on timely policy decisions to tighten social distancing measures. While the recent advent of widespread rapid antigen testing has changed the diagnostic testing landscape considerably, delays in public reporting of SARS-CoV-2 case counts remain an important barrier to effective public health policy.


Assuntos
Síndrome da Imunodeficiência Adquirida , COVID-19 , Síndrome da Imunodeficiência Adquirida/epidemiologia , COVID-19/epidemiologia , Humanos , Cidade de Nova Iorque/epidemiologia , SARS-CoV-2 , Fatores de Tempo
14.
Automatica (Oxf) ; 140: 110265, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35400084

RESUMO

Quantitative assessment of the infection rate of a virus is key to monitor the evolution of an epidemic. However, such variable is not accessible to direct measurement and its estimation requires the solution of a difficult inverse problem. In particular, being the result not only of biological but also of social factors, the transmission dynamics can vary significantly in time. This makes questionable the use of parametric models which could be unable to capture their full complexity. In this paper we exploit compartmental models which include important COVID-19 peculiarities (like the presence of asymptomatic individuals) and allow the infection rate to assume any continuous-time profile. We show that these models are universal, i.e. capable to reproduce exactly any epidemic evolution, and extract from them closed-form expressions of the infection rate time-course. Building upon such expressions, we then design a regularized estimator able to reconstruct COVID-19 transmission dynamics in continuous-time. Using real data collected in Italy, our technique proves to be an useful tool to monitor COVID-19 transmission dynamics and to predict and assess the effect of lockdown restrictions.

15.
Test (Madr) ; 31(4): 931-949, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35382496

RESUMO

Joint distribution between two or more variables could be influenced by the outcome of a conditioning variable. In this paper, we propose a flexible Wald-type statistic to test for such influence. The test is based on a conditioned multivariate Kendall's tau nonparametric estimator. The asymptotic properties of the test statistic are established under different null hypotheses to be tested for, such as conditional independence or testing for constant conditional dependence. Two simulation studies are presented: The first shows that the estimator proposed and the bandwidth selection procedure perform well. The second presents different bivariate and multivariate models to check the size and power of the test and runs comparisons with previous proposals when appropriate. The results support the contention that the test is accurate even in complex situations and that its computational cost is low. As an empirical application, we study the dependence between some pillars of European Regional Competitiveness when conditioned on the quality of regional institutions. We find interesting results, such as weaker links between innovation and higher education in regions with lower institutional quality. Supplementary Information: The online version contains supplementary material available at 10.1007/s11749-022-00806-1.

16.
BMC Med Res Methodol ; 22(1): 10, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34996366

RESUMO

When modelling the survival distribution of a disease for which the symptomatic progression of the associated condition is insidious, it is not always clear how to measure the failure/censoring times from some true date of disease onset. In a prevalent cohort study with follow-up, one approach for removing any potential influence from the uncertainty in the measurement of the true onset dates is through the utilization of only the residual lifetimes. As the residual lifetimes are measured from a well-defined screening date (prevalence day) to failure/censoring, these observed time durations are essentially error free. Using residual lifetime data, the nonparametric maximum likelihood estimator (NPMLE) may be used to estimate the underlying survival function. However, the resulting estimator can yield exceptionally wide confidence intervals. Alternatively, while parametric maximum likelihood estimation can yield narrower confidence intervals, it may not be robust to model misspecification. Using only right-censored residual lifetime data, we propose a stacking procedure to overcome the non-robustness of model misspecification; our proposed estimator comprises a linear combination of individual nonparametric/parametric survival function estimators, with optimal stacking weights obtained by minimizing a Brier Score loss function.


Assuntos
Estudos de Coortes , Simulação por Computador , Humanos , Funções Verossimilhança , Análise de Sobrevida , Incerteza
17.
Biometrics ; 78(4): 1390-1401, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-34389985

RESUMO

There is often delayed entry into observational studies, which results in left truncation. In the estimation of the distribution of time-to-event from left-truncated data, standard survival analysis methods require quasi-independence between the truncation time and event time. Incorrectly assuming quasi-independence may lead to biased estimation. We address the problem of estimation of the survival distribution when dependence between the event time and its left truncation time is induced by shared covariates. We introduce propensity scores for truncated data and propose two inverse probability weighting methods that adjust for both truncation and dependence, if all of the shared covariates are measured. The proposed methods additionally allow for right censoring. We evaluate the proposed methods in simulations, conduct sensitivity analyses, and provide guidelines for use in practice. We illustrate our approach in application to data from a central nervous system lymphoma study. The proposed methods are implemented in the R package, depLT.


Assuntos
Modelos Estatísticos , Análise de Sobrevida , Pontuação de Propensão , Simulação por Computador
18.
Biometrics ; 78(1): 165-178, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-33140426

RESUMO

A flexible class of semiparametric partly linear frailty transformation models is considered for analyzing clustered interval-censored data, which arise naturally in complex diseases and dental research. This class of models features two nonparametric components, resulting in a nonparametric baseline survival function and a potential nonlinear effect of a continuous covariate. The dependence among failure times within a cluster is induced by a shared, unobserved frailty term. A sieve maximum likelihood estimation method based on piecewise linear functions is proposed. The proposed estimators of the regression, dependence, and transformation parameters are shown to be strongly consistent and asymptotically normal, whereas the estimators of the two nonparametric functions are strongly consistent with optimal rates of convergence. An extensive simulation study is conducted to study the finite-sample performance of the proposed estimators. We provide an application to a dental study for illustration.


Assuntos
Fragilidade , Simulação por Computador , Humanos , Funções Verossimilhança , Modelos Lineares , Modelos Estatísticos
19.
Stat Methods Med Res ; 30(11): 2428-2446, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34519231

RESUMO

Ultrahigh-dimensional gene features are often collected in modern cancer studies in which the number of gene features p is extremely larger than sample size n. While gene expression patterns have been shown to be related to patients' survival in microarray-based gene expression studies, one has to deal with the challenges of ultrahigh-dimensional genetic predictors for survival predicting and genetic understanding of the disease in precision medicine. The problem becomes more complicated when two types of survival endpoints, distant metastasis-free survival and overall survival, are of interest in the study and outcome data can be subject to semi-competing risks due to the fact that distant metastasis-free survival is possibly censored by overall survival but not vice versa. Our focus in this paper is to extract important features, which have great impacts on both distant metastasis-free survival and overall survival jointly, from massive gene expression data in the semi-competing risks setting. We propose a model-free screening method based on the ranking of the correlation between gene features and the joint survival function of two endpoints. The method accounts for the relationship between two endpoints in a simply defined utility measure that is easy to understand and calculate. We show its favorable theoretical properties such as the sure screening and ranking consistency, and evaluate its finite sample performance through extensive simulation studies. Finally, an application to classifying breast cancer data clearly demonstrates the utility of the proposed method in practice.


Assuntos
Neoplasias da Mama , Neoplasias da Mama/genética , Simulação por Computador , Detecção Precoce de Câncer , Feminino , Humanos , Programas de Rastreamento , Modelos Estatísticos
20.
Stat Med ; 40(28): 6321-6343, 2021 12 10.
Artigo em Inglês | MEDLINE | ID: mdl-34474500

RESUMO

The potential benefit of using a surrogate marker in place of a long-term primary outcome is very attractive in terms of the impact on study length and cost. Many available methods for quantifying the effectiveness of a surrogate endpoint either rely on strict parametric modeling assumptions or require that the primary outcome and surrogate marker are fully observed that is, not subject to censoring. Moreover, available methods for quantifying surrogacy typically provide a proportion of treatment effect explained (PTE) measure and do not directly address the important questions of whether and how the trial can be ended earlier using the surrogate marker. In this article, we specifically address these important questions by proposing a PTE measure to quantify the feasibility of ending trials early based on endpoint information collected at an earlier landmark point t0 in a time-to-event outcome setting. We provide a framework for deriving an optimally predicted outcome for individual patients at t0 based on a combination of surrogate marker and event time information in the presence of censoring. We propose a non-parametric estimator for the PTE measure and derive the asymptotic properties of our estimators. Finite sample performance of our estimators are illustrated via extensive simulation studies and a real data application examining the potential of hemoglobin A1c and fasting plasma glucose to predict treatment effects on long term diabetes risk based on the Diabetes Prevention Program study.


Assuntos
Ensaios Clínicos como Assunto , Biomarcadores , Simulação por Computador , Estudos de Viabilidade , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...