Búsqueda | Portal Regional de la BVS Paraguay

Nested case-control sampling without replacement.

Shin, Yei Eun; Saegusa, Takumi.

Lifetime Data Anal ; 2024 Sep 05.

Artículo en Inglés | MEDLINE | ID: mdl-39235702

RESUMEN

Nested case-control design (NCC) is a cost-effective outcome-dependent design in epidemiology that collects all cases and a fixed number of controls at the time of case diagnosis from a large cohort. Due to inefficiency relative to full cohort studies, previous research developed various estimation methodologies but changing designs in the formulation of risk sets was considered only in view of potential bias in the partial likelihood estimation. In this paper, we study a modified design that excludes previously selected controls from risk sets in view of efficiency improvement as well as bias. To this end, we extend the inverse probability weighting method of Samuelsen which was shown to outperform the partial likelihood estimator in the standard setting. We develop its asymptotic theory and a variance estimation of both regression coefficients and the cumulative baseline hazard function that takes account of the complex feature of the modified sampling design. In addition to good finite sample performance of variance estimation, simulation studies show that the modified design with the proposed estimator is more efficient than the standard design. Examples are provided using data from NIH-AARP Diet and Health Cohort Study.

Detecting survival-associated biomarkers from heterogeneous populations.

Saegusa, Takumi; Zhao, Zhiwei; Ke, Hongjie; Ye, Zhenyao; Xu, Zhongying; Chen, Shuo; Ma, Tianzhou.

Sci Rep ; 11(1): 3203, 2021 02 05.

Artículo en Inglés | MEDLINE | ID: mdl-33547332

RESUMEN

Detection of prognostic factors associated with patients' survival outcome helps gain insights into a disease and guide treatment decisions. The rapid advancement of high-throughput technologies has yielded plentiful genomic biomarkers as candidate prognostic factors, but most are of limited use in clinical application. As the price of the technology drops over time, many genomic studies are conducted to explore a common scientific question in different cohorts to identify more reproducible and credible biomarkers. However, new challenges arise from heterogeneity in study populations and designs when jointly analyzing the multiple studies. For example, patients from different cohorts show different demographic characteristics and risk profiles. Existing high-dimensional variable selection methods for survival analysis, however, are restricted to single study analysis. We propose a novel Cox model based two-stage variable selection method called "Cox-TOTEM" to detect survival-associated biomarkers common in multiple genomic studies. Simulations showed our method greatly improved the sensitivity of variable selection as compared to the separate applications of existing methods to each study, especially when the signals are weak or when the studies are heterogeneous. An application of our method to TCGA transcriptomic data identified essential survival associated genes related to the common disease mechanism of five Pan-Gynecologic cancers.

Asunto(s)

Biomarcadores de Tumor/genética , Genómica , Neoplasias/genética , Transcriptoma , Perfilación de la Expresión Génica , Humanos , Neoplasias/epidemiología , Pronóstico , Modelos de Riesgos Proporcionales , Análisis de Supervivencia

Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data.

Saegusa, Takumi; Ma, Tianzhou; Li, Gang; Chen, Ying Qing; Lee, Mei-Ling Ting.

Stat Biosci ; 12(3): 376-398, 2020 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-33796162

RESUMEN

The threshold regression model is an effective alternative to the Cox proportional hazards regression model when the proportional hazards assumption is not met. This paper considers variable selection for threshold regression. This model has separate regression functions for the initial health status and the speed of degradation in health. This flexibility is an important advantage when considering relevant risk factors for a complex time-to-event model where one needs to decide which variables should be included in the regression function for the initial health status, in the function for the speed of degradation in health, or in both functions. In this paper, we extend the broken adaptive ridge (BAR) method, originally designed for variable selection for one regression function, to simultaneous variable selection for both regression functions needed in the threshold regression model. We establish variable selection consistency of the proposed method and asymptotic normality of the estimator of non-zero regression coefficients. Simulation results show that our method outperformed threshold regression without variable selection and variable selection based on the Akaike information criterion. We apply the proposed method to data from an HIV drug adherence study in which electronic monitoring of drug intake is used to identify risk factors for non- adherence.

Joint Estimation of Precision Matrices in Heterogeneous Populations.

Saegusa, Takumi; Shojaie, Ali.

Electron J Stat ; 10(1): 1341-1392, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-28473876

RESUMEN

We introduce a general framework for estimation of inverse covariance, or precision, matrices from heterogeneous populations. The proposed framework uses a Laplacian shrinkage penalty to encourage similarity among estimates from disparate, but related, subpopulations, while allowing for differences among matrices. We propose an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation, as well as its extension for faster computation in high dimensions by thresholding the empirical covariance matrix to identify the joint block diagonal structure in the estimated precision matrices. We establish both variable selection and norm consistency of the proposed estimator for distributions with exponential or polynomial tails. Further, to extend the applicability of the method to the settings with unknown populations structure, we propose a Laplacian penalty based on hierarchical clustering, and discuss conditions under which this data-driven choice results in consistent estimation of precision matrices in heterogenous populations. Extensive numerical studies and applications to gene expression data from subtypes of cancer with distinct clinical outcomes indicate the potential advantages of the proposed method over existing approaches.

Comparative study of computational methods for reconstructing genetic networks of cancer-related pathways.

Sedaghat, Nafiseh; Saegusa, Takumi; Randolph, Timothy; Shojaie, Ali.

Cancer Inform ; 13(Suppl 2): 55-66, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25288880

RESUMEN

Network reconstruction is an important yet challenging task in systems biology. While many methods have been recently proposed for reconstructing biological networks from diverse data types, properties of estimated networks and differences between reconstruction methods are not well understood. In this paper, we conduct a comprehensive empirical evaluation of seven existing network reconstruction methods, by comparing the estimated networks with different sparsity levels for both normal and tumor samples. The results suggest substantial heterogeneity in networks reconstructed using different reconstruction methods. Our findings also provide evidence for significant differences between networks of normal and tumor samples, even after accounting for the considerable variability in structures of networks estimated using different reconstruction methods. These differences can offer new insight into changes in mechanisms of genetic interaction associated with cancer initiation and progression.

Hypothesis testing for an extended cox model with time-varying coefficients.

Saegusa, Takumi; Di, Chongzhi; Chen, Ying Qing.

Biometrics ; 70(3): 619-28, 2014 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-24888739

RESUMEN

The log-rank test has been widely used to test treatment effects under the Cox model for censored time-to-event outcomes, though it may lose power substantially when the model's proportional hazards assumption does not hold. In this article, we consider an extended Cox model that uses B-splines or smoothing splines to model a time-varying treatment effect and propose score test statistics for the treatment effect. Our proposed new tests combine statistical evidence from both the magnitude and the shape of the time-varying hazard ratio function, and thus are omnibus and powerful against various types of alternatives. In addition, the new testing framework is applicable to any choice of spline basis functions, including B-splines, and smoothing splines. Simulation studies confirm that the proposed tests performed well in finite samples and were frequently more powerful than conventional tests alone in many settings. The new methods were applied to the HIVNET 012 Study, a randomized clinical trial to assess the efficacy of single-dose Nevirapine against mother-to-child HIV transmission conducted by the HIV Prevention Trial Network.

Asunto(s)

Infecciones por VIH/mortalidad , Infecciones por VIH/prevención & control , Transmisión Vertical de Enfermedad Infecciosa/prevención & control , Nevirapina/administración & dosificación , Complicaciones Infecciosas del Embarazo/tratamiento farmacológico , Modelos de Riesgos Proporcionales , Algoritmos , Biometría/métodos , Simulación por Computador , Interpretación Estadística de Datos , Femenino , Humanos , Transmisión Vertical de Enfermedad Infecciosa/estadística & datos numéricos , Modelos Estadísticos , Embarazo , Complicaciones Infecciosas del Embarazo/mortalidad , Reproducibilidad de los Resultados , Sensibilidad y Especificidad

WEIGHTED LIKELIHOOD ESTIMATION UNDER TWO-PHASE SAMPLING.

Saegusa, Takumi; Wellner, Jon A.

Ann Stat ; 41(1): 269-295, 2013 Feb 01.

Artículo en Inglés | MEDLINE | ID: mdl-24563559

RESUMEN

We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko-Cantelli theorem, a theorem for rates of convergence of M-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase.

osDesign: An R Package for the Analysis, Evaluation, and Design of Two-Phase and Case-Control Studies.

Haneuse, Sebastien; Saegusa, Takumi; Lumley, Thomas.

J Stat Softw ; 43(11)2011 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-22545023

RESUMEN

The two-phase design has recently received attention in the statistical literature as an extension to the traditional case-control study for settings where a predictor of interest is rare or subject to missclassification. Despite a thorough methodological treatment and the potential for substantial efficiency gains, the two-phase design has not been widely adopted. This may be due, in part, to a lack of general-purpose, readily-available software. The osDesign package for R provides a suite of functions for analyzing data from a two-phase and/or case-control design, as well as evaluating operating characteristics, including bias, efficiency and power. The evaluation is simulation-based, permitting flexible application of the package to a broad range of scientific settings. Using lung cancer mortality data from Ohio, the package is illustrated with a detailed case-study in which two statistical goals are considered: (i) the evaluation of small-sample operating characteristics for two-phase and case-control designs and (ii) the planning and design of a future two-phase study.

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA