Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 61
Filter
1.
J Appl Stat ; 51(3): 430-450, 2024.
Article in English | MEDLINE | ID: mdl-38370272

ABSTRACT

The Early Childhood Longitudinal Study-Kindergarten Class of 2010-2011 (ECLS-K:2011) ascertained timing of ear infections within age specified intervals and parent's/caregiver's report of medically diagnosed hearing loss. In this nationally representative, school-based sample of children followed from kindergarten entry through fifth grade, academic performance in reading, mathematics, and science was assessed longitudinally. Prior investigations of this ECLS-K:2011 cohort showed that age has a non-linear, monotonically increasing functional relationship with academic performance. Because of this knowledge, a semiparametric partial linear model is proposed, in which the effect of age is modeled by an unknown monotonically increasing function along with other regression parameters. The parameters are estimated by a semiparametric maximum likelihood estimator. A test of a constant effect of age is also proposed. Simulation studies are conducted to evaluate the performance of the proposed method, as compared with the commonly used linear model; the former outperforms the latter based on several criteria. We then analyzed ECLS-K:2011 data to compare results of the partial linear parametric model estimation with that of classical linear regression models.

2.
J Data Sci ; 21(4): 681-695, 2023 Oct.
Article in English | MEDLINE | ID: mdl-38623143

ABSTRACT

Single-index models are becoming increasingly popular in many scientific applications as they offer the advantages of flexibility in regression modeling as well as interpretable covariate effects. In the context of survival analysis, the single-index hazards models are natural extensions of the Cox proportional hazards models. In this paper, we propose a novel estimation procedure for single-index hazard models under a monotone constraint of the index. We apply the profile likelihood method to obtain the semiparametric maximum likelihood estimator, where the novelty of the estimation procedure lies in estimating the unknown monotone link function by embedding the problem in isotonic regression with exponentially distributed random variables. The consistency of the proposed semiparametric maximum likelihood estimator is established under suitable regularity conditions. Numerical simulations are conducted to examine the finite-sample performance of the proposed method. An analysis of breast cancer data is presented for illustration.

3.
Int J Biostat ; 2022 Nov 28.
Article in English | MEDLINE | ID: mdl-36433631

ABSTRACT

With our increased ability to capture large data, causal inference has received renewed attention and is playing an ever-important role in biomedicine and economics. However, one major methodological hurdle is that existing methods rely on many unverifiable model assumptions. Thus robust modeling is a critically important approach complementary to sensitivity analysis, where it compares results under various model assumptions. The more robust a method is with respect to model assumptions, the more worthy it is. The doubly robust estimator (DRE) is a significant advance in this direction. However, in practice, many outcome measures are functionals of multiple distributions, and so are the associated estimands, which can only be estimated via U-statistics. Thus most existing DREs do not apply. This article proposes a broad class of highly robust U-statistic estimators (HREs), which use semiparametric specifications for both the propensity score and outcome models in constructing the U-statistic. Thus, the HRE is more robust than the existing DREs. We derive comprehensive asymptotic properties of the proposed estimators and perform extensive simulation studies to evaluate their finite sample performance and compare them with the corresponding parametric U-statistics and the naive estimators, which show significant advantages. Then we apply the method to analyze a clinical trial from the AIDS Clinical Trials Group.

4.
J Biopharm Stat ; 32(4): 627-640, 2022 07 04.
Article in English | MEDLINE | ID: mdl-35867402

ABSTRACT

Global clinical trials involving multiple regions are common in current drug development processes. Determining the regional treatment effects of a new therapy over an existing therapy is important to both the sponsors and the regulatory agencies in the regions. Existing methods are mainly for continuous primary endpoints and use subjectively specified models, which may deviate from the true model. Here, we consider trials that have ordinal responses as the primary endpoint. This article extends the recently developed robust semiparametric ordinal regression model to estimate regional treatment effects, in which the regression coefficients and regional effects are modeled parametrically for ease of interpretation, and the regression link function is specified nonparametrically for robustness. The model parameters are estimated by semiparametric maximum likelihood estimation, and the null hypothesis of no regional effect is tested by the Wald test. Simulation studies are conducted to evaluate the performance of the proposed method and compare it with the commonly used parametric model. The results of the former show an improved overall performance over the latter. In particular, the model yields much higher precision in estimation and prediction than the fixed-link model. This result is especially appealing since our interest is to estimate the treatment effect more efficiently and the estimand is of particular interest in multiregional clinical trials. We then apply the method by analyzing real multiregional clinical trials with ordinal responses as their primary endpoint.


Subject(s)
Research Design , Computer Simulation , Humans , Randomized Controlled Trials as Topic
5.
J Pain Res ; 15: 1719-1728, 2022.
Article in English | MEDLINE | ID: mdl-35734509

ABSTRACT

Purpose: This study aimed to investigate the use of the percutaneous intervertebral foramen lens technology for secondary molding of the intervertebral foramen in the treatment of calcified lumbar discs. Methods: The study included 104 patients who were divided into two groups. Group A comprised 50 patients with calcified lumbar disc herniation and group B comprised 54 patients with non-calcified lumbar disc herniation diagnosed by computed tomography and magnetic resonance imaging. Patients underwent a percutaneous endoscopic lumbar discectomy at our hospital from January 1, 2017, to December 31, 2019. Demographic characteristics before the surgery and perioperative outcomes were retrospectively reviewed. The treatment outcome was analyzed using the numerical rating scale (NRS) score, Oswestry Disability Index (ODI) score, and modified Macnab criteria. Results: Patients in groups A and B showed significant improvement in both the NRS and ODI scores after the surgery and maintained relatively low ODI and NRS scores during subsequent follow-ups. According to the evaluation under the modified MacNab standard, the good-excellent rate of clinical efficacy was 94% in group A and 92.6% in group B at the 3 month follow-up. In group A, one patient developed neck pain during the surgery, which was diagnosed as spinal hypertension syndrome, and the surgery was suspended until the patient's condition improved. No similar complications occurred in group B. In both the groups, no patient reported any dural leak, infection, or other related complications. Conclusion: The use of transforaminal remolding technology can significantly improve the symptoms and dysfunction of patients with calcified and non calcified lumbar disc herniation. There are few intraoperative and postoperative complications and have little impact on vertebral stability. It can provide a reference for the treatment of special types of lumbar disc herniation.

6.
Stat Med ; 41(1): 180-193, 2022 01 15.
Article in English | MEDLINE | ID: mdl-34672000

ABSTRACT

Regression is a commonly used statistical model. It is the conditional mean of the response given covariates µ(x)=E(Y|X=x) . However, in some practical problems, the interest is the conditional mean of the response given the covariates belonging to some set A. Notably, in precision medicine and subgroup analysis in clinical trials, the aim is to identify subjects who benefit the most from the treatment, or identify an optimal set in the covariate space which manifests treatment favoritism if a subject's covariates fall in this set and the subject is classified to the favorable treatment subgroup. Existing methods for subgroup analysis achieve this indirectly by using classical regression. This motivates us to develop a new type of regression: set-regression, defined as µ(A)=E(Y|X∈A) which directly addresses the subgroup analysis problem. This extends not only the classical regression model but also improves recursive partitioning and support vector machine approaches, and is particularly suitable for objectives involving optimization of the regression over sets, such as subgroup analysis. We show that the new versatile set-regression identifies the subgroup with increased accuracy. It is easy to use. Simulation studies also show superior performance of the proposed method in finite samples.


Subject(s)
Models, Statistical , Research Design , Clinical Trials as Topic , Computer Simulation , Humans , Regression Analysis , Support Vector Machine
7.
Biometrics ; 78(4): 1464-1474, 2022 12.
Article in English | MEDLINE | ID: mdl-34492116

ABSTRACT

In this paper, we propose a semiparametric regression model that is built upon an isotonic regression model with the assumption that the random error follows a skewed distribution. We develop an expectation-maximization algorithm for obtaining the maximum likelihood estimates of the model parameters, examine the asymptotic properties of the estimators, conduct simulation studies to explore the performance of the proposed model, and apply the method to evaluate the DNA-RNA-protein relationship and identify genes that are key factors in tumor progression.


Subject(s)
Algorithms , Models, Statistical , Likelihood Functions , Computer Simulation , DNA
8.
Pharm Stat ; 21(1): 133-149, 2022 01.
Article in English | MEDLINE | ID: mdl-34350678

ABSTRACT

In multiregional randomized clinical trials (MRCTs), determining the regional treatment effect of a new treatment over an existing one is important to both the sponsor and related regulatory agencies. Also of particular interest is to test the null hypothesis that the treatment benefit is the same among all the regions. Existing methods are mainly for continuous endpoint and use parametric models, which are not robust. MRCTs are known for facing increased variation and heterogeneity and a robust model for its design and analysis would be desirable. We consider clinical trials with a binary primary endpoint and propose a robust semiparametric logistic model which has a known parametric and an unknown nonparametric component. The parametric component represents our prior knowledge about the model, and the nonparametric part reflects uncertainty. Compared to the classic logistic model for this problem, the proposed model has the following advantages: robust to model assumption, more flexible and accurate to model the relationship between the response and covariates, and possibly more accurate parameter estimates. The model parameters are estimated by profile maximum likelihood approach, and the null hypothesis of regional treatment difference being the same is tested by the profile likelihood ratio statistic. Asymptotic properties of the estimates are derived. Simulation studies are conducted to evaluate the performance of the proposed model, which demonstrated clear advantages over the classic logistic model. The method is then applied to analyzing a real MRCT.


Subject(s)
Models, Statistical , Computer Simulation , Humans , Likelihood Functions , Logistic Models , Randomized Controlled Trials as Topic
9.
Biometrics ; 78(4): 1475-1488, 2022 12.
Article in English | MEDLINE | ID: mdl-34181761

ABSTRACT

Personalized medicine allows individuals to choose the best fit of their treatments based on their characteristics through an individualized treatment regime. In this paper, we develop a pool adjacent violators algorithm-assisted learning method to find the optimal individualized treatment regime under the monotone single-index outcome gain model. The proposed estimator is more efficient than peers, and it is robust to the misspecification of the propensity score model or the baseline regression model. The optimal treatment regime is also robust to the misspecification of the functional form of the expected outcome gain model. Simulation studies verified our theoretical results. We also provide an estimate of the expected outcome gain model. Plotting the expected outcome gain versus an individual's characteristics index can visualize how significant the treatment effect is over the control. We apply the proposed method to an AIDS study.


Subject(s)
Algorithms , Models, Statistical , Humans , Computer Simulation , Precision Medicine/methods , Propensity Score
10.
Biom J ; 64(3): 506-522, 2022 03.
Article in English | MEDLINE | ID: mdl-34897799

ABSTRACT

In clinical trials, treatment effects often vary from subject to subject. Some subjects may benefit more than others from a specific treatment. One of the aims of subgroup analysis is to identify if there are subgroups of subjects with differential treatment effects. As in standard analysis, we first test if subgroups with differential treatment effects exist; if they do, we classify the subjects into different subgroups based on their covariate profiles; otherwise, we conclude no subgroups have differential treatment effects in this population. Existing methods utilize regression models, particularly linear models, for such analysis. However, in practice, not all effects of covariates on responses are linear. To address this issue, the article proposes a more flexible model, the partial linear model with a nonlinear monotone function to describe some specific effects of covariates and with a linear component to describe the effects of other covariates, develops model-fitting algorithm and derives model asymptotics. We then utilize the Wald statistic to test the existence of subgroups and the Neyman-Pearson rule to classify subjects into the subgroups. Simulation studies are conducted to evaluate the finite sample performance of the proposed method by comparing it with the commonly used linear models. Finally, we apply the methods to analyzing a real clinical trial.


Subject(s)
Algorithms , Research Design , Computer Simulation , Humans , Linear Models
11.
Ann Appl Stat ; 15(3): 1291-1307, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34745408

ABSTRACT

For certain subtypes of breast cancer, study findings show that their level of estrogen receptor expression is associated with their risk of cancer death, and also suggests a non-linear effect on the hazard of death. A flexible form of the proportional hazards model, λ(t∣x, z ) = λ(t) exp( z T ß )q(x), is desirable to facilitate a rich class of covariate effect on a survival outcome to provide meaningful insight, where the functional form of q(x) is not specified except for its shape. Prior biologic knowledge on the shape of the underlying distribution of the covariate effect in regression models can be used to enhance statistical inference. Despite recent progress, major challenges remain for the semiparametric shape-restricted inference due to lack of practical and efficient computational algorithms to accomplish non-convex optimization. We propose an alternative algorithm to maximize the full log-likelihood with two sets of parameters iteratively under monotone constraints. The first set consists of the non-parametric estimation of the monotone-restricted function q(x), while the second set includes estimating the baseline hazard function and other covariate coefficients. The iterative algorithm in conjunction with the pool-adjacent-violators algorithm makes the computation efficient and practical. The Jackknife resampling effectively reduces the estimator bias, when sample size is small. Simulations show that the proposed method can accurately capture the underlying shape of q(x), and outperforms the estimators when q(x) in the Cox model is mis-specified. We apply the method to model the effect of estrogen receptor on breast cancer patients' survival.

12.
Can J Stat ; 49(3): 659-677, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34690407

ABSTRACT

In the group testing procedure, several individual samples are grouped and the pooled samples, instead of each individual sample, are tested for outcome status (e.g., infectious disease status). Although this cost-effectiveness strategy in data collection is both labor and time efficient, it poses statistical challenges to derive statistically and computationally efficient estimators under semiparametric models. We consider semiparametric isotonic regression models for the simultaneous estimation of the conditional probability curve and covariate effects, in which a parametric form for combining the covariate information is assumed and the monotonic link function is left unspecified. We develop an expectation-maximization algorithm to overcome the computational challenge and embed the pool-adjacent violators algorithm in the M-step to facilitate the computation. We establish the large sample behavior of the proposed estimators and examine their finite sample performance in simulation studies. We apply the proposed method to data from the National Health and Nutrition Examination Survey for illustration.

13.
J Am Stat Assoc ; 116(534): 531-545, 2021.
Article in English | MEDLINE | ID: mdl-34321704

ABSTRACT

Genetics plays a role in age-related macular degeneration (AMD), a common cause of blindness in the elderly. There is a need for powerful methods for carrying out region-based association tests between a dichotomous trait like AMD and genetic variants on family data. Here, we apply our new generalized functional linear mixed models (GFLMM) developed to test for gene-based association in a set of AMD families. Using common and rare variants, we observe significant association with two known AMD genes: CFH and ARMS2. Using rare variants, we find suggestive signals in four genes: ASAH1, CLEC6A, TMEM63C, and SGSM1. Intriguingly, ASAH1 is down-regulated in AMD aqueous humor, and ASAH1 deficiency leads to retinal inflammation and increased vulnerability to oxidative stress. These findings were made possible by our GFLMM which model the effect of a major gene as a fixed mean, the polygenic contributions as a random variation, and the correlation of pedigree members by kinship coefficients. Simulations indicate that the GFLMM likelihood ratio tests (LRTs) accurately control the Type I error rates. The LRTs have similar or higher power than existing retrospective kernel and burden statistics. Our GFLMM-based statistics provide a new tool for conducting family-based genetic studies of complex diseases. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

14.
J Leukoc Biol ; 110(5): 987-998, 2021 11.
Article in English | MEDLINE | ID: mdl-33784425

ABSTRACT

High-mobility group box 1 (HMGB1) is an abundant architectural chromosomal protein that has multiple biologic functions: gene transcription, DNA replication, DNA-damage repair, and cell signaling for inflammation. HMGB1 can be released passively by necrotic cells or secreted actively by activated immune cells into the extracellular milieu after injury. Extracellular HMGB1 acts as a damage-associated molecular pattern to initiate the innate inflammatory response to infection and injury by communicating with neighboring cells through binding to specific cell-surface receptors, including Toll-like receptors (TLRs) and the receptor for advanced glycation end products (RAGE). Numerous studies have suggested HMGB1 to act as a key protein mediating the pathogenesis of chronic and acute liver diseases, including nonalcoholic fatty liver disease, hepatocellular carcinoma, and hepatic ischemia/reperfusion injury. Here, we provide a detailed review that focuses on the role of HMGB1 and HMGB1-mediated inflammatory signaling pathways in the pathogenesis of liver diseases.


Subject(s)
HMGB1 Protein/metabolism , Liver Diseases/metabolism , Liver Diseases/physiopathology , Animals , Humans
15.
Int J Biostat ; 17(2): 177-190, 2020 10 08.
Article in English | MEDLINE | ID: mdl-33027048

ABSTRACT

Precision medicine approach that assigns treatment according to an individual's personal (including molecular) profile is revolutionizing health care. Existing statistical methods for clinical trial design typically assume a known model to estimate characteristics of treatment outcomes, which may yield biased results if the true model deviates far from the assumed one. This article aims to achieve model robustness in a phase II multi-stage adaptive clinical trial design. We propose and study a semiparametric regression mixture model in which the mixing proportions are specified according to the subjects' profiles, and each sub-group distribution is only assumed to be unimodal for robustness. The regression parameters and the error density functions are estimated by semiparametric maximum likelihood and isotonic regression estimators. The asymptotic properties of the estimates are studied. Simulation studies are conducted to evaluate the performance of the method after a real data analysis.


Subject(s)
Clinical Trials as Topic , Research Design , Computer Simulation , Humans , Likelihood Functions , Models, Statistical
16.
Int J Biostat ; 17(1): 55-74, 2020 09 16.
Article in English | MEDLINE | ID: mdl-32936783

ABSTRACT

In integrative analysis parametric or nonparametric methods are often used. The former is easier for interpretation but not robust, while the latter is robust but not easy to interpret the relationships among the different types of variables. To combine the advantages of both methods and for flexibility, here a system of semiparametric projection non-linear regression models is proposed for the integrative analysis, to model the innate coordinate structure of these different types of data, and a diagnostic tool is constructed to classify new subjects to the case or control group. Simulation studies are conducted to evaluate the performance of the proposed method, and shows promising results. Then the method is applied to analyze a real omics data from The Cancer Genome Atlas study, compared the results with those from the similarity network fusion, another integrative analysis method, and results from our method are more reasonable.


Subject(s)
Research Design , Computer Simulation , Humans
17.
J Appl Stat ; 47(3): 524-540, 2020.
Article in English | MEDLINE | ID: mdl-35706964

ABSTRACT

Batched data is a type of data where each observed data value is the sum of a number of grouped (batched) latent ones obtained under different conditions. Batched data arises in various practical backgrounds and is often found in social studies and management sector. The analysis of such data is analytically challenging due to its structural complexity. In this article, we describe how to analyze batched service time data, estimate the mean and variance of each batch that are latent. We in particular focus on the situation when the observed total time includes an unknown proportion of non-service time. To address this problem, we propose a Gaussian model for efficiency as well as a semi-parametric kernel density model for robustness. We evaluate the performance of both proposed methods through simulation studies and then applied our methods to analyze a batched data.

18.
Ann Hum Genet ; 83(6): 405-417, 2019 11.
Article in English | MEDLINE | ID: mdl-31206606

ABSTRACT

Genome-wide association studies (GWAS) are used to investigate genetic variants contributing to complex traits. Despite discovering many loci, a large proportion of "missing" heritability remains unexplained. Gene-gene interactions may help explain some of this gap. Traditionally, gene-gene interactions have been evaluated using parametric statistical methods such as linear and logistic regression, with multifactor dimensionality reduction (MDR) used to address sparseness of data in high dimensions. We propose a method for the analysis of gene-gene interactions across independent single-nucleotide polymorphisms (SNPs) in two genes. Typical methods for this problem use statistics based on an asymptotic chi-squared mixture distribution, which is not easy to use. Here, we propose a Kullback-Leibler-type statistic, which follows an asymptotic, positive, normal distribution under the null hypothesis of no relationship between SNPs in the two genes, and normally distributed under the alternative hypothesis. The performance of the proposed method is evaluated by simulation studies, which show promising results. The method is also used to analyze real data and identifies gene-gene interactions among RAB3A, MADD, and PTPRN on type 2 diabetes (T2D) status.


Subject(s)
Epistasis, Genetic , Genetic Variation , Genome-Wide Association Study , Models, Genetic , Models, Statistical , Multifactorial Inheritance , Algorithms , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genetics, Population , Genome-Wide Association Study/methods , Humans , Polymorphism, Single Nucleotide
19.
Lifetime Data Anal ; 25(1): 26-51, 2019 01.
Article in English | MEDLINE | ID: mdl-29423775

ABSTRACT

Current status data occur in many biomedical studies where we only know whether the event of interest occurs before or after a particular time point. In practice, some subjects may never experience the event of interest, i.e., a certain fraction of the population is cured or is not susceptible to the event of interest. We consider a class of semiparametric transformation cure models for current status data with a survival fraction. This class includes both the proportional hazards and the proportional odds cure models as two special cases. We develop efficient likelihood-based estimation and inference procedures. We show that the maximum likelihood estimators for the regression coefficients are consistent, asymptotically normal, and asymptotically efficient. Simulation studies demonstrate that the proposed methods perform well in finite samples. For illustration, we provide an application of the models to a study on the calcification of the hydrogel intraocular lenses.


Subject(s)
Computer Simulation , Models, Statistical , Proportional Hazards Models , Algorithms , Biometry/methods , Data Analysis , Female , Humans , Likelihood Functions , Male , Sensitivity and Specificity
20.
Genet Epidemiol ; 43(2): 189-206, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30537345

ABSTRACT

We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F -statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F -distributed statistics provide a good control of the type I error rate. The F -test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F -statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05 . For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.


Subject(s)
Genetic Association Studies , High-Throughput Nucleotide Sequencing/methods , Models, Genetic , Quantitative Trait, Heritable , Computer Simulation , Family , Humans , Linear Models , Myopia/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...