Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 132
Filter
Add more filters

Publication year range
1.
Biostatistics ; 2023 May 31.
Article in English | MEDLINE | ID: mdl-37257175

ABSTRACT

In complex tissues containing cells that are difficult to dissociate, single-nucleus RNA-sequencing (snRNA-seq) has become the preferred experimental technology over single-cell RNA-sequencing (scRNA-seq) to measure gene expression. To accurately model these data in downstream analyses, previous work has shown that droplet-based scRNA-seq data are not zero-inflated, but whether droplet-based snRNA-seq data follow the same probability distributions has not been systematically evaluated. Using pseudonegative control data from nuclei in mouse cortex sequenced with the 10x Genomics Chromium system and mouse kidney sequenced with the DropSeq system, we found that droplet-based snRNA-seq data follow a negative binomial distribution, suggesting that parametric statistical models applied to scRNA-seq are transferable to snRNA-seq. Furthermore, we found that the quantification choices in adapting quantification mapping strategies from scRNA-seq to snRNA-seq can play a significant role in downstream analyses and biological interpretation. In particular, reference transcriptomes that do not include intronic regions result in significantly smaller library sizes and incongruous cell type classifications. We also confirmed the presence of a gene length bias in snRNA-seq data, which we show is present in both exonic and intronic reads, and investigate potential causes for the bias.

2.
Biometrics ; 80(1)2024 Jan 29.
Article in English | MEDLINE | ID: mdl-38470256

ABSTRACT

Semicontinuous outcomes commonly arise in a wide variety of fields, such as insurance claims, healthcare expenditures, rainfall amounts, and alcohol consumption. Regression models, including Tobit, Tweedie, and two-part models, are widely employed to understand the relationship between semicontinuous outcomes and covariates. Given the potential detrimental consequences of model misspecification, after fitting a regression model, it is of prime importance to check the adequacy of the model. However, due to the point mass at zero, standard diagnostic tools for regression models (eg, deviance and Pearson residuals) are not informative for semicontinuous data. To bridge this gap, we propose a new type of residuals for semicontinuous outcomes that is applicable to general regression models. Under the correctly specified model, the proposed residuals converge to being uniformly distributed, and when the model is misspecified, they significantly depart from this pattern. In addition to in-sample validation, the proposed methodology can also be employed to evaluate predictive distributions. We demonstrate the effectiveness of the proposed tool using health expenditure data from the US Medical Expenditure Panel Survey.


Subject(s)
Health Expenditures
3.
Biom J ; 66(5): e202300182, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39001709

ABSTRACT

Spatial count data with an abundance of zeros arise commonly in disease mapping studies. Typically, these data are analyzed using zero-inflated models, which comprise a mixture of a point mass at zero and an ordinary count distribution, such as the Poisson or negative binomial. However, due to their mixture representation, conventional zero-inflated models are challenging to explain in practice because the parameter estimates have conditional latent-class interpretations. As an alternative, several authors have proposed marginalized zero-inflated models that simultaneously model the excess zeros and the marginal mean, leading to a parameterization that more closely aligns with ordinary count models. Motivated by a study examining predictors of COVID-19 death rates, we develop a spatiotemporal marginalized zero-inflated negative binomial model that directly models the marginal mean, thus extending marginalized zero-inflated models to the spatial setting. To capture the spatiotemporal heterogeneity in the data, we introduce region-level covariates, smooth temporal effects, and spatially correlated random effects to model both the excess zeros and the marginal mean. For estimation, we adopt a Bayesian approach that combines full-conditional Gibbs sampling and Metropolis-Hastings steps. We investigate features of the model and use the model to identify key predictors of COVID-19 deaths in the US state of Georgia during the 2021 calendar year.


Subject(s)
Bayes Theorem , Biometry , COVID-19 , Models, Statistical , Humans , COVID-19/mortality , COVID-19/epidemiology , Georgia/epidemiology , Biometry/methods , Spatial Analysis , Binomial Distribution
4.
Behav Res Methods ; 56(4): 2765-2781, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38383801

ABSTRACT

Count outcomes are frequently encountered in single-case experimental designs (SCEDs). Generalized linear mixed models (GLMMs) have shown promise in handling overdispersed count data. However, the presence of excessive zeros in the baseline phase of SCEDs introduces a more complex issue known as zero-inflation, often overlooked by researchers. This study aimed to deal with zero-inflated and overdispersed count data within a multiple-baseline design (MBD) in single-case studies. It examined the performance of various GLMMs (Poisson, negative binomial [NB], zero-inflated Poisson [ZIP], and zero-inflated negative binomial [ZINB] models) in estimating treatment effects and generating inferential statistics. Additionally, a real example was used to demonstrate the analysis of zero-inflated and overdispersed count data. The simulation results indicated that the ZINB model provided accurate estimates for treatment effects, while the other three models yielded biased estimates. The inferential statistics obtained from the ZINB model were reliable when the baseline rate was low. However, when the data were overdispersed but not zero-inflated, both the ZINB and ZIP models exhibited poor performance in accurately estimating treatment effects. These findings contribute to our understanding of using GLMMs to handle zero-inflated and overdispersed count data in SCEDs. The implications, limitations, and future research directions are also discussed.


Subject(s)
Single-Case Studies as Topic , Humans , Linear Models , Multilevel Analysis/methods , Data Interpretation, Statistical , Models, Statistical , Poisson Distribution , Computer Simulation , Research Design
5.
Behav Res Methods ; 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38987450

ABSTRACT

Generalized linear mixed models (GLMMs) have great potential to deal with count data in single-case experimental designs (SCEDs). However, applied researchers have faced challenges in making various statistical decisions when using such advanced statistical techniques in their own research. This study focused on a critical issue by investigating the selection of an appropriate distribution to handle different types of count data in SCEDs due to overdispersion and/or zero-inflation. To achieve this, I proposed two model selection frameworks, one based on calculating information criteria (AIC and BIC) and another based on utilizing a multistage-model selection procedure. Four data scenarios were simulated including Poisson, negative binominal (NB), zero-inflated Poisson (ZIP), and zero-inflated negative binomial (ZINB). The same set of models (i.e., Poisson, NB, ZIP, and ZINB) were fitted for each scenario. In the simulation, I evaluated 10 model selection strategies within the two frameworks by assessing the model selection bias and its consequences on the accuracy of the treatment effect estimates and inferential statistics. Based on the simulation results and previous work, I provide recommendations regarding which model selection methods should be adopted in different scenarios. The implications, limitations, and future research directions are also discussed.

6.
Biostatistics ; 23(1): 50-68, 2022 01 13.
Article in English | MEDLINE | ID: mdl-32282877

ABSTRACT

Joint models for a longitudinal biomarker and a terminal event have gained interests for evaluating cancer clinical trials because the tumor evolution reflects directly the state of the disease. A biomarker characterizing the tumor size evolution over time can be highly informative for assessing treatment options and could be taken into account in addition to the survival time. The biomarker often has a semicontinuous distribution, i.e., it is zero inflated and right skewed. An appropriate model is needed for the longitudinal biomarker as well as an association structure with the survival outcome. In this article, we propose a joint model for a longitudinal semicontinuous biomarker and a survival time. The semicontinuous nature of the longitudinal biomarker is specified by a two-part model, which splits its distribution into a binary outcome (first part) represented by the positive versus zero values and a continuous outcome (second part) with the positive values only. Survival times are modeled with a proportional hazards model for which we propose three association structures with the biomarker. Our simulation studies show some bias can arise in the parameter estimates when the semicontinuous nature of the biomarker is ignored, assuming the true model is a two-part model. An application to advanced metastatic colorectal cancer data from the GERCOR study is performed where our two-part model is compared to one-part joint models. Our results show that treatment arm B (FOLFOX6/FOLFIRI) is associated to higher SLD values over time and its positive association with the terminal event leads to an increased risk of death compared to treatment arm A (FOLFIRI/FOLFOX6).


Subject(s)
Colorectal Neoplasms , Models, Statistical , Biomarkers , Colorectal Neoplasms/drug therapy , Computer Simulation , Humans , Longitudinal Studies
7.
Biostatistics ; 24(1): 161-176, 2022 12 12.
Article in English | MEDLINE | ID: mdl-34520533

ABSTRACT

Single-cell RNA-sequencing (scRNAseq) data contain a high level of noise, especially in the form of zero-inflation, that is, the presence of an excessively large number of zeros. This is largely due to dropout events and amplification biases that occur in the preparation stage of single-cell experiments. Recent scRNAseq experiments have been augmented with unique molecular identifiers (UMI) and External RNA Control Consortium (ERCC) molecules which can be used to account for zero-inflation. However, most of the current methods on graphical models are developed under the assumption of the multivariate Gaussian distribution or its variants, and thus they are not able to adequately account for an excessively large number of zeros in scRNAseq data. In this article, we propose a single-cell latent graphical model (scLGM)-a Bayesian hierarchical model for estimating the conditional dependency network among genes using scRNAseq data. Taking advantage of UMI and ERCC data, scLGM explicitly models the two sources of zero-inflation. Our simulation study and real data analysis demonstrate that the proposed approach outperforms several existing methods.


Subject(s)
RNA , Single-Cell Analysis , Humans , Sequence Analysis, RNA/methods , Bayes Theorem , RNA/genetics , Computer Simulation
8.
Biometrics ; 79(4): 3239-3251, 2023 12.
Article in English | MEDLINE | ID: mdl-36896642

ABSTRACT

The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data generated by high-throughput sequencing technology in omics research due to its ability to accommodate the compositional structure of the data as well as overdispersion. A major limitation of the DM distribution is that it is unable to handle excess zeros typically found in practice which may bias inference. To fill this gap, we propose a novel Bayesian zero-inflated DM model for multivariate compositional count data with excess zeros. We then extend our approach to regression settings and embed sparsity-inducing priors to perform variable selection for high-dimensional covariate spaces. Throughout, modeling decisions are made to boost scalability without sacrificing interpretability or imposing limiting assumptions. Extensive simulations and an application to a human gut microbiome dataset are presented to compare the performance of the proposed method to existing approaches. We provide an accompanying R package with a user-friendly vignette to apply our method to other datasets.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Humans , Models, Statistical , Bayes Theorem , Poisson Distribution
9.
Stat Med ; 42(25): 4632-4643, 2023 11 10.
Article in English | MEDLINE | ID: mdl-37607718

ABSTRACT

In this article, we present a flexible model for microbiome count data. We consider a quasi-likelihood framework, in which we do not make any assumptions on the distribution of the microbiome count except that its variance is an unknown but smooth function of the mean. By comparing our model to the negative binomial generalized linear model (GLM) and Poisson GLM in simulation studies, we show that our flexible quasi-likelihood method yields valid inferential results. Using a real microbiome study, we demonstrate the utility of our method by examining the relationship between adenomas and microbiota. We also provide an R package "fql" for the application of our method.


Subject(s)
Microbiota , Models, Statistical , Humans , Likelihood Functions , Computer Simulation , Poisson Distribution
10.
Stat Med ; 42(28): 5100-5112, 2023 12 10.
Article in English | MEDLINE | ID: mdl-37715594

ABSTRACT

Physical activity (PA) guidelines recommend that PA be accumulated in bouts of 10 minutes or more in duration. Recently, researchers have sought to better understand how participants in PA interventions increase their activity. Participants can increase their daily PA by increasing the number of PA bouts per day while keeping the duration of the bouts constant; they can keep the number of bouts constant but increase the duration of each bout; or participants can increase both the number of bouts and their duration. We propose a novel joint modeling framework for modeling PA bouts and their duration over time. Our joint model is comprised of two sub-models: a mixed-effects Poisson hurdle sub-model for the number of bouts per day and a mixed-effects location scale gamma regression sub-model to characterize the duration of the bouts and their variance. The model allows us to estimate how daily PA bouts and their duration vary together over the course of an intervention and by treatment condition and is specifically designed to capture the unique distributional features of bouted PA as measured by accelerometer: frequent measurements, zero-inflated bouts, and skewed bout durations. We apply our methods to the Make Better Choices study, a longitudinal lifestyle intervention trial to increase PA. We perform a simulation study to evaluate how well our model is able to estimate relationships between outcomes.


Subject(s)
Exercise , Life Style , Humans , Accelerometry/methods , Time Factors , Clinical Trials as Topic
11.
Stat Med ; 42(20): 3636-3648, 2023 09 10.
Article in English | MEDLINE | ID: mdl-37316997

ABSTRACT

Disease mapping is a research field to estimate spatial pattern of disease risks so that areas with elevated risk levels can be identified. The motivation of this article is from a study of dengue fever infection, which causes seasonal epidemics in almost every summer in Taiwan. For analysis of zero-inflated data with spatial correlation and covariates, current methods would either cause a computational burden or miss associations between zero and non-zero responses. In this article, we develop estimating equations for a mixture regression model that accommodates spatial dependence and zero inflation for study of disease propagation. Asymptotic properties for the proposed estimates are established. A simulation study is conducted to evaluate performance of the mixture estimating equations; and a dengue dataset from southern Taiwan is used to illustrate the proposed method.


Subject(s)
Dengue , Epidemics , Humans , Computer Simulation , Spatial Analysis , Taiwan/epidemiology , Dengue/epidemiology , Dengue/prevention & control , Models, Statistical
12.
Biom J ; 65(4): e2200090, 2023 04.
Article in English | MEDLINE | ID: mdl-36732909

ABSTRACT

Disease mapping models have been popularly used to model disease incidence with spatial correlation. In disease mapping models, zero inflation is an important issue, which often occurs in disease incidence datasets with high proportions of zero disease count. It is originated from limited survey coverage or unadvanced testing equipment, which makes some regions have no observed patients. Then excessive zeros recorded in the disease incidence dataset would mess up the true distributions of disease incidence and lead to inaccurate estimates. To address this issue, a zero-inflated disease mapping model is developed in this work. In this model, a zero-inflated process using Bernoulli indicators is assumed to characterize whether the zero inflation occurs for each region. For regions without zero inflation, a coherent and generative disease mapping model is applied for mapping the spatially correlated disease incidence. Independent spatial random effects are incorporated in both processes to account for the spatial patterns of zero inflation and disease incidence. External covariates are also considered in both processes to better explain the disease count data. To estimate the model, a Markov chain Monte Carlo algorithm is proposed. We evaluate model performance via a variety of simulation experiments. Finally, a Lyme disease dataset of Virginia is analyzed to illustrate the application of the proposed model.


Subject(s)
Algorithms , Models, Statistical , Humans , Incidence , Poisson Distribution , Computer Simulation , Monte Carlo Method
13.
Biom J ; 65(8): e2100408, 2023 12.
Article in English | MEDLINE | ID: mdl-37439440

ABSTRACT

Count data with an excess of zeros are often encountered when modeling infectious disease occurrence. The degree of zero inflation can vary over time due to nonepidemic periods as well as by age group or region. A well-established approach to analyze multivariate incidence time series is the endemic-epidemic modeling framework, also known as the HHH approach. However, it assumes Poisson or negative binomial distributions and is thus not tailored to surveillance data with excess zeros. Here, we propose a multivariate zero-inflated endemic-epidemic model with random effects that extends HHH. Parameters of both the zero-inflation probability and the HHH part of this mixture model can be estimated jointly and efficiently via (penalized) maximum likelihood inference using analytical derivatives. We found proper convergence and good coverage of confidence intervals in simulation studies. An application to measles counts in the 16 German states, 2005-2018, showed that zero inflation is more pronounced in the Eastern states characterized by a higher vaccination coverage. Probabilistic forecasts of measles cases improved when accounting for zero inflation. We anticipate zero-inflated HHH models to be a useful extension also for other applications and provide an implementation in an R package.


Subject(s)
Measles , Models, Statistical , Humans , Time Factors , Computer Simulation , Measles/epidemiology , Measles/prevention & control , Germany/epidemiology , Poisson Distribution
14.
Ecol Lett ; 25(12): 2739-2752, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36269686

ABSTRACT

Species' responses to broad-scale environmental or spatial gradients are typically unimodal. Current models of species' responses along gradients tend to be overly simplistic (e.g., linear, quadratic or Gaussian GLMs), or are suitably flexible (e.g., splines, GAMs) but lack direct ecologically interpretable parameters. We describe a parametric framework for species-environment non-linear modelling ('senlm'). The framework has two components: (i) a non-linear parametric mathematical function to model the mean species response along a gradient that allows asymmetry, flattening/peakedness or bimodality; and (ii) a statistical error distribution tailored for ecological data types, allowing intrinsic mean-variance relationships and zero-inflation. We demonstrate the utility of this model framework, highlighting the flexibility of a range of possible mean functions and a broad range of potential error distributions, in analyses of fish species' abundances along a depth gradient, and how they change over time and at different latitudes.


Subject(s)
Environment , Nonlinear Dynamics , Animals , Spatial Analysis , Fishes
15.
Biometrics ; 78(4): 1686-1698, 2022 12.
Article in English | MEDLINE | ID: mdl-34213763

ABSTRACT

Recent studies have suggested that the temporal dynamics of the human microbiome may have associations with human health and disease. An increasing number of longitudinal microbiome studies, which record time to disease onset, aim to identify candidate microbes as biomarkers for prognosis. Owing to the ultra-skewness and sparsity of microbiome proportion (relative abundance) data, directly applying traditional statistical methods may result in substantial power loss or spurious inferences. We propose a novel joint modeling framework [JointMM], which is comprised of two sub-models: a longitudinal sub-model called zero-inflated scaled-beta generalized linear mixed-effects regression to depict the temporal structure of microbial proportions among subjects; and a survival sub-model to characterize the occurrence of an event and its relationship with the longitudinal microbiome proportions. JointMM is specifically designed to handle the zero-inflated and highly skewed longitudinal microbial proportion data and examine whether the temporal pattern of microbial presence and/or the nonzero microbial proportions are associated with differences in the time to an event. The longitudinal sub-model of JointMM also provides the capacity to investigate how the (time-varying) covariates are related to the temporal microbial presence/absence patterns and/or the changing trend in nonzero proportions. Comprehensive simulations and real data analyses are used to assess the statistical efficiency and interpretability of JointMM.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Humans , Models, Statistical , Linear Models , Longitudinal Studies
16.
Biometrics ; 78(2): 766-776, 2022 06.
Article in English | MEDLINE | ID: mdl-33720414

ABSTRACT

Interactions between biological molecules in a cell are tightly coordinated and often highly dynamic. As a result of these varying signaling activities, changes in gene coexpression patterns could often be observed. The advancements in next-generation sequencing technologies bring new statistical challenges for studying these dynamic changes of gene coexpression. In recent years, methods have been developed to examine genomic information from individual cells. Single-cell RNA sequencing (scRNA-seq) data are count-based, and often exhibit characteristics such as overdispersion and zero inflation. To explore the dynamic dependence structure in scRNA-seq data and other zero-inflated count data, new approaches are needed. In this paper, we consider overdispersion and zero inflation in count outcomes and propose a ZEro-inflated negative binomial dynamic COrrelation model (ZENCO). The observed count data are modeled as a mixture of two components: success amplifications and dropout events in ZENCO. A latent variable is incorporated into ZENCO to model the covariate-dependent correlation structure. We conduct simulation studies to evaluate the performance of our proposed method and to compare it with existing approaches. We also illustrate the implementation of our proposed approach using scRNA-seq data from a study of minimal residual disease in melanoma.


Subject(s)
High-Throughput Nucleotide Sequencing , Models, Statistical , Computer Simulation , Sequence Analysis, RNA/methods , Exome Sequencing
17.
Stat Med ; 41(16): 3180-3198, 2022 07 20.
Article in English | MEDLINE | ID: mdl-35429179

ABSTRACT

In many medical and social science studies, count responses with excess zeros are very common and often the primary outcome of interest. Such count responses are usually generated under some clustered correlation structures due to longitudinal observations of subjects. To model such longitudinal count data with excess zeros, the zero-inflated binomial (ZIB) models for bounded outcomes, and the zero-inflated negative binomial (ZINB) and zero-inflated poisson (ZIP) models for unbounded outcomes all are popular methods. To alleviate the effects of deviations from model assumptions, a semiparametric (or, distribution-free) weighted generalized estimating equations has been proposed to estimate model parameters when data are subject to missingness. In this article, we further explore important covariates for the response variable. Without assumptions on the data distribution, a model selection criterion based on the expected weighted quadratic loss is proposed to select an appropriate subset of covariates, especially when count responses have excess zeros and data are subject to nonmonotone missingness in both responses and covariates. To understand the selection effects of the percentages of excess zeros and missingness, we design various scenarios for covariate selection in the mean model via simulation studies and a real data example regarding the study of cardiovascular disease is also presented for illustration.


Subject(s)
Cardiovascular Diseases , Models, Statistical , Computer Simulation , Humans , Poisson Distribution , Weight Loss
18.
Stat Med ; 41(18): 3492-3510, 2022 08 15.
Article in English | MEDLINE | ID: mdl-35656596

ABSTRACT

The performance of computational methods and software to identify differentially expressed features in single-cell RNA-sequencing (scRNA-seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA-seq expression features. To model the technological variability in cross-platform scRNA-seq data, here we propose to use Tweedie generalized linear models that can flexibly capture a large dynamic range of observed scRNA-seq expression profiles across experimental platforms induced by platform- and gene-specific statistical properties such as heavy tails, sparsity, and gene expression distributions. We also propose a zero-inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero-inflated scRNA-seq data with excessive zero counts. Using both synthetic and published plate- and droplet-based scRNA-seq datasets, we perform a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state-of-the-art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open-source software (R/Bioconductor package) is available at https://github.com/himelmallick/Tweedieverse.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Gene Expression Profiling/methods , Humans , RNA-Seq , Sequence Analysis, RNA , Software
19.
Entropy (Basel) ; 24(10)2022 Oct 16.
Article in English | MEDLINE | ID: mdl-37420492

ABSTRACT

Numerous methods have been developed for longitudinal binomial data in the literature. These traditional methods are reasonable for longitudinal binomial data with a negative association between the number of successes and the number of failures over time; however, a positive association may occur between the number of successes and the number of failures over time in some behaviour, economic, disease aggregation and toxicological studies as the numbers of trials are often random. In this paper, we propose a joint Poisson mixed modelling approach to longitudinal binomial data with a positive association between longitudinal counts of successes and longitudinal counts of failures. This approach can accommodate both a random and zero number of trials. It can also accommodate overdispersion and zero inflation in the number of successes and the number of failures. An optimal estimation method for our model has been developed using the orthodox best linear unbiased predictors. Our approach not only provides robust inference against misspecified random effects distributions, but also consolidates the subject-specific and population-averaged inferences. The usefulness of our approach is illustrated with an analysis of quarterly bivariate count data of stock daily limit-ups and limit-downs.

20.
Biometrics ; 77(1): 125-135, 2021 03.
Article in English | MEDLINE | ID: mdl-32125699

ABSTRACT

Researchers are often interested in predicting outcomes, detecting distinct subgroups of their data, or estimating causal treatment effects. Pathological data distributions that exhibit skewness and zero-inflation complicate these tasks-requiring highly flexible, data-adaptive modeling. In this paper, we present a multipurpose Bayesian nonparametric model for continuous, zero-inflated outcomes that simultaneously predicts structural zeros, captures skewness, and clusters patients with similar joint data distributions. The flexibility of our approach yields predictions that capture the joint data distribution better than commonly used zero-inflated methods. Moreover, we demonstrate that our model can be coherently incorporated into a standardization procedure for computing causal effect estimates that are robust to such data pathologies. Uncertainty at all levels of this model flow through to the causal effect estimates of interest-allowing easy point estimation, interval estimation, and posterior predictive checks verifying positivity, a required causal identification assumption. Our simulation results show point estimates to have low bias and interval estimates to have close to nominal coverage under complicated data settings. Under simpler settings, these results hold while incurring lower efficiency loss than comparator methods. We use our proposed method to analyze zero-inflated inpatient medical costs among endometrial cancer patients receiving either chemotherapy or radiation therapy in the SEER-Medicare database.


Subject(s)
Medicare , Models, Statistical , Aged , Bayes Theorem , Causality , Cluster Analysis , Humans , United States
SELECTION OF CITATIONS
SEARCH DETAIL