Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
1.
Cell ; 187(5): 1255-1277.e27, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38359819

ABSTRACT

Despite the successes of immunotherapy in cancer treatment over recent decades, less than <10%-20% cancer cases have demonstrated durable responses from immune checkpoint blockade. To enhance the efficacy of immunotherapies, combination therapies suppressing multiple immune evasion mechanisms are increasingly contemplated. To better understand immune cell surveillance and diverse immune evasion responses in tumor tissues, we comprehensively characterized the immune landscape of more than 1,000 tumors across ten different cancers using CPTAC pan-cancer proteogenomic data. We identified seven distinct immune subtypes based on integrative learning of cell type compositions and pathway activities. We then thoroughly categorized unique genomic, epigenetic, transcriptomic, and proteomic changes associated with each subtype. Further leveraging the deep phosphoproteomic data, we studied kinase activities in different immune subtypes, which revealed potential subtype-specific therapeutic targets. Insights from this work will facilitate the development of future immunotherapy strategies and enhance precision targeting with existing agents.


Subject(s)
Neoplasms , Proteogenomics , Humans , Combined Modality Therapy , Genomics , Neoplasms/genetics , Neoplasms/immunology , Neoplasms/therapy , Proteomics , Tumor Escape
3.
Cell ; 186(16): 3476-3498.e35, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37541199

ABSTRACT

To improve the understanding of chemo-refractory high-grade serous ovarian cancers (HGSOCs), we characterized the proteogenomic landscape of 242 (refractory and sensitive) HGSOCs, representing one discovery and two validation cohorts across two biospecimen types (formalin-fixed paraffin-embedded and frozen). We identified a 64-protein signature that predicts with high specificity a subset of HGSOCs refractory to initial platinum-based therapy and is validated in two independent patient cohorts. We detected significant association between lack of Ch17 loss of heterozygosity (LOH) and chemo-refractoriness. Based on pathway protein expression, we identified 5 clusters of HGSOC, which validated across two independent patient cohorts and patient-derived xenograft (PDX) models. These clusters may represent different mechanisms of refractoriness and implicate putative therapeutic vulnerabilities.


Subject(s)
Cystadenocarcinoma, Serous , Ovarian Neoplasms , Proteogenomics , Female , Humans , Cystadenocarcinoma, Serous/drug therapy , Cystadenocarcinoma, Serous/genetics , Ovarian Neoplasms/drug therapy , Ovarian Neoplasms/genetics
4.
Biometrics ; 79(4): 3294-3306, 2023 12.
Article in English | MEDLINE | ID: mdl-37479677

ABSTRACT

We consider a Bayesian functional data analysis for observations measured as extremely long sequences. Splitting the sequence into several small windows with manageable lengths, the windows may not be independent especially when they are neighboring each other. We propose to utilize Bayesian smoothing splines to estimate individual functional patterns within each window and to establish transition models for parameters involved in each window to address the dependence structure between windows. The functional difference of groups of individuals at each window can be evaluated by the Bayes factor based on Markov Chain Monte Carlo samples in the analysis. In this paper, we examine the proposed method through simulation studies and apply it to identify differentially methylated genetic regions in TCGA lung adenocarcinoma data.


Subject(s)
Data Analysis , Humans , Bayes Theorem , Computer Simulation , Markov Chains , Monte Carlo Method
5.
Front Oncol ; 13: 1168710, 2023.
Article in English | MEDLINE | ID: mdl-37205196

ABSTRACT

Introduction: Immunotherapy is an effective treatment for a subset of cancer patients, and expanding the benefits of immunotherapy to all cancer patients will require predictive biomarkers of response and immune-related adverse events (irAEs). To support correlative studies in immunotherapy clinical trials, we are developing highly validated assays for quantifying immunomodulatory proteins in human biospecimens. Methods: Here, we developed a panel of novel monoclonal antibodies and incorporated them into a novel, multiplexed, immuno-multiple reaction monitoring mass spectrometry (MRM-MS)-based proteomic assay targeting 49 proteotypic peptides representing 43 immunomodulatory proteins. Results and discussion: The multiplex assay was validated in human tissue and plasma matrices, where the linearity of quantification was >3 orders of magnitude with median interday CVs of 8.7% (tissue) and 10.1% (plasma). Proof-of-principle demonstration of the assay was conducted in plasma samples collected in clinical trials from lymphoma patients receiving an immune checkpoint inhibitor. We provide the assays and novel monoclonal antibodies as a publicly available resource for the biomedical community.

6.
J Appl Stat ; 50(4): 848-870, 2023.
Article in English | MEDLINE | ID: mdl-36925904

ABSTRACT

Necessity for finding improved intervention in many legacy therapeutic areas are of high priority. This has the potential to decrease the expense of medical care and poor outcomes for many patients. Typically, clinical efficacy is the primary evaluating criteria to measure any beneficial effect of a treatment. Albeit, there could be situations when several other factors (e.g. side-effects, cost-burden, less debilitating, less intensive, etc.) which can permit some slightly less efficacious treatment options favorable to a subgroup of patients. This often leads to non-inferiority (NI) testing. NI trials may or may not include a placebo arm due to ethical reasons. However, when included, the resulting three-arm trial is more prudent since it requires less stringent assumptions compared to a two-arm placebo-free trial. In this article, we consider both Frequentist and Bayesian procedures for testing NI in the three-arm trial with binary outcomes when the functional of interest is risk difference. An improved Frequentist approach is proposed first, which is then followed by a Bayesian counterpart. Bayesian methods have a natural advantage in many active-control trials, including NI trial, as it can seamlessly integrate substantial prior information. In addition, we discuss sample size calculation and draw an interesting connection between the two paradigms.

7.
BMC Bioinformatics ; 23(1): 321, 2022 Aug 05.
Article in English | MEDLINE | ID: mdl-35931981

ABSTRACT

BACKGROUND: Applying directed acyclic graph (DAG) models to proteogenomic data has been shown effective for detecting causal biomarkers of complex diseases. However, there remain unsolved challenges in DAG learning to jointly model binary clinical outcome variables and continuous biomarker measurements. RESULTS: In this paper, we propose a new tool, DAGBagM, to learn DAGs with both continuous and binary nodes. By using appropriate models, DAGBagM allows for either continuous or binary nodes to be parent or child nodes. It employs a bootstrap aggregating strategy to reduce false positives in edge inference. At the same time, the aggregation procedure provides a flexible framework to robustly incorporate prior information on edges. CONCLUSIONS: Through extensive simulation experiments, we demonstrate that DAGBagM has superior performance compared to alternative strategies for modeling mixed types of nodes. In addition, DAGBagM is computationally more efficient than two competing methods. When applying DAGBagM to proteogenomic datasets from ovarian cancer studies, we identify potential protein biomarkers for platinum refractory/resistant response in ovarian cancer. DAGBagM is made available as a github repository at https://github.com/jie108/dagbagM .


Subject(s)
Ovarian Neoplasms , Biomarkers , Causality , Child , Computer Simulation , Confounding Factors, Epidemiologic , Female , Humans , Ovarian Neoplasms/drug therapy , Ovarian Neoplasms/genetics
8.
Stat Med ; 41(18): 3492-3510, 2022 08 15.
Article in English | MEDLINE | ID: mdl-35656596

ABSTRACT

The performance of computational methods and software to identify differentially expressed features in single-cell RNA-sequencing (scRNA-seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA-seq expression features. To model the technological variability in cross-platform scRNA-seq data, here we propose to use Tweedie generalized linear models that can flexibly capture a large dynamic range of observed scRNA-seq expression profiles across experimental platforms induced by platform- and gene-specific statistical properties such as heavy tails, sparsity, and gene expression distributions. We also propose a zero-inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero-inflated scRNA-seq data with excessive zero counts. Using both synthetic and published plate- and droplet-based scRNA-seq datasets, we perform a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state-of-the-art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open-source software (R/Bioconductor package) is available at https://github.com/himelmallick/Tweedieverse.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Gene Expression Profiling/methods , Humans , RNA-Seq , Sequence Analysis, RNA , Software
9.
Anal Chem ; 94(27): 9540-9547, 2022 07 12.
Article in English | MEDLINE | ID: mdl-35767427

ABSTRACT

Despite advances in proteomic technologies, clinical translation of plasma biomarkers remains low, partly due to a major bottleneck between the discovery of candidate biomarkers and costly clinical validation studies. Due to a dearth of multiplexable assays, generally only a few candidate biomarkers are tested, and the validation success rate is accordingly low. Previously, mass spectrometry-based approaches have been used to fill this gap but feature poor quantitative performance and were generally limited to hundreds of proteins. Here, we demonstrate the capability of an internal standard triggered-parallel reaction monitoring (IS-PRM) assay to greatly expand the numbers of candidates that can be tested with improved quantitative performance. The assay couples immunodepletion and fractionation with IS-PRM and was developed and implemented in human plasma to quantify 5176 peptides representing 1314 breast cancer biomarker candidates. Characterization of the IS-PRM assay demonstrated the precision (median % CV of 7.7%), linearity (median R2 > 0.999 over 4 orders of magnitude), and sensitivity (median LLOQ < 1 fmol, approximately) to enable rank-ordering of candidate biomarkers for validation studies. Using three plasma pools from breast cancer patients and three control pools, 893 proteins were quantified, of which 162 candidate biomarkers were verified in at least one of the cancer pools and 22 were verified in all three cancer pools. The assay greatly expands capabilities for quantification of large numbers of proteins and is well suited for prioritization of viable candidate biomarkers.


Subject(s)
Breast Neoplasms , Proteomics , Biomarkers/analysis , Biomarkers, Tumor , Breast Neoplasms/diagnosis , Female , Humans , Mass Spectrometry/methods , Peptides/analysis , Proteins , Proteomics/methods
10.
Biostatistics ; 23(1): 136-156, 2022 01 13.
Article in English | MEDLINE | ID: mdl-32385495

ABSTRACT

With the availability of limited resources, innovation for improved statistical method for the design and analysis of randomized controlled trials (RCTs) is of paramount importance for newer and better treatment discovery for any therapeutic area. Although clinical efficacy is almost always the primary evaluating criteria to measure any beneficial effect of a treatment, there are several important other factors (e.g., side effects, cost burden, less debilitating, less intensive, etc.), which can permit some less efficacious treatment options favorable to a subgroup of patients. This leads to non-inferiority (NI) testing. The objective of NI trial is to show that an experimental treatment is not worse than an active reference treatment by more than a pre-specified margin. Traditional NI trials do not include a placebo arm for ethical reason; however, this necessitates stringent and often unverifiable assumptions. On the other hand, three-arm NI trials consisting of placebo, reference, and experimental treatment, can simultaneously test the superiority of the reference over placebo and NI of experimental treatment over the reference. In this article, we proposed both novel Frequentist and Bayesian procedures for testing NI in the three-arm trial with Poisson distributed count outcome. RCTs with count data as the primary outcome are quite common in various disease areas such as lesion count in cancer trials, relapses in multiple sclerosis, dermatology, neurology, cardiovascular research, adverse event count, etc. We first propose an improved Frequentist approach, which is then followed by it's Bayesian version. Bayesian methods have natural advantage in any active-control trials, including NI trial when substantial historical information is available for placebo and established reference treatment. In addition, we discuss sample size calculation and draw an interesting connection between the two paradigms.


Subject(s)
Research Design , Bayes Theorem , Humans , Treatment Outcome
11.
Clin Chem ; 67(7): 1008-1018, 2021 07 06.
Article in English | MEDLINE | ID: mdl-34136904

ABSTRACT

BACKGROUND: Conventional HER2-targeting therapies improve outcomes for patients with HER2-positive breast cancer (BC), defined as tumors showing HER2 protein overexpression by immunohistochemistry and/or ERBB2 gene amplification determined by in situ hybridization (ISH). Emerging HER2-targeting compounds show benefit in some patients with neither HER2 protein overexpression nor ERBB2 gene amplification, creating a need for new assays to select HER2-low tumors for treatment with these compounds. We evaluated the analytical performance of a targeted mass spectrometry-based assay for quantifying HER2 protein in formalin-fixed paraffin-embedded (FFPE) and frozen BC biopsies. METHODS: We used immunoaffinity-enrichment coupled to multiple reaction monitoring-mass spectrometry (immuno-MRM-MS) to quantify HER2 protein (as peptide GLQSLPTHDPSPLQR) in 96 frozen and 119 FFPE BC biopsies. We characterized linearity, lower limit of quantification (LLOQ), and intra- and inter-day variation of the assay in frozen and FFPE tissue matrices. We determined concordance between HER2 immuno-MRM-MS and predicate immunohistochemistry and ISH assays and examined the benefit of multiplexing the assay to include proteins expressed in tumor subcompartments (e.g., stroma, adipose, lymphocytes, epithelium) to account for tissue heterogeneity. RESULTS: HER2 immuno-MRM-MS assay linearity was ≥103, assay coefficient of variation was 7.8% (FFPE) and 5.9% (frozen) for spiked-in analyte, and 7.7% (FFPE) and 7.9% (frozen) for endogenous measurements. Immuno-MRM-MS-based HER2 measurements strongly correlated with predicate assay HER2 determinations, and concordance was improved by normalizing to glyceraldehyde-3-phosphate dehydrogenase. HER2 was quantified above the LLOQ in all tumors. CONCLUSIONS: Immuno-MRM-MS can be used to quantify HER2 in FFPE and frozen BC biopsies, even at low HER2 expression levels.


Subject(s)
Breast Neoplasms , Biomarkers, Tumor/genetics , Breast Neoplasms/pathology , Female , Formaldehyde/chemistry , Humans , Mass Spectrometry/methods , Paraffin Embedding , Receptor, ErbB-2/analysis , Tissue Fixation/methods
12.
Cell Rep Med ; 2(12): 100471, 2021 12 21.
Article in English | MEDLINE | ID: mdl-35028612

ABSTRACT

Resistance to platinum compounds is a major determinant of patient survival in high-grade serous ovarian cancer (HGSOC). To understand mechanisms of platinum resistance and identify potential therapeutic targets in resistant HGSOC, we generated a data resource composed of dynamic (±carboplatin) protein, post-translational modification, and RNA sequencing (RNA-seq) profiles from intra-patient cell line pairs derived from 3 HGSOC patients before and after acquiring platinum resistance. These profiles reveal extensive responses to carboplatin that differ between sensitive and resistant cells. Higher fatty acid oxidation (FAO) pathway expression is associated with platinum resistance, and both pharmacologic inhibition and CRISPR knockout of carnitine palmitoyltransferase 1A (CPT1A), which represents a rate limiting step of FAO, sensitize HGSOC cells to platinum. The results are further validated in patient-derived xenograft models, indicating that CPT1A is a candidate therapeutic target to overcome platinum resistance. All multiomic data can be queried via an intuitive gene-query user interface (https://sites.google.com/view/ptrc-cell-line).


Subject(s)
Carboplatin/therapeutic use , Carnitine O-Palmitoyltransferase/metabolism , Cystadenocarcinoma, Serous/metabolism , Cystadenocarcinoma, Serous/pathology , Genomics , Molecular Targeted Therapy , Ovarian Neoplasms/metabolism , Ovarian Neoplasms/pathology , Acetyl-CoA Carboxylase/genetics , Acetyl-CoA Carboxylase/metabolism , Animals , Apoptosis/drug effects , Carboplatin/pharmacology , Carnitine O-Palmitoyltransferase/antagonists & inhibitors , Carnitine O-Palmitoyltransferase/genetics , Cell Line, Tumor , Cell Proliferation/drug effects , Cystadenocarcinoma, Serous/drug therapy , DNA Damage , Drug Resistance, Neoplasm/drug effects , Fatty Acids/metabolism , Female , Gene Expression Regulation, Neoplastic/drug effects , Humans , Mice, SCID , Neoplasm Grading , Ovarian Neoplasms/drug therapy , Oxidation-Reduction/drug effects , Oxidative Phosphorylation/drug effects , Phosphoproteins/metabolism , Proteomics , Reactive Oxygen Species/metabolism
13.
Cell ; 183(7): 1962-1985.e31, 2020 12 23.
Article in English | MEDLINE | ID: mdl-33242424

ABSTRACT

We report a comprehensive proteogenomics analysis, including whole-genome sequencing, RNA sequencing, and proteomics and phosphoproteomics profiling, of 218 tumors across 7 histological types of childhood brain cancer: low-grade glioma (n = 93), ependymoma (32), high-grade glioma (25), medulloblastoma (22), ganglioglioma (18), craniopharyngioma (16), and atypical teratoid rhabdoid tumor (12). Proteomics data identify common biological themes that span histological boundaries, suggesting that treatments used for one histological type may be applied effectively to other tumors sharing similar proteomics features. Immune landscape characterization reveals diverse tumor microenvironments across and within diagnoses. Proteomics data further reveal functional effects of somatic mutations and copy number variations (CNVs) not evident in transcriptomics data. Kinase-substrate association and co-expression network analysis identify important biological mechanisms of tumorigenesis. This is the first large-scale proteogenomics analysis across traditional histological boundaries to uncover foundational pediatric brain tumor biology and inform rational treatment selection.


Subject(s)
Brain Neoplasms/genetics , Brain Neoplasms/pathology , Proteogenomics , Brain Neoplasms/immunology , Child , DNA Copy Number Variations/genetics , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Genome, Human , Glioma/genetics , Glioma/pathology , Humans , Lymphocytes, Tumor-Infiltrating/immunology , Mutation/genetics , Neoplasm Grading , Neoplasm Recurrence, Local/pathology , Phosphoproteins/metabolism , Phosphorylation , RNA, Messenger/genetics , RNA, Messenger/metabolism , Transcriptome/genetics
15.
Cell ; 179(4): 964-983.e31, 2019 10 31.
Article in English | MEDLINE | ID: mdl-31675502

ABSTRACT

To elucidate the deregulated functional modules that drive clear cell renal cell carcinoma (ccRCC), we performed comprehensive genomic, epigenomic, transcriptomic, proteomic, and phosphoproteomic characterization of treatment-naive ccRCC and paired normal adjacent tissue samples. Genomic analyses identified a distinct molecular subgroup associated with genomic instability. Integration of proteogenomic measurements uniquely identified protein dysregulation of cellular mechanisms impacted by genomic alterations, including oxidative phosphorylation-related metabolism, protein translation processes, and phospho-signaling modules. To assess the degree of immune infiltration in individual tumors, we identified microenvironment cell signatures that delineated four immune-based ccRCC subtypes characterized by distinct cellular pathways. This study reports a large-scale proteogenomic analysis of ccRCC to discern the functional impact of genomic alterations and provides evidence for rational treatment selection stemming from ccRCC pathobiology.


Subject(s)
Carcinoma, Renal Cell/genetics , Neoplasm Proteins/genetics , Proteogenomics , Transcriptome/genetics , Adult , Aged , Aged, 80 and over , Biomarkers, Tumor/genetics , Biomarkers, Tumor/immunology , Carcinoma, Renal Cell/immunology , Carcinoma, Renal Cell/pathology , Disease-Free Survival , Exome/genetics , Female , Gene Expression Regulation, Neoplastic/genetics , Genome, Human/genetics , Humans , Male , Middle Aged , Neoplasm Proteins/immunology , Oxidative Phosphorylation , Phosphorylation/genetics , Signal Transduction/genetics , Transcriptome/immunology , Tumor Microenvironment/genetics , Tumor Microenvironment/immunology , Exome Sequencing
16.
Comput Stat Data Anal ; 132: 70-83, 2019 Apr.
Article in English | MEDLINE | ID: mdl-31749512

ABSTRACT

Three-arm non-inferiority (NI) trial including the experimental treatment, an active reference treatment, and a placebo where the outcome of interest is binary are considered. While the risk difference (RD) is the most common and well explored functional form for testing efficacy (or effectiveness), however, recent FDA guideline suggested measures such as relative risk (RR), odds ratio (OR), number needed to treat (NNT) among others, on the basis of which NI can be claimed for binary outcome. Albeit, developing test based on these different functions of binary outcome are challenging. This is because the construction and interpretation of NI margin for such functions are non-trivial extensions of RD based approach. A Frequentist test based on traditional fraction margin approach for RR, OR and NNT are proposed first. Furthermore a conditional testing approach is developed by incorporating assay sensitivity (AS) condition directly into NI testing. A detailed discussion of sample size/power calculation are also put forward which could be readily used while designing such trials in practice. A clinical trial data is reanalyzed to demonstrate the presented approach.

17.
Stat Biopharm Res ; 11(1): 34-43, 2019.
Article in English | MEDLINE | ID: mdl-31602287

ABSTRACT

In this paper we consider three-arm non-inferiority (NI) trial that includes an experimental, a reference, and a placebo arm. While for binary outcomes the risk difference (RD) is the most common and well explored functional form for testing efficacy (or effectiveness), recent FDA guideline suggested other measures such as relative risk (RR) and odds ratio (OR) on the basis of which NI of an experimental treatment can be claimed. However, developing test based on these different functions of binary outcomes are challenging since the construction and interpretation of NI margin for such functions are not trivial extensions of RD based approach. Recently, we have proposed Frequentist approaches for testing NI for these functionals. In this article we further develop Bayesian approaches for testing NI based on effect retention approach for RR and OR. Bayesian paradigm provides a natural path to integrate historical trials' information, as well as it allows the usage of patients'/clinicians' opinions as prior information via sequential learning. In addition we discuss, in detail, the sample size/power calculation which could be readily used while designing such trials in practice.

18.
AIDS Patient Care STDS ; 33(9): 388-398, 2019 09.
Article in English | MEDLINE | ID: mdl-31517525

ABSTRACT

Dramatic decreases in HIV transmission are achievable with currently available biomedical and behavioral interventions, including antiretroviral therapy and pre-exposure prophylaxis. However, such decreases have not yet been realized among adolescents and young adults. The Adolescent Medicine Trials Network (ATN) for HIV/AIDS interventions is dedicated to research addressing the needs of youth at high risk for HIV acquisition as well as youth living with HIV. This article provides an overview of an array of efficient and effective designs across the translational spectrum that are utilized within the ATN. These designs maximize methodological rigor and real-world applicability of findings while minimizing resource use. Implementation science and cost-effectiveness methods are included. Utilizing protocol examples, we demonstrate the feasibility of such designs to balance rigor and relevance to shorten the science-to-practice gap and improve the youth HIV prevention and care continua.


Subject(s)
Antiretroviral Therapy, Highly Active , HIV Infections/prevention & control , Pre-Exposure Prophylaxis , Acquired Immunodeficiency Syndrome , Adolescent , Adolescent Behavior , HIV Infections/drug therapy , Humans , Young Adult
19.
J Biopharm Stat ; 29(3): 425-445, 2019.
Article in English | MEDLINE | ID: mdl-30744476

ABSTRACT

For an existing established drug regimen, active control trials are defacto standard due to ethical reason as well as for clinical equipoise. However, when superiority claim of a new drug against the active control is unlikely to be successful, researchers often address the issue in terms of noninferiority (NI), provided the experimental drug demonstrates the evidence of other benefits beyond efficacy. Such trials aim to demonstrate that an experimental treatment is non-inferior to an existing comparator by not more than a pre-specified margin. The issue of choosing such a margin is complex. In this article, two-arm NI trials with binary outcomes are considered when margin is defined in terms of relative risk or odds ratio. A Frequentist test based on proposed NI margin is developed first. Since two-arm NI trials without placebo arm are dependent upon historical information, in order to make accurate and meaningful interpretation of their results, a Bayesian approach is developed next. Bayesian approach is flexible to incorporate the available information from the historical trial. The operating characteristics of the proposed methods are studied in terms of power and sample size for varying design factors. A clinical trial data is reanalyzed to study the properties of the proposed approach.


Subject(s)
Controlled Clinical Trials as Topic/statistics & numerical data , Models, Statistical , Research Design/statistics & numerical data , Bayes Theorem , Controlled Clinical Trials as Topic/methods , Data Interpretation, Statistical , Humans , Markov Chains , Monte Carlo Method , Odds Ratio , Research Design/standards , Risk , Sample Size
20.
Stat Med ; 37(20): 3012-3026, 2018 09 10.
Article in English | MEDLINE | ID: mdl-29900575

ABSTRACT

In many biomedical applications, covariates are naturally grouped, with variables in the same group being systematically related or statistically correlated. Under such settings, variable selection must be conducted at both group and individual variable levels. Motivated by the widespread availability of zero-inflated count outcomes and grouped covariates in many practical applications, we consider group regularization for zero-inflated negative binomial regression models. Using a least squares approximation of the mixture likelihood and a variety of group-wise penalties on the coefficients, we propose a unified algorithm (Gooogle: Group Regularization for Zero-inflated Count Regression Models) to efficiently compute the entire regularization path of the estimators. We investigate the finite sample performance of these methods through extensive simulation experiments and the analysis of a German health care demand dataset. Finally, we derive theoretical properties of these methods under reasonable assumptions, which further provides deeper insight into the asymptotic behavior of these approaches. The open source software implementation of this method is publicly available at: https://github.com/himelmallick/Gooogle.


Subject(s)
Health Services Needs and Demand/statistics & numerical data , Models, Statistical , Algorithms , Germany , Humans , Least-Squares Analysis , Likelihood Functions , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...