Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
1.
Proc Natl Acad Sci U S A ; 121(4): e2308942121, 2024 Jan 23.
Article in English | MEDLINE | ID: mdl-38241441

ABSTRACT

In the Antibody Mediated Prevention (AMP) trials (HVTN 704/HPTN 085 and HVTN 703/HPTN 081), prevention efficacy (PE) of the monoclonal broadly neutralizing antibody (bnAb) VRC01 (vs. placebo) against HIV-1 acquisition diagnosis varied according to the HIV-1 Envelope (Env) neutralization sensitivity to VRC01, as measured by 80% inhibitory concentration (IC80). Here, we performed a genotypic sieve analysis, a complementary approach to gaining insight into correlates of protection that assesses how PE varies with HIV-1 sequence features. We analyzed HIV-1 Env amino acid (AA) sequences from the earliest available HIV-1 RNA-positive plasma samples from AMP participants diagnosed with HIV-1 and identified Env sequence features that associated with PE. The strongest Env AA sequence correlate in both trials was VRC01 epitope distance that quantifies the divergence of the VRC01 epitope in an acquired HIV-1 isolate from the VRC01 epitope of reference HIV-1 strains that were most sensitive to VRC01-mediated neutralization. In HVTN 704/HPTN 085, the Env sequence-based predicted probability that VRC01 IC80 against the acquired isolate exceeded 1 µg/mL also significantly associated with PE. In HVTN 703/HPTN 081, a physicochemical-weighted Hamming distance across 50 VRC01 binding-associated Env AA positions of the acquired isolate from the most VRC01-sensitive HIV-1 strain significantly associated with PE. These results suggest that incorporating mutation scoring by BLOSUM62 and weighting by the strength of interactions at AA positions in the epitope:VRC01 interface can optimize performance of an Env sequence-based biomarker of VRC01 prevention efficacy. Future work could determine whether these results extend to other bnAbs and bnAb combinations.


Subject(s)
HIV Infections , HIV Seropositivity , HIV-1 , Humans , Broadly Neutralizing Antibodies , Antibodies, Neutralizing , HIV Antibodies , Epitopes/genetics
2.
Prev Sci ; 2023 Oct 28.
Article in English | MEDLINE | ID: mdl-37897553

ABSTRACT

In research assessing the effect of an intervention or exposure, a key secondary objective often involves assessing differential effects of this intervention or exposure in subgroups of interest; this is often referred to as assessing effect modification or heterogeneity of treatment effects (HTE). Observed HTE can have important implications for policy, including intervention strategies (e.g., will some patients benefit more from intervention than others?) and prioritizing resources (e.g., to reduce observed health disparities). Analysis of HTE is well understood in studies where the independent unit is an individual. In contrast, in studies where the independent unit is a cluster (e.g., a hospital or school) and a cluster-level outcome is used in the analysis, it is less well understood how to proceed if the HTE analysis of interest involves an individual-level characteristic (e.g., self-reported race) that must be aggregated at the cluster level. Through simulations, we show that only individual-level models have power to detect HTE by individual-level variables; if outcomes must be defined at the cluster level, then there is often low power to detect HTE by the corresponding aggregated variables. We illustrate the challenges inherent to this type of analysis in a study assessing the effect of an intervention on increasing COVID-19 booster vaccination rates at long-term care centers.

3.
J Infect Dis ; 225(2): 332-340, 2022 01 18.
Article in English | MEDLINE | ID: mdl-34174082

ABSTRACT

BACKGROUND: In the CYD14 (NCT01373281) and CYD15 (NCT01374516) dengue vaccine efficacy trials, month 13 neutralizing antibody (nAb) titers correlated inversely with risk of symptomatic, virologically confirmed dengue (VCD) between month 13 (1 month after final dose) and month 25. We assessed nAb titer as a correlate of instantaneous risk of hospitalized VCD (HVCD), for which participants were continually surveilled for 72 months. METHODS: Using longitudinal nAb titers from the per-protocol immunogenicity subsets, we estimated hazard ratios (HRs) of HVCD by current nAb titer value for 3 correlate/endpoint pairs: average titer across all 4 serotypes/HVCD of any serotype (HVCD-Any), serotype-specific titer/homologous HVCD, and serotype-specific titer/heterologous HVCD. RESULTS: Baseline-seropositive placebo recipients with higher average titer had lower instantaneous risk of HVCD-Any in 2- to 16-year-olds and in 9- to 16-year-olds (HR, 0.26 or 0.15 per 10-fold increase in average titer by 2 methods [95% confidence interval {CI}, .14-.45 and .07-.34, respectively]) pooled across both trials. Results were similar for homologous HVCD. There was evidence suggesting increased HVCD-Any risk in participants with low average titer (1:10 to 1:100) compared to seronegative participants (HR, 1.85 [95% CI, .93-3.68]). CONCLUSIONS: Natural infection-induced nAbs were inversely associated with hospitalized dengue, upon exceeding a relatively low threshold.


Subject(s)
Antibodies, Neutralizing , Dengue Vaccines/administration & dosage , Dengue Virus/immunology , Dengue/prevention & control , Vaccine Efficacy , Adolescent , Antibodies, Viral , Child , Child, Preschool , Female , Follow-Up Studies , Hospitalization , Humans , Male
4.
Bioinformatics ; 37(22): 4187-4192, 2021 11 18.
Article in English | MEDLINE | ID: mdl-34021743

ABSTRACT

MOTIVATION: A single monoclonal broadly neutralizing antibody (bnAb) regimen was recently evaluated in two randomized trials for prevention efficacy against HIV-1 infection. Subsequent trials will evaluate combination bnAb regimens (e.g. cocktails, multi-specific antibodies), which demonstrate higher potency and breadth in vitro compared to single bnAbs. Given the large number of potential regimens, methods for down-selecting these regimens into efficacy trials are of great interest. RESULTS: We developed Super LeArner Prediction of NAb Panels (SLAPNAP), a software tool for training and evaluating machine learning models that predict in vitro neutralization sensitivity of HIV Envelope (Env) pseudoviruses to a given single or combination bnAb regimen, based on Env amino acid sequence features. SLAPNAP also provides measures of variable importance of sequence features. By predicting bnAb coverage of circulating sequences, SLAPNAP can improve ranking of bnAb regimens by their potential prevention efficacy. In addition, SLAPNAP can improve sieve analysis by defining sequence features that impact bnAb prevention efficacy. AVAILABILITYAND IMPLEMENTATION: SLAPNAP is a freely available docker image that can be downloaded from DockerHub (https://hub.docker.com/r/slapnap/slapnap). Source code and documentation are available at GitHub (https://github.com/benkeser/slapnap and https://benkeser.github.io/slapnap/).


Subject(s)
HIV Infections , HIV-1 , Humans , Broadly Neutralizing Antibodies , HIV Antibodies , Antibodies, Neutralizing/chemistry
5.
Biometrics ; 78(3): 1181-1194, 2022 09.
Article in English | MEDLINE | ID: mdl-34048057

ABSTRACT

The absolute abundance of bacterial taxa in human host-associated environments plays a critical role in reproductive and gastrointestinal health. However, obtaining the absolute abundance of many bacterial species is typically prohibitively expensive. In contrast, relative abundance data for many species are comparatively cheap and easy to collect (e.g., with universal primers for the 16S rRNA gene). In this paper, we propose a method to jointly model relative abundance data for many taxa and absolute abundance data for a subset of taxa. Our method provides point and interval estimates for the absolute abundance of all taxa. Crucially, our proposal accounts for differences in the efficiency of taxon detection in the relative and absolute abundance data. We show that modeling taxon-specific efficiencies substantially reduces the estimation error for absolute abundance, and controls the coverage of interval estimators. We demonstrate the performance of our proposed method via a simulation study, a study of the effect of HIV acquisition on microbial abundances, and a sensitivity study where we jackknife the taxa with observed absolute abundances.


Subject(s)
Bacteria , High-Throughput Nucleotide Sequencing , Bacteria/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , RNA, Ribosomal, 16S/genetics
6.
Stat Med ; 41(6): 1120-1136, 2022 03 15.
Article in English | MEDLINE | ID: mdl-35080038

ABSTRACT

In trials of oral HIV pre-exposure prophylaxis (PrEP), multiple approaches have been used to measure adherence, including self-report, pill counts, electronic dose monitoring devices, and biological measures such as drug levels in plasma, peripheral blood mononuclear cells, hair, and/or dried blood spots. No one of these measures is ideal and each has strengths and weaknesses. However, accurate estimates of adherence to oral PrEP are important as drug efficacy is closely tied to adherence, and secondary analyses of trial data within identified adherent/non-adherent subgroups may yield important insights into real-world drug effectiveness. We develop a statistical approach to combining multiple measures of adherence and show in simulated data that the proposed method provides a more accurate measure of true adherence than self-report. We then apply the method to estimate adherence in the ADAPT study (HPTN 067) in South African women.


Subject(s)
Anti-HIV Agents , HIV Infections , Pre-Exposure Prophylaxis , Anti-HIV Agents/therapeutic use , Female , HIV Infections/drug therapy , HIV Infections/prevention & control , Humans , Leukocytes, Mononuclear , Medication Adherence
7.
J Cardiovasc Electrophysiol ; 32(3): 639-646, 2021 03.
Article in English | MEDLINE | ID: mdl-33476459

ABSTRACT

INTRODUCTION: A weight-based heparin dosing policy adjusted for preprocedural oral anticoagulation was implemented to reduce the likelihood of subtherapeutic dosing during left atrial catheter ablation procedures. We hypothesized that initiation of the protocol would result in a greater prevalence of therapeutic activated clotting time (ACT) values and decreased time to therapeutic ACT during left atrial ablation procedures. METHODS: A departmental protocol was initiated for which subjects received intravenous unfractionated heparin (UFH) to achieve and maintain a goal of ACT >300 s. Initial bolus dose was adjusted for pre-procedure oral anticoagulation and weight as follows: 50 units/kg for those receiving warfarin, 75 units/kg for those not anticoagulated, and 120 units/kg for those on direct oral anticoagulants (DOACs). A UFH infusion was initiated at 10% of the bolus per hour. One hundred consecutive left atrial ablation procedures treated with Protocol Guided heparin dosing were compared with a retrospective consecutive cohort of Usual Care heparin dosing. RESULTS: When the Usual Care and Protocol Guided cohorts were compared, significant findings were limited to those on pre-procedure DOAC. The initial UFH bolus increased from 99.3 ± 24.8 to 118.2 ± 22.8 units/kg (p < .001), the proportion of therapeutic ACT on the first draw after heparin administration increased from 57.7% to 76.6% (p = .010), and the time to therapeutic ACT after UFH administration decreased from 37.8 ± 19.8 to 30.2 ± 16.4 min (p = .032). CONCLUSION: A weight-based protocol for periprocedural UFH administration resulted in a higher proportion of therapeutic ACT values and decreased the time to therapeutic ACT for those on pre-procedure DOAC.


Subject(s)
Atrial Fibrillation , Catheter Ablation , Anticoagulants/adverse effects , Atrial Fibrillation/diagnosis , Atrial Fibrillation/drug therapy , Atrial Fibrillation/surgery , Catheter Ablation/adverse effects , Heparin/adverse effects , Humans , Retrospective Studies
8.
Biometrics ; 77(1): 9-22, 2021 03.
Article in English | MEDLINE | ID: mdl-33043428

ABSTRACT

In a regression setting, it is often of interest to quantify the importance of various features in predicting the response. Commonly, the variable importance measure used is determined by the regression technique employed. For this reason, practitioners often only resort to one of a few regression techniques for which a variable importance measure is naturally defined. Unfortunately, these regression techniques are often suboptimal for predicting the response. Additionally, because the variable importance measures native to different regression techniques generally have a different interpretation, comparisons across techniques can be difficult. In this work, we study a variable importance measure that can be used with any regression technique, and whose interpretation is agnostic to the technique used. This measure is a property of the true data-generating mechanism. Specifically, we discuss a generalization of the analysis of variance variable importance measure and discuss how it facilitates the use of machine learning techniques to flexibly estimate the variable importance of a single feature or group of features. The importance of each feature or group of features in the data can then be described individually, using this measure. We describe how to construct an efficient estimator of this measure as well as a valid confidence interval. Through simulations, we show that our proposal has good practical operating characteristics, and we illustrate its use with data from a study of risk factors for cardiovascular disease in South Africa.


Subject(s)
Cardiovascular Diseases , Machine Learning , Humans , Regression Analysis , Risk Factors
9.
Proc Natl Acad Sci U S A ; 115(18): E4294-E4303, 2018 05 01.
Article in English | MEDLINE | ID: mdl-29654148

ABSTRACT

An individual malignant tumor is composed of a heterogeneous collection of single cells with distinct molecular and phenotypic features, a phenomenon termed intratumoral heterogeneity. Intratumoral heterogeneity poses challenges for cancer treatment, motivating the need for combination therapies. Single-cell technologies are now available to guide effective drug combinations by accounting for intratumoral heterogeneity through the analysis of the signaling perturbations of an individual tumor sample screened by a drug panel. In particular, Mass Cytometry Time-of-Flight (CyTOF) is a high-throughput single-cell technology that enables the simultaneous measurements of multiple ([Formula: see text]40) intracellular and surface markers at the level of single cells for hundreds of thousands of cells in a sample. We developed a computational framework, entitled Drug Nested Effects Models (DRUG-NEM), to analyze CyTOF single-drug perturbation data for the purpose of individualizing drug combinations. DRUG-NEM optimizes drug combinations by choosing the minimum number of drugs that produce the maximal desired intracellular effects based on nested effects modeling. We demonstrate the performance of DRUG-NEM using single-cell drug perturbation data from tumor cell lines and primary leukemia samples.


Subject(s)
Antineoplastic Combined Chemotherapy Protocols/pharmacology , Antineoplastic Combined Chemotherapy Protocols/pharmacokinetics , Biomarkers, Tumor/metabolism , Computer Simulation , Precursor Cell Lymphoblastic Leukemia-Lymphoma/drug therapy , Precursor Cell Lymphoblastic Leukemia-Lymphoma/metabolism , HeLa Cells , Humans
10.
BMC Med Inform Decis Mak ; 21(1): 322, 2021 11 22.
Article in English | MEDLINE | ID: mdl-34809631

ABSTRACT

BACKGROUND: While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases-a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive. METHODS: Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning. RESULTS: Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions. CONCLUSION: In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.


Subject(s)
Machine Learning , Vaccine Efficacy , Humans , Probability
11.
PLoS Comput Biol ; 15(4): e1006952, 2019 04.
Article in English | MEDLINE | ID: mdl-30933973

ABSTRACT

The broadly neutralizing antibody (bnAb) VRC01 is being evaluated for its efficacy to prevent HIV-1 infection in the Antibody Mediated Prevention (AMP) trials. A secondary objective of AMP utilizes sieve analysis to investigate how VRC01 prevention efficacy (PE) varies with HIV-1 envelope (Env) amino acid (AA) sequence features. An exhaustive analysis that tests how PE depends on every AA feature with sufficient variation would have low statistical power. To design an adequately powered primary sieve analysis for AMP, we modeled VRC01 neutralization as a function of Env AA sequence features of 611 HIV-1 gp160 pseudoviruses from the CATNAP database, with objectives: (1) to develop models that best predict the neutralization readouts; and (2) to rank AA features by their predictive importance with classification and regression methods. The dataset was split in half, and machine learning algorithms were applied to each half, each analyzed separately using cross-validation and hold-out validation. We selected Super Learner, a nonparametric ensemble-based cross-validated learning method, for advancement to the primary sieve analysis. This method predicted the dichotomous resistance outcome of whether the IC50 neutralization titer of VRC01 for a given Env pseudovirus is right-censored (indicating resistance) with an average validated AUC of 0.868 across the two hold-out datasets. Quantitative log IC50 was predicted with an average validated R2 of 0.355. Features predicting neutralization sensitivity or resistance included 26 surface-accessible residues in the VRC01 and CD4 binding footprints, the length of gp120, the length of Env, the number of cysteines in gp120, the number of cysteines in Env, and 4 potential N-linked glycosylation sites; the top features will be advanced to the primary sieve analysis. This modeling framework may also inform the study of VRC01 in the treatment of HIV-infected persons.


Subject(s)
Antibodies, Monoclonal/pharmacology , HIV Envelope Protein gp160/genetics , HIV Envelope Protein gp160/immunology , Amino Acid Sequence , Antibodies, Monoclonal/genetics , Antibodies, Monoclonal/immunology , Antibodies, Neutralizing/immunology , Binding Sites , Broadly Neutralizing Antibodies , CD4 Antigens , Computer Simulation , Forecasting/methods , Glycosylation , HIV Antibodies/immunology , HIV Infections/virology , HIV-1/immunology , Humans , Protein Binding
14.
Int J Biostat ; 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38348882

ABSTRACT

In many applications, it is of interest to identify a parsimonious set of features, or panel, from multiple candidates that achieves a desired level of performance in predicting a response. This task is often complicated in practice by missing data arising from the sampling design or other random mechanisms. Most recent work on variable selection in missing data contexts relies in some part on a finite-dimensional statistical model, e.g., a generalized or penalized linear model. In cases where this model is misspecified, the selected variables may not all be truly scientifically relevant and can result in panels with suboptimal classification performance. To address this limitation, we propose a nonparametric variable selection algorithm combined with multiple imputation to develop flexible panels in the presence of missing-at-random data. We outline strategies based on the proposed algorithm that achieve control of commonly used error rates. Through simulations, we show that our proposal has good operating characteristics and results in panels with higher classification and variable selection performance compared to several existing penalized regression approaches in cases where a generalized linear model is misspecified. Finally, we use the proposed method to develop biomarker panels for separating pancreatic cysts with differing malignancy potential in a setting where complicated missingness in the biomarkers arose due to limited specimen volumes.

15.
Article in English | MEDLINE | ID: mdl-38748991

ABSTRACT

OBJECTIVE: Present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data. MATERIALS/METHODS: Drawing on extensive prior phenotyping experiences and insights derived from three algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process. RESULTS: We propose five stages of algorithm development and corresponding principles, strategies, and guidelines: 1) assessing fitness-for-purpose, 2) creating gold standard data, 3) feature engineering, 4) model development, and 5) model evaluation. DISCUSSION/CONCLUSION: This framework is intended to provide practical guidance and serve as a basis for future elaboration and extension.

16.
Contemp Clin Trials ; 136: 107403, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38052297

ABSTRACT

BACKGROUND: COVID-19 vaccination rates among long-term care center (LTCC) workers are among the lowest of all frontline health care workers. Current efforts to increase COVID-19 vaccine uptake generally focus on strategies that have proven effective for increasing influenza vaccine uptake among health care workers including educational and communication strategies. Experimental evidence is lacking on the comparative advantage of educational strategies to improve vaccine acceptance and uptake, especially in the context of COVID-19. Despite the lack of evidence, education and communication strategies are recommended to improve COVID-19 vaccination rates and decrease vaccine hesitancy (VH), especially strategies using tailored messaging for disproportionately affected populations. METHODS: We describe a cluster-randomized comparative effectiveness trial with 40 LTCCs and approximately 4000 LTCC workers in 2 geographically, culturally, and ethnically distinct states. We compare the effectiveness of two strategies for increasing COVID-19 booster vaccination rates and willingness to promote COVID-19 booster vaccination: co-design processes for tailoring educational messages vs. an enhanced usual care comparator. Our study focuses on the language and/or cultural groups that are most disproportionately affected by VH and low COVID-19 vaccine uptake in these LTCCs. CONCLUSION: Finding effective methods to increase COVID-19 vaccine uptake and decrease VH among LTCC staff is critical. Beyond COVID-19, better approaches are needed to improve vaccine uptake and decrease VH for a variety of existing vaccines as well as vaccines created to address novel viruses as they emerge.


Subject(s)
COVID-19 , Vaccines , Humans , COVID-19 Vaccines/therapeutic use , Long-Term Care , COVID-19/epidemiology , COVID-19/prevention & control , Vaccination
17.
J Am Med Inform Assoc ; 31(3): 574-582, 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38109888

ABSTRACT

OBJECTIVES: Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions. MATERIALS AND METHODS: PheNorm is a general-purpose automated approach to creating computable phenotype algorithms based on natural language processing, machine learning, and (low cost) silver-standard training labels. We applied PheNorm to cohorts of potential COVID-19 patients from 2 institutions and used gold-standard manual chart review data to investigate the impact on performance of alternative feature engineering options and implementing externally trained models without local retraining. RESULTS: Models at each institution achieved AUC, sensitivity, and positive predictive value of 0.853, 0.879, 0.851 and 0.804, 0.976, and 0.885, respectively, at quantiles of model-predicted risk that maximize F1. We report performance metrics for all combinations of silver labels, feature engineering options, and models trained internally versus externally. DISCUSSION: Phenotyping algorithms developed using PheNorm performed well at both institutions. Performance varied with different silver-standard labels and feature engineering options. Models developed locally at one site also worked well when implemented externally at the other site. CONCLUSION: PheNorm models successfully identified an acute health condition, symptomatic COVID-19. The simplicity of the PheNorm approach allows it to be applied at multiple study sites with substantially reduced overhead compared to traditional approaches.


Subject(s)
Algorithms , COVID-19 , Humans , Electronic Health Records , Machine Learning , Natural Language Processing
18.
Nat Commun ; 15(1): 2175, 2024 Mar 11.
Article in English | MEDLINE | ID: mdl-38467646

ABSTRACT

In the ENSEMBLE randomized, placebo-controlled phase 3 trial (NCT04505722), estimated single-dose Ad26.COV2.S vaccine efficacy (VE) was 56% against moderate to severe-critical COVID-19. SARS-CoV-2 Spike sequences were determined from 484 vaccine and 1,067 placebo recipients who acquired COVID-19. In this set of prespecified analyses, we show that in Latin America, VE was significantly lower against Lambda vs. Reference and against Lambda vs. non-Lambda [family-wise error rate (FWER) p < 0.05]. VE differed by residue match vs. mismatch to the vaccine-insert at 16 amino acid positions (4 FWER p < 0.05; 12 q-value ≤ 0.20); significantly decreased with physicochemical-weighted Hamming distance to the vaccine-strain sequence for Spike, receptor-binding domain, N-terminal domain, and S1 (FWER p < 0.001); differed (FWER ≤ 0.05) by distance to the vaccine strain measured by 9 antibody-epitope escape scores and 4 NTD neutralization-impacting features; and decreased (p = 0.011) with neutralization resistance level to vaccinee sera. VE against severe-critical COVID-19 was stable across most sequence features but lower against the most distant viruses.


Subject(s)
Ad26COVS1 , COVID-19 , Humans , COVID-19/prevention & control , SARS-CoV-2 , Vaccine Efficacy , Amino Acids , Antibodies, Viral , Antibodies, Neutralizing
19.
J Am Stat Assoc ; 118(543): 1645-1658, 2023.
Article in English | MEDLINE | ID: mdl-37982008

ABSTRACT

In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response - in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment does not necessarily characterize the prediction potential of features, and may provide a misleading reflection of the intrinsic value of these features. To address this limitation, we propose a general framework for nonparametric inference on interpretable algorithm-agnostic variable importance. We define variable importance as a population-level contrast between the oracle predictiveness of all available features versus all features except those under consideration. We propose a nonparametric efficient estimation procedure that allows the construction of valid confidence intervals, even when machine learning techniques are used. We also outline a valid strategy for testing the null importance hypothesis. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection.

20.
bioRxiv ; 2023 Dec 14.
Article in English | MEDLINE | ID: mdl-38168308

ABSTRACT

Combination monoclonal broadly neutralizing antibodies (bnAbs) are currently being developed for preventing HIV-1 infection. Recent work has focused on predicting in vitro neutralization potency of both individual bnAbs and combination regimens against HIV-1 pseudoviruses using Env sequence features. To predict in vitro combination regimen neutralization potency against a given HIV-1 pseudovirus, previous approaches have applied mathematical models to combine individual-bnAb neutralization and have predicted this combined neutralization value; we call this the combine-then-predict (CP) approach. However, prediction performance for some individual bnAbs has exceeded that for the combination, leading to another possibility: combining the individual-bnAb predicted values and using these to predict combination regimen neutralization; we call this the predict-then-combine (PC) approach. We explore both approaches in both simulated data and data from the Los Alamos National Laboratory's Compile, Neutralize, and Tally NAb Panels repository. The CP approach is superior to the PC approach when the neutralization outcome of interest is binary (e.g., neutralization susceptibility, defined as inhibitory concentration < 1 µg/mL. For continuous outcomes, the CP approach performs at least as well as the PC approach, and is superior to the PC approach when the individual-bnAb prediction algorithms have poor performance. This knowledge may be used when building prediction models for novel antibody combinations in the absence of in vitro neutralization data for the antibody combination; this, in turn, will aid in the evaluation and down-selection of these antibody combinations into prevention efficacy trials.

SELECTION OF CITATIONS
SEARCH DETAIL