RESUMO
The current approach to using machine learning (ML) algorithms in healthcare is to either require clinician oversight for every use case or use their predictions without any human oversight. We explore a middle ground that lets ML algorithms abstain from making a prediction to simultaneously improve their reliability and reduce the burden placed on human experts. To this end, we present a general penalized loss minimization framework for training selective prediction-set (SPS) models, which choose to either output a prediction set or abstain. The resulting models abstain when the outcome is difficult to predict accurately, such as on subjects who are too different from the training data, and achieve higher accuracy on those they do give predictions for. We then introduce a model-agnostic, statistical inference procedure for the coverage rate of an SPS model that ensembles individual models trained using K-fold cross-validation. We find that SPS ensembles attain prediction-set coverage rates closer to the nominal level and have narrower confidence intervals for its marginal coverage rate. We apply our method to train neural networks that abstain more for out-of-sample images on the MNIST digit prediction task and achieve higher predictive accuracy for ICU patients compared to existing approaches.
Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Humanos , Reprodutibilidade dos Testes , Algoritmos , Projetos de PesquisaRESUMO
Databases derived from electronic health records (EHRs) are commonly subject to left truncation, a type of selection bias that occurs when patients need to survive long enough to satisfy certain entry criteria. Standard methods to adjust for left truncation bias rely on an assumption of marginal independence between entry and survival times, which may not always be satisfied in practice. In this work, we examine how a weaker assumption of conditional independence can result in unbiased estimation of common statistical parameters. In particular, we show the estimability of conditional parameters in a truncated dataset, and of marginal parameters that leverage reference data containing non-truncated data on confounders. The latter is complementary to observational causal inference methodology applied to real-world external comparators, which is a common use case for real-world databases. We implement our proposed methods in simulation studies, demonstrating unbiased estimation and valid statistical inference. We also illustrate estimation of a survival distribution under conditionally independent left truncation in a real-world clinico-genomic database.
Assuntos
Modelos Estatísticos , Viés , Causalidade , Simulação por Computador , Humanos , Análise de SobrevidaRESUMO
BACKGROUND AND AIMS: EUS-directed transgastric ERCP (the EDGE procedure) is a simplified method of performing ERCP in Roux-en-Y gastric bypass patients. The EDGE procedure involves placement of a lumen-apposing metal stent (LAMS) into the excluded stomach to serve as a conduit for passage of the duodenoscope for pancreatobiliary intervention. Originally a multistep process, urgent indications for ERCP have led to the development of single-session EDGE (SS-EDGE) with LAMS placement and ERCP performed in the same session. The goal of this study was to identify predictive factors of intraprocedural LAMS migration in SS-EDGE. METHODS: We conducted a multicenter retrospective review that included 9 tertiary medical centers across the United States. Data were collected and analyzed from 128 SS-EDGE procedures. The primary outcome was intraprocedural LAMS migration. Secondary outcomes were other procedural adverse events such as bleeding and perforation. RESULTS: Eleven LAMS migrations were observed in 128 procedures (8.6%). Univariate analysis of clinically relevant variables was performed, as was a binary logistic regression analysis of stent diameter and stent dilation. This revealed that use of a smaller (15 mm) diameter LAMS was an independent predictor of intraprocedural stent migration (odds ratio, 5.36; 95% confidence interval, 1.29-22.24; P = .021). Adverse events included 3 patients who required surgery and 2 who experienced intraprocedural bleeding. CONCLUSIONS: Use of a larger-diameter LAMS is a predictive factor for a nonmigrated stent and improved procedural success in SS-EDGE. Although larger patient cohorts are needed to adequately assess these findings, performance of LAMS dilation and fixation may also decrease risk of intraprocedural LAMS migration and improve procedural success.
Assuntos
Colangiopancreatografia Retrógrada Endoscópica , Derivação Gástrica , Colangiopancreatografia Retrógrada Endoscópica/efeitos adversos , Derivação Gástrica/efeitos adversos , Humanos , Estudos Retrospectivos , Stents , Estômago/cirurgiaRESUMO
BACKGROUND: Statistical inference based on small datasets, commonly found in precision oncology, is subject to low power and high uncertainty. In these settings, drawing strong conclusions about future research utility is difficult when using standard inferential measures. It is therefore important to better quantify the uncertainty associated with both significant and non-significant results based on small sample sizes. METHODS: We developed a new method, Bayesian Additional Evidence (BAE), that determines (1) how much additional supportive evidence is needed for a non-significant result to reach Bayesian posterior credibility, or (2) how much additional opposing evidence is needed to render a significant result non-credible. Although based in Bayesian analysis, a prior distribution is not needed; instead, the tipping point output is compared to reasonable effect ranges to draw conclusions. We demonstrate our approach in a comparative effectiveness analysis comparing two treatments in a real world biomarker-defined cohort, and provide guidelines for how to apply BAE in practice. RESULTS: Our initial comparative effectiveness analysis results in a hazard ratio of 0.31 with 95% confidence interval (0.09, 1.1). Applying BAE to this result yields a tipping point of 0.54; thus, an observed hazard ratio of 0.54 or smaller in a replication study would result in posterior credibility for the treatment association. Given that effect sizes in this range are not extreme, and that supportive evidence exists from a similar published study, we conclude that this problem is worthy of further research. CONCLUSIONS: Our proposed method provides a useful framework for interpreting analytic results from small datasets. This can assist researchers in deciding how to interpret and continue their investigations based on an initial analysis that has high uncertainty. Although we illustrated its use in estimating parameters based on time-to-event outcomes, BAE easily applies to any normally-distributed estimator, such as those used for analyzing binary or continuous outcomes.
Assuntos
Neoplasias , Teorema de Bayes , Humanos , Medicina de Precisão , Tamanho da Amostra , IncertezaRESUMO
In large-scale genetic studies, a primary aim is to test for an association between genetic variants and a disease outcome. The variants of interest are often rare and appear with low frequency among subjects. In this situation, statistical tests based on standard asymptotic results do not adequately control the type I error rate, especially if the case : control ratio is unbalanced. In this article, we propose the use of permutation and approximate unconditional tests for testing association with rare variants. We use novel analytical calculations to efficiently approximate the true type I error rate under common study designs, and in numerical studies show that the proposed classes of tests significantly improve upon standard testing methods. We also illustrate our methods in data from a recent case-control study for genetic causes of a severe side effect of a common drug treatment.
Assuntos
Estudos de Associação Genética , Modelos Genéticos , Estudos de Casos e Controles , Humanos , Inibidores de Hidroximetilglutaril-CoA Redutases/efeitos adversos , Funções Verossimilhança , Modelos Estatísticos , Rabdomiólise/induzido quimicamente , Rabdomiólise/genéticaAssuntos
Dor Abdominal/etiologia , Ductos Biliares/patologia , Doença Hepática Induzida por Substâncias e Drogas/diagnóstico , Doença Hepática Induzida por Substâncias e Drogas/patologia , Mitragyna/efeitos adversos , Adulto , Analgésicos/efeitos adversos , Doença Hepática Induzida por Substâncias e Drogas/etiologia , Humanos , MasculinoAssuntos
Caquexia/imunologia , Colite Ulcerativa/diagnóstico , Neoplasias/imunologia , Doenças da Imunodeficiência Primária/diagnóstico , Síndrome de Emaciação/imunologia , Candidíase Mucocutânea Crônica/diagnóstico , Candidíase Mucocutânea Crônica/imunologia , Candidíase Mucocutânea Crônica/microbiologia , Classe Ia de Fosfatidilinositol 3-Quinase/genética , Colite Ulcerativa/imunologia , Citomegalovirus/imunologia , Citomegalovirus/isolamento & purificação , Infecções por Citomegalovirus/diagnóstico , Infecções por Citomegalovirus/imunologia , Infecções por Citomegalovirus/virologia , Diarreia/diagnóstico , Diarreia/imunologia , Esofagite/diagnóstico , Esofagite/imunologia , Esofagite/microbiologia , Feminino , Humanos , Pessoa de Meia-Idade , Mutação , Neoplasias/complicações , Doenças da Imunodeficiência Primária/genética , Doenças da Imunodeficiência Primária/imunologia , Viremia/diagnóstico , Viremia/imunologia , Viremia/virologia , Síndrome de Emaciação/diagnóstico , Síndrome de Emaciação/virologiaAssuntos
Diarreia/imunologia , Fármacos Gastrointestinais/uso terapêutico , Poliendocrinopatias Autoimunes/imunologia , Redução de Peso , Biópsia , Diarreia/diagnóstico , Diarreia/tratamento farmacológico , Endoscopia Gastrointestinal , Feminino , Glucocorticoides/uso terapêutico , Humanos , Imunidade Humoral , Infliximab/uso terapêutico , Mucosa Intestinal/diagnóstico por imagem , Mucosa Intestinal/imunologia , Mucosa Intestinal/patologia , Pessoa de Meia-Idade , Poliendocrinopatias Autoimunes/diagnóstico , Poliendocrinopatias Autoimunes/tratamento farmacológico , Resultado do TratamentoAssuntos
Colestase Intra-Hepática/etiologia , Transtornos de Fotossensibilidade/etiologia , Protoporfiria Eritropoética/complicações , Doenças Raras , Idoso , Biópsia , Colestase Intra-Hepática/diagnóstico , Colestase Intra-Hepática/terapia , Feminino , Cuidados Paliativos na Terminalidade da Vida , Humanos , Fígado/patologia , Transtornos de Fotossensibilidade/diagnóstico , Troca Plasmática , Protoporfiria Eritropoética/diagnóstico , Protoporfiria Eritropoética/terapia , Pele/patologiaAssuntos
Doenças dos Ductos Biliares , Endoscópios , Vesícula Biliar , Humanos , Estudos ProspectivosAssuntos
Artrite/etiologia , Diarreia/etiologia , Duodeno/patologia , Tropheryma/isolamento & purificação , Doença de Whipple/diagnóstico , Antibacterianos/uso terapêutico , Ceftriaxona/uso terapêutico , Diagnóstico Diferencial , Endoscopia do Sistema Digestório , Febre/etiologia , Humanos , Masculino , Pessoa de Meia-Idade , Redução de Peso , Doença de Whipple/complicações , Doença de Whipple/tratamento farmacológicoAssuntos
Dor Abdominal/etiologia , Intestino Delgado/diagnóstico por imagem , Nefrite Lúpica/complicações , Nefrite Lúpica/diagnóstico por imagem , Tomografia Computadorizada por Raios X , Feminino , Humanos , Nefrite Lúpica/terapia , Valor Preditivo dos Testes , Recidiva , Resultado do Tratamento , Adulto JovemRESUMO
BACKGROUND: Certain populations have been historically underrepresented in clinical trials. Broadening eligibility criteria is one approach to inclusive clinical research and achieving enrollment goals. How broadened trial eligibility criteria affect the diversity of eligible participants is unknown. METHODS: Using a nationwide electronic health record-derived deidentified database, we identified a retrospective cohort of patients diagnosed with 22 cancer types between April 1, 2013 and December 31, 2022 who received systemic therapy (N=235,234) for cancer. We evaluated strict versus broadened eligibility criteria using performance status and liver, kidney, and hematologic function around first line of therapy. We performed logistic regression to estimate odds ratios for exclusion by strict criteria and their association with measures of patient diversity, including sex, age, race or ethnicity, and area-level socioeconomic status (SES); estimated the impact of broadening criteria on the number and distribution of eligible patients; and performed Cox regression to estimate hazard ratios for real-world overall survival (rwOS) comparing patients meeting strict versus broadened criteria. RESULTS: When applying common strict cutoffs for eligibility criteria to patients with complete data and weighting each cancer type equally, 48% of patients were eligible for clinical trials. Female (odds ratio, 1.30; 95% confidence interval [CI], 1.25 to 1.35), older (age 75+ vs. 18 to 49 years old: odds ratio, 3.04; 95% CI, 2.85 to 3.24), Latinx (odds ratio, 1.46; 95% CI, 1.39 to 1.54), non-Latinx Black (odds ratio, 1.11; 95% CI, 1.06 to 1.16), and lower-SES patients were more likely to be excluded using strict eligibility criteria. Broadening criteria increased the number of eligible patients by 78%, with the strongest impact for older, female, non-Latinx Black, and lower-SES patients. Patients who met only broadened criteria had worse rwOS versus those with strict criteria (hazard ratio, 1.31; 95% CI, 1.27 to 1.34). CONCLUSIONS: Data-driven evaluation of clinical trial eligibility criteria may optimize the eligibility of certain historically underrepresented groups and promote access to more inclusive trials. (Sponsored by Flatiron Health.).
Assuntos
Ensaios Clínicos como Assunto , Definição da Elegibilidade , Neoplasias , Seleção de Pacientes , Humanos , Feminino , Neoplasias/terapia , Neoplasias/etnologia , Neoplasias/mortalidade , Masculino , Estudos Retrospectivos , Pessoa de Meia-Idade , Idoso , Adulto , Adolescente , Adulto JovemRESUMO
PURPOSE: Real-world data (RWD) derived from electronic health records (EHRs) are often used to understand population-level relationships between patient characteristics and cancer outcomes. Machine learning (ML) methods enable researchers to extract characteristics from unstructured clinical notes, and represent a more cost-effective and scalable approach than manual expert abstraction. These extracted data are then used in epidemiologic or statistical models as if they were abstracted observations. Analytical results derived from extracted data in this way may differ from those given by abstracted data, and the magnitude of this difference is not directly informed by standard ML performance metrics. METHODS: In this paper, we define the task of postprediction inference, which is to recover similar estimation and inference from an ML-extracted variable that would be obtained from abstracting the variable. We consider fitting a Cox proportional hazards model that uses a binary ML-extracted variable as a covariate and evaluate four approaches for postprediction inference in this setting. The first two approaches only require the ML-predicted probability, while the latter two additionally require a labeled (human abstracted) validation data set. RESULTS: Our results for both simulated data and EHR-derived RWD from a national cohort demonstrate that we can improve inference from ML-extracted variables by leveraging a limited amount of labeled data. CONCLUSION: We describe and evaluate methods for fitting statistical models using ML-extracted variables subject to model error. We show that estimation and inference is generally valid when using extracted data from high-performing ML models. More complex methods that incorporate auxiliary labeled data provide further improvements.
Assuntos
Benchmarking , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Modelos Estatísticos , PesquisadoresRESUMO
Real-world data derived from electronic health records often exhibit high levels of missingness in variables, such as laboratory results, presenting a challenge for statistical analyses. We developed a systematic workflow for gathering evidence of different missingness mechanisms and performing subsequent statistical analyses. We quantify evidence for missing completely at random (MCAR) or missing at random (MAR), mechanisms using Hotelling's multivariate t-test, and random forest classifiers, respectively. We further illustrate how to apply sensitivity analyses using the not at random fully conditional specification procedure to examine changes in parameter estimates under missing not at random (MNAR) mechanisms. In simulation studies, we validated these diagnostics and compared analytic bias under different mechanisms. To demonstrate the application of this workflow, we applied it to two exemplary case studies with an advanced non-small cell lung cancer and a multiple myeloma cohort derived from a real-world oncology database. Here, we found strong evidence against MCAR, and some evidence of MAR, implying that imputation approaches that attempt to predict missing values by fitting a model to observed data may be suitable for use. Sensitivity analyses did not suggest meaningful departures of our analytic results under potential MNAR mechanisms; these results were also in line with results reported in clinical trials.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Mieloma Múltiplo , Humanos , Registros Eletrônicos de Saúde , Simulação por Computador , Modelos EstatísticosRESUMO
Meaningful real-world evidence (RWE) generation requires unstructured data found in electronic health records (EHRs) which are often missing from administrative claims; however, obtaining relevant data from unstructured EHR sources is resource-intensive. In response, researchers are using natural language processing (NLP) with machine learning (ML) techniques (i.e., ML extraction) to extract real-world data (RWD) at scale. This study assessed the quality and fitness-for-use of EHR-derived oncology data curated using NLP with ML as compared to the reference standard of expert abstraction. Using a sample of 186,313 patients with lung cancer from a nationwide EHR-derived de-identified database, we performed a series of replication analyses demonstrating some common analyses conducted in retrospective observational research with complex EHR-derived data to generate evidence. Eligible patients were selected into biomarker- and treatment-defined cohorts, first with expert-abstracted then with ML-extracted data. We utilized the biomarker- and treatment-defined cohorts to perform analyses related to biomarker-associated survival and treatment comparative effectiveness, respectively. Across all analyses, the results differed by less than 8% between the data curation methods, and similar conclusions were reached. These results highlight that high-performance ML-extracted variables trained on expert-abstracted data can achieve similar results as when using abstracted data, unlocking the ability to perform oncology research at scale.
RESUMO
Video 1EUS-Guided hepaticogastrostomy in a pregnant patient with Roux-en-Y hepaticojejunostomy anatomy.
RESUMO
Background and study aims Various techniques have been described for flexible endoscopic therapy for Zenker's diverticulum (ZD). Objective methods to assess myotomy effectiveness are lacking. We assessed the utility of impedance planimetry in flexible endoscopic ZD therapies and correlation with a validated symptom score. Patients and methods Patients undergoing endoscopic therapy for symptomatic ZD from February 2019 to March 2020 were included. Intraprocedural impedance planimetry was performed pre- and post-myotomy to assess esophageal diameter and distensibility index (DI). Eating Assessment Tool (EAT)-10 scores were assessed preintervention and post-intervention. Descriptive statistics were calculated. Results Thirteen patients (46â% women; mean age 80 years; 77â% peroral endoscopic myotomy technique) were included. Technical and clinical success was 100â%. No adverse events occurred. At 40âmL and 50âmL, the diameter improved (mean 2.3âmm and 2.6âmm, respectively). At 40âmL and 50âmL, the DI improved (mean 1.0âmm 2 /mmHg and 1.8âmm 2 /mmHg, respectively). EAT-10 scores improved by a mean of 15 points. Mean follow-up was 97 days. Conclusions Intraprocedural impedance planimetry may provide objective data to define success for flexible endoscopic ZD. Further research is required to corroborate these results.