Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
1.
NEJM AI ; 1(6)2024 Jun.
Article in English | MEDLINE | ID: mdl-38872809

ABSTRACT

BACKGROUND: In intensive care units (ICUs), critically ill patients are monitored with electroencephalography (EEG) to prevent serious brain injury. EEG monitoring is constrained by clinician availability, and EEG interpretation can be subjective and prone to interobserver variability. Automated deep-learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, existing uninterpretable (black-box) deep-learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of both trust and adoption by clinicians. METHODS: We developed an interpretable deep-learning system that accurately classifies six patterns of potentially harmful EEG activity - seizure, lateralized periodic discharges (LPDs), generalized periodic discharges (GPDs), lateralized rhythmic delta activity (LRDA), generalized rhythmic delta activity (GRDA), and other patterns - while providing faithful case-based explanations of its predictions. The model was trained on 50,697 total 50-second continuous EEG samples collected from 2711 patients in the ICU between July 2006 and March 2020 at Massachusetts General Hospital. EEG samples were labeled as one of the six EEG patterns by 124 domain experts and trained annotators. To evaluate the model, we asked eight medical professionals with relevant backgrounds to classify 100 EEG samples into the six pattern categories - once with and once without artificial intelligence (AI) assistance - and we assessed the assistive power of this interpretable system by comparing the diagnostic accuracy of the two methods. The model's discriminatory performance was evaluated with area under the receiver-operating characteristic curve (AUROC) and area under the precision-recall curve. The model's interpretability was measured with task-specific neighborhood agreement statistics that interrogated the similarities of samples and features. In a separate analysis, the latent space of the neural network was visualized by using dimension reduction techniques to examine whether the ictal-interictal injury continuum hypothesis, which asserts that seizures and seizure-like patterns of brain activity lie along a spectrum, is supported by data. RESULTS: The performance of all users significantly improved when provided with AI assistance. Mean user diagnostic accuracy improved from 47 to 71% (P<0.04). The model achieved AUROCs of 0.87, 0.93, 0.96, 0.92, 0.93, and 0.80 for the classes seizure, LPD, GPD, LRDA, GRDA, and other patterns, respectively. This performance was significantly higher than that of a corresponding uninterpretable black-box model (with P<0.0001). Videos traversing the ictal-interictal injury manifold from dimension reduction (a two-dimensional representation of the original high-dimensional feature space) give insight into the layout of EEG patterns within the network's latent space and illuminate relationships between EEG patterns that were previously hypothesized but had not yet been shown explicitly. These results indicate that the ictal-interictal injury continuum hypothesis is supported by data. CONCLUSIONS: Users showed significant pattern classification accuracy improvement with the assistance of this interpretable deep-learning model. The interpretable design facilitates effective human-AI collaboration; this system may improve diagnosis and patient care in clinical settings. The model may also provide a better understanding of how EEG patterns relate to each other along the ictal-interictal injury continuum. (Funded by the National Science Foundation, National Institutes of Health, and others.).

2.
Article in English | MEDLINE | ID: mdl-38867375

ABSTRACT

BACKGROUND/OBJECTIVES: Epileptiform activity (EA), including seizures and periodic patterns, worsens outcomes in patients with acute brain injuries (e.g., aneurysmal subarachnoid hemorrhage [aSAH]). Randomized control trials (RCTs) assessing anti-seizure interventions are needed. Due to scant drug efficacy data and ethical reservations with placebo utilization, and complex physiology of acute brain injury, RCTs are lacking or hindered by design constraints. We used a pharmacological model-guided simulator to design and determine the feasibility of RCTs evaluating EA treatment. METHODS: In a single-center cohort of adults (age >18) with aSAH and EA, we employed a mechanistic pharmacokinetic-pharmacodynamic framework to model treatment response using observational data. We subsequently simulated RCTs for levetiracetam and propofol, each with three treatment arms mirroring clinical practice and an additional placebo arm. Using our framework, we simulated EA trajectories across treatment arms. We predicted discharge modified Rankin Scale as a function of baseline covariates, EA burden, and drug doses using a double machine learning model learned from observational data. Differences in outcomes across arms were used to estimate the required sample size. RESULTS: Sample sizes ranged from 500 for levetiracetam 7 mg/kg versus placebo, to >4000 for levetiracetam 15 versus 7 mg/kg to achieve 80% power (5% type I error). For propofol 1 mg/kg/h versus placebo, 1200 participants were needed. Simulations comparing propofol at varying doses did not reach 80% power even at samples >1200. CONCLUSIONS: Our simulations using drug efficacy show sample sizes are infeasible, even for potentially unethical placebo-control trials. We highlight the strength of simulations with observational data to inform the null hypotheses and propose use of this simulation-based RCT paradigm to assess the feasibility of future trials of anti-seizure treatment in acute brain injury.

3.
Article in English | MEDLINE | ID: mdl-38902848

ABSTRACT

Despite the success of antiretroviral therapy, human immunodeficiency virus (HIV) cannot be cured because of a reservoir of latently infected cells that evades therapy. To understand the mechanisms of HIV latency, we employed an integrated single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq) approach to simultaneously profile the transcriptomic and epigenomic characteristics of ∼ 125,000 latently infected primary CD4+ T cells after reactivation using three different latency reversing agents. Differentially expressed genes and differentially accessible motifs were used to examine transcriptional pathways and transcription factor (TF) activities across the cell population. We identified cellular transcripts and TFs whose expression/activity was correlated with viral reactivation and demonstrated that a machine learning model trained on these data was 75%-79% accurate at predicting viral reactivation. Finally, we validated the role of two candidate HIV-regulating factors, FOXP1 and GATA3, in viral transcription. These data demonstrate the power of integrated multimodal single-cell analysis to uncover novel relationships between host cell factors and HIV latency.


Subject(s)
CD4-Positive T-Lymphocytes , GATA3 Transcription Factor , HIV-1 , Single-Cell Analysis , Virus Activation , Virus Latency , Virus Latency/genetics , Humans , Virus Activation/genetics , Single-Cell Analysis/methods , HIV-1/genetics , HIV-1/physiology , CD4-Positive T-Lymphocytes/virology , CD4-Positive T-Lymphocytes/metabolism , GATA3 Transcription Factor/metabolism , GATA3 Transcription Factor/genetics , Forkhead Transcription Factors/metabolism , Forkhead Transcription Factors/genetics , HIV Infections/virology , HIV Infections/genetics , HIV Infections/metabolism , Repressor Proteins/metabolism , Repressor Proteins/genetics , Transcriptome/genetics , Gene Expression Regulation, Viral
4.
Radiology ; 310(3): e232780, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38501952

ABSTRACT

Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpose To identify whether bilateral dissimilarity underpins Mirai's reasoning process; create a simplified, intelligible model, AsymMirai, using bilateral dissimilarity; and determine if AsymMirai may approximate Mirai's performance in 1-5-year breast cancer risk prediction. Materials and Methods This retrospective study involved mammograms obtained from patients in the EMory BrEast imaging Dataset, known as EMBED, from January 2013 to December 2020. To approximate 1-5-year breast cancer risk predictions from Mirai, another deep learning-based model, AsymMirai, was built with an interpretable module: local bilateral dissimilarity (localized differences between left and right breast tissue). Pearson correlation coefficients were computed between the risk scores of Mirai and those of AsymMirai. Subgroup analysis was performed in patients for whom AsymMirai's year-over-year reasoning was consistent. AsymMirai and Mirai risk scores were compared using the area under the receiver operating characteristic curve (AUC), and 95% CIs were calculated using the DeLong method. Results Screening mammograms (n = 210 067) from 81 824 patients (mean age, 59.4 years ± 11.4 [SD]) were included in the study. Deep learning-extracted bilateral dissimilarity produced similar risk scores to those of Mirai (1-year risk prediction, r = 0.6832; 4-5-year prediction, r = 0.6988) and achieved similar performance as Mirai. For AsymMirai, the 1-year breast cancer risk AUC was 0.79 (95% CI: 0.73, 0.85) (Mirai, 0.84; 95% CI: 0.79, 0.89; P = .002), and the 5-year risk AUC was 0.66 (95% CI: 0.63, 0.69) (Mirai, 0.71; 95% CI: 0.68, 0.74; P < .001). In a subgroup of 183 patients for whom AsymMirai repeatedly highlighted the same tissue over time, AsymMirai achieved a 3-year AUC of 0.92 (95% CI: 0.86, 0.97). Conclusion Localized bilateral dissimilarity, an imaging marker for breast cancer risk, approximated the predictive power of Mirai and was a key to Mirai's reasoning. © RSNA, 2024 Supplemental material is available for this article See also the editorial by Freitas in this issue.


Subject(s)
Breast Neoplasms , Deep Learning , Humans , Middle Aged , Female , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Retrospective Studies , Mammography , Breast
5.
IEEE J Biomed Health Inform ; 28(5): 2650-2661, 2024 May.
Article in English | MEDLINE | ID: mdl-38300786

ABSTRACT

Atrial fibrillation (AF) is a common cardiac arrhythmia with serious health consequences if not detected and treated early. Detecting AF using wearable devices with photoplethysmography (PPG) sensors and deep neural networks has demonstrated some success using proprietary algorithms in commercial solutions. However, to improve continuous AF detection in ambulatory settings towards a population-wide screening use case, we face several challenges, one of which is the lack of large-scale labeled training data. To address this challenge, we propose to leverage AF alarms from bedside patient monitors to label concurrent PPG signals, resulting in the largest PPG-AF dataset so far (8.5 M 30-second records from 24,100 patients) and demonstrating a practical approach to build large labeled PPG datasets. Furthermore, we recognize that the AF labels thus obtained contain errors because of false AF alarms generated from imperfect built-in algorithms from bedside monitors. Dealing with label noise with unknown distribution characteristics in this case requires advanced algorithms. We, therefore, introduce and open-source a novel loss design, the cluster membership consistency (CMC) loss, to mitigate label errors. By comparing CMC with state-of-the-art methods selected from a noisy label competition, we demonstrate its superiority in handling label noise in PPG data, resilience to poor-quality signals, and computational efficiency.


Subject(s)
Algorithms , Atrial Fibrillation , Photoplethysmography , Signal Processing, Computer-Assisted , Humans , Photoplethysmography/methods , Atrial Fibrillation/physiopathology , Atrial Fibrillation/diagnosis , Clinical Alarms , Machine Learning , Wearable Electronic Devices
6.
ArXiv ; 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-37808086

ABSTRACT

Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely flexible and can be integrated with most existing model classes and global variable importance metrics. We demonstrate through experiments that our framework recovers variable importance rankings for complex simulation setups where other methods fail. Further, we show that our framework accurately estimates the true importance of a variable for the underlying data distribution. We provide theoretical guarantees on the consistency and finite sample error rates for our estimator. Finally, we demonstrate its utility with a real-world case study exploring which genes are important for predicting HIV load in persons with HIV, highlighting an important gene that has not previously been studied in connection with HIV. Code is available at https://github.com/jdonnelly36/Rashomon_Importance_Distribution.

7.
bioRxiv ; 2023 Nov 16.
Article in English | MEDLINE | ID: mdl-38014340

ABSTRACT

Antiretroviral therapy (ART) halts HIV replication; however, cellular / immue cell viral reservoirs persist despite ART. Understanding the interplay between the HIV reservoir, immune perturbations, and HIV-specific immune responses on ART may yield insights into HIV persistence. A cross-sectional study of peripheral blood samples from 115 people with HIV (PWH) on long-term ART was conducted. High-dimensional immunophenotyping, quantification of HIV-specific T cell responses, and the intact proviral DNA assay (IPDA) were performed. Total and intact HIV DNA was positively correlated with T cell activation and exhaustion. Years of ART and select bifunctional HIV-specific CD4 T cell responses were negatively correlated with the percentage of intact proviruses. A Leave-One-Covariate-Out (LOCO) inference approach identified specific HIV reservoir and clinical-demographic parameters that were particularly important in predicting select immunophenotypes. Dimension reduction revealed two main clusters of PWH with distinct reservoirs. Additionally, machine learning approaches identified specific combinations of immune and clinical-demographic parameters that predicted with approximately 70% accuracy whether a given participant had qualitatively high or low levels of total or intact HIV DNA. The techniques described here may be useful for assessing global patterns within the increasingly high-dimensional data used in HIV reservoir and other studies of complex biology.

8.
Proc Natl Acad Sci U S A ; 120(41): e2301842120, 2023 10 10.
Article in English | MEDLINE | ID: mdl-37782786

ABSTRACT

One of the most troubling trends in criminal investigations is the growing use of "black box" technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply conceal how it functions. In criminal cases, black box systems have proliferated in forensic areas such as DNA mixture interpretation, facial recognition, and recidivism risk assessments. The champions and critics of AI argue, mistakenly, that we face a catch 22: While black box AI is not understandable by people, they assume that it produces more accurate forensic evidence. In this Article, we question this assertion, which has so powerfully affected judges, policymakers, and academics. We describe a mature body of computer science research showing how "glass box" AI-designed to be interpretable-can be more accurate than black box alternatives. Indeed, black box AI performs predictably worse in settings like the criminal system. Debunking the black box performance myth has implications for forensic evidence, constitutional criminal procedure rights, and legislative policy. Absent some compelling-or even credible-government interest in keeping AI as a black box, and given the constitutional rights and public safety interests at stake, we argue that a substantial burden rests on the government to justify black box AI in criminal cases. We conclude by calling for judicial rulings and legislation to safeguard a right to interpretable forensic AI.


Subject(s)
Artificial Intelligence , Criminals , Humans , Forensic Medicine , Law Enforcement , Algorithms
9.
medRxiv ; 2023 Aug 22.
Article in English | MEDLINE | ID: mdl-37662339

ABSTRACT

Objectives: Epileptiform activity (EA) worsens outcomes in patients with acute brain injuries (e.g., aneurysmal subarachnoid hemorrhage [aSAH]). Randomized trials (RCTs) assessing anti-seizure interventions are needed. Due to scant drug efficacy data and ethical reservations with placebo utilization, RCTs are lacking or hindered by design constraints. We used a pharmacological model-guided simulator to design and determine feasibility of RCTs evaluating EA treatment. Methods: In a single-center cohort of adults (age >18) with aSAH and EA, we employed a mechanistic pharmacokinetic-pharmacodynamic framework to model treatment response using observational data. We subsequently simulated RCTs for levetiracetam and propofol, each with three treatment arms mirroring clinical practice and an additional placebo arm. Using our framework we simulated EA trajectories across treatment arms. We predicted discharge modified Rankin Scale as a function of baseline covariates, EA burden, and drug doses using a double machine learning model learned from observational data. Differences in outcomes across arms were used to estimate the required sample size. Results: Sample sizes ranged from 500 for levetiracetam 7 mg/kg vs placebo, to >4000 for levetiracetam 15 vs. 7 mg/kg to achieve 80% power (5% type I error). For propofol 1mg/kg/hr vs. placebo 1200 participants were needed. Simulations comparing propofol at varying doses did not reach 80% power even at samples >1200. Interpretation: Our simulations using drug efficacy show sample sizes are infeasible, even for potentially unethical placebo-control trials. We highlight the strength of simulations with observational data to inform the null hypotheses and assess feasibility of future trials of EA treatment.

10.
Data Brief ; 49: 109396, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37600123

ABSTRACT

Additive manufacturing has provided the ability to manufacture complex structures using a wide variety of materials and geometries. Structures such as triply periodic minimal surface (TPMS) lattices have been incorporated into products across many fields due to their unique combinations of mechanical, geometric, and physical properties. Yet, the near limitless possibility of combining geometry and material into these lattices leaves much to be discovered. This article provides a dataset of experimentally gathered tensile stress-strain curves and measured porosity values for 389 unique gyroid lattice structures manufactured using vat photopolymerization 3D printing. The lattice samples were printed from one of twenty different photopolymer materials available from either Formlabs, LOCTITE AM, or ETEC that range from strong and brittle to elastic and ductile and were printed on commercially available 3D printers, specifically the Formlabs Form2, Prusa SL1, and ETEC Envision One cDLM Mechanical. The stress-strain curves were recorded with an MTS Criterion C43.504 mechanical testing apparatus and following ASTM standards, and the void fraction or "porosity" of each lattice was measured using a calibrated scale. This data serves as a valuable resource for use in the development of novel printing materials and lattice geometries and provides insight into the influence of photopolymer material properties on the printability, geometric accuracy, and mechanical performance of 3D printed lattice structures. The data described in this article was used to train a machine learning model capable of predicting mechanical properties of 3D printed gyroid lattices based on the base mechanical properties of the printing material and porosity of the lattice in the research article [1].

11.
J Infect Dis ; 228(11): 1600-1609, 2023 11 28.
Article in English | MEDLINE | ID: mdl-37606598

ABSTRACT

BACKGROUND: Human immunodeficiency virus (HIV) infection remains incurable due to the persistence of a viral reservoir despite antiretroviral therapy (ART). Cannabis (CB) use is prevalent amongst people with HIV (PWH), but the impact of CB on the latent HIV reservoir has not been investigated. METHODS: Peripheral blood cells from a cohort of PWH who use CB and a matched cohort of PWH who do not use CB on ART were evaluated for expression of maturation/activation markers, HIV-specific T-cell responses, and intact proviral DNA. RESULTS: CB use was associated with increased abundance of naive T cells, reduced effector T cells, and reduced expression of activation markers. CB use was also associated with reduced levels of exhausted and senescent T cells compared to nonusing controls. HIV-specific T-cell responses were unaffected by CB use. CB use was not associated with intact or total HIV DNA frequency in CD4 T cells. CONCLUSIONS: This analysis is consistent with the hypothesis that CB use reduces activation, exhaustion, and senescence in the T cells of PWH, and does not impair HIV-specific CD8 T-cell responses. Longitudinal and interventional studies with evaluation of CB exposure are needed to fully evaluate the impact of CB use on the HIV reservoir.


Subject(s)
Cannabis , HIV Infections , HIV-1 , Humans , Cannabis/genetics , HIV-1/genetics , Virus Latency , CD4-Positive T-Lymphocytes , DNA , Viral Load , Anti-Retroviral Agents/therapeutic use , DNA, Viral/genetics
12.
Nat Commun ; 14(1): 4838, 2023 08 10.
Article in English | MEDLINE | ID: mdl-37563117

ABSTRACT

Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity in commercial polymers used in medicine is stunningly low. Considerable time and resources have been extended over the years towards the development of new polymeric biomaterials which address unmet needs left by the current generation of medical-grade polymers. Machine learning (ML) presents an unprecedented opportunity in this field to bypass the need for trial-and-error synthesis, thus reducing the time and resources invested into new discoveries critical for advancing medical treatments. Current efforts pioneering applied ML in polymer design have employed combinatorial and high throughput experimental design to address data availability concerns. However, the lack of available and standardized characterization of parameters relevant to medicine, including degradation time and biocompatibility, represents a nearly insurmountable obstacle to ML-aided design of biomaterials. Herein, we identify a gap at the intersection of applied ML and biomedical polymer design, highlight current works at this junction more broadly and provide an outlook on challenges and future directions.


Subject(s)
Biocompatible Materials , Polymers
13.
Lancet Digit Health ; 5(8): e495-e502, 2023 08.
Article in English | MEDLINE | ID: mdl-37295971

ABSTRACT

BACKGROUND: Epileptiform activity is associated with worse patient outcomes, including increased risk of disability and death. However, the effect of epileptiform activity on neurological outcome is confounded by the feedback between treatment with antiseizure medications and epileptiform activity burden. We aimed to quantify the heterogeneous effects of epileptiform activity with an interpretability-centred approach. METHODS: We did a retrospective, cross-sectional study of patients in the intensive care unit who were admitted to Massachusetts General Hospital (Boston, MA, USA). Participants were aged 18 years or older and had electrographic epileptiform activity identified by a clinical neurophysiologist or epileptologist. The outcome was the dichotomised modified Rankin Scale (mRS) at discharge and the exposure was epileptiform activity burden defined as mean or maximum proportion of time spent with epileptiform activity in 6 h windows in the first 24 h of electroencephalography. We estimated the change in discharge mRS if everyone in the dataset had experienced a specific epileptiform activity burden and were untreated. We combined pharmacological modelling with an interpretable matching method to account for confounding and epileptiform activity-antiseizure medication feedback. The quality of the matched groups was validated by the neurologists. FINDINGS: Between Dec 1, 2011, and Oct 14, 2017, 1514 patients were admitted to Massachusetts General Hospital intensive care unit, 995 (66%) of whom were included in the analysis. Compared with patients with a maximum epileptiform activity of 0 to less than 25%, patients with a maximum epileptiform activity burden of 75% or more when untreated had a mean 22·27% (SD 0·92) increased chance of a poor outcome (severe disability or death). Moderate but long-lasting epileptiform activity (mean epileptiform activity burden 2% to <10%) increased the risk of a poor outcome by mean 13·52% (SD 1·93). The effect sizes were heterogeneous depending on preadmission profile-eg, patients with hypoxic-ischaemic encephalopathy or acquired brain injury were more adversely affected compared with patients without these conditions. INTERPRETATION: Our results suggest that interventions should put a higher priority on patients with an average epileptiform activity burden 10% or greater, and treatment should be more conservative when maximum epileptiform activity burden is low. Treatment should also be tailored to individual preadmission profiles because the potential for epileptiform activity to cause harm depends on age, medical history, and reason for admission. FUNDING: National Institutes of Health and National Science Foundation.


Subject(s)
Critical Illness , Patient Discharge , United States , Humans , Retrospective Studies , Cross-Sectional Studies , Treatment Outcome
14.
Adv Neural Inf Process Syst ; 36: 56673-56699, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38623077

ABSTRACT

In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space containing a diverse set of models from which domain experts can choose. We present algorithms to efficiently and accurately approximate the Rashomon set of sparse, generalized additive models with ellipsoids for fixed support sets and use these ellipsoids to approximate Rashomon sets for many different support sets. The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); and (3) investigating sudden changes in the shape functions. Experiments demonstrate the fidelity of the approximated Rashomon set and its effectiveness in solving practical challenges.

15.
Proc AAAI Conf Artif Intell ; 37(9): 11270-11279, 2023 Jun.
Article in English | MEDLINE | ID: mdl-38650922

ABSTRACT

Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm on one dimensional data. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.

16.
Adv Neural Inf Process Syst ; 36: 3362-3401, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38577617

ABSTRACT

The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large. In this work, we propose and study a mechanism of the data generation process, coupled with choices usually made by the analyst during the learning process, that determines the size of the Rashomon ratio. Specifically, we demonstrate that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models. Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.

17.
Adv Neural Inf Process Syst ; 36: 41076-41258, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38505104

ABSTRACT

We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, first, a saddle point formulation, and from there, either solving (i) a linear system or (ii) using an ADMM-based approach, where the proximal operators can be efficiently evaluated by solving another linear system and an isotonic regression problem. We also propose a method to warm-start our solver, which leverages a beam search. Experimentally, our methods attain provable optimality with run times that are orders of magnitude faster than those of the existing MIP formulations solved by the commercial solver Gurobi.

18.
Proc AAAI Conf Artif Intell ; 36(9): 9604-9613, 2022.
Article in English | MEDLINE | ID: mdl-36051654

ABSTRACT

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.

19.
Commun Biol ; 5(1): 719, 2022 07 19.
Article in English | MEDLINE | ID: mdl-35853932

ABSTRACT

Dimension reduction (DR) algorithms project data from high dimensions to lower dimensions to enable visualization of interesting high-dimensional structure. DR algorithms are widely used for analysis of single-cell transcriptomic data. Despite widespread use of DR algorithms such as t-SNE and UMAP, these algorithms have characteristics that lead to lack of trust: they do not preserve important aspects of high-dimensional structure and are sensitive to arbitrary user choices. Given the importance of gaining insights from DR, DR methods should be evaluated carefully before trusting their results. In this paper, we introduce and perform a systematic evaluation of popular DR methods, including t-SNE, art-SNE, UMAP, PaCMAP, TriMap and ForceAtlas2. Our evaluation considers five components: preservation of local structure, preservation of global structure, sensitivity to parameter choices, sensitivity to preprocessing choices, and computational efficiency. This evaluation can help us to choose DR tools that align with the scientific goals of the user.


Subject(s)
Data Visualization , Transcriptome , Algorithms
20.
Proc Mach Learn Res ; 151: 9304-9333, 2022 Mar.
Article in English | MEDLINE | ID: mdl-35601052

ABSTRACT

We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.

SELECTION OF CITATIONS
SEARCH DETAIL
...