Search | VHL Regional Portal

Representational Gradient Boosting: Backpropagation in the Space of Functions.

Valdes, Gilmer; Friedman, Jerome H; Jiang, Fei; Gennatas, Efstathios D.

IEEE Trans Pattern Anal Mach Intell ; 44(12): 10186-10195, 2022 12.

Article in English | MEDLINE | ID: mdl-34941500

ABSTRACT

The estimation of nested functions (i.e., functions of functions) is one of the central reasons for the success and popularity of machine learning. Today, artificial neural networks are the predominant class of algorithms in this area, known as representational learning. Here, we introduce Representational Gradient Boosting (RGB), a nonparametric algorithm that estimates functions with multi-layer architectures obtained using backpropagation in the space of functions. RGB does not need to assume a functional form in the nodes or output (e.g., linear models or rectified linear units), but rather estimates these transformations. RGB can be seen as an optimized stacking procedure where a meta algorithm learns how to combine different classes of functions (e.g., Neural Networks (NN) and Gradient Boosting (GB)), while building and optimizing them jointly in an attempt to compensate each other's weaknesses. This highlights a stark difference with current approaches to meta-learning that combine models only after they have been built independently. We showed that providing optimized stacking is one of the main advantages of RGB over current approaches. Additionally, due to the nested nature of RGB we also showed how it improves over GB in problems that have several high-order interactions. Finally, we investigate both theoretically and in practice the problem of recovering nested functions and the value of prior knowledge.

Subject(s)

Algorithms , Neural Networks, Computer , Machine Learning

Contrast trees and distribution boosting.

Friedman, Jerome H.

Proc Natl Acad Sci U S A ; 117(35): 21175-21184, 2020 09 01.

Article in English | MEDLINE | ID: mdl-32817416

ABSTRACT

A method for decision tree induction is presented. Given a set of predictor variables [Formula: see text] and two outcome variables y and z associated with each x, the goal is to identify those values of x for which the respective distributions of [Formula: see text] and [Formula: see text], or selected properties of those distributions such as means or quantiles, are most different. Contrast trees provide a lack-of-fit measure for statistical models of such statistics, or for the complete conditional distribution [Formula: see text], as a function of x. They are easily interpreted and can be used as diagnostic tools to reveal and then understand the inaccuracies of models produced by any learning method. A corresponding contrast-boosting strategy is described for remedying any uncovered errors, thereby producing potentially more accurate predictions. This leads to a distribution-boosting strategy for directly estimating the full conditional distribution of y at each x under no assumptions concerning its shape, form, or parametric representation.

Reply to Nock and Nielsen: On the work of Nock and Nielsen and its relationship to the additive tree.

Valdes, Gilmer; Luna, José Marcio; Gennatas, Efstathios D; Ungar, Lyle H; Eaton, Eric; Diffenderfer, Eric S; Jensen, Shane T; Simone, Charles B; Friedman, Jerome H; Solberg, Timothy D.

Proc Natl Acad Sci U S A ; 117(16): 8694-8695, 2020 04 21.

Article in English | MEDLINE | ID: mdl-32265277

Subject(s)

Decision Trees

Expert-augmented machine learning.

Gennatas, Efstathios D; Friedman, Jerome H; Ungar, Lyle H; Pirracchio, Romain; Eaton, Eric; Reichmann, Lara G; Interian, Yannet; Luna, José Marcio; Simone, Charles B; Auerbach, Andrew; Delgado, Elier; van der Laan, Mark J; Solberg, Timothy D; Valdes, Gilmer.

Proc Natl Acad Sci U S A ; 117(9): 4571-4577, 2020 03 03.

Article in English | MEDLINE | ID: mdl-32071251

ABSTRACT

Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert. In reality, the optimal learning strategy may involve combining the complementary strengths of humans and machines. Here, we present expert-augmented machine learning (EAML), an automated method that guides the extraction of expert knowledge and its integration into machine-learned models. We used a large dataset of intensive-care patient data to derive 126 decision rules that predict hospital mortality. Using an online platform, we asked 15 clinicians to assess the relative risk of the subpopulation defined by each rule compared to the total sample. We compared the clinician-assessed risk to the empirical risk and found that, while clinicians agreed with the data in most cases, there were notable exceptions where they overestimated or underestimated the true risk. Studying the rules with greatest disagreement, we identified problems with the training data, including one miscoded variable and one hidden confounder. Filtering the rules based on the extent of disagreement between clinician-assessed risk and empirical risk, we improved performance on out-of-sample data and were able to train with less data. EAML provides a platform for automated creation of problem-specific priors, which help build robust and dependable machine-learning models in critical applications.

Subject(s)

Expert Systems , Machine Learning/standards , Medical Informatics/methods , Data Management/methods , Database Management Systems , Medical Informatics/standards

Building more accurate decision trees with the additive tree.

Luna, José Marcio; Gennatas, Efstathios D; Ungar, Lyle H; Eaton, Eric; Diffenderfer, Eric S; Jensen, Shane T; Simone, Charles B; Friedman, Jerome H; Solberg, Timothy D; Valdes, Gilmer.

Proc Natl Acad Sci U S A ; 116(40): 19887-19893, 2019 10 01.

Article in English | MEDLINE | ID: mdl-31527280

ABSTRACT

The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has increased the interest in interpretable machine learning. The widely used Classification and Regression Trees (CART) have played a major role in health sciences, due to their simple and intuitive explanation of predictions. Ensemble methods like gradient boosting can improve the accuracy of decision trees, but at the expense of the interpretability of the generated model. Additive models, such as those produced by gradient boosting, and full interaction models, such as CART, have been investigated largely in isolation. We show that these models exist along a spectrum, revealing previously unseen connections between these approaches. This paper introduces a rigorous formalization for the additive tree, an empirically validated learning technique for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although the additive tree is designed primarily to provide both the model interpretability and predictive performance needed for high-stakes applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.

Subject(s)

Algorithms , Decision Trees , Machine Learning , Databases, Factual , Models, Statistical , Programming Languages

SparseNet: Coordinate Descent With Nonconvex Penalties.

Mazumder, Rahul; Friedman, Jerome H; Hastie, Trevor.

J Am Stat Assoc ; 106(495): 1125-1138, 2011.

Article in English | MEDLINE | ID: mdl-25580042

ABSTRACT

We address the problem of sparse selection in linear models. A number of nonconvex penalties have been proposed in the literature for this purpose, along with a variety of convex-relaxation algorithms for finding good solutions. In this article we pursue a coordinate-descent approach for optimization, and study its convergence properties. We characterize the properties of penalties suitable for this approach, study their corresponding threshold functions, and describe a df-standardizing reparametrization that assists our pathwise algorithm. The MC+ penalty is ideally suited to this task, and we use it to demonstrate the performance of our algorithm. Certain technical derivations and experiments related to this article are included in the Supplementary Materials section.

Multiple additive regression trees with application in epidemiology.

Friedman, Jerome H; Meulman, Jacqueline J.

Stat Med ; 22(9): 1365-81, 2003 May 15.

Article in English | MEDLINE | ID: mdl-12704603

ABSTRACT

Predicting future outcomes based on knowledge obtained from past observational data is a common application in a wide variety of areas of scientific research. In the present paper, prediction will be focused on various grades of cervical preneoplasia and neoplasia. Statistical tools used for prediction should of course possess predictive accuracy, and preferably meet secondary requirements such as speed, ease of use, and interpretability of the resulting predictive model. A new automated procedure based on an extension (called 'boosting') of regression and classification tree (CART) models is described. The resulting tool is a fast 'off-the-shelf' procedure for classification and regression that is competitive in accuracy with more customized approaches, while being fairly automatic to use (little tuning), and highly robust especially when applied to less than clean data. Additional tools are presented for interpreting and visualizing the results of such multiple additive regression tree (MART) models.

Subject(s)

Epidemiologic Methods , Models, Statistical , Regression Analysis , Carcinoma, Squamous Cell/epidemiology , Female , Humans , Predictive Value of Tests , Uterine Cervical Neoplasms/epidemiology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL