Búsqueda | Portal de Búsqueda de la BVS Colombia

Structured Verification of Machine Learning Models in Industrial Settings.

Kaminwar, Sai Rahul; Goschenhofer, Jann; Thomas, Janek; Thon, Ingo; Bischl, Bernd.

Big Data ; 11(3): 181-198, 2023 06.

Artículo en Inglés | MEDLINE | ID: mdl-34978896

RESUMEN

The use of machine learning (ML) allows us to automate and scale the decision-making processes. The key to this automation is the development of ML models that generalize training data toward unseen data. Such models can become extremely versatile and powerful, which makes democratization of artificial intelligence (AI) possible, that is, providing ML to non-ML experts such as software engineers or domain experts. Typically, automated ML (AutoML) is being referred to as a key step toward it. However, from our perspective, we believe that democratization of the verification process of ML systems is a larger and even more crucial challenge to achieve the democratization of AI. Currently, the process of ensuring that an ML model works as intended is unstructured. It is largely based on experience and domain knowledge that cannot be automated. The current approaches such as cross-validation or explainable AI are not enough to overcome the real challenges and are discussed extensively in this article. Arguing toward structured verification approaches, we discuss a set of guidelines to verify models, code, and data in each step of the ML lifecycle. These guidelines can help to reliably measure and select an optimal solution, besides minimizing the risk of bugs and undesired behavior in edge-cases.

Asunto(s)

Inteligencia Artificial , Aprendizaje Automático , Automatización , Proyectos de Investigación , Programas Informáticos

Corrigendum to "Probing for Sparse and Fast Variable Selection with Model-Based Boosting".

Thomas, Janek; Hepp, Tobias; Mayr, Andreas; Bischl, Bernd.

Comput Math Methods Med ; 2018: 2430438, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30073029

RESUMEN

[This corrects the article DOI: 10.1155/2017/1421409.].

Probing for Sparse and Fast Variable Selection with Model-Based Boosting.

Thomas, Janek; Hepp, Tobias; Mayr, Andreas; Bischl, Bernd.

Comput Math Methods Med ; 2017: 1421409, 2017.

Artículo en Inglés | MEDLINE | ID: mdl-28831289

RESUMEN

We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of the fitting lies in the need of multiple model fits on slightly altered data (e.g., cross-validation or bootstrap) to find the optimal number of boosting iterations and prevent overfitting. In our proposed approach, we augment the data set with randomly permuted versions of the true variables, so-called shadow variables, and stop the stepwise fitting as soon as such a variable would be added to the model. This allows variable selection in a single fit of the model without requiring further parameter tuning. We show that our probing approach can compete with state-of-the-art selection methods like stability selection in a high-dimensional classification benchmark and apply it on three gene expression data sets.

Asunto(s)

Perfilación de la Expresión Génica/métodos , Modelos Estadísticos , Interpretación Estadística de Datos

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA