Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Artif Intell ; 6: 1124553, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37565044

RESUMO

This article provides a birds-eye view on the role of decision trees in machine learning and data science over roughly four decades. It sketches the evolution of decision tree research over the years, describes the broader context in which the research is situated, and summarizes strengths and weaknesses of decision trees in this context. The main goal of the article is to clarify the broad relevance to machine learning and artificial intelligence, both practical and theoretical, that decision trees still have today.

2.
IEEE Trans Vis Comput Graph ; 29(1): 745-755, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36166539

RESUMO

A plethora of dimensionality reduction techniques have emerged over the past decades, leaving researchers and analysts with a wide variety of choices for reducing their data, all the more so given some techniques come with additional hyper-parametrization (e.g., t-SNE, UMAP, etc.). Recent studies are showing that people often use dimensionality reduction as a black-box regardless of the specific properties the method itself preserves. Hence, evaluating and comparing 2D embeddings is usually qualitatively decided, by setting embeddings side-by-side and letting human judgment decide which embedding is the best. In this work, we propose a quantitative way of evaluating embeddings, that nonetheless places human perception at the center. We run a comparative study, where we ask people to select "good" and "misleading" views between scatterplots of low-dimensional embeddings of image datasets, simulating the way people usually select embeddings. We use the study data as labels for a set of quality metrics for a supervised machine learning model whose purpose is to discover and quantify what exactly people are looking for when deciding between embeddings. With the model as a proxy for human judgments, we use it to rank embeddings on new datasets, explain why they are relevant, and quantify the degree of subjectivity when people select preferred embeddings.

3.
IEEE Trans Cybern ; 46(12): 3351-3363, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26685280

RESUMO

Extreme learning machines (ELMs) are fast methods that obtain state-of-the-art results in regression. However, they are not robust to outliers and their meta-parameter (i.e., the number of neurons for standard ELMs and the regularization constant of output weights for L2 -regularized ELMs) selection is biased by such instances. This paper proposes a new robust inference algorithm for ELMs which is based on the pointwise probability reinforcement methodology. Experiments show that the proposed approach produces results which are comparable to the state of the art, while being often faster.

4.
IEEE Trans Neural Netw Learn Syst ; 25(5): 845-69, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24808033

RESUMO

Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of techniques to deal with label noise. However, the field lacks a comprehensive survey on the different types of label noise, their consequences and the algorithms that consider label noise. This paper proposes to fill this gap. First, the definitions and sources of label noise are considered and a taxonomy of the types of label noise is proposed. Second, the potential consequences of label noise are discussed. Third, label noise-robust, label noise cleansing, and label noise-tolerant algorithms are reviewed. For each category of approaches, a short discussion is proposed to help the practitioner to choose the most suitable technique in its own particular field of application. Eventually, the design of experiments is also discussed, what may interest the researchers who would like to test their own algorithms. In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

5.
Neural Netw ; 50: 124-41, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24300550

RESUMO

Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation.


Assuntos
Interpretação Estatística de Dados , Probabilidade , Reforço Psicológico , Bases de Dados Factuais , Humanos , Análise de Regressão , Estatísticas não Paramétricas
6.
Neural Netw ; 48: 1-7, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23892907

RESUMO

Feature selection is an important preprocessing step for many high-dimensional regression problems. One of the most common strategies is to select a relevant feature subset based on the mutual information criterion. However, no connection has been established yet between the use of mutual information and a regression error criterion in the machine learning literature. This is obviously an important lack, since minimising such a criterion is eventually the objective one is interested in. This paper demonstrates that under some reasonable assumptions, features selected with the mutual information criterion are the ones minimising the mean squared error and the mean absolute error. On the contrary, it is also shown that the mutual information criterion can fail in selecting optimal features in some situations that we characterise. The theoretical developments presented in this work are expected to lead in practice to a critical and efficient use of the mutual information for feature selection.


Assuntos
Análise de Regressão , Algoritmos , Inteligência Artificial , Entropia , Processamento de Imagem Assistida por Computador , Informática , Armazenamento e Recuperação da Informação , Redes Neurais de Computação , Distribuição Normal , Razão Sinal-Ruído
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...