RESUMO
Demographic biases in source datasets have been shown as one of the causes of unfairness and discrimination in the predictions of Machine Learning models. One of the most prominent types of demographic bias are statistical imbalances in the representation of demographic groups in the datasets. In this article, we study the measurement of these biases by reviewing the existing metrics, including those that can be borrowed from other disciplines. We develop a taxonomy for the classification of these metrics, providing a practical guide for the selection of appropriate metrics. To illustrate the utility of our framework, and to further understand the practical characteristics of the metrics, we conduct a case study of 20 datasets used in Facial Emotion Recognition (FER), analyzing the biases present in them. Our experimental results show that many metrics are redundant and that a reduced subset of metrics may be sufficient to measure the amount of demographic bias. The article provides valuable insights for researchers in AI and related fields to mitigate dataset bias and improve the fairness and accuracy of AI models.
Assuntos
Bases de Dados Factuais , Expressão Facial , Humanos , Reconhecimento Facial Automatizado/métodos , Algoritmos , Viés , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Demografia , Face/anatomia & histologia , Face/diagnóstico por imagem , Reconhecimento Automatizado de Padrão/métodosRESUMO
BACKGROUND AND OBJECTIVE: Severe trauma patients are those who have several injuries implying a death risk. Prediction systems consider the severity of these injuries to predict whether the patients are likely to survive or not. These systems allow one to objectively compare the quality of the emergency services of trauma centres across different hospitals. However, even the most accurate existing prediction systems are based on the usage of a single model. The aim of this paper is to combine several models to make the prediction, since this methodology usually improves the performance of single models. MATERIALS AND METHODS: The two currently used prediction systems by the Hospital of Navarre, which are based on logistic regression models, besides the C4.5 decision tree are combined to conform our proposed multiple classifier system. The quality of the method is tested using the major trauma registry of Navarre, which stores information of 462 trauma patients. A 10x10-fold cross-validation model is applied using as performance measures the specificity, sensitivity and the geometric mean between the two former ones. The results are supported by the usage of the Mann-Whitney's U statistical test. RESULTS: The proposed method provides 0.8908, 0.6703 and 0.7661 for sensitivity, specificity and geometric mean, respectively. It slightly decreases the sensitivity of the currently used systems but it notably increases the specificity, which implies a large enhancement on the geometric mean. The same behaviour is found when it is compared versus four classical ensemble approaches and the random forest. The statistical analysis supports the quality of our proposal, since the obtained p-values are less than 0.01 in all the cases. CONCLUSIONS: The obtained results show that the multiple classifier systems is the best choice among the considered methods to obtain a trade-off between sensitivity and specificity.
Assuntos
Ferimentos e Lesões/classificação , Ferimentos e Lesões/mortalidade , Adulto , Idoso , Algoritmos , Árvores de Decisões , Medicina de Emergência , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Sistema de Registros , Análise de Regressão , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Índice de Gravidade de Doença , Software , Espanha , Resultado do TratamentoRESUMO
In this paper we present a comparison study between different aggregation functions for the combination of RGB color channels in stereo matching problem. We introduce color information from images to the stereo matching algorithm by aggregating the similarities of the RGB channels which are calculated independently. We compare the accuracy of different stereo matching algorithms and aggregation functions. We show experimentally that the best function depends on the stereo matching algorithm considered, but the dual of the geometric mean excels as the most robust aggregation.
RESUMO
Stereo matching problem attempts to find corresponding locations between pairs of displaced images of the same scene. Correspondence estimation between pixels suffers from occlusions, noise, and bias. This paper introduces a novel approach to represent images by means of interval-valued fuzzy sets. These sets allow one to overcome the uncertainty due to the aforementioned problems. The aim is to take advantage of the new representation to develop a stereo matching algorithm. The interval-valued fuzzification process for images that is proposed here is based on image segmentation. Interval-valued fuzzy similarities are introduced to compare windows whose pixels are represented by intervals. To make use of color information, the similarities of the RGB channels were aggregated using the luminance formula. The experimental analysis makes a comparison with other methods. The new representation that is proposed together with the new similarity measure show a better overall behavior, providing more accurate correspondences, mainly near depth discontinuities and for images with a large amount of color.