Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Med Image Anal ; 95: 103188, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38718715

RESUMO

In medical image diagnosis, fairness has become increasingly crucial. Without bias mitigation, deploying unfair AI would harm the interests of the underprivileged population and potentially tear society apart. Recent research addresses prediction biases in deep learning models concerning demographic groups (e.g., gender, age, and race) by utilizing demographic (sensitive attribute) information during training. However, many sensitive attributes naturally exist in dermatological disease images. If the trained model only targets fairness for a specific attribute, it remains unfair for other attributes. Moreover, training a model that can accommodate multiple sensitive attributes is impractical due to privacy concerns. To overcome this, we propose a method enabling fair predictions for sensitive attributes during the testing phase without using such information during training. Inspired by prior work highlighting the impact of feature entanglement on fairness, we enhance the model features by capturing the features related to the sensitive and target attributes and regularizing the feature entanglement between corresponding classes. This ensures that the model can only classify based on the features related to the target attribute without relying on features associated with sensitive attributes, thereby improving fairness and accuracy. Additionally, we use disease masks from the Segment Anything Model (SAM) to enhance the quality of the learned feature. Experimental results demonstrate that the proposed method can improve fairness in classification compared to state-of-the-art methods in two dermatological disease datasets.


Assuntos
Dermatopatias , Humanos , Dermatopatias/diagnóstico por imagem , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Demografia
2.
Ophthalmol Sci ; 4(4): 100471, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38591048

RESUMO

Topic: This scoping review summarizes artificial intelligence (AI) reporting in ophthalmology literature in respect to model development and validation. We characterize the state of transparency in reporting of studies prospectively validating models for disease classification. Clinical Relevance: Understanding what elements authors currently describe regarding their AI models may aid in the future standardization of reporting. This review highlights the need for transparency to facilitate the critical appraisal of models prior to clinical implementation, to minimize bias and inappropriate use. Transparent reporting can improve effective and equitable use in clinical settings. Methods: Eligible articles (as of January 2022) from PubMed, Embase, Web of Science, and CINAHL were independently screened by 2 reviewers. All observational and clinical trial studies evaluating the performance of an AI model for disease classification of ophthalmic conditions were included. Studies were evaluated for reporting of parameters derived from reporting guidelines (CONSORT-AI, MI-CLAIM) and our previously published editorial on model cards. The reporting of these factors, which included basic model and dataset details (source, demographics), and prospective validation outcomes, were summarized. Results: Thirty-seven prospective validation studies were included in the scoping review. Eleven additional associated training and/or retrospective validation studies were included if this information could not be determined from the primary articles. These 37 studies validated 27 unique AI models; multiple studies evaluated the same algorithms (EyeArt, IDx-DR, and Medios AI). Details of model development were variably reported; 18 of 27 models described training dataset annotation and 10 of 27 studies reported training data distribution. Demographic information of training data was rarely reported; 7 of the 27 unique models reported age and gender and only 2 reported race and/or ethnicity. At the level of prospective clinical validation, age and gender of populations was more consistently reported (29 and 28 of 37 studies, respectively), but only 9 studies reported race and/or ethnicity data. Scope of use was difficult to discern for the majority of models. Fifteen studies did not state or imply primary users. Conclusion: Our scoping review demonstrates variable reporting of information related to both model development and validation. The intention of our study was not to assess the quality of the factors we examined, but to characterize what information is, and is not, regularly reported. Our results suggest the need for greater transparency in the reporting of information necessary to determine the appropriateness and fairness of these tools prior to clinical use. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

3.
Comput Methods Programs Biomed ; 245: 108044, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38290289

RESUMO

BACKGROUND: The field of dermatological image analysis using deep neural networks includes the semantic segmentation of skin lesions, pivotal for lesion analysis, pathology inference, and diagnoses. While biases in neural network-based dermatoscopic image classification against darker skin tones due to dataset imbalance and contrast disparities are acknowledged, a comprehensive exploration of skin color bias in lesion segmentation models is lacking. It is imperative to address and understand the biases in these models. METHODS: Our study comprehensively evaluates skin tone bias within prevalent neural networks for skin lesion segmentation. Since no information about skin color exists in widely used datasets, to quantify the bias we use three distinct skin color estimation methods: Fitzpatrick skin type estimation, Individual Typology Angle estimation as well as manual grouping of images by skin color. We assess bias across common models by training a variety of U-Net-based models on three widely-used datasets with 1758 different dermoscopic and clinical images. We also evaluate commonly suggested methods to mitigate bias. RESULTS: Our findings expose a significant and large correlation between segmentation performance and skin color, revealing consistent challenges in segmenting lesions for darker skin tones across diverse datasets. Using various methods of skin color quantification, we have found significant bias in skin lesion segmentation against darker-skinned individuals when evaluated both in and out-of-sample. We also find that commonly used methods for bias mitigation do not result in any significant reduction in bias. CONCLUSIONS: Our findings suggest a pervasive bias in most published lesion segmentation methods, given our use of commonly employed neural network architectures and publicly available datasets. In light of our findings, we propose recommendations for unbiased dataset collection, labeling, and model development. This presents the first comprehensive evaluation of fairness in skin lesion segmentation.


Assuntos
Aprendizado Profundo , Dermatopatias , Humanos , Pigmentação da Pele , Dermoscopia/métodos , Dermatopatias/diagnóstico por imagem , Pele/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos
4.
Res Sq ; 2023 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-37886534

RESUMO

The clock drawing test (CDT) is a neuropsychological assessment tool to evaluate a patient's cognitive ability. In this study, we developed a Fair and Interpretable Representation of Clock drawing tests (FaIRClocks) to evaluate and mitigate bias against people with lower education while predicting their cognitive status. We represented clock drawings with a 10-dimensional latent embedding using Relevance Factor Variational Autoencoder (RF-VAE) network pretrained on publicly available clock drawings from the National Health and Aging Trends Study (NHATS) dataset. These embeddings were later fine-tuned for predicting three cognitive scores: the Mini-Mental State Examination (MMSE) total score, attention composite z-score (ATT-C), and memory composite z-score (MEM-C). The classifiers were initially tested to see their relative performance in patients with low education (<= 8 years) versus patients with higher education (> 8 years). Results indicated that the initial unweighted classifiers confounded lower education with cognitive impairment, resulting in a 100% type I error rate for this group. Thereby, the samples were re-weighted using multiple fairness metrics to achieve balanced performance. In summary, we report the FaIRClocks model, which a) can identify attention and memory deficits using clock drawings and b) exhibits identical performance between people with higher and lower education levels.

5.
bioRxiv ; 2023 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-37293014

RESUMO

AlphaFold2 is reshaping biomedical research by enabling the prediction of a protein's 3D structure solely based on its amino acid sequence. This breakthrough reduces reliance on labor-intensive experimental methods traditionally used to obtain protein structures, thereby accelerating the pace of scientific discovery. Despite the bright future, it remains unclear whether AlphaFold2 can uniformly predict the wide spectrum of proteins equally well. Systematic investigation into the fairness and unbiased nature of its predictions is still an area yet to be thoroughly explored. In this paper, we conducted an in-depth analysis of AlphaFold2's fairness using data comprised of five million reported protein structures from its open-access repository. Specifically, we assessed the variability in the distribution of PLDDT scores, considering factors such as amino acid type, secondary structure, and sequence length. Our findings reveal a systematic discrepancy in AlphaFold2's predictive reliability, varying across different types of amino acids and secondary structures. Furthermore, we observed that the size of the protein exerts a notable impact on the credibility of the 3D structural prediction. AlphaFold2 demonstrates enhanced prediction power for proteins of medium size compared to those that are either smaller or larger. These systematic biases could potentially stem from inherent biases present in its training data and model architecture. These factors need to be taken into account when expanding the applicability of AlphaFold2.

6.
AI Soc ; : 1-25, 2023 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-36789242

RESUMO

Uncovering the world's ethnic inequalities is hampered by a lack of ethnicity-annotated datasets. Name-ethnicity classifiers (NECs) can help, as they are able to infer people's ethnicities from their names. However, since the latest generation of NECs rely on machine learning and artificial intelligence (AI), they may suffer from the same racist and sexist biases found in many AIs. Therefore, this paper offers an algorithmic fairness audit of three NECs. It finds that the UK-Census-trained EthnicityEstimator displays large accuracy biases with regards to ethnicity, but relatively less among gender and age groups. In contrast, the Twitter-trained NamePrism and the Wikipedia-trained Ethnicolr are more balanced among ethnicity, but less among gender and age. We relate these biases to global power structures manifested in naming conventions and NECs' input distribution of names. To improve on the uncovered biases, we program a novel NEC, N2E, using fairness-aware AI techniques. We make N2E freely available at www.name-to-ethnicity.com. Supplementary Information: The online version contains supplementary material available at 10.1007/s00146-022-01619-4.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA