RESUMO
The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research. To explore FL methodology, the breast density classification FL challenge was hosted in partnership with the American College of Radiology, Harvard Medical Schools' Mass General Brigham, University of Colorado, NVIDIA, and the National Institutes of Health National Cancer Institute. Challenge participants were able to submit docker containers capable of implementing FL on three simulated medical facilities, each containing a unique large mammography dataset. The breast density FL challenge ran from June 15 to September 5, 2022, attracting seven finalists from around the world. The winning FL submission reached a linear kappa score of 0.653 on the challenge test data and 0.413 on an external testing dataset, scoring comparably to a model trained on the same data in a central location.
Assuntos
Algoritmos , Densidade da Mama , Neoplasias da Mama , Mamografia , Humanos , Feminino , Mamografia/métodos , Neoplasias da Mama/diagnóstico por imagem , Aprendizado de MáquinaRESUMO
In the field of transportation and logistics, smart vision systems have been employed successfully to automate various tasks such as number-plate recognition and vehicle identity recognition. The development of such automated systems is possible with the availability of large image datasets having proper annotations. The TRODO dataset is a rich-annotated collection of odometer displays that can enable automatic mileage reading from raw images. Initially, the dataset consisted of 2613 frames captured in different conditions in terms of resolution, quality, illumination and vehicle type. After data pre-processing and cleaning, the number of images was reduced to 2389. The images were annotated using the CVAT image annotation tool. The dataset provides the following information for each frame: the type of odometer (analog or digital), the mileage value displayed on the odometer, the bounding boxes of the odometer, and the digits and characters displayed on the screen. Combined with machine learning and artificial intelligence, the TRODO dataset can be used to train odometer classifiers, digit recognition and number reading models from odometers and similar types of displays.