RESUMO
The study and early detection of breast cancer are key for its treatment. We carry out an exhaustive analysis of the most used database for mastology research with infrared images, analyzing the anomalies according to five quality dimensions: completeness, correctness, concordance, plausibility, and currency. We established control queries that looked for these anomalies and that can be used to ensure the quality of the database. Finally, we briefly review the more than 40 papers that use this database and that do not mention any of these anomalies. When analyzing the database, we found 365 anomalies related to personal and clinical data, and thermal images. The errors found in our research may lead to a modification of the results and conclusions made in the articles found in the literature, serve as a basis for improvements in the quality of the database, and help future researchers to work with it.
Assuntos
Neoplasias da Mama , Termografia , Humanos , Feminino , Termografia/métodos , Mama , Neoplasias da Mama/diagnóstico por imagem , Bases de Dados FactuaisRESUMO
A sum-product network (SPN) is a probabilistic model, based on a rooted acyclic directed graph, in which terminal nodes represent probability distributions and non-terminal nodes represent convex sums (weighted averages) and products of probability distributions. They are closely related to probabilistic graphical models, in particular to Bayesian networks with multiple context-specific independencies. Their main advantage is the possibility of building tractable models from data, i.e., models that can perform several inference tasks in time proportional to the number of edges in the graph. They are somewhat similar to neural networks and can address the same kinds of problems, such as image processing and natural language understanding. This paper offers a survey of SPNs, including their definition, the main algorithms for inference and learning from data, several applications, a brief review of software libraries, and a comparison with related models.
Assuntos
Algoritmos , Redes Neurais de Computação , Teorema de Bayes , Modelos EstatísticosRESUMO
BACKGROUND AND OBJECTIVE: Breast cancer is the most common cancer in women. While mammography is the most widely used screening technique for the early detection of this disease, it has several disadvantages such as radiation exposure or high economic cost. Recently, multiple authors studied the ability of machine learning algorithms for early diagnosis of breast cancer using thermal images, showing that thermography can be considered as a complementary test to mammography, or even as a primary test under certain circumstances. Moreover, although some personal and clinical data are considered risk factors of breast cancer, none of these works considered that information jointly with thermal images. METHODS: We propose a novel approach for early detection of breast cancer combining thermal images of different views with personal and clinical data, building a multi-input classification model which exploits the benefits of convolutional neural networks for image analysis. First, we searched for structures using only thermal images. Next, we added the clinical data as a new branch of each of these structures, aiming to improve its performance. RESULTS: We applied our method to the most widely used public database of breast thermal images, the Database for Mastology Research with Infrared Image. The best model achieves a 97% accuracy and an area under the ROC curve of 0.99, with a specificity of 100% and a sensitivity of 83%. CONCLUSIONS: After studying the impact of thermal images and personal and clinical data on multi-input convolutional neural networks for breast cancer diagnosis, we conclude that: (1) adding the lateral views to the front view improves the performance of the classification model, and (2) including personal and clinical data helps the model to recognize sick patients.