Búsqueda | BVS Bolivia

1.

An Overview of Public Retinal Optical Coherence Tomography Datasets: Access, Annotations, and Beyond.

Rozhyna, Anastasiia; Somfai, Gábor Márk; Atzori, Manfredo; Müller, Henning.

Stud Health Technol Inform ; 316: 1664-1668, 2024 Aug 22.

Artículo en Inglés | MEDLINE | ID: mdl-39176530

RESUMEN

In ophthalmology, Optical Coherence Tomography (OCT) has become a daily used tool in the diagnostics and therapeutic planning of various diseases. Publicly available datasets play a crucial role in advancing research by providing access to diverse imaging data for algorithm development. The accessibility, data format, annotations, and metadata are not consistent across OCT datasets, making it challenging to efficiently use the available resources. This article provides a comprehensive analysis of different OCT datasets, with particular attention to dataset properties, disease representation, accessibility, and aims to create a catalog of all publicly available OCT datasets. The goal is to improve accessibility to OCT data, increase openness about the availability, and give important new perspectives on the state of OCT imaging resources. Our findings reveal the need for improved data-sharing practices and standardized documentation.

Asunto(s)

Tomografía de Coherencia Óptica , Humanos , Enfermedades de la Retina/diagnóstico por imagen , Bases de Datos Factuales , Retina/diagnóstico por imagen , Difusión de la Información

2.

Exploring Publicly Accessible Optical Coherence Tomography Datasets: A Comprehensive Overview.

Rozhyna, Anastasiia; Somfai, Gábor Márk; Atzori, Manfredo; DeBuc, Delia Cabrera; Saad, Amr; Zoellin, Jay; Müller, Henning.

Diagnostics (Basel) ; 14(15)2024 Aug 01.

Artículo en Inglés | MEDLINE | ID: mdl-39125544

RESUMEN

Artificial intelligence has transformed medical diagnostic capabilities, particularly through medical image analysis. AI algorithms perform well in detecting abnormalities with a strong performance, enabling computer-aided diagnosis by analyzing the extensive amounts of patient data. The data serve as a foundation upon which algorithms learn and make predictions. Thus, the importance of data cannot be underestimated, and clinically corresponding datasets are required. Many researchers face a lack of medical data due to limited access, privacy concerns, or the absence of available annotations. One of the most widely used diagnostic tools in ophthalmology is Optical Coherence Tomography (OCT). Addressing the data availability issue is crucial for enhancing AI applications in the field of OCT diagnostics. This review aims to provide a comprehensive analysis of all publicly accessible retinal OCT datasets. Our main objective is to compile a list of OCT datasets and their properties, which can serve as an accessible reference, facilitating data curation for medical image analysis tasks. For this review, we searched through the Zenodo repository, Mendeley Data repository, MEDLINE database, and Google Dataset search engine. We systematically evaluated all the identified datasets and found 23 open-access datasets containing OCT images, which significantly vary in terms of size, scope, and ground-truth labels. Our findings indicate the need for improvement in data-sharing practices and standardized documentation. Enhancing the availability and quality of OCT datasets will support the development of AI algorithms and ultimately improve diagnostic capabilities in ophthalmology. By providing a comprehensive list of accessible OCT datasets, this review aims to facilitate better utilization and development of AI in medical image analysis.

3.

Improving quality control of whole slide images by explicit artifact augmentation.

Jurgas, Artur; Wodzinski, Marek; D'Amato, Marina; van der Laak, Jeroen; Atzori, Manfredo; Müller, Henning.

Sci Rep ; 14(1): 17847, 2024 08 01.

Artículo en Inglés | MEDLINE | ID: mdl-39090284

RESUMEN

The problem of artifacts in whole slide image acquisition, prevalent in both clinical workflows and research-oriented settings, necessitates human intervention and re-scanning. Overcoming this challenge requires developing quality control algorithms, that are hindered by the limited availability of relevant annotated data in histopathology. The manual annotation of ground-truth for artifact detection methods is expensive and time-consuming. This work addresses the issue by proposing a method dedicated to augmenting whole slide images with artifacts. The tool seamlessly generates and blends artifacts from an external library to a given histopathology dataset. The augmented datasets are then utilized to train artifact classification methods. The evaluation shows their usefulness in classification of the artifacts, where they show an improvement from 0.10 to 0.01 AUROC depending on the artifact type. The framework, model, weights, and ground-truth annotations are freely released to facilitate open science and reproducible research.

Asunto(s)

Algoritmos , Artefactos , Procesamiento de Imagen Asistido por Computador , Control de Calidad , Humanos , Procesamiento de Imagen Asistido por Computador/métodos

4.

BIDSAlign: a library for automatic merging and preprocessing of multiple EEG repositories.

Zanola, Andrea; Del Pup, Federico; Porcaro, Camillo; Atzori, Manfredo.

J Neural Eng ; 21(4)2024 Aug 20.

Artículo en Inglés | MEDLINE | ID: mdl-39094617

RESUMEN

Objective.This study aims to address the challenges associated with data-driven electroencephalography (EEG) data analysis by introducing a standardised library calledBIDSAlign. This library efficiently processes and merges heterogeneous EEG datasets from different sources into a common standard template. The goal of this work is to create an environment that allows to preprocess public datasets in order to provide data for the effective training of deep learning (DL) architectures.Approach.The library can handle both Brain Imaging Data Structure (BIDS) and non-BIDS datasets, allowing the user to easily preprocess multiple public datasets. It unifies the EEG recordings acquired with different settings by defining a common pipeline and a specified channel template. An array of visualisation functions is provided inside the library, together with a user-friendly graphical user interface to assist non-expert users throughout the workflow.Main results.BIDSAlign enables the effective use of public EEG datasets, providing valuable medical insights, even for non-experts in the field. Results from applying the library to datasets from OpenNeuro demonstrate its ability to extract significant medical knowledge through an end-to-end workflow, facilitating group analysis, visual comparison and statistical testing.Significance.BIDSAlign solves the lack of large EEG datasets by aligning multiple datasets to a standard template. This unlocks the potential of public EEG data for training DL models. It paves the way to promising contributions based on DL to clinical and non-clinical EEG research, offering insights that can inform neurological disease diagnosis and treatment strategies.

Asunto(s)

Electroencefalografía , Electroencefalografía/métodos , Humanos , Bases de Datos Factuales , Aprendizaje Profundo , Procesamiento de Señales Asistido por Computador

5.

Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning.

Marini, Niccolò; Marchesin, Stefano; Wodzinski, Marek; Caputo, Alessandro; Podareanu, Damian; Guevara, Bryan Cardenas; Boytcheva, Svetla; Vatrano, Simona; Fraggetta, Filippo; Ciompi, Francesco; Silvello, Gianmaria; Müller, Henning; Atzori, Manfredo.

Med Image Anal ; 97: 103303, 2024 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-39154617

RESUMEN

The increasing availability of biomedical data creates valuable resources for developing new deep learning algorithms to support experts, especially in domains where collecting large volumes of annotated data is not trivial. Biomedical data include several modalities containing complementary information, such as medical images and reports: images are often large and encode low-level information, while reports include a summarized high-level description of the findings identified within data and often only concerning a small part of the image. However, only a few methods allow to effectively link the visual content of images with the textual content of reports, preventing medical specialists from properly benefitting from the recent opportunities offered by deep learning models. This paper introduces a multimodal architecture creating a robust biomedical data representation encoding fine-grained text representations within image embeddings. The architecture aims to tackle data scarcity (combining supervised and self-supervised learning) and to create multimodal biomedical ontologies. The architecture is trained on over 6,000 colon whole slide Images (WSI), paired with the corresponding report, collected from two digital pathology workflows. The evaluation of the multimodal architecture involves three tasks: WSI classification (on data from pathology workflow and from public repositories), multimodal data retrieval, and linking between textual and visual concepts. Noticeably, the latter two tasks are available by architectural design without further training, showing that the multimodal architecture that can be adopted as a backbone to solve peculiar tasks. The multimodal data representation outperforms the unimodal one on the classification of colon WSIs and allows to halve the data needed to reach accurate performance, reducing the computational power required and thus the carbon footprint. The combination of images and reports exploiting self-supervised algorithms allows to mine databases without needing new annotations provided by experts, extracting new information. In particular, the multimodal visual ontology, linking semantic concepts to images, may pave the way to advancements in medicine and biomedical analysis domains, not limited to histopathology.

Asunto(s)

Aprendizaje Profundo , Humanos , Algoritmos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Imagen Asistido por Computador/métodos

6.

The ACROBAT 2022 challenge: Automatic registration of breast cancer tissue.

Weitz, Philippe; Valkonen, Masi; Solorzano, Leslie; Carr, Circe; Kartasalo, Kimmo; Boissin, Constance; Koivukoski, Sonja; Kuusela, Aino; Rasic, Dusan; Feng, Yanbo; Pouplier, Sandra Sinius; Sharma, Abhinav; Eriksson, Kajsa Ledesma; Robertson, Stephanie; Marzahl, Christian; Gatenbee, Chandler D; Anderson, Alexander R A; Wodzinski, Marek; Jurgas, Artur; Marini, Niccolò; Atzori, Manfredo; Müller, Henning; Budelmann, Daniel; Weiss, Nick; Heldmann, Stefan; Lotz, Johannes; Wolterink, Jelmer M; De Santi, Bruno; Patil, Abhijeet; Sethi, Amit; Kondo, Satoshi; Kasai, Satoshi; Hirasawa, Kousuke; Farrokh, Mahtab; Kumar, Neeraj; Greiner, Russell; Latonen, Leena; Laenkholm, Anne-Vibeke; Hartman, Johan; Ruusuvuori, Pekka; Rantalainen, Mattias.

Med Image Anal ; 97: 103257, 2024 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-38981282

RESUMEN

The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration dataset to date, including 4,212 WSIs from 1,152 breast cancer patients. The challenge objective was to align WSIs of tissue that was stained with routine diagnostic immunohistochemistry to its H&E-stained counterpart. We compare the performance of eight WSI registration algorithms, including an investigation of the impact of different WSI properties and clinical covariates. We find that conceptually distinct WSI registration methods can lead to highly accurate registration performances and identify covariates that impact performances across methods. These results provide a comparison of the performance of current WSI registration methods and guide researchers in selecting and developing methods.

Asunto(s)

Algoritmos , Neoplasias de la Mama , Humanos , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/patología , Femenino , Interpretación de Imagen Asistida por Computador/métodos , Inmunohistoquímica

7.

A systematic comparison of deep learning methods for Gleason grading and scoring.

Dominguez-Morales, Juan P; Duran-Lopez, Lourdes; Marini, Niccolò; Vicente-Diaz, Saturnino; Linares-Barranco, Alejandro; Atzori, Manfredo; Müller, Henning.

Med Image Anal ; 95: 103191, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38728903

RESUMEN

Prostate cancer is the second most frequent cancer in men worldwide after lung cancer. Its diagnosis is based on the identification of the Gleason score that evaluates the abnormality of cells in glands through the analysis of the different Gleason patterns within tissue samples. The recent advancements in computational pathology, a domain aiming at developing algorithms to automatically analyze digitized histopathology images, lead to a large variety and availability of datasets and algorithms for Gleason grading and scoring. However, there is no clear consensus on which methods are best suited for each problem in relation to the characteristics of data and labels. This paper provides a systematic comparison on nine datasets with state-of-the-art training approaches for deep neural networks (including fully-supervised learning, weakly-supervised learning, semi-supervised learning, Additive-MIL, Attention-Based MIL, Dual-Stream MIL, TransMIL and CLAM) applied to Gleason grading and scoring tasks. The nine datasets are collected from pathology institutes and openly accessible repositories. The results show that the best methods for Gleason grading and Gleason scoring tasks are fully supervised learning and CLAM, respectively, guiding researchers to the best practice to adopt depending on the task to solve and the labels that are available.

Asunto(s)

Aprendizaje Profundo , Clasificación del Tumor , Neoplasias de la Próstata , Humanos , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/diagnóstico por imagen , Masculino , Algoritmos , Interpretación de Imagen Asistida por Computador/métodos

8.

RegWSI: Whole slide image registration using combined deep feature- and intensity-based methods: Winner of the ACROBAT 2023 challenge.

Wodzinski, Marek; Marini, Niccolò; Atzori, Manfredo; Müller, Henning.

Comput Methods Programs Biomed ; 250: 108187, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38657383

RESUMEN

BACKGROUND AND OBJECTIVE: The automatic registration of differently stained whole slide images (WSIs) is crucial for improving diagnosis and prognosis by fusing complementary information emerging from different visible structures. It is also useful to quickly transfer annotations between consecutive or restained slides, thus significantly reducing the annotation time and associated costs. Nevertheless, the slide preparation is different for each stain and the tissue undergoes complex and large deformations. Therefore, a robust, efficient, and accurate registration method is highly desired by the scientific community and hospitals specializing in digital pathology. METHODS: We propose a two-step hybrid method consisting of (i) deep learning- and feature-based initial alignment algorithm, and (ii) intensity-based nonrigid registration using the instance optimization. The proposed method does not require any fine-tuning to a particular dataset and can be used directly for any desired tissue type and stain. The registration time is low, allowing one to perform efficient registration even for large datasets. The method was proposed for the ACROBAT 2023 challenge organized during the MICCAI 2023 conference and scored 1st place. The method is released as open-source software. RESULTS: The proposed method is evaluated using three open datasets: (i) Automatic Nonrigid Histological Image Registration Dataset (ANHIR), (ii) Automatic Registration of Breast Cancer Tissue Dataset (ACROBAT), and (iii) Hybrid Restained and Consecutive Histological Serial Sections Dataset (HyReCo). The target registration error (TRE) is used as the evaluation metric. We compare the proposed algorithm to other state-of-the-art solutions, showing considerable improvement. Additionally, we perform several ablation studies concerning the resolution used for registration and the initial alignment robustness and stability. The method achieves the most accurate results for the ACROBAT dataset, the cell-level registration accuracy for the restained slides from the HyReCo dataset, and is among the best methods evaluated on the ANHIR dataset. CONCLUSIONS: The article presents an automatic and robust registration method that outperforms other state-of-the-art solutions. The method does not require any fine-tuning to a particular dataset and can be used out-of-the-box for numerous types of microscopic images. The method is incorporated into the DeeperHistReg framework, allowing others to directly use it to register, transform, and save the WSIs at any desired pyramid level (resolution up to 220k x 220k). We provide free access to the software. The results are fully and easily reproducible. The proposed method is a significant contribution to improving the WSI registration quality, thus advancing the field of digital pathology.

Asunto(s)

Algoritmos , Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Programas Informáticos , Interpretación de Imagen Asistida por Computador/métodos , Neoplasias de la Mama/diagnóstico por imagen , Neoplasias de la Mama/patología , Femenino , Coloración y Etiquetado

9.

Artifact Augmentation for Learning-based Quality Control of Whole Slide Images.

Jurgas, Artur; Wodzinski, Marek; Celniak, Weronika; Atzori, Manfredo; Muller, Henning.

Annu Int Conf IEEE Eng Med Biol Soc ; 2023: 1-4, 2023 07.

Artículo en Inglés | MEDLINE | ID: mdl-38082977

RESUMEN

The acquisition of whole slide images is prone to artifacts that can require human control and re-scanning, both in clinical workflows and in research-oriented settings. Quality control algorithms are a first step to overcome this challenge, as they limit the use of low quality images. Developing quality control systems in histopathology is not straightforward, also due to the limited availability of data related to this topic. We address the problem by proposing a tool to augment data with artifacts. The proposed method seamlessly generates and blends artifacts from an external library to a given histopathology dataset. The datasets augmented by the blended artifacts are then used to train an artifact detection network in a supervised way. We use the YOLOv5 model for the artifact detection with a slightly modified training pipeline. The proposed tool can be extended into a complete framework for the quality assessment of whole slide images.Clinical relevance- The proposed method may be useful for the initial quality screening of whole slide images. Each year, millions of whole slide images are acquired and digitized worldwide. Numerous of them contain artifacts affecting the following AI-oriented analysis. Therefore, a tool operating at the acquisition phase and improving the initial quality assessment is crucial to increase the performance of digital pathology algorithms, e.g., early cancer diagnosis.

Asunto(s)

Artefactos , Neoplasias , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos

10.

Improving the classification of veterinary thoracic radiographs through inter-species and inter-pathology self-supervised pre-training of deep learning models.

Celniak, Weronika; Wodzinski, Marek; Jurgas, Artur; Burti, Silvia; Zotti, Alessandro; Atzori, Manfredo; Müller, Henning; Banzato, Tommaso.

Sci Rep ; 13(1): 19518, 2023 11 09.

Artículo en Inglés | MEDLINE | ID: mdl-37945653

RESUMEN

The analysis of veterinary radiographic imaging data is an essential step in the diagnosis of many thoracic lesions. Given the limited time that physicians can devote to a single patient, it would be valuable to implement an automated system to help clinicians make faster but still accurate diagnoses. Currently, most of such systems are based on supervised deep learning approaches. However, the problem with these solutions is that they need a large database of labeled data. Access to such data is often limited, as it requires a great investment of both time and money. Therefore, in this work we present a solution that allows higher classification scores to be obtained using knowledge transfer from inter-species and inter-pathology self-supervised learning methods. Before training the network for classification, pretraining of the model was performed using self-supervised learning approaches on publicly available unlabeled radiographic data of human and dog images, which allowed substantially increasing the number of images for this phase. The self-supervised learning approaches included the Beta Variational Autoencoder, the Soft-Introspective Variational Autoencoder, and a Simple Framework for Contrastive Learning of Visual Representations. After the initial pretraining, fine-tuning was performed for the collected veterinary dataset using 20% of the available data. Next, a latent space exploration was performed for each model after which the encoding part of the model was fine-tuned again, this time in a supervised manner for classification. Simple Framework for Contrastive Learning of Visual Representations proved to be the most beneficial pretraining method. Therefore, it was for this method that experiments with various fine-tuning methods were carried out. We achieved a mean ROC AUC score of 0.77 and 0.66, respectively, for the laterolateral and dorsoventral projection datasets. The results show significant improvement compared to using the model without any pretraining approach.

Asunto(s)

Aprendizaje Profundo , Humanos , Animales , Perros , Radiografía , Bases de Datos Factuales , Inversiones en Salud , Conocimiento , Aprendizaje Automático Supervisado

11.

Modelling digital health data: The ExaMode ontology for computational pathology.

Menotti, Laura; Silvello, Gianmaria; Atzori, Manfredo; Boytcheva, Svetla; Ciompi, Francesco; Di Nunzio, Giorgio Maria; Fraggetta, Filippo; Giachelle, Fabio; Irrera, Ornella; Marchesin, Stefano; Marini, Niccolò; Müller, Henning; Primov, Todor.

J Pathol Inform ; 14: 100332, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37705689

RESUMEN

Computational pathology can significantly benefit from ontologies to standardize the employed nomenclature and help with knowledge extraction processes for high-quality annotated image datasets. The end goal is to reach a shared model for digital pathology to overcome data variability and integration problems. Indeed, data annotation in such a specific domain is still an unsolved challenge and datasets cannot be steadily reused in diverse contexts due to heterogeneity issues of the adopted labels, multilingualism, and different clinical practices. Material and methods: This paper presents the ExaMode ontology, modeling the histopathology process by considering 3 key cancer diseases (colon, cervical, and lung tumors) and celiac disease. The ExaMode ontology has been designed bottom-up in an iterative fashion with continuous feedback and validation from pathologists and clinicians. The ontology is organized into 5 semantic areas that defines an ontological template to model any disease of interest in histopathology. Results: The ExaMode ontology is currently being used as a common semantic layer in: (i) an entity linking tool for the automatic annotation of medical records; (ii) a web-based collaborative annotation tool for histopathology text reports; and (iii) a software platform for building holistic solutions integrating multimodal histopathology data. Discussion: The ontology ExaMode is a key means to store data in a graph database according to the RDF data model. The creation of an RDF dataset can help develop more accurate algorithms for image analysis, especially in the field of digital pathology. This approach allows for seamless data integration and a unified query access point, from which we can extract relevant clinical insights about the considered diseases using SPARQL queries.

12.

Data-driven color augmentation for H&E stained images in computational pathology.

Marini, Niccolò; Otalora, Sebastian; Wodzinski, Marek; Tomassini, Selene; Dragoni, Aldo Franco; Marchand-Maillet, Stephane; Morales, Juan Pedro Dominguez; Duran-Lopez, Lourdes; Vatrano, Simona; Müller, Henning; Atzori, Manfredo.

J Pathol Inform ; 14: 100183, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-36687531

RESUMEN

Computational pathology targets the automatic analysis of Whole Slide Images (WSI). WSIs are high-resolution digitized histopathology images, stained with chemical reagents to highlight specific tissue structures and scanned via whole slide scanners. The application of different parameters during WSI acquisition may lead to stain color heterogeneity, especially considering samples collected from several medical centers. Dealing with stain color heterogeneity often limits the robustness of methods developed to analyze WSIs, in particular Convolutional Neural Networks (CNN), the state-of-the-art algorithm for most computational pathology tasks. Stain color heterogeneity is still an unsolved problem, although several methods have been developed to alleviate it, such as Hue-Saturation-Contrast (HSC) color augmentation and stain augmentation methods. The goal of this paper is to present Data-Driven Color Augmentation (DDCA), a method to improve the efficiency of color augmentation methods by increasing the reliability of the samples used for training computational pathology models. During CNN training, a database including over 2 million H&E color variations collected from private and public datasets is used as a reference to discard augmented data with color distributions that do not correspond to realistic data. DDCA is applied to HSC color augmentation, stain augmentation and H&E-adversarial networks in colon and prostate cancer classification tasks. DDCA is then compared with 11 state-of-the-art baseline methods to handle color heterogeneity, showing that it can substantially improve classification performance on unseen data including heterogeneous color variations.

13.

Spatial and Temporal Muscle Synergies Provide a Dual Characterization of Low-dimensional and Intermittent Control of Upper-limb Movements.

Brambilla, Cristina; Atzori, Manfredo; Müller, Henning; d'Avella, Andrea; Scano, Alessandro.

Neuroscience ; 514: 100-122, 2023 03 15.

Artículo en Inglés | MEDLINE | ID: mdl-36708799

RESUMEN

Muscle synergy analysis investigates the neurophysiological mechanisms that the central nervous system employs to coordinate muscles. Several models have been developed to decompose electromyographic (EMG) signals into spatial and temporal synergies. However, using multiple approaches can complicate the interpretation of results. Spatial synergies represent invariant muscle weights modulated with variant temporal coefficients; temporal synergies are invariant temporal profiles that coordinate variant muscle weights. While non-negative matrix factorization allows to extract both spatial and temporal synergies, the comparison between the two approaches was rarely investigated targeting a large set of multi-joint upper-limb movements. Spatial and temporal synergies were extracted from two datasets with proximal (16 subjects, 10M, 6F) and distal upper-limb movements (30 subjects, 21M, 9F), focusing on their differences in reconstruction accuracy and inter-individual variability. We showed the existence of both spatial and temporal structure in the EMG data, comparing synergies with those from a surrogate dataset in which the phases were shuffled preserving the frequency content of the original data. The two models provide a compact characterization of motor coordination at the spatial or temporal level, respectively. However, a lower number of temporal synergies are needed to achieve the same reconstruction R2: spatial and temporal synergies may capture different hierarchical levels of motor control and are dual approaches to the characterization of low-dimensional coordination of the upper-limb. Last, a detailed characterization of the structure of the temporal synergies suggested that they can be related to intermittent control of the movement, allowing high flexibility and dexterity. These results improve neurophysiology understanding in several fields such as motor control, rehabilitation, and prosthetics.

Asunto(s)

Músculo Esquelético , Músculo Temporal , Humanos , Músculo Esquelético/fisiología , Electromiografía , Movimiento/fisiología , Extremidad Superior/fisiología

14.

Semantic wikis as flexible database interfaces for biomedical applications.

Falda, Marco; Atzori, Manfredo; Corbetta, Maurizio.

Sci Rep ; 13(1): 1095, 2023 01 19.

Artículo en Inglés | MEDLINE | ID: mdl-36658254

RESUMEN

Several challenges prevent extracting knowledge from biomedical resources, including data heterogeneity and the difficulty to obtain and collaborate on data and annotations by medical doctors. Therefore, flexibility in their representation and interconnection is required; it is also essential to be able to interact easily with such data. In recent years, semantic tools have been developed: semantic wikis are collections of wiki pages that can be annotated with properties and so combine flexibility and expressiveness, two desirable aspects when modeling databases, especially in the dynamic biomedical domain. However, semantics and collaborative analysis of biomedical data is still an unsolved challenge. The aim of this work is to create a tool for easing the design and the setup of semantic databases and to give the possibility to enrich them with biostatistical applications. As a side effect, this will also make them reproducible, fostering their application by other research groups. A command-line software has been developed for creating all structures required by Semantic MediaWiki. Besides, a way to expose statistical analyses as R Shiny applications in the interface is provided, along with a facility to export Prolog predicates for reasoning with external tools. The developed software allowed to create a set of biomedical databases for the Neuroscience Department of the University of Padova in a more automated way. They can be extended with additional qualitative and statistical analyses of data, including for instance regressions, geographical distribution of diseases, and clustering. The software is released as open source-code and published under the GPL-3 license at https://github.com/mfalda/tsv2swm .

Asunto(s)

Semántica , Programas Informáticos , Bases de Datos Factuales

15.

Empowering digital pathology applications through explainable knowledge extraction tools.

Marchesin, Stefano; Giachelle, Fabio; Marini, Niccolò; Atzori, Manfredo; Boytcheva, Svetla; Buttafuoco, Genziana; Ciompi, Francesco; Di Nunzio, Giorgio Maria; Fraggetta, Filippo; Irrera, Ornella; Müller, Henning; Primov, Todor; Vatrano, Simona; Silvello, Gianmaria.

J Pathol Inform ; 13: 100139, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36268087

RESUMEN

Exa-scale volumes of medical data have been produced for decades. In most cases, the diagnosis is reported in free text, encoding medical knowledge that is still largely unexploited. In order to allow decoding medical knowledge included in reports, we propose an unsupervised knowledge extraction system combining a rule-based expert system with pre-trained Machine Learning (ML) models, namely the Semantic Knowledge Extractor Tool (SKET). Combining rule-based techniques and pre-trained ML models provides high accuracy results for knowledge extraction. This work demonstrates the viability of unsupervised Natural Language Processing (NLP) techniques to extract critical information from cancer reports, opening opportunities such as data mining for knowledge extraction purposes, precision medicine applications, structured report creation, and multimodal learning. SKET is a practical and unsupervised approach to extracting knowledge from pathology reports, which opens up unprecedented opportunities to exploit textual and multimodal medical information in clinical practice. We also propose SKET eXplained (SKET X), a web-based system providing visual explanations about the algorithmic decisions taken by SKET. SKET X is designed/developed to support pathologists and domain experts in understanding SKET predictions, possibly driving further improvements to the system.

16.

Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations.

Marini, Niccolò; Marchesin, Stefano; Otálora, Sebastian; Wodzinski, Marek; Caputo, Alessandro; van Rijthoven, Mart; Aswolinskiy, Witali; Bokhorst, John-Melle; Podareanu, Damian; Petters, Edyta; Boytcheva, Svetla; Buttafuoco, Genziana; Vatrano, Simona; Fraggetta, Filippo; van der Laak, Jeroen; Agosti, Maristella; Ciompi, Francesco; Silvello, Gianmaria; Muller, Henning; Atzori, Manfredo.

NPJ Digit Med ; 5(1): 102, 2022 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-35869179

RESUMEN

The digitalization of clinical workflows and the increasing performance of deep learning algorithms are paving the way towards new methods for tackling cancer diagnosis. However, the availability of medical specialists to annotate digitized images and free-text diagnostic reports does not scale with the need for large datasets required to train robust computer-aided diagnosis methods that can target the high variability of clinical cases and data produced. This work proposes and evaluates an approach to eliminate the need for manual annotations to train computer-aided diagnosis tools in digital pathology. The approach includes two components, to automatically extract semantically meaningful concepts from diagnostic reports and use them as weak labels to train convolutional neural networks (CNNs) for histopathology diagnosis. The approach is trained (through 10-fold cross-validation) on 3'769 clinical images and reports, provided by two hospitals and tested on over 11'000 images from private and publicly available datasets. The CNN, trained with automatically generated labels, is compared with the same architecture trained with manual labels. Results show that combining text analysis and end-to-end deep neural networks allows building computer-aided diagnosis tools that reach solid performance (micro-accuracy = 0.908 at image-level) based only on existing clinical data without the need for manual annotations.

17.

Evaluation of Methods for the Extraction of Spatial Muscle Synergies.

Zhao, Kunkun; Wen, Haiying; Zhang, Zhisheng; Atzori, Manfredo; Müller, Henning; Xie, Zhongqu; Scano, Alessandro.

Front Neurosci ; 16: 732156, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35720729

RESUMEN

Muscle synergies have been largely used in many application fields, including motor control studies, prosthesis control, movement classification, rehabilitation, and clinical studies. Due to the complexity of the motor control system, the full repertoire of the underlying synergies has been identified only for some classes of movements and scenarios. Several extraction methods have been used to extract muscle synergies. However, some of these methods may not effectively capture the nonlinear relationship between muscles and impose constraints on input signals or extracted synergies. Moreover, other approaches such as autoencoders (AEs), an unsupervised neural network, were recently introduced to study bioinspired control and movement classification. In this study, we evaluated the performance of five methods for the extraction of spatial muscle synergy, namely, principal component analysis (PCA), independent component analysis (ICA), factor analysis (FA), nonnegative matrix factorization (NMF), and AEs using simulated data and a publicly available database. To analyze the performance of the considered extraction methods with respect to several factors, we generated a comprehensive set of simulated data (ground truth), including spatial synergies and temporal coefficients. The signal-to-noise ratio (SNR) and the number of channels (NoC) varied when generating simulated data to evaluate their effects on ground truth reconstruction. This study also tested the efficacy of each synergy extraction method when coupled with standard classification methods, including K-nearest neighbors (KNN), linear discriminant analysis (LDA), support vector machines (SVM), and Random Forest (RF). The results showed that both SNR and NoC affected the outputs of the muscle synergy analysis. Although AEs showed better performance than FA in variance accounted for and PCA in synergy vector similarity and activation coefficient similarity, NMF and ICA outperformed the other three methods. Classification tasks showed that classification algorithms were sensitive to synergy extraction methods, while KNN and RF outperformed the other two methods for all extraction methods; in general, the classification accuracy of NMF and PCA was higher. Overall, the results suggest selecting suitable methods when performing muscle synergy-related analysis.

18.

Questioning Domain Adaptation in Myoelectric Hand Prostheses Control: An Inter- and Intra-Subject Study.

Marano, Giulio; Brambilla, Cristina; Mira, Robert Mihai; Scano, Alessandro; Müller, Henning; Atzori, Manfredo.

Sensors (Basel) ; 21(22)2021 Nov 11.

Artículo en Inglés | MEDLINE | ID: mdl-34833573

RESUMEN

One major challenge limiting the use of dexterous robotic hand prostheses controlled via electromyography and pattern recognition relates to the important efforts required to train complex models from scratch. To overcome this problem, several studies in recent years proposed to use transfer learning, combining pre-trained models (obtained from prior subjects) with training sessions performed on a specific user. Although a few promising results were reported in the past, it was recently shown that the use of conventional transfer learning algorithms does not increase performance if proper hyperparameter optimization is performed on the standard approach that does not exploit transfer learning. The objective of this paper is to introduce novel analyses on this topic by using a random forest classifier without hyperparameter optimization and to extend them with experiments performed on data recorded from the same patient, but in different data acquisition sessions. Two domain adaptation techniques were tested on the random forest classifier, allowing us to conduct experiments on healthy subjects and amputees. Differently from several previous papers, our results show that there are no appreciable improvements in terms of accuracy, regardless of the transfer learning techniques tested. The lack of adaptive learning is also demonstrated for the first time in an intra-subject experimental setting when using as a source ten data acquisitions recorded from the same subject but on five different days.

Asunto(s)

Amputados , Miembros Artificiales , Algoritmos , Electromiografía , Mano , Humanos , Reconocimiento de Normas Patrones Automatizadas

19.

Semi-supervised training of deep convolutional neural networks with heterogeneous data and few local annotations: An experiment on prostate histopathology image classification.

Marini, Niccolò; Otálora, Sebastian; Müller, Henning; Atzori, Manfredo.

Med Image Anal ; 73: 102165, 2021 10.

Artículo en Inglés | MEDLINE | ID: mdl-34303169

RESUMEN

Convolutional neural networks (CNNs) are state-of-the-art computer vision techniques for various tasks, particularly for image classification. However, there are domains where the training of classification models that generalize on several datasets is still an open challenge because of the highly heterogeneous data and the lack of large datasets with local annotations of the regions of interest, such as histopathology image analysis. Histopathology concerns the microscopic analysis of tissue specimens processed in glass slides to identify diseases such as cancer. Digital pathology concerns the acquisition, management and automatic analysis of digitized histopathology images that are large, having in the order of 100'0002 pixels per image. Digital histopathology images are highly heterogeneous due to the variability of the image acquisition procedures. Creating locally labeled regions (required for the training) is time-consuming and often expensive in the medical field, as physicians usually have to annotate the data. Despite the advances in deep learning, leveraging strongly and weakly annotated datasets to train classification models is still an unsolved problem, mainly when data are very heterogeneous. Large amounts of data are needed to create models that generalize well. This paper presents a novel approach to train CNNs that generalize to heterogeneous datasets originating from various sources and without local annotations. The data analysis pipeline targets Gleason grading on prostate images and includes two models in sequence, following a teacher/student training paradigm. The teacher model (a high-capacity neural network) automatically annotates a set of pseudo-labeled patches used to train the student model (a smaller network). The two models are trained with two different teacher/student approaches: semi-supervised learning and semi-weekly supervised learning. For each of the two approaches, three student training variants are presented. The baseline is provided by training the student model only with the strongly annotated data. Classification performance is evaluated on the student model at the patch level (using the local annotations of the Tissue Micro-Arrays Zurich dataset) and at the global level (using the TCGA-PRAD, The Cancer Genome Atlas-PRostate ADenocarcinoma, whole slide image Gleason score). The teacher/student paradigm allows the models to better generalize on both datasets, despite the inter-dataset heterogeneity and the small number of local annotations used. The classification performance is improved both at the patch-level (up to κ=0.6127±0.0133 from κ=0.5667±0.0285), at the TMA core-level (Gleason score) (up to κ=0.7645±0.0231 from κ=0.7186±0.0306) and at the WSI-level (Gleason score) (up to κ=0.4529±0.0512 from κ=0.2293±0.1350). The results show that with the teacher/student paradigm, it is possible to train models that generalize on datasets from entirely different sources, despite the inter-dataset heterogeneity and the lack of large datasets with local annotations.

Asunto(s)

Redes Neurales de la Computación , Neoplasias de la Próstata , Humanos , Masculino , Clasificación del Tumor , Neoplasias de la Próstata/diagnóstico por imagen , Aprendizaje Automático Supervisado

20.

Combining weakly and strongly supervised learning improves strong supervision in Gleason pattern classification.

Otálora, Sebastian; Marini, Niccolò; Müller, Henning; Atzori, Manfredo.

BMC Med Imaging ; 21(1): 77, 2021 05 08.

Artículo en Inglés | MEDLINE | ID: mdl-33964886

RESUMEN

BACKGROUND: One challenge to train deep convolutional neural network (CNNs) models with whole slide images (WSIs) is providing the required large number of costly, manually annotated image regions. Strategies to alleviate the scarcity of annotated data include: using transfer learning, data augmentation and training the models with less expensive image-level annotations (weakly-supervised learning). However, it is not clear how to combine the use of transfer learning in a CNN model when different data sources are available for training or how to leverage from the combination of large amounts of weakly annotated images with a set of local region annotations. This paper aims to evaluate CNN training strategies based on transfer learning to leverage the combination of weak and strong annotations in heterogeneous data sources. The trade-off between classification performance and annotation effort is explored by evaluating a CNN that learns from strong labels (region annotations) and is later fine-tuned on a dataset with less expensive weak (image-level) labels. RESULTS: As expected, the model performance on strongly annotated data steadily increases as the percentage of strong annotations that are used increases, reaching a performance comparable to pathologists ([Formula: see text]). Nevertheless, the performance sharply decreases when applied for the WSI classification scenario with [Formula: see text]. Moreover, it only provides a lower performance regardless of the number of annotations used. The model performance increases when fine-tuning the model for the task of Gleason scoring with the weak WSI labels [Formula: see text]. CONCLUSION: Combining weak and strong supervision improves strong supervision in classification of Gleason patterns using tissue microarrays (TMA) and WSI regions. Our results contribute very good strategies for training CNN models combining few annotated data and heterogeneous data sources. The performance increases in the controlled TMA scenario with the number of annotations used to train the model. Nevertheless, the performance is hindered when the trained TMA model is applied directly to the more challenging WSI classification problem. This demonstrates that a good pre-trained model for prostate cancer TMA image classification may lead to the best downstream model if fine-tuned on the WSI target dataset. We have made available the source code repository for reproducing the experiments in the paper: https://github.com/ilmaro8/Digital_Pathology_Transfer_Learning.

Asunto(s)

Clasificación del Tumor/métodos , Redes Neurales de la Computación , Neoplasias de la Próstata/patología , Aprendizaje Automático Supervisado , Conjuntos de Datos como Asunto , Diagnóstico por Computador/métodos , Humanos , Masculino , Clasificación del Tumor/clasificación , Próstata/patología , Prostatectomía/métodos , Neoplasias de la Próstata/cirugía , Análisis de Matrices Tisulares

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA