RESUMEN
Plane wave imaging persists as a focal point of research due to its high frame rate and low complexity. However, in spite of these advantages, its performance can be compromised by several factors such as noise, speckle, and artifacts that affect the image quality and resolution. In this paper, we propose an attention-based complex convolutional residual U-Net to reconstruct improved in-phase/quadrature complex data from a single insonification acquisition that matches diverging wave imaging. Our approach introduces an attention mechanism to the complex domain in conjunction with complex convolution to incorporate phase information and improve the image quality matching images obtained using coherent compounding imaging. To validate the effectiveness of this method, we trained our network on a simulated phased array dataset and evaluated it using in vitro and in vivo data. The experimental results show that our approach improved the ultrasound image quality by focusing the network's attention on critical aspects of the complex data to identify and separate different regions of interest from background noise.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Ultrasonografía , Ultrasonografía/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Humanos , Algoritmos , Redes Neurales de la Computación , Fantasmas de ImagenRESUMEN
Fully automated approaches based on convolutional neural networks have shown promising performances on muscle segmentation from magnetic resonance (MR) images, but still rely on an extensive amount of training data to achieve valuable results. Muscle segmentation for pediatric and rare diseases cohorts is therefore still often done manually. Producing dense delineations over 3D volumes remains a time-consuming and tedious task, with significant redundancy between successive slices. In this work, we propose a segmentation method relying on registration-based label propagation, which provides 3D muscle delineations from a limited number of annotated 2D slices. Based on an unsupervised deep registration scheme, our approach ensures the preservation of anatomical structures by penalizing deformation compositions that do not produce consistent segmentation from one annotated slice to another. Evaluation is performed on MR data from lower leg and shoulder joints. Results demonstrate that the proposed few-shot multi-label segmentation model outperforms state-of-the-art techniques.
RESUMEN
SIGNIFICANCE: Screening for ocular anomalies using fundus photography is key to prevent vision impairment and blindness. With the growing and aging population, automated algorithms that can triage fundus photographs and provide instant referral decisions are relevant to scale-up screening and face the shortage of ophthalmic expertise. PURPOSE: This study aimed to develop a deep learning algorithm that detects any ocular anomaly in fundus photographs and to evaluate this algorithm for "normal versus anomalous" eye examination classification in the diabetic and general populations. METHODS: The deep learning algorithm was developed and evaluated in two populations: the diabetic and general populations. Our patient cohorts consist of 37,129 diabetic patients from the OPHDIAT diabetic retinopathy screening network in Paris, France, and 7356 general patients from the OphtaMaine private screening network, in Le Mans, France. Each data set was divided into a development subset and a test subset of more than 4000 examinations each. For ophthalmologist/algorithm comparison, a subset of 2014 examinations from the OphtaMaine test subset was labeled by a second ophthalmologist. First, the algorithm was trained on the OPHDIAT development subset. Then, it was fine-tuned on the OphtaMaine development subset. RESULTS: On the OPHDIAT test subset, the area under the receiver operating characteristic curve for normal versus anomalous classification was 0.9592. On the OphtaMaine test subset, the area under the receiver operating characteristic curve was 0.8347 before fine-tuning and 0.9108 after fine-tuning. On the ophthalmologist/algorithm comparison subset, the second ophthalmologist achieved a specificity of 0.8648 and a sensitivity of 0.6682. For the same specificity, the fine-tuned algorithm achieved a sensitivity of 0.8248. CONCLUSIONS: The proposed algorithm compares favorably with human performance for normal versus anomalous eye examination classification using fundus photography. Artificial intelligence, which previously targeted a few retinal pathologies, can be used to screen for ocular anomalies comprehensively.
Asunto(s)
Diabetes Mellitus , Retinopatía Diabética , Oftalmopatías , Anciano , Algoritmos , Inteligencia Artificial , Retinopatía Diabética/diagnóstico , Técnicas de Diagnóstico Oftalmológico , Fondo de Ojo , Humanos , Masculino , Tamizaje Masivo , Fotograbar , Sensibilidad y EspecificidadRESUMEN
In this paper, we propose a new collaborative process that aims to detect macrocalcifications from mammographic images while minimizing false negative detections. This process is made up of three main phases: suspicious area detection, candidate object identification, and collaborative classification. The main concept is to operate on the entire image divided into homogenous regions called superpixels which are used to identify both suspicious areas and candidate objects. The collaborative classification phase consists in making the initial results of different microcalcification detectors collaborate in order to produce a new common decision and reduce their initial disagreements. The detectors share the information about their detected objects and associated labels in order to refine their initial decisions based on those of the other collaborators. This refinement consists of iteratively updating the candidate object labels of each detector following local and contextual analyses based on prior knowledge about the links between super pixels and macrocalcifications. This process iteratively reduces the disagreement between different detectors and estimates local reliability terms for each super pixel. The final result is obtained by a conjunctive combination of the new detector decisions reached by the collaborative process. The proposed approach is evaluated on the publicly available INBreast dataset. Experimental results show the benefits gained in terms of improving microcalcification detection performances compared to existing detectors as well as ordinary fusion operators.
Asunto(s)
Enfermedades de la Mama , Calcinosis , Humanos , Reproducibilidad de los Resultados , Enfermedades de la Mama/diagnóstico por imagen , Calcinosis/diagnóstico por imagen , Mamografía/métodosRESUMEN
Autosomal-dominant polycystic kidney disease is a prevalent genetic disorder characterized by the development of renal cysts, leading to kidney enlargement and renal failure. Accurate measurement of total kidney volume through polycystic kidney segmentation is crucial to assess disease severity, predict progression and evaluate treatment effects. Traditional manual segmentation suffers from intra- and inter-expert variability, prompting the exploration of automated approaches. In recent years, convolutional neural networks have been employed for polycystic kidney segmentation from magnetic resonance images. However, the use of Transformer-based models, which have shown remarkable performance in a wide range of computer vision and medical image analysis tasks, remains unexplored in this area. With their self-attention mechanism, Transformers excel in capturing global context information, which is crucial for accurate organ delineations. In this paper, we evaluate and compare various convolutional-based, Transformers-based, and hybrid convolutional/Transformers-based networks for polycystic kidney segmentation. Additionally, we propose a dual-task learning scheme, where a common feature extractor is followed by per-kidney decoders, towards better generalizability and efficiency. We extensively evaluate various architectures and learning schemes on a heterogeneous magnetic resonance imaging dataset collected from 112 patients with polycystic kidney disease. Our results highlight the effectiveness of Transformer-based models for polycystic kidney segmentation and the relevancy of exploiting dual-task learning to improve segmentation accuracy and mitigate data scarcity issues. A promising ability in accurately delineating polycystic kidneys is especially shown in the presence of heterogeneous cyst distributions and adjacent cyst-containing organs. This work contribute to the advancement of reliable delineation methods in nephrology, paving the way for a broad spectrum of clinical applications.
Asunto(s)
Quistes , Enfermedades Renales Poliquísticas , Riñón Poliquístico Autosómico Dominante , Humanos , Riñón/diagnóstico por imagen , Riñón Poliquístico Autosómico Dominante/diagnóstico por imagen , Riñón Poliquístico Autosómico Dominante/patología , Enfermedades Renales Poliquísticas/patología , Imagen por Resonancia Magnética/métodos , Quistes/patologíaRESUMEN
The extraction of abdominal structures using deep learning has recently experienced a widespread interest in medical image analysis. Automatic abdominal organ and vessel segmentation is highly desirable to guide clinicians in computer-assisted diagnosis, therapy, or surgical planning. Despite a good ability to extract large organs, the capacity of U-Net inspired architectures to automatically delineate smaller structures remains a major issue, especially given the increase in receptive field size as we go deeper into the network. To deal with various abdominal structure sizes while exploiting efficient geometric constraints, we present a novel approach that integrates into deep segmentation shape priors from a semi-overcomplete convolutional auto-encoder (S-OCAE) embedding. Compared to standard convolutional auto-encoders (CAE), it exploits an over-complete branch that projects data onto higher dimensions to better characterize anatomical structures with a small spatial extent. Experiments on abdominal organs and vessel delineation performed on various publicly available datasets highlight the effectiveness of our method compared to state-of-the-art, including U-Net trained without and with shape priors from a traditional CAE. Exploiting a semi-overcomplete convolutional auto-encoder embedding as shape priors improves the ability of deep segmentation models to provide realistic and accurate abdominal structure contours.
Asunto(s)
Redes Neurales de la Computación , Tomografía Computarizada por Rayos X , Tomografía Computarizada por Rayos X/métodos , Abdomen/diagnóstico por imagen , Diagnóstico por ComputadorRESUMEN
Programmed death-ligand 1 (PD-L1) expressions play a crucial role in guiding therapeutic interventions such as the use of tyrosine kinase inhibitors (TKIs) and immune checkpoint inhibitors (ICIs) in lung cancer. Conventional determination of PD-L1 status includes careful surgical or biopsied tumor specimens. These specimens are gathered through invasive procedures, representing a risk of difficulties and potential challenges in getting reliable and representative tissue samples. Using a single center cohort of 189 patients, our objective was to evaluate various fusion methods that used non-invasive computed tomography (CT) and 18 F-FDG positron emission tomography (PET) images as inputs to various deep learning models to automatically predict PD-L1 in non-small cell lung cancer (NSCLC). We compared three different architectures (ResNet, DenseNet, and EfficientNet) and considered different input data (CT only, PET only, PET/CT early fusion, PET/CT late fusion without as well as with partially and fully shared weights to determine the best model performance. Models were assessed utilizing areas under the receiver operating characteristic curves (AUCs) considering their 95% confidence intervals (CI). The fusion of PET and CT images as input yielded better performance for PD-L1 classification. The different data fusion schemes systematically outperformed their individual counterparts when used as input of the various deep models. Furthermore, early fusion consistently outperformed late fusion, probably as a result of its capacity to capture more complicated patterns by merging PET and CT derived content at a lower level. When we looked more closely at the effects of weight sharing in late fusion architectures, we discovered that while it might boost model stability, it did not always result in better results. This suggests that although weight sharing could be beneficial when modality parameters are similar, the anatomical and metabolic information provided by CT and PET scans are too dissimilar to consistently lead to improved PD-L1 status predictions.
Asunto(s)
Antígeno B7-H1 , Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Tomografía Computarizada por Tomografía de Emisión de Positrones , Humanos , Antígeno B7-H1/metabolismo , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patología , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Masculino , Femenino , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico por imagen , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Carcinoma de Pulmón de Células no Pequeñas/patología , Persona de Mediana Edad , Anciano , Aprendizaje Profundo , Fluorodesoxiglucosa F18 , Adulto , Curva ROC , Anciano de 80 o más Años , Tomografía Computarizada por Rayos X/métodosRESUMEN
The domain shift, or acquisition shift in medical imaging, is responsible for potentially harmful differences between development and deployment conditions of medical image analysis techniques. There is a growing need in the community for advanced methods that could mitigate this issue better than conventional approaches. In this paper, we consider configurations in which we can expose a learning-based pixel level adaptor to a large variability of unlabeled images during its training, i.e. sufficient to span the acquisition shift expected during the training or testing of a downstream task model. We leverage the ability of convolutional architectures to efficiently learn domain-agnostic features and train a many-to-one unsupervised mapping between a source collection of heterogeneous images from multiple unknown domains subjected to the acquisition shift and a homogeneous subset of this source set of lower cardinality, potentially constituted of a single image. To this end, we propose a new cycle-free image-to-image architecture based on a combination of three loss functions : a contrastive PatchNCE loss, an adversarial loss and an edge preserving loss allowing for rich domain adaptation to the target image even under strong domain imbalance and low data regimes. Experiments support the interest of the proposed contrastive image adaptation approach for the regularization of downstream deep supervised segmentation and cross-modality synthesis models.
Asunto(s)
Diagnóstico por Imagen , Aprendizaje , Escolaridad , Procesamiento de Imagen Asistido por ComputadorRESUMEN
OBJECTIVES: Data scarcity and domain shifts lead to biased training sets that do not accurately represent deployment conditions. A related practical problem is cross-modal image segmentation, where the objective is to segment unlabelled images using previously labelled datasets from other imaging modalities. METHODS: We propose a cross-modal segmentation method based on conventional image synthesis boosted by a new data augmentation technique called Generative Blending Augmentation (GBA). GBA leverages a SinGAN model to learn representative generative features from a single training image to diversify realistically tumor appearances. This way, we compensate for image synthesis errors, subsequently improving the generalization power of a downstream segmentation model. The proposed augmentation is further combined to an iterative self-training procedure leveraging pseudo labels at each pass. RESULTS: The proposed solution ranked first for vestibular schwannoma (VS) segmentation during the validation and test phases of the MICCAI CrossMoDA 2022 challenge, with best mean Dice similarity and average symmetric surface distance measures. CONCLUSION AND SIGNIFICANCE: Local contrast alteration of tumor appearances and iterative self-training with pseudo labels are likely to lead to performance improvements in a variety of segmentation contexts.
RESUMEN
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.
Asunto(s)
Aprendizaje Profundo , Imagen Multimodal , Humanos , Imagen Multimodal/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
Diabetic Retinopathy (DR), an ocular complication of diabetes, is a leading cause of blindness worldwide. Traditionally, DR is monitored using Color Fundus Photography (CFP), a widespread 2-D imaging modality. However, DR classifications based on CFP have poor predictive power, resulting in suboptimal DR management. Optical Coherence Tomography Angiography (OCTA) is a recent 3-D imaging modality offering enhanced structural and functional information (blood flow) with a wider field of view. This paper investigates automatic DR severity assessment using 3-D OCTA. A straightforward solution to this task is a 3-D neural network classifier. However, 3-D architectures have numerous parameters and typically require many training samples. A lighter solution consists in using 2-D neural network classifiers processing 2-D en-face (or frontal) projections and/or 2-D cross-sectional slices. Such an approach mimics the way ophthalmologists analyze OCTA acquisitions: (1) en-face flow maps are often used to detect avascular zones and neovascularization, and (2) cross-sectional slices are commonly analyzed to detect macular edemas, for instance. However, arbitrary data reduction or selection might result in information loss. Two complementary strategies are thus proposed to optimally summarize OCTA volumes with 2-D images: (1) a parametric en-face projection optimized through deep learning and (2) a cross-sectional slice selection process controlled through gradient-based attribution. The full summarization and DR classification pipeline is trained from end to end. The automatic 2-D summary can be displayed in a viewer or printed in a report to support the decision. We show that the proposed 2-D summarization and classification pipeline outperforms direct 3-D classification with the advantage of improved interpretability.
Asunto(s)
Diabetes Mellitus , Retinopatía Diabética , Humanos , Retinopatía Diabética/diagnóstico por imagen , Angiografía con Fluoresceína/métodos , Vasos Retinianos/diagnóstico por imagen , Tomografía de Coherencia Óptica/métodos , Estudios TransversalesRESUMEN
Over the last decade, convolutional neural networks have emerged and advanced the state-of-the-art in various image analysis and computer vision applications. The performance of 2D image classification networks is constantly improving and being trained on databases made of millions of natural images. Conversely, in the field of medical image analysis, the progress is also remarkable but has mainly slowed down due to the relative lack of annotated data and besides, the inherent constraints related to the acquisition process. These limitations are even more pronounced given the volumetry of medical imaging data. In this paper, we introduce an efficient way to transfer the efficiency of a 2D classification network trained on natural images to 2D, 3D uni- and multi-modal medical image segmentation applications. In this direction, we designed novel architectures based on two key principles: weight transfer by embedding a 2D pre-trained encoder into a higher dimensional U-Net, and dimensional transfer by expanding a 2D segmentation network into a higher dimension one. The proposed networks were tested on benchmarks comprising different modalities: MR, CT, and ultrasound images. Our 2D network ranked first on the CAMUS challenge dedicated to echo-cardiographic data segmentation and surpassed the state-of-the-art. Regarding 2D/3D MR and CT abdominal images from the CHAOS challenge, our approach largely outperformed the other 2D-based methods described in the challenge paper on Dice, RAVD, ASSD, and MSSD scores and ranked third on the online evaluation platform. Our 3D network applied to the BraTS 2022 competition also achieved promising results, reaching an average Dice score of 91.69% (91.22%) for the whole tumor, 83.23% (84.77%) for the tumor core and 81.75% (83.88%) for enhanced tumor using the approach based on weight (dimensional) transfer. Experimental and qualitative results illustrate the effectiveness of our methods for multi-dimensional medical image segmentation.
Asunto(s)
Aprendizaje Profundo , Imagenología Tridimensional , Humanos , Imagenología Tridimensional/métodos , Redes Neurales de la Computación , Procesamiento de Imagen Asistido por Computador/métodos , Tomografía Computarizada por Rayos X/métodosRESUMEN
Multi-modal medical image segmentation is a crucial task in oncology that enables the precise localization and quantification of tumors. The aim of this work is to present a meta-analysis of the use of multi-modal medical Transformers for medical image segmentation in oncology, specifically focusing on multi-parametric MR brain tumor segmentation (BraTS2021), and head and neck tumor segmentation using PET-CT images (HECKTOR2021). The multi-modal medical Transformer architectures presented in this work exploit the idea of modality interaction schemes based on visio-linguistic representations: (i) single-stream, where modalities are jointly processed by one Transformer encoder, and (ii) multiple-stream, where the inputs are encoded separately before being jointly modeled. A total of fourteen multi-modal architectures are evaluated using different ranking strategies based on dice similarity coefficient (DSC) and average symmetric surface distance (ASSD) metrics. In addition, cost indicators such as the number of trainable parameters and the number of multiply-accumulate operations (MACs) are reported. The results demonstrate that multi-path hybrid CNN-Transformer-based models improve segmentation accuracy when compared to traditional methods, but come at the cost of increased computation time and potentially larger model size.
Asunto(s)
Benchmarking , Tomografía Computarizada por Tomografía de Emisión de Positrones , Procesamiento de Imagen Asistido por ComputadorRESUMEN
Quantitative Gait Analysis (QGA) is considered as an objective measure of gait performance. In this study, we aim at designing an artificial intelligence that can efficiently predict the progression of gait quality using kinematic data obtained from QGA. For this purpose, a gait database collected from 734 patients with gait disorders is used. As the patient walks, kinematic data is collected during the gait session. This data is processed to generate the Gait Profile Score (GPS) for each gait cycle. Tracking potential GPS variations enables detecting changes in gait quality. In this regard, our work is driven by predicting such future variations. Two approaches were considered: signal-based and image-based. The signal-based one uses raw gait cycles, while the image-based one employs a two-dimensional Fast Fourier Transform (2D FFT) representation of gait cycles. Several architectures were developed, and the obtained Area Under the Curve (AUC) was above 0.72 for both approaches. To the best of our knowledge, our study is the first to apply neural networks for gait prediction tasks.
Asunto(s)
Inteligencia Artificial , Análisis de la Marcha , Humanos , Análisis de la Marcha/métodos , Marcha , Redes Neurales de la Computación , Análisis de Fourier , Fenómenos BiomecánicosRESUMEN
Optical coherence tomography angiography (OCTA) can deliver enhanced diagnosis for diabetic retinopathy (DR). This study evaluated a deep learning (DL) algorithm for automatic DR severity assessment using high-resolution and ultra-widefield (UWF) OCTA. Diabetic patients were examined with 6×6 mm2 high-resolution OCTA and 15×15 mm2 UWF-OCTA using PLEX®Elite 9000. A novel DL algorithm was trained for automatic DR severity inference using both OCTA acquisitions. The algorithm employed a unique hybrid fusion framework, integrating structural and flow information from both acquisitions. It was trained on data from 875 eyes of 444 patients. Tested on 53 patients (97 eyes), the algorithm achieved a good area under the receiver operating characteristic curve (AUC) for detecting DR (0.8868), moderate non-proliferative DR (0.8276), severe non-proliferative DR (0.8376), and proliferative/treated DR (0.9070). These results significantly outperformed detection with the 6×6 mm2 (AUC = 0.8462, 0.7793, 0.7889, and 0.8104, respectively) or 15×15 mm2 (AUC = 0.8251, 0.7745, 0.7967, and 0.8786, respectively) acquisitions alone. Thus, combining high-resolution and UWF-OCTA acquisitions holds the potential for improved early and late-stage DR detection, offering a foundation for enhancing DR management and a clear path for future works involving expanded datasets and integrating additional imaging modalities.
RESUMEN
Independent validation studies of automatic diabetic retinopathy screening systems have recently shown a drop of screening performance on external data. Beyond diabetic retinopathy, this study investigates the generalizability of deep learning (DL) algorithms for screening various ocular anomalies in fundus photographs, across heterogeneous populations and imaging protocols. The following datasets are considered: OPHDIAT (France, diabetic population), OphtaMaine (France, general population), RIADD (India, general population) and ODIR (China, general population). Two multi-disease DL algorithms were developed: a Single-Dataset (SD) network, trained on the largest dataset (OPHDIAT), and a Multiple-Dataset (MD) network, trained on multiple datasets simultaneously. To assess their generalizability, both algorithms were evaluated whenever training and test data originate from overlapping datasets or from disjoint datasets. The SD network achieved a mean per-disease area under the receiver operating characteristic curve (mAUC) of 0.9571 on OPHDIAT. However, it generalized poorly to the other three datasets (mAUC < 0.9). When all four datasets were involved in training, the MD network significantly outperformed the SD network (p = 0.0058), indicating improved generality. However, in leave-one-dataset-out experiments, performance of the MD network was significantly lower on populations unseen during training than on populations involved in training (p < 0.0001), indicating imperfect generalizability.
Asunto(s)
Retinopatía Diabética , Oftalmopatías , Humanos , Retinopatía Diabética/diagnóstico por imagen , Fondo de Ojo , Oftalmopatías/diagnóstico , Técnicas de Diagnóstico Oftalmológico , Curva ROC , AlgoritmosRESUMEN
Morphological and diagnostic evaluation of pediatric musculoskeletal system is crucial in clinical practice. However, most segmentation models do not perform well on scarce pediatric imaging data. We propose a new pre-trained regularized convolutional encoder-decoder network for the challenging task of segmenting heterogeneous pediatric magnetic resonance (MR) images. To this end, we have conceived a novel optimization scheme for the segmentation network which comprises additional regularization terms to the loss function. In order to obtain globally consistent predictions, we incorporate a shape priors based regularization, derived from a non-linear shape representation learnt by an auto-encoder. Additionally, an adversarial regularization computed by a discriminator is integrated to encourage precise delineations. The proposed method is evaluated for the task of multi-bone segmentation on two scarce pediatric imaging datasets from ankle and shoulder joints, comprising pathological as well as healthy examinations. The proposed method performed either better or at par with previously proposed approaches for Dice, sensitivity, specificity, maximum symmetric surface distance, average symmetric surface distance, and relative absolute volume difference metrics. We illustrate that the proposed approach can be easily integrated into various bone segmentation strategies and can improve the prediction accuracy of models pre-trained on large non-medical images databases. The obtained results bring new perspectives for the management of pediatric musculoskeletal disorders.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Niño , Bases de Datos Factuales , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Imagen por Resonancia Magnética/métodosRESUMEN
Clinical diagnosis of the pediatric musculoskeletal system relies on the analysis of medical imaging examinations. In the medical image processing pipeline, semantic segmentation using deep learning algorithms enables an automatic generation of patient-specific three-dimensional anatomical models which are crucial for morphological evaluation. However, the scarcity of pediatric imaging resources may result in reduced accuracy and generalization performance of individual deep segmentation models. In this study, we propose to design a novel multi-task, multi-domain learning framework in which a single segmentation network is optimized over the union of multiple datasets arising from distinct parts of the anatomy. Unlike previous approaches, we simultaneously consider multiple intensity domains and segmentation tasks to overcome the inherent scarcity of pediatric data while leveraging shared features between imaging datasets. To further improve generalization capabilities, we employ a transfer learning scheme from natural image classification, along with a multi-scale contrastive regularization aimed at promoting domain-specific clusters in the shared representations, and multi-joint anatomical priors to enforce anatomically consistent predictions. We evaluate our contributions for performing bone segmentation using three scarce and pediatric imaging datasets of the ankle, knee, and shoulder joints. Our results demonstrate that the proposed approach outperforms individual, transfer, and shared segmentation schemes in Dice metric with statistically sufficient margins. The proposed model brings new perspectives towards intelligent use of imaging resources and better management of pediatric musculoskeletal disorders.
Asunto(s)
Algoritmos , Procesamiento de Imagen Asistido por Computador , Niño , Diagnóstico por Imagen , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Articulación de la RodillaRESUMEN
Automatic segmentation of abdominal organs in CT scans plays an important role in clinical practice. However, most existing benchmarks and datasets only focus on segmentation accuracy, while the model efficiency and its accuracy on the testing cases from different medical centers have not been evaluated. To comprehensively benchmark abdominal organ segmentation methods, we organized the first Fast and Low GPU memory Abdominal oRgan sEgmentation (FLARE) challenge, where the segmentation methods were encouraged to achieve high accuracy on the testing cases from different medical centers, fast inference speed, and low GPU memory consumption, simultaneously. The winning method surpassed the existing state-of-the-art method, achieving a 19× faster inference speed and reducing the GPU memory consumption by 60% with comparable accuracy. We provide a summary of the top methods, make their code and Docker containers publicly available, and give practical suggestions on building accurate and efficient abdominal organ segmentation models. The FLARE challenge remains open for future submissions through a live platform for benchmarking further methodology developments at https://flare.grand-challenge.org/.
Asunto(s)
Algoritmos , Tomografía Computarizada por Rayos X , Humanos , Tomografía Computarizada por Rayos X/métodos , Abdomen/diagnóstico por imagen , Benchmarking , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
In recent years, Artificial Intelligence (AI) has proven its relevance for medical decision support. However, the "black-box" nature of successful AI algorithms still holds back their wide-spread deployment. In this paper, we describe an eXplanatory Artificial Intelligence (XAI) that reaches the same level of performance as black-box AI, for the task of classifying Diabetic Retinopathy (DR) severity using Color Fundus Photography (CFP). This algorithm, called ExplAIn, learns to segment and categorize lesions in images; the final image-level classification directly derives from these multivariate lesion segmentations. The novelty of this explanatory framework is that it is trained from end to end, with image supervision only, just like black-box AI algorithms: the concepts of lesions and lesion categories emerge by themselves. For improved lesion localization, foreground/background separation is trained through self-supervision, in such a way that occluding foreground pixels transforms the input image into a healthy-looking image. The advantage of such an architecture is that automatic diagnoses can be explained simply by an image and/or a few sentences. ExplAIn is evaluated at the image level and at the pixel level on various CFP image datasets. We expect this new framework, which jointly offers high classification performance and explainability, to facilitate AI deployment.