RESUMO
PURPOSE: A retrospective study was performed to study the effect of fetal surgery on brain development measured by MRI in fetuses with myelomeningocele (MMC). METHODS: MRI scans of 12 MMC fetuses before and after surgery were compared to 24 age-matched controls without central nervous system abnormalities. An automated super-resolution reconstruction technique generated isotropic brain volumes to mitigate 2D MRI fetal motion artefact. Unmyelinated white matter, cerebellum and ventricles were automatically segmented, and cerebral volume, shape and cortical folding were thereafter quantified. Biometric measures were calculated for cerebellar herniation level (CHL), clivus-supraocciput angle (CSO), transverse cerebellar diameter (TCD) and ventricular width (VW). Shape index (SI), a mathematical marker of gyrification, was derived. We compared cerebral volume, surface area and SI before and after MMC fetal surgery versus controls. We additionally identified any relationship between these outcomes and biometric measurements. RESULTS: MMC ventricular volume/week (mm3/week) increased after fetal surgery (median: 3699, interquartile range (IQR): 1651-5395) compared to controls (median: 648, IQR: 371-896); P = 0.015. The MMC SI is higher pre-operatively in all cerebral lobes in comparison to that in controls. Change in SI/week in MMC fetuses was higher in the left temporal lobe (median: 0.039, IQR: 0.021-0.054), left parietal lobe (median: 0.032, IQR: 0.023-0.039) and right occipital lobe (median: 0.027, IQR: 0.019-0.040) versus controls (P = 0.002 to 0.005). Ventricular volume (mm3) and VW (mm) (r = 0.64), cerebellar volume and TCD (r = 0.56) were moderately correlated. CONCLUSIONS: Following fetal myelomeningocele repair, brain volume, shape and SI were significantly different from normal in most cerebral layers. Morphological brain changes after fetal surgery are not limited to hindbrain herniation reversal. These findings may have neurocognitive outcome implications and require further evaluation.
Assuntos
Meningomielocele , Disrafismo Espinal , Encéfalo/diagnóstico por imagem , Encéfalo/cirurgia , Feto , Humanos , Imageamento por Ressonância Magnética , Meningomielocele/diagnóstico por imagem , Meningomielocele/cirurgia , Estudos RetrospectivosRESUMO
Deep learning models for medical image segmentation can fail unexpectedly and spectacularly for pathological cases and images acquired at different centers than training images, with labeling errors that violate expert knowledge. Such errors undermine the trustworthiness of deep learning models for medical image segmentation. Mechanisms for detecting and correcting such failures are essential for safely translating this technology into clinics and are likely to be a requirement of future regulations on artificial intelligence (AI). In this work, we propose a trustworthy AI theoretical framework and a practical system that can augment any backbone AI system using a fallback method and a fail-safe mechanism based on Dempster-Shafer theory. Our approach relies on an actionable definition of trustworthy AI. Our method automatically discards the voxel-level labeling predicted by the backbone AI that violate expert knowledge and relies on a fallback for those voxels. We demonstrate the effectiveness of the proposed trustworthy AI approach on the largest reported annotated dataset of fetal MRI consisting of 540 manually annotated fetal brain 3D T2w MRIs from 13 centers. Our trustworthy AI method improves the robustness of four backbone AI models for fetal brain MRIs acquired across various centers and for fetuses with various brain abnormalities.
Assuntos
Algoritmos , Inteligência Artificial , Imageamento por Ressonância Magnética , Feto/diagnóstico por imagem , Encéfalo/diagnóstico por imagemRESUMO
In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, gray matter, white matter, ventricles, cerebellum, brainstem, deep gray matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero.
Assuntos
Processamento de Imagem Assistida por Computador , Substância Branca , Gravidez , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Encéfalo/diagnóstico por imagem , Cabeça , Feto/diagnóstico por imagem , Algoritmos , Imageamento por Ressonância Magnética/métodosRESUMO
Background: Spina bifida aperta (SBA) is a birth defect associated with severe anatomical changes in the developing fetal brain. Brain magnetic resonance imaging (MRI) atlases are popular tools for studying neuropathology in the brain anatomy, but previous fetal brain MRI atlases have focused on the normal fetal brain. We aimed to develop a spatio-temporal fetal brain MRI atlas for SBA. Methods: We developed a semi-automatic computational method to compute the first spatio-temporal fetal brain MRI atlas for SBA. We used 90 MRIs of fetuses with SBA with gestational ages ranging from 21 to 35 weeks. Isotropic and motion-free 3D reconstructed MRIs were obtained for all the examinations. We propose a protocol for the annotation of anatomical landmarks in brain 3D MRI of fetuses with SBA with the aim of making spatial alignment of abnormal fetal brain MRIs more robust. In addition, we propose a weighted generalized Procrustes method based on the anatomical landmarks for the initialization of the atlas. The proposed weighted generalized Procrustes can handle temporal regularization and missing annotations. After initialization, the atlas is refined iteratively using non-linear image registration based on the image intensity and the anatomical land-marks. A semi-automatic method is used to obtain a parcellation of our fetal brain atlas into eight tissue types: white matter, ventricular system, cerebellum, extra-axial cerebrospinal fluid, cortical gray matter, deep gray matter, brainstem, and corpus callosum. Results: An intra-rater variability analysis suggests that the seven anatomical land-marks are sufficiently reliable. We find that the proposed atlas outperforms a normal fetal brain atlas for the automatic segmentation of brain 3D MRI of fetuses with SBA. Conclusions: We make publicly available a spatio-temporal fetal brain MRI atlas for SBA, available here: https://doi.org/10.7303/syn25887675. This atlas can support future research on automatic segmentation methods for brain 3D MRI of fetuses with SBA.
RESUMO
Producing manual, pixel-accurate, image segmentation labels is tedious and time-consuming. This is often a rate-limiting factor when large amounts of labeled images are required, such as for training deep convolutional networks for instrument-background segmentation in surgical scenes. No large datasets comparable to industry standards in the computer vision community are available for this task. To circumvent this problem, we propose to automate the creation of a realistic training dataset by exploiting techniques stemming from special effects and harnessing them to target training performance rather than visual appeal. Foreground data is captured by placing sample surgical instruments over a chroma key (a.k.a. green screen) in a controlled environment, thereby making extraction of the relevant image segment straightforward. Multiple lighting conditions and viewpoints can be captured and introduced in the simulation by moving the instruments and camera and modulating the light source. Background data is captured by collecting videos that do not contain instruments. In the absence of pre-existing instrument-free background videos, minimal labeling effort is required, just to select frames that do not contain surgical instruments from videos of surgical interventions freely available online. We compare different methods to blend instruments over tissue and propose a novel data augmentation approach that takes advantage of the plurality of options. We show that by training a vanilla U-Net on semi-synthetic data only and applying a simple post-processing, we are able to match the results of the same network trained on a publicly available manually labeled real dataset.
Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Instrumentos CirúrgicosRESUMO
A multitude of image-based machine learning segmentation and classification algorithms has recently been proposed, offering diagnostic decision support for the identification and characterization of glioma, Covid-19 and many other diseases. Even though these algorithms often outperform human experts in segmentation tasks, their limited reliability, and in particular the inability to detect failure cases, has hindered translation into clinical practice. To address this major shortcoming, we propose an unsupervised quality estimation method for segmentation ensembles. Our primitive solution examines discord in binary segmentation maps to automatically flag segmentation results that are particularly error-prone and therefore require special assessment by human readers. We validate our method both on segmentation of brain glioma in multi-modal magnetic resonance - and of lung lesions in computer tomography images. Additionally, our method provides an adaptive prioritization mechanism to maximize efficacy in use of human expert time by enabling radiologists to focus on the most difficult, yet important cases while maintaining full diagnostic autonomy. Our method offers an intuitive and reliable uncertainty estimation from segmentation ensembles and thereby closes an important gap toward successful translation of automatic segmentation into clinical routine.
RESUMO
PURPOSE: This study aims to evaluate the impact of key parameters on the pseudo computed tomography (pCT) quality generated from magnetic resonance imaging (MRI) with a 3-dimensional (3D) convolutional neural network. METHODS AND MATERIALS: Four hundred two brain tumor cases were retrieved, yielding associations between 182 computed tomography (CT) and T1-weighted MRI (T1) scans, 180 CT and contrast-enhanced T1-weighted MRI (T1-Gd) scans, and 40 CT, T1, and T1-Gd scans. A 3D CNN was used to map T1 or T1-Gd onto CT scans and evaluate the importance of different components. First, the training set size's influence on testing set accuracy was assessed. Moreover, we evaluated the MRI sequence impact, using T1-only and T1-Gd-only cohorts. We then investigated 4 MRI standardization approaches (histogram-based, zero-mean/unit-variance, white stripe, and no standardization) based on training, validation, and testing cohorts composed of 242, 81, and 79 patients cases, respectively, as well as a bias field correction influence. Finally, 2 networks, namely HighResNet and 3D UNet, were compared to evaluate the architecture's impact on the pCT quality. The mean absolute error, gamma indices, and dose-volume histograms were used as evaluation metrics. RESULTS: Generating models using all the available cases for training led to higher pCT quality. The T1 and T1-Gd models had a maximum difference in gamma index means of 0.07 percentage point. The mean absolute error obtained with white stripe was 78 ± 22 Hounsfield units, which slightly outperformed histogram-based, zero-mean/unit-variance, and no standardization (P < .0001). Regarding the network architectures, 3%/3 mm gamma indices of 99.83% ± 0.19% and 99.74% ± 0.24% were obtained for HighResNet and 3D UNet, respectively. CONCLUSIONS: Our best pCTs were generated using more than 200 samples in the training data set. Training with T1 only and T1-Gd only did not significantly affect performance. Regardless of the preprocessing applied, the dosimetry quality remained equivalent and relevant for potential use in clinical practice.
Assuntos
Neoplasias Encefálicas/diagnóstico por imagem , Aprendizado Profundo , Imageamento por Ressonância Magnética/métodos , Tomografia Computadorizada por Raios X/métodos , Encéfalo/diagnóstico por imagem , Neoplasias Encefálicas/radioterapia , Meios de Contraste , Humanos , Imageamento por Ressonância Magnética/normas , Redes Neurais de Computação , Radiometria , Radioterapia/normas , Estudos Retrospectivos , Crânio/diagnóstico por imagemRESUMO
BACKGROUND AND OBJECTIVES: Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this domain of application requires substantial implementation effort. Consequently, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. METHODS: The NiftyNet infrastructure provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on the TensorFlow framework and supports features such as TensorBoard visualization of 2D and 3D images and computational graphs by default. RESULTS: We present three illustrative medical image analysis applications built using NiftyNet infrastructure: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. CONCLUSIONS: The NiftyNet infrastructure enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.