RESUMEN
The phenotypic analysis of root system growth is important to inform efforts to enhance plant resource acquisition from soils; however, root phenotyping remains challenging because of the opacity of soil, requiring systems that facilitate root system visibility and image acquisition. Previously reported systems require costly or bespoke materials not available in most countries, where breeders need tools to select varieties best adapted to local soils and field conditions. Here, we report an affordable soil-based growth (rhizobox) and imaging system to phenotype root development in glasshouses or shelters. All components of the system are made from locally available commodity components, facilitating the adoption of this affordable technology in low-income countries. The rhizobox is large enough (approximately 6000 cm2 of visible soil) to avoid restricting vertical root system growth for most if not all of the life cycle, yet light enough (approximately 21 kg when filled with soil) for routine handling. Support structures and an imaging station, with five cameras covering the whole soil surface, complement the rhizoboxes. Images are acquired via the Phenotiki sensor interface, collected, stitched and analysed. Root system architecture (RSA) parameters are quantified without intervention. The RSAs of a dicot species (Cicer arietinum, chickpea) and a monocot species (Hordeum vulgare, barley), exhibiting contrasting root systems, were analysed. Insights into root system dynamics during vegetative and reproductive stages of the chickpea life cycle were obtained. This affordable system is relevant for efforts in Ethiopia and other low- and middle-income countries to enhance crop yields and climate resilience sustainably.
Asunto(s)
Raíces de Plantas/anatomía & histología , Envejecimiento , Cicer/anatomía & histología , Cicer/genética , Genotipo , Hordeum/anatomía & histología , Hordeum/genética , Fenotipo , SueloRESUMEN
Cycle-consistent generative adversarial network (CycleGAN) has been widely used for cross-domain medical image synthesis tasks particularly due to its ability to deal with unpaired data. However, most CycleGAN-based synthesis methods cannot achieve good alignment between the synthesized images and data from the source domain, even with additional image alignment losses. This is because the CycleGAN generator network can encode the relative deformations and noises associated to different domains. This can be detrimental for the downstream applications that rely on the synthesized images, such as generating pseudo-CT for PET-MR attenuation correction. In this paper, we present a deformation invariant cycle-consistency model that can filter out these domain-specific deformation. The deformation is globally parameterized by thin-plate-spline (TPS), and locally learned by modified deformable convolutional layers. Robustness to domain-specific deformations has been evaluated through experiments on multi-sequence brain MR data and multi-modality abdominal CT and MR data. Experiment results demonstrated that our method can achieve better alignment between the source and target data while maintaining superior image quality of signal compared to several state-of-the-art CycleGAN-based methods.
RESUMEN
Direct observation of morphological plant traits is tedious and a bottleneck for high-throughput phenotyping. Hence, interest in image-based analysis is increasing, with the requirement for software that can reliably extract plant traits, such as leaf count, preferably across a variety of species and growth conditions. However, current leaf counting methods do not work across species or conditions and therefore may lack broad utility. In this paper, we present Pheno-Deep Counter, a single deep network that can predict leaf count in two-dimensional (2D) plant images of different species with a rosette-shaped appearance. We demonstrate that our architecture can count leaves from multi-modal 2D images, such as visible light, fluorescence and near-infrared. Our network design is flexible, allowing for inputs to be added or removed to accommodate new modalities. Furthermore, our architecture can be used as is without requiring dataset-specific customization of the internal structure of the network, opening its use to new scenarios. Pheno-Deep Counter is able to produce accurate predictions in many plant species and, once trained, can count leaves in a few seconds. Through our universal and open source approach to deep counting we aim to broaden utilization of machine learning-based approaches to leaf counting. Our implementation can be downloaded at https://bitbucket.org/tuttoweb/pheno-deep-counter.
Asunto(s)
Aprendizaje Profundo , Fenotipo , Hojas de la Planta/anatomía & histología , Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Automático , Plantas , Programas InformáticosRESUMEN
Phenotyping is important to understand plant biology, but current solutions are costly, not versatile or are difficult to deploy. To solve this problem, we present Phenotiki, an affordable system for plant phenotyping that, relying on off-the-shelf parts, provides an easy to install and maintain platform, offering an out-of-box experience for a well-established phenotyping need: imaging rosette-shaped plants. The accompanying software (with available source code) processes data originating from our device seamlessly and automatically. Our software relies on machine learning to devise robust algorithms, and includes an automated leaf count obtained from 2D images without the need of depth (3D). Our affordable device (~200) can be deployed in growth chambers or greenhouse to acquire optical 2D images of approximately up to 60 adult Arabidopsis rosettes concurrently. Data from the device are processed remotely on a workstation or via a cloud application (based on CyVerse). In this paper, we present a proof-of-concept validation experiment on top-view images of 24 Arabidopsis plants in a combination of genotypes that has not been compared previously. Phenotypic analysis with respect to morphology, growth, color and leaf count has not been performed comprehensively before now. We confirm the findings of others on some of the extracted traits, showing that we can phenotype at reduced cost. We also perform extensive validations with external measurements and with higher fidelity equipment, and find no loss in statistical accuracy when we use the affordable setting that we propose. Device set-up instructions and analysis software are publicly available ( http://phenotiki.com).
Asunto(s)
Plantas/anatomía & histología , Plantas/clasificación , Programas Informáticos , Algoritmos , Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático , Fenotipo , Hojas de la Planta/anatomía & histología , Hojas de la Planta/metabolismoRESUMEN
Genetic variations in catechol-O-methyltransferase (COMT) that modulate cortical dopamine have been associated with pleiotropic behavioral effects in humans and mice. Recent data suggest that some of these effects may vary among sexes. However, the specific brain substrates underlying COMT sexual dimorphisms remain unknown. Here, we report that genetically driven reduction in COMT enzyme activity increased cortical thickness in the prefrontal cortex (PFC) and postero-parieto-temporal cortex of male, but not female adult mice and humans. Dichotomous changes in PFC cytoarchitecture were also observed: reduced COMT increased a measure of neuronal density in males, while reducing it in female mice. Consistent with the neuroanatomical findings, COMT-dependent sex-specific morphological brain changes were paralleled by divergent effects on PFC-dependent working memory in both mice and humans. These findings emphasize a specific sex-gene interaction that can modulate brain morphological substrates with influence on behavioral outcomes in healthy subjects and, potentially, in neuropsychiatric populations.
Asunto(s)
Catecol O-Metiltransferasa/genética , Corteza Cerebral/anatomía & histología , Memoria a Corto Plazo/fisiología , Caracteres Sexuales , Adolescente , Adulto , Análisis de Varianza , Animales , Aprendizaje por Asociación/fisiología , Mapeo Encefálico , Catecol O-Metiltransferasa/deficiencia , Corteza Cerebral/citología , Femenino , Genotipo , Proteínas de Homeodominio/metabolismo , Humanos , Imagen por Resonancia Magnética , Masculino , Aprendizaje por Laberinto , Ratones , Ratones Transgénicos , Persona de Mediana Edad , Mutación/genética , Neuronas/metabolismo , Proteínas Nucleares/metabolismo , Fosfopiruvato Hidratasa/metabolismo , Proteínas Represoras/metabolismo , Adulto JovenRESUMEN
PURPOSE: To examine whether controlled and tolerable levels of hypercapnia may be an alternative to adenosine, a routinely used coronary vasodilator, in healthy human subjects and animals. MATERIALS AND METHODS: Human studies were approved by the institutional review board and were HIPAA compliant. Eighteen subjects had end-tidal partial pressure of carbon dioxide (PetCO2) increased by 10 mm Hg, and myocardial perfusion was monitored with myocardial blood oxygen level-dependent (BOLD) magnetic resonance (MR) imaging. Animal studies were approved by the institutional animal care and use committee. Anesthetized canines with (n = 7) and without (n = 7) induced stenosis of the left anterior descending artery (LAD) underwent vasodilator challenges with hypercapnia and adenosine. LAD coronary blood flow velocity and free-breathing myocardial BOLD MR responses were measured at each intervention. Appropriate statistical tests were performed to evaluate measured quantitative changes in all parameters of interest in response to changes in partial pressure of carbon dioxide. RESULTS: Changes in myocardial BOLD MR signal were equivalent to reported changes with adenosine (11.2% ± 10.6 [hypercapnia, 10 mm Hg] vs 12% ± 12.3 [adenosine]; P = .75). In intact canines, there was a sigmoidal relationship between BOLD MR response and PetCO2 with most of the response occurring over a 10 mm Hg span. BOLD MR (17% ± 14 [hypercapnia] vs 14% ± 24 [adenosine]; P = .80) and coronary blood flow velocity (21% ± 16 [hypercapnia] vs 26% ± 27 [adenosine]; P > .99) responses were similar to that of adenosine infusion. BOLD MR signal changes in canines with LAD stenosis during hypercapnia and adenosine infusion were not different (1% ± 4 [hypercapnia] vs 6% ± 4 [adenosine]; P = .12). CONCLUSION: Free-breathing T2-prepared myocardial BOLD MR imaging showed that hypercapnia of 10 mm Hg may provide a cardiac hyperemic stimulus similar to adenosine.
Asunto(s)
Circulación Coronaria/fisiología , Hipercapnia/fisiopatología , Imagen por Resonancia Magnética/métodos , Adenosina/farmacología , Animales , Perros , Electrocardiografía , Humanos , Aumento de la Imagen/métodos , Oximetría , Reproducibilidad de los Resultados , Vasodilatadores/farmacologíaRESUMEN
Ultrasound is a promising medical imaging modality benefiting from low-cost and real-time acquisition. Accurate tracking of an anatomical landmark has been of high interest for various clinical workflows such as minimally invasive surgery and ultrasound-guided radiation therapy. However, tracking an anatomical landmark accurately in ultrasound video is very challenging, due to landmark deformation, visual ambiguity and partial observation. In this paper, we propose a long-short diffeomorphism memory network (LSDM), which is a multi-task framework with an auxiliary learnable deformation prior to supporting accurate landmark tracking. Specifically, we design a novel diffeomorphic representation, which contains both long and short temporal information stored in separate memory banks for delineating motion margins and reducing cumulative errors. We further propose an expectation maximization memory alignment (EMMA) algorithm to iteratively optimize both the long and short deformation memory, updating the memory queue for mitigating local anatomical ambiguity. The proposed multi-task system can be trained in a weakly-supervised manner, which only requires few landmark annotations for tracking and zero annotation for deformation learning. We conduct extensive experiments on both public and private ultrasound landmark tracking datasets. Experimental results show that LSDM can achieve better or competitive landmark tracking performance with a strong generalization capability across different scanner types and different ultrasound modalities, compared with other state-of-the-art methods.
Asunto(s)
Algoritmos , Humanos , Ultrasonografía/métodos , Movimiento (Física)RESUMEN
Deep learning models often need sufficient supervision (i.e., labelled data) in order to be trained effectively. By contrast, humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans, with minimal guidance. This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings. This rapid and generalisable learning ability is largely due to the compositional structure of image patterns in the human brain, which are not well represented in current medical models. In this paper, we study the utilisation of compositionality in learning more interpretable and generalisable representations for medical image segmentation. Overall, we propose that the underlying generative factors that are used to generate the medical images satisfy compositional equivariance property, where each factor is compositional (e.g., corresponds to human anatomy) and also equivariant to the task. Hence, a good representation that approximates well the ground truth factor has to be compositionally equivariant. By modelling the compositional representations with learnable von-Mises-Fisher (vMF) kernels, we explore how different design and learning biases can be used to enforce the representations to be more compositionally equivariant under un-, weakly-, and semi-supervised settings. Extensive results show that our methods achieve the best performance over several strong baselines on the task of semi-supervised domain-generalised medical image segmentation. Code will be made publicly available upon acceptance at https://github.com/vios-s.
Asunto(s)
Algoritmos , Encéfalo , Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética , Humanos , Encéfalo/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Tomografía Computarizada por Rayos X/métodosRESUMEN
Deep neural networks (DNNs) have achieved high accuracy in diagnosing multiple diseases/conditions at a large scale. However, a number of concerns have been raised about safeguarding data privacy and algorithmic bias of the neural network models. We demonstrate that unique features (UFs), such as names, IDs, or other patient information can be memorised (and eventually leaked) by neural networks even when it occurs on a single training data sample within the dataset. We explain this memorisation phenomenon by showing that it is more likely to occur when UFs are an instance of a rare concept. We propose methods to identify whether a given model does or does not memorise a given (known) feature. Importantly, our method does not require access to the training data and therefore can be deployed by an external entity. We conclude that memorisation does have implications on model robustness, but it can also pose a risk to the privacy of patients who consent to the use of their data for training models.
Asunto(s)
Redes Neurales de la Computación , Privacidad , HumanosRESUMEN
Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance since commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts, covering involuntary data leakage which is natural to ML models, potential malicious leakage which is caused by privacy attacks, and currently available defence mechanisms. We focus on inference-time leakage, as the most likely scenario for publicly available models. We first discuss what leakage is in the context of different data, tasks, and model architectures. We then propose a taxonomy across involuntary and malicious leakage, followed by description of currently available defences, assessment metrics, and applications. We conclude with outstanding challenges and open questions, outlining some promising directions for future research.
RESUMEN
Pathological brain lesions exhibit diverse appearance in brain images, in terms of intensity, texture, shape, size, and location. Comprehensive sets of data and annotations are difficult to acquire. Therefore, unsupervised anomaly detection approaches have been proposed using only normal data for training, with the aim of detecting outlier anomalous voxels at test time. Denoising methods, for instance classical denoising autoencoders (DAEs) and more recently emerging diffusion models, are a promising approach, however naive application of pixelwise noise leads to poor anomaly detection performance. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes, with similar noise parameter adjustments giving good performance for both DAEs and diffusion models. Visual inspection of the reconstructions suggests that the training noise influences the trade-off between the extent of the detail that is reconstructed and the extent of erasure of anomalies, both of which contribute to better anomaly detection performance. We validate our findings on two real-world datasets (tumor detection in brain MRI and hemorrhage/ischemia/tumor detection in brain CT), showing good detection on diverse anomaly appearances. Overall, we find that a DAE trained with coarse noise is a fast and simple method that gives state-of-the-art accuracy. Diffusion models applied to anomaly detection are as yet in their infancy and provide a promising avenue for further research. Code for our DAE model and coarse noise is provided at: https://github.com/AntanasKascenas/DenoisingAE.
RESUMEN
Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on the myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020. Note that MyoPS refers to both myocardial pathology segmentation and the challenge in this paper. The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segmentation. In this article, we provide details of the challenge, survey the works from fifteen participants and interpret their methods according to five aspects, i.e., preprocessing, data augmentation, learning strategy, model architecture and post-processing. In addition, we analyze the results with respect to different factors, in order to examine the key obstacles and explore the potential of solutions, as well as to provide a benchmark for future research. The average Dice scores of submitted algorithms were 0.614±0.231 and 0.644±0.153 for myocardial scars and edema, respectively. We conclude that while promising results have been reported, the research is still in the early stage, and more in-depth exploration is needed before a successful application to the clinics. MyoPS data and evaluation tool continue to be publicly available upon registration via its homepage (www.sdspeople.fudan.edu.cn/zhuangxiahai/0/myops20/).
Asunto(s)
Benchmarking , Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Corazón/diagnóstico por imagen , Miocardio/patología , Imagen por Resonancia Magnética/métodosRESUMEN
PURPOSE: To investigate whether a statistical analysis of myocardial blood-oxygen-level-dependent (mBOLD) signal intensities can lead to the identification and quantification of the ischemic area supplied by the culprit artery. MATERIALS AND METHODS: Cardiac BOLD images were acquired in a canine model (n = 9) with controllable LCX stenosis at rest and during adenosine infusion on a 1.5T clinical scanner. Statistical distributions of myocardial pixel-intensities derived from BOLD images were used to compute an area metric (ischemic extent, IE). True myocardial perfusion was estimated from microsphere analysis. IE was compared against a standard metric (segment-intensity-response, SIR). Additional animals (n = 3) were used to investigate the feasibility of the approach for identifying ischemic territories due to LAD stenosis from mBOLD images. RESULTS: Regression analyses showed that IE and myocardial flow ratio between rest and adenosine infusion (MFR) were exponentially related (R(2) > 0.70, P < 0.001, for end-systole and end-diastole), while SIR and MFR were linearly related to end-systole (R(2) = 0.51, P < 0.04) and unrelated to end-diastole (R(2) ≈ 0, P = 0.91). Receiver-operating-characteristic analysis that IE was superior to SIR for detecting critical stenosis (MFR ≤ 2) in end-systole and end-diastole. Feasibility studies on LAD narrowing demonstrated that the proposed approach could also identify oxygenation changes in the LAD territories. CONCLUSION: The proposed evaluation of cardiac BOLD magnetic resonance imaging (MRI) offers marked improvement in sensitivity and specificity for detecting critical coronary stenosis at 1.5T compared to the mean segmental intensity approach. Patient studies are now warranted to determine its clinical utility.
Asunto(s)
Estenosis Coronaria/sangre , Estenosis Coronaria/diagnóstico , Imagen por Resonancia Magnética/métodos , Isquemia Miocárdica/sangre , Isquemia Miocárdica/diagnóstico , Oxígeno/sangre , Animales , Biomarcadores/sangre , Estenosis Coronaria/complicaciones , Perros , Isquemia Miocárdica/complicaciones , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Índice de Severidad de la EnfermedadRESUMEN
Due to the limited availability of medical data, deep learning approaches for medical image analysis tend to generalise poorly to unseen data. Augmenting data during training with random transformations has been shown to help and became a ubiquitous technique for training neural networks. Here, we propose a novel adversarial counterfactual augmentation scheme that aims at finding the most effective synthesised images to improve downstream tasks, given a pre-trained generative model. Specifically, we construct an adversarial game where we update the input conditional factor of the generator and the downstream classifier with gradient backpropagation alternatively and iteratively. This can be viewed as finding the 'weakness' of the classifier and purposely forcing it to overcome its weakness via the generative model. To demonstrate the effectiveness of the proposed approach, we validate the method with the classification of Alzheimer's Disease (AD) as a downstream task. The pre-trained generative model synthesises brain images using age as conditional factor. Extensive experiments and ablation studies have been performed to show that the proposed approach improves classification performance and has potential to alleviate spurious correlations and catastrophic forgetting. Code: https://github.com/xiat0616/adversarial_counterfactual_augmentation.
RESUMEN
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Asunto(s)
Aprendizaje , Aprendizaje Automático , Humanos , Programas InformáticosRESUMEN
Causal machine learning (CML) has experienced increasing popularity in healthcare. Beyond the inherent capabilities of adding domain knowledge into learning systems, CML provides a complete toolset for investigating how a system would react to an intervention (e.g. outcome given a treatment). Quantifying effects of interventions allows actionable decisions to be made while maintaining robustness in the presence of confounders. Here, we explore how causal inference can be incorporated into different aspects of clinical decision support systems by using recent advances in machine learning. Throughout this paper, we use Alzheimer's disease to create examples for illustrating how CML can be advantageous in clinical scenarios. Furthermore, we discuss important challenges present in healthcare applications such as processing high-dimensional and unstructured data, generalization to out-of-distribution samples and temporal relationships, that despite the great effort from the research community remain to be solved. Finally, we review lines of research within causal representation learning, causal discovery and causal reasoning which offer the potential towards addressing the aforementioned challenges.
RESUMEN
Age has important implications for health, and understanding how age manifests in the human body is the first step for a potential intervention. This becomes especially important for cardiac health, since age is the main risk factor for development of cardiovascular disease. Data-driven modeling of age progression has been conducted successfully in diverse applications such as face or brain aging. While longitudinal data is the preferred option for training deep learning models, collecting such a dataset is usually very costly, especially in medical imaging. In this work, a conditional generative adversarial network is proposed to synthesize older and younger versions of a heart scan by using only cross-sectional data. We train our model with more than 14,000 different scans from the UK Biobank. The induced modifications focused mainly on the interventricular septum and the aorta, which is consistent with the existing literature in cardiac aging. We evaluate the results by measuring image quality, the mean absolute error for predicted age using a pre-trained regressor, and demonstrate the application of synthetic data for counter-balancing biased datasets. The results suggest that the proposed approach is able to model realistic changes in the heart using only cross-sectional data and that these data can be used to correct age bias in a dataset.
RESUMEN
PURPOSE: To investigate the contribution of proton density (PD) in T(2) -STIR based edema imaging in the setting of acute myocardial infarction (AMI). MATERIALS AND METHODS: Canines (n = 5), subjected to full occlusion of the left anterior descending artery for 3 hours, underwent serial magnetic resonance imaging (MRI) studies 2 hours postreperfusion (day 0) and on day 2. During each study, T(1) and T(2) maps, STIR (TE = 7.1 msec and 64 msec) and late gadolinium enhancement (LGE) images were acquired. Using T(1) and T(2) maps, relaxation and PD contributions to myocardial edema contrast (EC) in STIR images at both TEs were calculated. RESULTS: Edematous territories showed significant increase in PD (20.3 ± 14.3%, P < 0.05) relative to healthy territories. The contributions of T(1) changes and T(2) or PD changes toward EC were in opposite directions. One-tailed t-test confirmed that the mean T(2) and PD-based EC at both TEs were greater than zero. EC from STIR images at TE = 7.1 msec was dominated by PD than T(2) effects (94.3 ± 11.3% vs. 17.6 ± 2.5%, P < 0.05), while at TE = 64 msec, T(2) effects were significantly greater than PD effects (90.8 ± 20.3% vs. 12.5 ± 11.9%, P < 0.05). The contribution from PD in standard STIR acquisitions (TE = 64 msec) was significantly higher than 0 (P < 0.05). CONCLUSION: In addition to T(2) -weighting, edema detection in the setting of AMI with T(2) -weighted STIR imaging has a substantial contribution from PD changes, likely stemming from increased free-water content within the affected tissue. This suggests that imaging approaches that take advantage of both PD as well as T(2) effects may provide the optimal sensitivity for detecting myocardial edema.
Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Imagen por Resonancia Magnética/métodos , Miocardio/patología , Daño por Reperfusión/patología , Animales , Medios de Contraste/farmacología , Perros , Edema , Femenino , Gadolinio/farmacología , Masculino , Modelos Biológicos , Modelos Estadísticos , Infarto del Miocardio/patología , ProtonesRESUMEN
Large, fine-grained image segmentation datasets, annotated at pixel-level, are difficult to obtain, particularly in medical imaging, where annotations also require expert knowledge. Weakly-supervised learning can train models by relying on weaker forms of annotation, such as scribbles. Here, we learn to segment using scribble annotations in an adversarial game. With unpaired segmentation masks, we train a multi-scale GAN to generate realistic segmentation masks at multiple resolutions, while we use scribbles to learn their correct position in the image. Central to the model's success is a novel attention gating mechanism, which we condition with adversarial signals to act as a shape prior, resulting in better object localization at multiple scales. Subject to adversarial conditioning, the segmentor learns attention maps that are semantic, suppress the noisy activations outside the objects, and reduce the vanishing gradient problem in the deeper layers of the segmentor. We evaluated our model on several medical (ACDC, LVSC, CHAOS) and non-medical (PPSS) datasets, and we report performance levels matching those achieved by models trained with fully annotated segmentation masks. We also demonstrate extensions in a variety of settings: semi-supervised learning; combining multiple scribble sources (a crowdsourcing scenario) and multi-task learning (combining scribble and mask supervision). We release expert-made scribble annotations for the ACDC dataset, and the code used for the experiments, at https://vios-s.github.io/multiscale-adversarial-attention-gates.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático Supervisado , Atención , Humanos , SemánticaRESUMEN
How will my face look when I get older? Or, for a more challenging question: How will my brain look when I get older? To answer this question one must devise (and learn from data) a multivariate auto-regressive function which given an image and a desired target age generates an output image. While collecting data for faces may be easier, collecting longitudinal brain data is not trivial. We propose a deep learning-based method that learns to simulate subject-specific brain ageing trajectories without relying on longitudinal data. Our method synthesises images conditioned on two factors: age (a continuous variable), and status of Alzheimer's Disease (AD, an ordinal variable). With an adversarial formulation we learn the joint distribution of brain appearance, age and AD status, and define reconstruction losses to address the challenging problem of preserving subject identity. We compare with several benchmarks using two widely used datasets. We evaluate the quality and realism of synthesised images using ground-truth longitudinal data and a pre-trained age predictor. We show that, despite the use of cross-sectional data, our model learns patterns of gray matter atrophy in the middle temporal gyrus in patients with AD. To demonstrate generalisation ability, we train on one dataset and evaluate predictions on the other. In conclusion, our model shows an ability to separate age, disease influence and anatomy using only 2D cross-sectional data that should be useful in large studies into neurodegenerative disease, that aim to combine several data sources. To facilitate such future studies by the community at large our code is made available at https://github.com/xiat0616/BrainAgeing.