RESUMO
With the rise of open data, identifiability of individuals based on 3D renderings obtained from routine structural magnetic resonance imaging (MRI) scans of the head has become a growing privacy concern. To protect subject privacy, several algorithms have been developed to de-identify imaging data using blurring, defacing or refacing. Completely removing facial structures provides the best re-identification protection but can significantly impact post-processing steps, like brain morphometry. As an alternative, refacing methods that replace individual facial structures with generic templates have a lower effect on the geometry and intensity distribution of original scans, and are able to provide more consistent post-processing results by the price of higher re-identification risk and computational complexity. In the current study, we propose a novel method for anonymized face generation for defaced 3D T1-weighted scans based on a 3D conditional generative adversarial network. To evaluate the performance of the proposed de-identification tool, a comparative study was conducted between several existing defacing and refacing tools, with two different segmentation algorithms (FAST and Morphobox). The aim was to evaluate (i) impact on brain morphometry reproducibility, (ii) re-identification risk, (iii) balance between (i) and (ii), and (iv) the processing time. The proposed method takes 9 s for face generation and is suitable for recovering consistent post-processing results after defacing.
Assuntos
Imageamento por Ressonância Magnética , Humanos , Imageamento por Ressonância Magnética/métodos , Adulto , Encéfalo/diagnóstico por imagem , Encéfalo/anatomia & histologia , Masculino , Feminino , Redes Neurais de Computação , Imageamento Tridimensional/métodos , Neuroimagem/métodos , Neuroimagem/normas , Anonimização de Dados , Adulto Jovem , Processamento de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/normas , AlgoritmosRESUMO
Modern aircraft cockpit system is highly information-intensive. Pilots often need to receive a large amount of information and make correct judgments and decisions in a short time. However, cognitive load can affect their ability to perceive, judge and make decisions accurately. Furthermore, the excessive cognitive load will induce incorrect operations and even lead to flight accidents. Accordingly, the research on cognitive load is crucial to reduce errors and even accidents caused by human factors. By using physiological acquisition systems such as eye movement, ECG, and respiration, multi-source physiological signals of flight cadets performing different flight tasks during the flight simulation experiment are obtained. Based on the characteristic indexes extracted from multi-source physiological data, the CGAN-DBN model is established by combining the conditional generative adversarial networks (CGAN) model with the deep belief network (DBN) model to identify the flight cadets' cognitive load. The research results show that the flight cadets' cognitive load identification based on the CGAN-DBN model established has high accuracy. And it can effectively identify the cognitive load of flight cadets. The research paper has important practical significance to reduce the flight accidents caused by the high cognitive load of pilots.
In our study, a highly accurate cognitive load identification model for flight cadets was established by using multi-source physiological data. Moreover, it provides a theoretical basis for identifying the cognitive load of pilots through wearable physiological devices. Our intent is to catalyse further research and technological development.
RESUMO
Recent functional magnetic resonance imaging (fMRI) studies have made significant progress in reconstructing perceived visual content, which advanced our understanding of the visual mechanism. However, reconstructing dynamic natural vision remains a challenge because of the limitation of the temporal resolution of fMRI. Here, we developed a novel fMRI-conditional video generative adversarial network (f-CVGAN) to reconstruct rapid video stimuli from evoked fMRI responses. In this model, we employed a generator to produce spatiotemporal reconstructions and employed two separate discriminators (spatial and temporal discriminators) for the assessment. We trained and tested the f-CVGAN on two publicly available video-fMRI datasets, and the model produced pixel-level reconstructions of 8 perceived video frames from each fMRI volume. Experimental results showed that the reconstructed videos were fMRI-related and captured important spatial and temporal information of the original stimuli. Moreover, we visualized the cortical importance map and found that the visual cortex is extensively involved in the reconstruction, whereas the low-level visual areas (V1/V2/V3/V4) showed the largest contribution. Our work suggests that slow blood oxygen level-dependent signals describe neural representations of the fast perceptual process that can be decoded in practice.
Assuntos
Imageamento por Ressonância Magnética , Córtex Visual , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Córtex Visual/diagnóstico por imagem , Córtex Visual/fisiologiaRESUMO
In this paper, we propose a new model for conditional video generation (GammaGAN). Generally, it is challenging to generate a plausible video from a single image with a class label as a condition. Traditional methods based on conditional generative adversarial networks (cGANs) often encounter difficulties in effectively utilizing a class label, typically by concatenating a class label to the input or hidden layer. In contrast, the proposed GammaGAN adopts the projection method to effectively utilize a class label and proposes scaling class embeddings and normalizing outputs. Concretely, our proposed architecture consists of two streams: a class embedding stream and a data stream. In the class embedding stream, class embeddings are scaled to effectively emphasize class-specific differences. Meanwhile, the outputs in the data stream are normalized. Our normalization technique balances the outputs of both streams, ensuring a balance between the importance of feature vectors and class embeddings during training. This results in enhanced video quality. We evaluated the proposed method using the MUG facial expression dataset, which consists of six facial expressions. Compared with the prior conditional video generation model, ImaGINator, our model yielded relative improvements of 1.61%, 1.66%, and 0.36% in terms of PSNR, SSIM, and LPIPS, respectively. These results suggest potential for further advancements in conditional video generation.
RESUMO
In long-term use, cracks will show up on the road, delivering monetary losses and security hazards. However, the road surface with a complex background has various disturbances, so it is challenging to segment the cracks accurately. Therefore, we propose a pavement cracks segmentation method based on a conditional generative adversarial network in this paper. U-net3+ with the attention module is used in the generator to generate segmented images for pavement cracks. The attention module highlights crack features and suppresses noise features from two dimensions of channel and space, then fuses the features generated by these two dimensions to obtain more complementary crack features. The original image is stitched with the manual annotation of cracks and the generated segmented image as the input of the discriminator. The PatchGAN method is used in the discriminator. Moreover, we propose a weighted hybrid loss function to improve the segmentation accuracy by exploiting the difference between the generated and annotated images. Through alternating gaming training of the generator and the discriminator, the segmentation image of cracks generated by the generator is very close to the actual segmentation image, thus achieving the effect of crack detection. Our experimental results using the Crack500 datasets show that the proposed method can eliminate various disturbances and achieve superior performance in pavement crack detection with complex backgrounds.
Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodosRESUMO
This paper develops an approach to perform binary semantic segmentation on Arabidopsis thaliana root images for plant root phenotyping using a conditional generative adversarial network (cGAN) to address pixel-wise class imbalance. Specifically, we use Pix2PixHD, an image-to-image translation cGAN, to generate realistic and high resolution images of plant roots and annotations similar to the original dataset. Furthermore, we use our trained cGAN to triple the size of our original root dataset to reduce pixel-wise class imbalance. We then feed both the original and generated datasets into SegNet to semantically segment the root pixels from the background. Furthermore, we postprocess our segmentation results to close small, apparent gaps along the main and lateral roots. Lastly, we present a comparison of our binary semantic segmentation approach with the state-of-the-art in root segmentation. Our efforts demonstrate that cGAN can produce realistic and high resolution root images, reduce pixel-wise class imbalance, and our segmentation model yields high testing accuracy (of over 99%), low cross entropy error (of less than 2%), high Dice Score (of near 0.80), and low inference time for near real-time processing.
Assuntos
Arabidopsis , Fenômenos Biológicos , Processamento de Imagem Assistida por Computador/métodos , Semântica , Raízes de PlantasRESUMO
Constant monitoring of road surfaces helps to show the urgency of deterioration or problems in the road construction and to improve the safety level of the road surface. Conditional generative adversarial networks (cGAN) are a powerful tool to generate or transform the images used for crack detection. The advantage of this method is the highly accurate results in vector-based images, which are convenient for mathematical analysis of the detected cracks at a later time. However, images taken under established parameters are different from images in real-world contexts. Another potential problem of cGAN is that it is difficult to detect the shape of an object when the resulting accuracy is low, which can seriously affect any further mathematical analysis of the detected crack. To tackle this issue, this paper proposes a method called improved cGAN with attention gate (ICGA) for roadway surface crack detection. To obtain a more accurate shape of the detected target object, ICGA establishes a multi-level model with independent stages. In the first stage, everything except the road is treated as noise and removed from the image. These images are stored in a new dataset. In the second stage, ICGA determines the cracks. Therefore, ICGA focuses on the redistribution of cracks, not the auxiliary elements in the image. ICGA adds two attention gates to a U-net architecture and improves the segmentation capacities of the generator in pix2pix. Extensive experimental results on dashboard camera images of the Unsupervised Llamas dataset show that our method has better performance than other state-of-the-art methods.
Assuntos
Atenção , Processamento de Imagem Assistida por ComputadorRESUMO
The three-dimensional (3D) liver and tumor segmentation of liver computed tomography (CT) has very important clinical value for assisting doctors in diagnosis and prognosis. This paper proposes a tumor 3D conditional generation confrontation segmentation network (T3scGAN) based on conditional generation confrontation network (cGAN), and at the same time, a coarse-to-fine 3D automatic segmentation framework is used to accurately segment liver and tumor area. This paper uses 130 cases in the 2017 Liver and Tumor Segmentation Challenge (LiTS) public data set to train, verify and test the T3scGAN model. Finally, the average Dice coefficients of the validation set and test set segmented in the 3D liver regions were 0.963 and 0.961, respectively, while the average Dice coefficients of the validation set and test set segmented in the 3D tumor regions were 0.819 and 0.796, respectively. Experimental results show that the proposed T3scGAN model can effectively segment the 3D liver and its tumor regions, so it can better assist doctors in the accurate diagnosis and treatment of liver cancer.
Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias Hepáticas , Humanos , Neoplasias Hepáticas/diagnóstico por imagem , Tomografia Computadorizada por Raios XRESUMO
PURPOSE: Unlike the normal organ segmentation task, automatic tumor segmentation is a more challenging task because of the existence of similar visual characteristics between tumors and their surroundings, especially on computed tomography (CT) images with severe low contrast resolution, as well as the diversity and individual characteristics of data acquisition procedures and devices. Consequently, most of the recently proposed methods have become increasingly difficult to be applied on a different tumor dataset with good results, and moreover, some tumor segmentors usually fail to generalize beyond those datasets and modalities used in their original evaluation experiments. METHODS: In order to alleviate some of the problems with the recently proposed methods, we propose a novel unified and end-to-end adversarial learning framework for automatic segmentation of any kinds of tumors from CT scans, called CTumorGAN, consisting of a Generator network and a Discriminator network. Specifically, the Generator attempts to generate segmentation results that are close to their corresponding golden standards, while the Discriminator aims to distinguish between generated samples and real tumor ground truths. More importantly, we deliberately design different modules to take into account the well-known obstacles, e.g., severe class imbalance, small tumor localization, and the label noise problem with poor expert annotation quality, and then use these modules to guide the CTumorGAN training process by utilizing multi-level supervision more effectively. RESULTS: We conduct a comprehensive evaluation on diverse loss functions for tumor segmentation and find that mean square error is more suitable for the CT tumor segmentation task. Furthermore, extensive experiments with multiple evaluation criteria on three well-established datasets, including lung tumor, kidney tumor, and liver tumor databases, also demonstrate that our CTumorGAN achieves stable and competitive performance compared with the state-of-the-art approaches for CT tumor segmentation. CONCLUSION: In order to overcome those key challenges arising from CT datasets and solve some of the main problems existing in the current deep learning-based methods, we propose a novel unified CTumorGAN framework, which can be effectively generalized to address any kinds of tumor datasets with superior performance.
Assuntos
Neoplasias Hepáticas , Neoplasias Pulmonares , Bases de Dados Factuais , Humanos , Processamento de Imagem Assistida por Computador , Tomografia Computadorizada por Raios XRESUMO
PURPOSE: Subject motion in MRI remains an unsolved problem; motion during image acquisition may cause blurring and artifacts that severely degrade image quality. In this work, we approach motion correction as an image-to-image translation problem, which refers to the approach of training a deep neural network to predict an image in 1 domain from an image in another domain. Specifically, the purpose of this work was to develop and train a conditional generative adversarial network to predict artifact-free brain images from motion-corrupted data. METHODS: An open source MRI data set comprising T2 *-weighted, FLASH magnitude, and phase brain images for 53 patients was used to generate complex image data for motion simulation. To simulate rigid motion, rotations and translations were applied to the image data based on randomly generated motion profiles. A conditional generative adversarial network, comprising a generator and discriminator networks, was trained using the motion-corrupted and corresponding ground truth (original) images as training pairs. RESULTS: The images predicted by the conditional generative adversarial network have improved image quality compared to the motion-corrupted images. The mean absolute error between the motion-corrupted and ground-truth images of the test set was 16.4% of the image mean value, whereas the mean absolute error between the conditional generative adversarial network-predicted and ground-truth images was 10.8% The network output also demonstrated improved peak SNR and structural similarity index for all test-set images. CONCLUSION: The images predicted by the conditional generative adversarial network have quantitatively and qualitatively improved image quality compared to the motion-corrupted images.
Assuntos
Imageamento Tridimensional/métodos , Imageamento por Ressonância Magnética/métodos , Movimento/fisiologia , Redes Neurais de Computação , Encéfalo/diagnóstico por imagem , HumanosRESUMO
Low-dose computed tomography (LDCT) has offered tremendous benefits in radiation-restricted applications, but the quantum noise as resulted by the insufficient number of photons could potentially harm the diagnostic performance. Current image-based denoising methods tend to produce a blur effect on the final reconstructed results especially in high noise levels. In this paper, a deep learning-based approach was proposed to mitigate this problem. An adversarially trained network and a sharpness detection network were trained to guide the training process. Experiments on both simulated and real dataset show that the results of the proposed method have very small resolution loss and achieves better performance relative to state-of-the-art methods both quantitatively and visually.
Assuntos
Processamento de Imagem Assistida por Computador/métodos , Doses de Radiação , Processamento de Sinais Assistido por Computador , Razão Sinal-Ruído , Tomografia Computadorizada por Raios X/métodos , Algoritmos , Aprendizado Profundo , HumanosRESUMO
As facial modification technology advances rapidly, it poses a challenge to methods used to detect fake faces. The advent of deep learning and AI-based technologies has led to the creation of counterfeit photographs that are more difficult to discern apart from real ones. Existing Deep fake detection systems excel at spotting fake content with low visual quality and are easily recognized by visual artifacts. The study employed a unique active forensic strategy Compact Ensemble-based discriminators architecture using Deep Conditional Generative Adversarial Networks (CED-DCGAN), for identifying real-time deep fakes in video conferencing. DCGAN focuses on video-deep fake detection on features since technologies for creating convincing fakes are improving rapidly. As a first step towards recognizing DCGAN-generated images, split real-time video images into frames containing essential elements and then use that bandwidth to train an ensemble-based discriminator as a classifier. Spectra anomalies are produced by up-sampling processes, standard procedures in GAN systems for making large amounts of fake data films. The Compact Ensemble discriminator (CED) concentrates on the most distinguishing feature between the natural and synthetic images, giving the generators a robust training signal. As empirical results on publicly available datasets show, the suggested algorithms outperform state-of-the-art methods and the proposed CED-DCGAN technique successfully detects high-fidelity deep fakes in video conferencing and generalizes well when comparing with other techniques. Python tool is used for implementing this proposed study and the accuracy obtained for proposed work is 98.23 %.
RESUMO
Background: Adolescent idiopathic scoliosis (AIS) is the most common spinal disorder in children, characterized by insidious onset and rapid progression, which can lead to severe consequences if not detected in a timely manner. Currently, the diagnosis of AIS primarily relies on X-ray imaging. However, due to limitations in healthcare access and concerns over radiation exposure, this diagnostic method cannot be widely adopted. Therefore, we have developed and validated a screening system using deep learning technology, capable of generating virtual X-ray images (VXI) from two-dimensional Red Green Blue (2D-RGB) images captured by a smartphone or camera to assist spine surgeons in the rapid, accurate, and non-invasive assessment of AIS. Methods: We included 2397 patients with AIS and 48 potential patients with AIS who visited four medical institutions in mainland China from June 11th 2014 to November 28th 2023. Participants data included standing full-spine X-ray images captured by radiology technicians and 2D-RGB images taken by spine surgeons using a camera. We developed a deep learning model based on conditional generative adversarial networks (cGAN) called Swin-pix2pix to generate VXI on retrospective training (n = 1842) and validation (n = 100) dataset, then validated the performance of VXI in quantifying the curve type and severity of AIS on retrospective internal (n = 100), external (n = 135), and prospective test datasets (n = 268). The prospective test dataset included 268 participants treated in Nanjing, China, from April 19th, 2023, to November 28th, 2023, comprising 220 patients with AIS and 48 potential patients with AIS. Their data underwent strict quality control to ensure optimal data quality and consistency. Findings: Our Swin-pix2pix model generated realistic VXI, with the mean absolute error (MAE) for predicting the main and secondary Cobb angles of AIS significantly lower than other baseline cGAN models, at 3.2° and 3.1° on prospective test dataset. The diagnostic accuracy for scoliosis severity grading exceeded that of two spine surgery experts, with accuracy of 0.93 (95% CI [0.91, 0.95]) in main curve and 0.89 (95% CI [0.87, 0.91]) in secondary curve. For main curve position and curve classification, the predictive accuracy of the Swin-pix2pix model also surpassed that of the baseline cGAN models, with accuracy of 0.93 (95% CI [0.90, 0.95]) for thoracic curve and 0.97 (95% CI [0.96, 0.98]), achieving satisfactory results on three external datasets as well. Interpretation: Our developed Swin-pix2pix model holds promise for using a single photo taken with a smartphone or camera to rapidly assess AIS curve type and severity without radiation, enabling large-scale screening. However, limited data quality and quantity, a homogeneous participant population, and rotational errors during imaging may affect the applicability and accuracy of the system, requiring further improvement in the future. Funding: National Key R&D Program of China, Natural Science Foundation of Jiangsu Province, China Postdoctoral Science Foundation, Nanjing Medical Science and Technology Development Foundation, Jiangsu Provincial Key Research and Development Program, and Jiangsu Provincial Medical Innovation Centre of Orthopedic Surgery.
RESUMO
PURPOSE: To propose a novel deep-learning based dosimetry method that allows quick and accurate estimation of organ doses for individual patients, using only their computed tomography (CT) images as input. METHODS: Despite recent advances in medical dosimetry, personalized CT dosimetry remains a labour-intensive process. Current state-of-the-art methods utilize time-consuming Monte Carlo (MC) based simulations for individual organ dose estimation in CT. The proposed method uses conditional generative adversarial networks (cGANs) to substitute MC simulations with fast dose image generation, based on image-to-image translation. The pix2pix architecture in conjunction with a regression model was utilized for the generation of the synthetic dose images. The lungs, heart, breast, bone and skin were manually segmented to estimate and compare organ doses calculated using both the original and synthetic dose images, respectively. RESULTS: The average organ dose estimation error for the proposed method was 8.3% and did not exceed 20% for any of the organs considered. The performance of the method in the clinical environment was also assessed. Using segmentation tools developed in-house, an automatic organ dose calculation pipeline was set up. Calculation of organ doses for heart and lung for each CT slice took about 2 s. CONCLUSIONS: This work shows that deep learning-enabled personalized CT dosimetry is feasible in real-time, using only patient CT images as input.
Assuntos
Aprendizado Profundo , Medicina de Precisão , Radiometria , Tomografia Computadorizada por Raios X , Humanos , Tomografia Computadorizada por Raios X/métodos , Radiometria/métodos , Processamento de Imagem Assistida por Computador/métodos , Estudos de Viabilidade , Doses de Radiação , Método de Monte Carlo , Fatores de TempoRESUMO
OBJECTIVE: CT and MRI are synergistic in the information provided for neurosurgical planning. While obtaining both types of images lends unique data from each, doing so adds to cost and exposes patients to additional ionizing radiation after MRI has been performed. Cross-modal synthesis of high-resolution CT images from MRI sequences offers an appealing solution. The authors therefore sought to develop a deep learning conditional generative adversarial network (cGAN) which performs this synthesis. METHODS: Preoperative paired CT and contrast-enhanced MR images were collected for patients with meningioma, pituitary tumor, vestibular schwannoma, and cerebrovascular disease. CT and MR images were denoised, field corrected, and coregistered. MR images were fed to a cGAN that exported a "synthetic" CT scan. The accuracy of synthetic CT images was assessed objectively using the quantitative similarity metrics as well as by clinical features such as sella and internal auditory canal (IAC) dimensions and mastoid/clinoid/sphenoid aeration. RESULTS: A total of 92,981 paired CT/MR images obtained in 80 patients were used for training/testing, and 10,068 paired images from 10 patients were used for external validation. Synthetic CT images reconstructed the bony skull base and convexity with relatively high accuracy. Measurements of the sella and IAC showed a median relative error between synthetic CT scans and ground truth images of 6%, with greater variability in IAC reconstruction compared with the sella. Aerations in the mastoid, clinoid, and sphenoid regions were generally captured, although there was heterogeneity in finer air cell septations. Performance varied based on pathology studied, with the highest limitation observed in evaluating meningiomas with intratumoral calcifications or calvarial invasion. CONCLUSIONS: The generation of high-resolution CT scans from MR images through cGAN offers promise for a wide range of applications in cranial and spinal neurosurgery, especially as an adjunct for preoperative evaluation. Optimizing cGAN performance on specific anatomical regions may increase its clinical viability.
Assuntos
Imageamento por Ressonância Magnética , Procedimentos Neurocirúrgicos , Tomografia Computadorizada por Raios X , Humanos , Tomografia Computadorizada por Raios X/métodos , Imageamento por Ressonância Magnética/métodos , Procedimentos Neurocirúrgicos/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Aprendizado Profundo , Meningioma/diagnóstico por imagem , Meningioma/cirurgia , Adulto , Neoplasias Meníngeas/diagnóstico por imagem , Neoplasias Meníngeas/cirurgia , IdosoRESUMO
Objective.This work proposes, for the first time, an image-based end-to-end self-normalization framework for positron emission tomography (PET) using conditional generative adversarial networks (cGANs).Approach.We evaluated different approaches by exploring each of the following three methodologies. First, we used images that were either unnormalized or corrected for geometric factors, which encompass all time-invariant factors, as input data types. Second, we set the input tensor shape as either a single axial slice (2D) or three contiguous axial slices (2.5D). Third, we chose either Pix2Pix or polarized self-attention (PSA) Pix2Pix, which we developed for this work, as a deep learning network. The targets for all approaches were the axial slices of images normalized using the direct normalization method. We performed Monte Carlo simulations of ten voxelized phantoms with the SimSET simulation tool and produced 26,000 pairs of axial image slices for training and testing.Main results.The results showed that 2.5D PSA Pix2Pix trained with geometric-factors-corrected input images achieved the best performance among all the methods we tested. All approaches improved general image quality figures of merit peak signal to noise ratio (PSNR) and structural similarity index (SSIM) from â¼15 % to â¼55 %, and 2.5D PSA Pix2Pix showed the highest PSNR (28.074) and SSIM (0.921). Lesion detectability, measured with region of interest (ROI) PSNR, SSIM, normalized contrast recovery coefficient, and contrast-to-noise ratio, was generally improved for all approaches, and 2.5D PSA Pix2Pix trained with geometric-factors-corrected input images achieved the highest ROI PSNR (28.920) and SSIM (0.973).Significance.This study demonstrates the potential of an image-based end-to-end self-normalization framework using cGANs for improving PET image quality and lesion detectability without the need for separate normalization scans.
Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Tomografia por Emissão de Pósitrons , Processamento de Imagem Assistida por Computador/métodos , Humanos , Imagens de Fantasmas , Método de Monte CarloRESUMO
Objective. This paper proposes a conditional GAN (cGAN)-based method to perform data enhancement of ultrasound images and segmentation of tumors in breast ultrasound images, which improves the reality of the enhenced breast ultrasound image and obtains a more accurate segmentation result.Approach. We use the idea of generative adversarial training to accomplish the following two tasks: (1) in this paper, we use generative adversarial networks to generate a batch of samples with labels from the perspective of label-generated images to expand the dataset from a data enhancement perspective. (2) In this paper, we use adversarial training instead of postprocessing steps such as conditional random fields to enhance higher-level spatial consistency. In addition, this work proposes a new network, EfficientUNet, based on U-Net, which combines ResNet18, an attention mechanism and a deep supervision technique. This segmentation model uses the residual network as an encoder to retain the lost information in the original encoder and can avoid the gradient disappearance problem to improve the feature extraction ability of the model, and it also uses deep supervision techniques to speed up the convergence of the model. The channel-by-channel weighting module of SENet is then used to enable the model to capture the tumor boundary more accurately.Main results. The paper concludes with experiments to verify the validity of these efforts by comparing them with mainstream methods on Dataset B. The Dice score and IoU score reaches 0.8856 and 0.8111, respectively.Significance. This study successfully combines cGAN and optimized EfficientUNet for the segmentation of breast tumor ultrasound images. The conditional generative adversarial network has a good performance in data enhancement, and the optimized EfficientUNet makes the segmentation more accurate.
Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Ultrassonografia , Ultrassonografia MamáriaRESUMO
Magnetic resonance imaging (MRI) is an efficient, non-invasive diagnostic imaging tool for a variety of disorders. In modern MRI systems, the scanning procedure is time-consuming, which leads to problems with patient comfort and causes motion artifacts. Accelerated or parallel MRI has the potential to minimize patient stress as well as reduce scanning time and medical costs. In this paper, a new deep learning MR image reconstruction framework is proposed to provide more accurate reconstructed MR images when under-sampled or aliased images are generated. The proposed reconstruction model is designed based on the conditional generative adversarial networks (CGANs) where the generator network is designed in a form of an encoder-decoder U-Net network. A hybrid spatial and k-space loss function is also proposed to improve the reconstructed image quality by minimizing the L1-distance considering both spatial and frequency domains simultaneously. The proposed reconstruction framework is directly compared when CGAN and U-Net are adopted and used individually based on the proposed hybrid loss function against the conventional L1-norm. Finally, the proposed reconstruction framework with the extended loss function is evaluated and compared against the traditional SENSE reconstruction technique using the evaluation metrics of structural similarity (SSIM) and peak signal to noise ratio (PSNR). To fine-tune and evaluate the proposed methodology, the public Multi-Coil k-Space OCMR dataset for cardiovascular MR imaging is used. The proposed framework achieves a better image reconstruction quality compared to SENSE in terms of PSNR by 6.84 and 9.57 when U-Net and CGAN are used, respectively. Similarly, it demonstrates SSIM of the reconstructed MR images comparable to the one provided by the SENSE algorithm when U-Net and CGAN are used. Comparing cases where the proposed hybrid loss function is used against the cases with the simple L1-norm, the reconstruction performance can be noticed to improve by 6.84 and 9.57 for U-Net and CGAN, respectively. To conclude this, the proposed framework using CGAN provides the best reconstruction performance compared with U-Net or the conventional SENSE reconstruction techniques. The proposed framework seems to be useful for the practical reconstruction of cardiac images since it can provide better image quality in terms of SSIM and PSNR.
RESUMO
Probabilistic Regression is a statistical technique and a crucial problem in the machine learning domain which employs a set of machine learning methods to forecast a continuous target variable based on the value of one or multiple predictor variables. COVID-19 is a virulent virus that has brought the whole world to a standstill. The potential of the virus to cause inter human transmission makes the world a dangerous place. This article predicts the upcoming circumstances of the Corona virus to subside its action. We have performed Conditional GAN regression to anticipate the subsequent COVID-19 cases of five countries. The GAN variant CGAN is used to design the model and predict the COVID-19 cases for 3 months ahead with least error for the dataset provided. Each country is examined individually, due to their variation in population size, tradition, medical management and preventive measures. The analysis is based on confirmed data, as provided by the World Health Organization. This paper investigates how conditional Generative Adversarial Networks (GANs) can be used to accurately exhibit intricate conditional distributions. GANs have got spectacular achievement in producing convoluted high-dimensional data, but work done on their use for regression problems is minimal. This paper exhibits how conditional GANs can be employed in probabilistic regression. It is shown that conditional GANs can be used to evaluate a wide range of various distributions and be competitive with existing probabilistic regression models.
RESUMO
Magnetic resonance imaging (MRI) has become one of the most standardized and widely used neuroimaging protocols in the detection and diagnosis of neurodegenerative diseases. In clinical scenarios, multi-modality MR images can provide more comprehensive information than single modality images. However, high-quality multi-modality MR images can be difficult to obtain in the actual diagnostic process due to various uncertainties. Efficient methods of modality complement and synthesis have aroused increasing attention in the research community. In this article, style transfer is introduced into conditional generative adversarial networks (cGAN) architecture. A cGAN model with hierarchical feature mapping and fusion (ST-cGAN) is proposed to address the cross-modality synthesis of MR images. In order to surmount the sole focus on the pixel-wise similarity as most cGAN-based methods do, the proposed ST-cGAN takes advantage of the style information and applies it to the synthetic image's content structure. Taking images of two modalities as conditional input, ST-cGAN extracts different levels of style features and integrates them with the content features to form the style-enhanced synthetic image. Furthermore, the proposed model is made robust to random noise by adding noise input to the generator. A comprehensive analysis is performed by comparing the proposed ST-cGAN with other state-of-the-art baselines based on four representative evaluation metrics. The experimental results on the IXI (Information eXtraction from Images) dataset verify the validity of the ST-cGAN from different evaluation perspectives.