RESUMO
Recent studies have extensively used deep learning algorithms to analyze gene expression to predict disease diagnosis, treatment effectiveness, and survival outcomes. Survival analysis studies on diseases with high mortality rates, such as cancer, are indispensable. However, deep learning models are plagued by overfitting owing to the limited sample size relative to the large number of genes. Consequently, the latest style-transfer deep generative models have been implemented to generate gene expression data. However, these models are limited in their applicability for clinical purposes because they generate only transcriptomic data. Therefore, this study proposes ctGAN, which enables the combined transformation of gene expression and survival data using a generative adversarial network (GAN). ctGAN improves survival analysis by augmenting data through style transformations between breast cancer and 11 other cancer types. We evaluated the concordance index (C-index) enhancements compared with previous models to demonstrate its superiority. Performance improvements were observed in nine of the 11 cancer types. Moreover, ctGAN outperformed previous models in seven out of the 11 cancer types, with colon adenocarcinoma (COAD) exhibiting the most significant improvement (median C-index increase of ~15.70%). Furthermore, integrating the generated COAD enhanced the log-rank p-value (0.041) compared with using only the real COAD (p-value = 0.797). Based on the data distribution, we demonstrated that the model generated highly plausible data. In clustering evaluation, ctGAN exhibited the highest performance in most cases (89.62%). These findings suggest that ctGAN can be meaningfully utilized to predict disease progression and select personalized treatments in the medical field.
Assuntos
Aprendizado Profundo , Humanos , Análise de Sobrevida , Algoritmos , Neoplasias/genética , Neoplasias/mortalidade , Perfilação da Expressão Gênica/métodos , Redes Neurais de Computação , Biologia Computacional/métodos , Neoplasias da Mama/genética , Neoplasias da Mama/mortalidade , Feminino , Regulação Neoplásica da Expressão GênicaRESUMO
Within research on the cross-view geolocation of UAVs, differences in image sources and interference from similar scenes pose huge challenges. Inspired by multimodal machine learning, in this paper, we design a single-stream pyramid transformer network (SSPT). The backbone of the model uses the self-attention mechanism to enrich its own internal features in the early stage and uses the cross-attention mechanism in the later stage to refine and interact with different features to eliminate irrelevant interference. In addition, in the post-processing part of the model, a header module is designed for upsampling to generate heat maps, and a Gaussian weight window is designed to assign label weights to make the model converge better. Together, these methods improve the positioning accuracy of UAV images in satellite images. Finally, we also use style transfer technology to simulate various environmental changes in order to expand the experimental data, further proving the environmental adaptability and robustness of the method. The final experimental results show that our method yields significant performance improvement: The relative distance score (RDS) of the SSPT-384 model on the benchmark UL14 dataset is significantly improved from 76.25% to 84.40%, while the meter-level accuracy (MA) of 3 m, 5 m, and 20 m is increased by 12%, 12%, and 10%, respectively. For the SSPT-256 model, the RDS has been increased to 82.21%, and the meter-level accuracy (MA) of 3 m, 5 m, and 20 m has increased by 5%, 5%, and 7%, respectively. It still shows strong robustness on the extended thermal infrared (TIR), nighttime, and rainy day datasets.
RESUMO
Dongba characters are ancient ideographic scripts with abstract expressions that differ greatly from modern Chinese characters; directly applying existing methods cannot achieve the font style transfer of Dongba characters. This paper proposes an Attention-based Font style transfer Generative Adversarial Network (AFGAN) method. Based on the characteristics of Dongba character images, two core modules are set up in the proposed AFGAN, namely void constraint and font stroke constraint. In addition, in order to enhance the feature learning ability of the network and improve the style transfer effect, the Convolutional Block Attention Module (CBAM) mechanism is added in the down-sampling stage to help the network better adapt to input font images with different styles. The quantitative and qualitative analyses of the generated font and the real font were conducted by consulting with professional artists based on the newly built small seal script, slender gold script, and Dongba character dataset, and the styles of the small seal script and slender gold script were transferred to Dongba characters. The results indicate that the proposed AFGAN method has advantages in evaluation indexes and visual quality compared to existing networks. At the same time, this method can effectively learn the style features of small seal script and slender gold script, and transfer them to Dongba characters, indicating the effectiveness of this method.
RESUMO
Recent work within neuroimaging consortia have aimed to identify reproducible, and often subtle, brain signatures of psychiatric or neurological conditions. To allow for high-powered brain imaging analyses, it is often necessary to pool MR images that were acquired with different protocols across multiple scanners. Current retrospective harmonization techniques have shown promise in removing site-related image variation. However, most statistical approaches may over-correct for technical, scanning-related, variation as they cannot distinguish between confounded image-acquisition based variability and site-related population variability. Such statistical methods often require that datasets contain subjects or patient groups with similar clinical or demographic information to isolate the acquisition-based variability. To overcome this limitation, we consider site-related magnetic resonance (MR) imaging harmonization as a style transfer problem rather than a domain transfer problem. Using a fully unsupervised deep-learning framework based on a generative adversarial network (GAN), we show that MR images can be harmonized by inserting the style information encoded from a single reference image, without knowing their site/scanner labels a priori. We trained our model using data from five large-scale multisite datasets with varied demographics. Results demonstrated that our style-encoding model can harmonize MR images, and match intensity profiles, without relying on traveling subjects. This model also avoids the need to control for clinical, diagnostic, or demographic information. We highlight the effectiveness of our method for clinical research by comparing extracted cortical and subcortical features, brain-age estimates, and case-control effect sizes before and after the harmonization. We showed that our harmonization removed the site-related variances, while preserving the anatomical information and clinical meaningful patterns. We further demonstrated that with a diverse training set, our method successfully harmonized MR images collected from unseen scanners and protocols, suggesting a promising tool for ongoing collaborative studies. Source code is released in USC-IGC/style_transfer_harmonization (github.com).
Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Humanos , Estudos Retrospectivos , Imageamento por Ressonância Magnética/métodos , Processamento de Imagem Assistida por Computador/métodos , Neuroimagem , Encéfalo/diagnóstico por imagemRESUMO
Neural Style Transfer (NST) has been a widely researched topic as of late enabling new forms of image manipulation. Here we perform an extensive study on NST algorithms and extend the existing methods with custom modifications for application to Indian art styles. In this paper, we aim to provide a comprehensive analysis of various methods ranging from the seminal work of Gatys et al which demonstrated the power of Convolutional Neural Networks (CNNs) in creating artistic imagery by separating and recombining image content and style, to the state of the art image-to-image translation models which use Generative Adversarial Networks (GANs) to learn the mapping between two domain of images. We observe and infer based on the results produced by the models on which one could be a more suitable approach for Indian art styles, especially Tanjore paintings which are unique compared to the Western art styles. We then propose the method which is more suitable for the domain of Indian Art style along with custom architecture which includes an enhancement and evaluation module. We then present evaluation methods, both qualitative and quantitative which includes our proposed metric, to evaluate the results produced by the model.
Assuntos
Algoritmos , Povo Asiático , Cultura , Humanos , Aprendizagem , Índia , ArteRESUMO
Retrograde intrarenal surgery (RIRS) is a widely utilized diagnostic and therapeutic tool for multiple upper urinary tract pathologies. The image-guided navigation system can assist the surgeon to perform precise surgery by providing the relative position between the lesion and the instrument after the intraoperative image is registered with the preoperative model. However, due to the structural complexity and diversity of multi-branched organs such as kidneys, bronchi, etc., the consistency of the intensity distribution of virtual and real images will be challenged, which makes the classical pure intensity registration method prone to bias and random results in a wide search domain. In this paper, we propose a structural feature similarity-based method combined with a semantic style transfer network, which significantly improves the registration accuracy when the initial state deviation is obvious. Furthermore, multi-view constraints are introduced to compensate for the collapse of spatial depth information and improve the robustness of the algorithm. Experimental studies were conducted on two models generated from patient data to evaluate the performance of the method and competing algorithms. The proposed method obtains mean target error (mTRE) of 0.971 ± 0.585 mm and 1.266 ± 0.416 mm respectively, with better accuracy and robustness overall. Experimental results demonstrate that the proposed method has the potential to be applied to RIRS and extended to other organs with similar structures.
Assuntos
Algoritmos , Imageamento Tridimensional , Humanos , Imageamento Tridimensional/métodos , Imagens de FantasmasRESUMO
This paper introduces a deep learning approach to photorealistic universal style transfer that extends the PhotoNet network architecture by adding extra feature-aggregation modules. Given a pair of images representing the content and the reference of style, we augment the state-of-the-art solution mentioned above with deeper aggregation, to better fuse content and style information across the decoding layers. As opposed to the more flexible implementation of PhotoNet (i.e., PhotoNAS), which targets the minimization of inference time, our method aims to achieve better image reconstruction and a more pleasant stylization. We propose several deep layer aggregation architectures to be used as wrappers over PhotoNet, to enhance the stylization and quality of the output image.
RESUMO
Unsupervised image-to-image translation has received considerable attention due to the recent remarkable advancements in generative adversarial networks (GANs). In image-to-image translation, state-of-the-art methods use unpaired image data to learn mappings between the source and target domains. However, despite their promising results, existing approaches often fail in challenging conditions, particularly when images have various target instances and a translation task involves significant transitions in shape and visual artifacts when translating low-level information rather than high-level semantics. To tackle the problem, we propose a novel framework called Progressive Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization (PRO-U-GAT-IT) for the unsupervised image-to-image translation task. In contrast to existing attention-based models that fail to handle geometric transitions between the source and target domains, our model can translate images requiring extensive and holistic changes in shape. Experimental results show the superiority of the proposed approach compared to the existing state-of-the-art models on different datasets.
RESUMO
In the fourth industrial revolution, the scale of execution for interactive applications increased substantially. These interactive and animated applications are human-centric, and the representation of human motion is unavoidable, making the representation of human motions ubiquitous. Animators strive to computationally process human motion in a way that the motions appear realistic in animated applications. Motion style transfer is an attractive technique that is widely used to create realistic motions in near real-time. motion style transfer approach employs existing captured motion data to generate realistic samples automatically and updates the motion data accordingly. This approach eliminates the need for handcrafted motions from scratch for every frame. The popularity of deep learning (DL) algorithms reshapes motion style transfer approaches, as such algorithms can predict subsequent motion styles. The majority of motion style transfer approaches use different variants of deep neural networks (DNNs) to accomplish motion style transfer approaches. This paper provides a comprehensive comparative analysis of existing state-of-the-art DL-based motion style transfer approaches. The enabling technologies that facilitate motion style transfer approaches are briefly presented in this paper. When employing DL-based methods for motion style transfer, the selection of the training dataset plays a key role in the performance. By anticipating this vital aspect, this paper provides a detailed summary of existing well-known motion datasets. As an outcome of the extensive overview of the domain, this paper highlights the contemporary challenges faced by motion style transfer approaches.
RESUMO
In this work we introduce a novel medical image style transfer method, StyleMapper, that can transfer medical scans to an unseen style with access to limited training data. This is made possible by training our model on unlimited possibilities of simulated random medical imaging styles on the training set, making our work more computationally efficient when compared with other style transfer methods. Moreover, our method enables arbitrary style transfer: transferring images to styles unseen in training. This is useful for medical imaging, where images are acquired using different protocols and different scanner models, resulting in a variety of styles that data may need to be transferred between. Our model disentangles image content from style and can modify an image's style by simply replacing the style encoding with one extracted from a single image of the target style, with no additional optimization required. This also allows the model to distinguish between different styles of images, including among those that were unseen in training. We propose a formal description of the proposed model. Experimental results on breast magnetic resonance images indicate the effectiveness of our method for style transfer. Our style transfer method allows for the alignment of medical images taken with different scanners into a single unified style dataset, allowing for the training of other downstream tasks on such a dataset for tasks such as classification, object detection and others.
Assuntos
Aprendizado Profundo , Humanos , Imageamento por Ressonância Magnética , Radiografia , Processamento de Imagem Assistida por Computador/métodosRESUMO
A growing number of papers on style transfer for texts rely on information decomposition. The performance of the resulting systems is usually assessed empirically in terms of the output quality or requires laborious experiments. This paper suggests a straightforward information theoretical framework to assess the quality of information decomposition for latent representations in the context of style transfer. Experimenting with several state-of-the-art models, we demonstrate that such estimates could be used as a fast and straightforward health check for the models instead of more laborious empirical experiments.
RESUMO
Molecular visualization is a cornerstone of structural biology, providing insights into the form and function of biomolecules that are difficult to achieve any other way. Scientific analysis, publication, education, and outreach often benefit from photorealistic molecular depictions rendered using advanced computer-graphics programs such as Maya, 3ds Max, and Blender. However, setting up molecular scenes in these programs is laborious even for expert users, and rendering often requires substantial time and computer resources. We have created a deep-learning model called Prot2Prot that quickly imitates photorealistic visualization styles, given a much simpler, easy-to-generate molecular representation. The resulting images are often indistinguishable from images rendered using industry-standard 3D graphics programs, but they can be created in a fraction of the time, even when running in a web browser. To the best of our knowledge, Prot2Prot is the first example of image-to-image translation applied to macromolecular visualization. Prot2Prot is available free of charge, released under the terms of the Apache License, Version 2.0. Users can access a Prot2Prot-powered web app without registration at http://durrantlab.com/prot2prot .
Assuntos
Aprendizado Profundo , Gráficos por Computador , Substâncias Macromoleculares , SoftwareRESUMO
Several applications of deep learning, such as image classification and retrieval, recommendation systems, and especially image synthesis, are of great interest to the fashion industry. Recently, image generation of clothes gained lot of popularity as it is a very challenging task that is far from being solved. Additionally, it would open lots of possibilities for designers and stylists enhancing their creativity. For this reason, in this paper we propose to tackle the problem of style transfer between two different people wearing different clothes. We draw inspiration from the recent StarGANv2 architecture that reached impressive results in transferring a target domain to a source image and we adapted it to work with fashion images and to transfer clothes styles. In more detail, we modified the architecture to work without the need of a clear separation between multiple domains, added a perceptual loss between the target and the source clothes, and edited the style encoder to better represent the style information of target clothes. We performed both qualitative and quantitative experiments with the recent DeepFashion2 dataset and proved the efficacy and novelty of our method.
Assuntos
Vestuário , HumanosRESUMO
Image style transfer is a challenging problem in computer vision which aims at rendering an image into different styles. A lot of progress has been made to transfer the style of one painting of a representative artist in real time, whereas less attention has been focused on transferring an artist's style from a collection of his paintings. This task requests capturing the artist's precise style from his painting collection. Existing methods did not pay more attention on the possible disruption of original content details and image structures by texture elements and noises, which leads to the structure deformation or edge blurring of the generated images. To address this problem, we propose IFFMStyle, a high-quality image style transfer framework. Specifically, we introduce invalid feature filtering modules (IFFM) to the encoder-decoder architecture to filter the content-independent features in the original image and the generated image. Then, the content-consistency constraint is used to enhance the model's content-preserving capability. We also introduce style perception consistency loss to jointly train a network with content loss and adversarial loss to maintain the distinction of different semantic content in the generated image. Additionally, we have no requirement for paired content image and style image. The experimental results show that the stylized image generated by the proposed method significantly improves the quality of the generated images, and can realize the style transfer based on the semantic information of the content image. Compared with the advanced method, our method is more favored by users.
RESUMO
The development of recent image style transfer methods allows the quick transformation of an input content image into an arbitrary style. However, these methods have a limitation that the scale-across style pattern of a style image cannot be fully transferred into a content image. In this paper, we propose a new style transfer method, named total style transfer, that resolves this limitation by utilizing intra/inter-scale statistics of multi-scaled feature maps without losing the merits of the existing methods. First, we use a more general feature transform layer that employs intra/inter-scale statistics of multi-scaled feature maps and transforms the multi-scaled style of a content image into that of a style image. Secondly, we generate a multi-scaled stylized image by using only a single decoder network with skip-connections, in which multi-scaled features are merged. Finally, we optimize the style loss for the decoder network in the intra/inter-scale statistics of image style. Our improved total style transfer can generate a stylized image with a scale-across style pattern from a pair of content and style images in one forwarding pass. Our method achieved less memory consumption and faster feed-forwarding speed compared with the recent cascade scheme and the lowest style loss among the recent style transfer methods.
Assuntos
Projetos de PesquisaRESUMO
The field of Neural Style Transfer (NST) has led to interesting applications that enable us to transform reality as human beings perceive it. Particularly, NST for material translation aims to transform the material of an object into that of a target material from a reference image. Since the target material (style) usually comes from a different object, the quality of the synthesized result totally depends on the reference image. In this paper, we propose a material translation method based on NST with automatic style image retrieval. The proposed CNN-feature-based image retrieval aims to find the ideal reference image that best translates the material of an object. An ideal reference image must share semantic information with the original object while containing distinctive characteristics of the desired material (style). Thus, we refine the search by selecting the most-discriminative images from the target material, while focusing on object semantics by removing its style information. To translate materials to object regions, we combine a real-time material segmentation method with NST. In this way, the material of the retrieved style image is transferred to the segmented areas only. We evaluate our proposal with different state-of-the-art NST methods, including conventional and recently proposed approaches. Furthermore, with a human perceptual study applied to 100 participants, we demonstrate that synthesized images of stone, wood, and metal can be perceived as real and even chosen over legitimate photographs of such materials.
Assuntos
Armazenamento e Recuperação da Informação , Semântica , HumanosRESUMO
Recent image-style transfer methods use the structure of a VGG feature network to encode and decode the feature map of the image. Since the network is designed for the general image-classification task, it has a number of channels and, accordingly, requires a huge amount of memory and high computational power, which is not mandatory for such a relatively simple task as image-style transfer. In this paper, we propose a new technique to size down the previously used style transfer network for eliminating the redundancy of the VGG feature network in memory consumption and computational cost. Our method automatically finds a number of consistently inactive convolution channels during the network training phase by using two new losses, i.e., channel loss and xor loss. The former maximizes the number of inactive channels and the latter fixes the positions of these inactive channels to be the same for the image. Our method improves the image generation speed to be up to 49% faster and reduces the number of parameters by 20% while maintaining style transferring performance. Additionally, our losses are also effective in pruning the VGG16 classifier network, i.e., parameter reduction by 26% and top-1 accuracy improvement by 0.16% on CIFAR-10.
Assuntos
Aumento da Imagem , Redes Neurais de Computação , Aumento da Imagem/métodosRESUMO
Here, we propose a CNN-based infrared image enhancement method to transform pseudo-realistic regions of simulation-based infrared images into real infrared texture. The proposed algorithm consists of the following three steps. First, target infrared features based on a real infrared image are extracted through pretrained VGG-19 networks. Next, by implementing a neural style-transfer algorithm to a simulated infrared image, fractal nature features from the real infrared image are progressively applied to the image. Therefore, the fractal characteristics of the simulated image are improved. Finally, based on the results of fractal analysis, peak signal-to-noise (PSNR), structural similarity index measure (SSIM), and natural image quality evaluator (NIQE) texture evaluations are performed to know how the simulated infrared image is properly transformed as it contains the real infrared fractal features. We verified the proposed methodology using a simulation with three different simulation conditions with a real mid-wave infrared (MWIR) image. As a result, the enhanced simulated infrared images based on the proposed algorithm have better NIQE and SSIM score values in both brightness and fractal characteristics, indicating the closest similarity to the given actual infrared image. The proposed image fractal feature analysis technique can be widely used not only for the simulated infrared images but also for general synthetic images.
Assuntos
Algoritmos , Fractais , Simulação por Computador , Aumento da Imagem , Processamento de Imagem Assistida por Computador/métodosRESUMO
Generating images of artistic style from input images, also known as image style transfer, has been improved in the quality of output style and the speed of image generation since deep neural networks have been applied in the field of computer vision research. However, the previous approaches used feature alignment techniques that were too simple in their transform layer to cover the characteristics of style features of images. In addition, they used an inconsistent combination of transform layers and loss functions in the training phase to embed arbitrary styles in a decoder network. To overcome these shortcomings, the second-order statistics of the encoded features are exploited to build an optimal arbitrary image style transfer technique. First, a new correlation-aware loss and a correlation-aware feature alignment technique are proposed. Using this consistent combination of loss and feature alignment methods strongly matches the second-order statistics of content features to those of the target-style features and, accordingly, the style capacity of the decoder network is increased. Secondly, a new component-wise style controlling method is proposed. This method can generate various styles from one or several style images by using style-specific components from second-order feature statistics. We experimentally prove that the proposed method achieves improvements in both the style capacity of the decoder network and the style variety without losing the ability of real-time processing (less than 200 ms) on Graphics Processing Unit (GPU) devices.
Assuntos
Redes Neurais de Computação , Registros , Processamento de Imagem Assistida por Computador/métodosRESUMO
Recovering height information from a single aerial image is a key problem in the fields of computer vision and remote sensing. At present, supervised learning methods have achieved impressive results, but, due to domain bias, the trained model cannot be directly applied to a new scene. In this paper, we propose a novel semi-supervised framework, StyHighNet, for accurately estimating the height of a single aerial image in a new city that requires only a small number of labeled data. The core is to transfer multi-source images to a unified style, making the unlabeled data provide the appearance distribution as additional supervision signals. The framework mainly contains three sub-networks: (1) the style transferring sub-network maps multi-source images into unified style distribution maps (USDMs); (2) the height regression sub-network, with the function of predicting the height maps from USDMs; and (3) the style discrimination sub-network, used to distinguish the sources of USDMs. Among them, the style transferring sub-network shoulders dual responsibilities: On the one hand, it needs to compute USDMs with obvious characteristics, so that the height regression sub-network can accurately estimate the height maps. On the other hand, it is necessary that the USDMs have consistent distribution to confuse the style discrimination sub-network, so as to achieve the goal of domain adaptation. Unlike previous methods, our style distribution function is learned unsupervised, thus it is of greater flexibility and better accuracy. Furthermore, when the style discrimination sub-network is shielded, this framework can also be used for supervised learning. We performed qualitatively and quantitative evaluations on two sets of public data, Vaihingen and Potsdam. Experiments show that the framework achieved superior performance in both supervised and semi-supervised learning modes.