Pesquisa | Portal Regional da BVS

Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators.

Zheng, Haitian; Lin, Zhe; Lu, Jingwan; Cohen, Scott; Shechtman, Eli; Barnes, Connelly; Zhang, Jianming; Liu, Qing; Amirghodsi, Sohrab; Zhou, Yuqian; Luo, Jiebo.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Apr 26.

Artigo em Inglês | MEDLINE | ID: mdl-38669165

RESUMO

Structure-guided image completion aims to inpaint a local region of an image according to an input guidance map from users. While such a task enables many practical applications for interactive editing, existing methods often struggle to hallucinate realistic object instances in complex natural scenes. Such a limitation is partially due to the lack of semantic-level constraints inside the hole region as well as the lack of a mechanism to enforce realistic object generation. In this work, we propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects. Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts. Moreover, the object-level discriminators take aligned instances as inputs to enforce the realism of individual objects. Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks, including segmentation-guided completion, edge-guided manipulation and panoptically-guided manipulation on Places2 datasets. Furthermore, our trained model is flexible and can support multiple editing use cases, such as object insertion, replacement, removal and standard inpainting. In particular, our trained model combined with a novel automatic image completion pipeline achieves state-of-the-art results on the standard inpainting task.

Semantic Layout Manipulation With High-Resolution Sparse Attention.

Zheng, Haitian; Lin, Zhe; Lu, Jingwan; Cohen, Scott; Zhang, Jianming; Xu, Ning; Luo, Jiebo.

IEEE Trans Pattern Anal Mach Intell ; 45(3): 3768-3782, 2023 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-35696464

RESUMO

We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map. A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic. Recent work on learning cross-domain correspondence has shown promising results for global layout transfer with dense attention-based warping. However, this method tends to lose texture details due to the resolution limitation and the lack of smoothness constraint on correspondence. To adapt this paradigm for the layout manipulation task, we propose a high-resolution sparse attention module that effectively transfers visual details to new layouts at a resolution up to 512x512. To further improve visual quality, we introduce a novel generator architecture consisting of a semantic encoder and a two-stage decoder for coarse-to-fine synthesis. Experiments on the ADE20k and Places365 datasets demonstrate that our proposed approach achieves substantial improvements over the existing inpainting and layout manipulation methods.

Construction and validation of a COVID-19 pandemic trend forecast model based on Google Trends data for smell and taste loss.

Chen, Jingguo; Mi, Hao; Fu, Jinyu; Zheng, Haitian; Zhao, Hongyue; Yuan, Rui; Guo, Hanwei; Zhu, Kang; Zhang, Ya; Lyu, Hui; Zhang, Yitong; She, Ningning; Ren, Xiaoyong.

Front Public Health ; 10: 1025658, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36530657

RESUMO

Aim: To explore the role of smell and taste changes in preventing and controlling the COVID-19 pandemic, we aimed to build a forecast model for trends in COVID-19 prediction based on Google Trends data for smell and taste loss. Methods: Data on confirmed COVID-19 cases from 6 January 2020 to 26 December 2021 were collected from the World Health Organization (WHO) website. The keywords "loss of smell" and "loss of taste" were used to search the Google Trends platform. We constructed a transfer function model for multivariate time-series analysis and to forecast confirmed cases. Results: From 6 January 2020 to 28 November 2021, a total of 99 weeks of data were analyzed. When the delay period was set from 1 to 3 weeks, the input sequence (Google Trends of loss of smell and taste data) and response sequence (number of new confirmed COVID-19 cases per week) were significantly correlated (P < 0.01). The transfer function model showed that worldwide and in India, the absolute error of the model in predicting the number of newly diagnosed COVID-19 cases in the following 3 weeks ranged from 0.08 to 3.10 (maximum value 100; the same below). In the United States, the absolute error of forecasts for the following 3 weeks ranged from 9.19 to 16.99, and the forecast effect was relatively accurate. For global data, the results showed that when the last point of the response sequence was at the midpoint of the uptrend or downtrend (25 July 2021; 21 November 2021; 23 May 2021; and 12 September 2021), the absolute error of the model forecast value for the following 4 weeks ranged from 0.15 to 5.77. When the last point of the response sequence was at the extreme point (2 May 2021; 29 August 2021; 20 June 2021; and 17 October 2021), the model could accurately forecast the trend in the number of confirmed cases after the extreme points. Our developed model could successfully predict the development trends of COVID-19. Conclusion: Google Trends for loss of smell and taste could be used to accurately forecast the development trend of COVID-19 cases 1-3 weeks in advance.

Assuntos

Ageusia , COVID-19 , Transtornos do Olfato , Estados Unidos , Humanos , Ageusia/epidemiologia , COVID-19/epidemiologia , Pandemias , Olfato , SARS-CoV-2 , Ferramenta de Busca/métodos

Pose Flow Learning From Person Images for Pose Guided Synthesis.

Zheng, Haitian; Chen, Lele; Xu, Chenliang; Luo, Jiebo.

IEEE Trans Image Process ; 30: 1898-1909, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33079660

RESUMO

Pose guided synthesis aims to generate a new image in an arbitrary target pose while preserving the appearance details from the source image. Existing approaches rely on either hard-coded spatial transformations or 3D body modeling. They often overlook complex non-rigid pose deformation or unmatched occluded regions, thus fail to effectively preserve appearance information. In this article, we propose a pose flow learning scheme that learns to transfer the appearance details from the source image without resorting to annotated correspondences. Based on such learned pose flow, we proposed GarmentNet and SynthesisNet, both of which use multi-scale feature-domain alignment for coarse-to-fine synthesis. Experiments on the DeepFashion, MVC dataset and additional real-world datasets demonstrate that our approach compares favorably with the state-of-the-art methods and generalizes to unseen poses and clothing styles.

CrossNet++: Cross-Scale Large-Parallax Warping for Reference-Based Super-Resolution.

Tan, Yang; Zheng, Haitian; Zhu, Yinheng; Yuan, Xiaoyun; Lin, Xing; Brady, David; Fang, Lu.

IEEE Trans Pattern Anal Mach Intell ; 43(12): 4291-4305, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-32750771

RESUMO

The ability of camera arrays to efficiently capture higher space-bandwidth product than single cameras has led to various multiscale and hybrid systems. These systems play vital roles in computational photography, including light field imaging, 360 VR camera, gigapixel videography, etc. One of the critical tasks in multiscale hybrid imaging is matching and fusing cross-resolution images from different cameras under perspective parallax. In this paper, we investigate the reference-based super-resolution (RefSR) problem associated with dual-camera or multi-camera systems. RefSR consists of super-resolving a low-resolution (LR) image given an external high-resolution (HR) reference image, where they suffer both a significant resolution gap ( 8×) and large parallax ( â¼ 10% pixel displacement). We present CrossNet++, an end-to-end network containing novel two-stage cross-scale warping modules, image encoder and fusion decoder. The stage I learns to narrow down the parallax distinctively with the strong guidance of landmarks and intensity distribution consensus. Then the stage II operates more fine-grained alignment and aggregation in feature domain to synthesize the final super-resolved image. To further address the large parallax, new hybrid loss functions comprising warping loss, landmark loss and super-resolution loss are proposed to regularize training and enable better convergence. CrossNet++ significantly outperforms the state-of-art on light field datasets as well as real dual-camera data. We further demonstrate the generalization of our framework by transferring it to video super-resolution and video denoising.

Early melanoma diagnosis with mobile imaging.

Do, Thanh-Toan; Zhou, Yiren; Zheng, Haitian; Cheung, Ngai-Man; Koh, Dawn.

Annu Int Conf IEEE Eng Med Biol Soc ; 2014: 6752-7, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25571546

RESUMO

We research a mobile imaging system for early diagnosis of melanoma. Different from previous work, we focus on smartphone-captured images, and propose a detection system that runs entirely on the smartphone. Smartphone-captured images taken under loosely-controlled conditions introduce new challenges for melanoma detection, while processing performed on the smartphone is subject to computation and memory constraints. To address these challenges, we propose to localize the skin lesion by combining fast skin detection and fusion of two fast segmentation results. We propose new features to capture color variation and border irregularity which are useful for smartphone-captured images. We also propose a new feature selection criterion to select a small set of good features used in the final lightweight system. Our evaluation confirms the effectiveness of proposed algorithms and features. In addition, we present our system prototype which computes selected visual features from a user-captured skin lesion image, and analyzes them to estimate the likelihood of malignance, all on an off-the-shelf smartphone.

Assuntos

Detecção Precoce de Câncer , Interpretação de Imagem Assistida por Computador , Neoplasias Cutâneas/diagnóstico , Algoritmos , Telefone Celular , Diagnóstico por Imagem , Humanos , Melanoma/diagnóstico , Fotografação , Pele/patologia , Pigmentação da Pele

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA