Búsqueda | Portal de Búsqueda de la BVS España

Cycle-Consistent Weakly Supervised Visual Grounding With Individual and Contextual Representations.

Zhang, Ruisong; Wang, Chuang; Liu, Cheng-Lin.

IEEE Trans Image Process ; 32: 5167-5180, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37695959

RESUMEN

Visual grounding, aiming to align image regions with textual queries, is a fundamental task for cross-modal learning. We study the weakly supervised visual grounding, where only image-text pairs at a coarse-grained level are available. Due to the lack of fine-grained correspondence information, existing approaches often encounter matching ambiguity. To overcome this challenge, we introduce the cycle consistency constraint into region-phrase pairs, which strengthens correlated pairs and weakens unrelated pairs. This cycle pairing makes use of the bidirectional association between image regions and text phrases to alleviate matching ambiguity. Furthermore, we propose a parallel grounding framework, where backbone networks and subsequent relation modules extract individual and contextual representations to calculate context-free and context-aware similarities between regions and phrases separately. Those two representations characterize visual/linguistic individual concepts and inter-relationships, respectively, and then complement each other to achieve cross-modal alignment. The whole framework is trained by minimizing an image-text contrastive loss and a cycle consistency loss. During inference, the above two similarities are fused to give the final region-phrase matching score. Experiments on five popular datasets about visual grounding demonstrate a noticeable improvement in our method. The source code is available at https://github.com/Evergrow/WSVG.

Multi-Information Model for Large-Flowered Chrysanthemum Cultivar Recognition and Classification.

Wang, Jue; Tian, Yuankai; Zhang, Ruisong; Liu, Zhilan; Tian, Ye; Dai, Silan.

Front Plant Sci ; 13: 806711, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35734255

RESUMEN

The traditional Chinese large-flowered chrysanthemum is one of the cultivar groups of chrysanthemum (Chrysanthemum × morifolium Ramat.) with great morphological variation based on many cultivars. Some experts have established several large-flowered chrysanthemum classification systems by using the method of comparative morphology. However, for many cultivars, accurate recognition and classification are still a problem. Combined with the comparative morphological traits of selected samples, we proposed a multi-information model based on deep learning to recognize and classify large-flowered chrysanthemum. In this study, we collected the images of 213 large-flowered chrysanthemum cultivars in two consecutive years, 2018 and 2019. Based on the 2018 dataset, we constructed a multi-information classification model using non-pre-trained ResNet18 as the backbone network. The model achieves 70.62% top-5 test accuracy for the 2019 dataset. We explored the ability of image features to represent the characteristics of large-flowered chrysanthemum. The affinity propagation (AP) clustering shows that the features are sufficient to discriminate flower colors. The principal component analysis (PCA) shows the petal type has a better interpretation than the flower type. The training sample processing, model training scheme, and learning rate adjustment method affected the convergence and generalization of the model. The non-pre-trained model overcomes the problem of focusing on texture by ignoring colors with the ImageNet pre-trained model. These results lay a foundation for the automated recognition and classification of large-flowered chrysanthemum cultivars based on image classification.

Image Inpainting With Local and Global Refinement.

Quan, Weize; Zhang, Ruisong; Zhang, Yong; Li, Zhifeng; Wang, Jue; Yan, Dong-Ming.

IEEE Trans Image Process ; 31: 2405-2420, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35259102

RESUMEN

Image inpainting has made remarkable progress with recent advances in deep learning. Popular networks mainly follow an encoder-decoder architecture (sometimes with skip connections) and possess sufficiently large receptive field, i.e., larger than the image resolution. The receptive field refers to the set of input pixels that are path-connected to a neuron. For image inpainting task, however, the size of surrounding areas needed to repair different kinds of missing regions are different, and the very large receptive field is not always optimal, especially for the local structures and textures. In addition, a large receptive field tends to involve more undesired completion results, which will disturb the inpainting process. Based on these insights, we rethink the process of image inpainting from a different perspective of receptive field, and propose a novel three-stage inpainting framework with local and global refinement. Specifically, we first utilize an encoder-decoder network with skip connection to achieve coarse initial results. Then, we introduce a shallow deep model with small receptive field to conduct the local refinement, which can also weaken the influence of distant undesired completion results. Finally, we propose an attention-based encoder-decoder network with large receptive field to conduct the global refinement. Experimental results demonstrate that our method outperforms the state of the arts on three popular publicly available datasets for image inpainting. Our local and global refinement network can be directly inserted into the end of any existing networks to further improve their inpainting performance. Code is available at https://github.com/weizequan/LGNet.git.

Asunto(s)

Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Neuronas

Metric learning for image-based flower cultivars identification.

Zhang, Ruisong; Tian, Ye; Zhang, Junmei; Dai, Silan; Hou, Xiaogai; Wang, Jue; Guo, Qi.

Plant Methods ; 17(1): 65, 2021 Jun 22.

Artículo en Inglés | MEDLINE | ID: mdl-34158091

RESUMEN

BACKGROUND: The study of plant phenotype by deep learning has received increased interest in recent years, which impressive progress has been made in the fields of plant breeding. Deep learning extremely relies on a large amount of training data to extract and recognize target features in the field of plant phenotype classification and recognition tasks. However, for some flower cultivars identification tasks with a huge number of cultivars, it is difficult for traditional deep learning methods to achieve better recognition results with limited sample data. Thus, a method based on metric learning for flower cultivars identification is proposed to solve this problem. RESULTS: We added center loss to the classification network to make inter-class samples disperse and intra-class samples compact, the script of ResNet18, ResNet50, and DenseNet121 were used for feature extraction. To evaluate the effectiveness of the proposed method, a public dataset Oxford 102 Flowers dataset and two novel datasets constructed by us are chosen. For the method of joint supervision of center loss and L2-softmax loss, the test accuracy rate is 91.88%, 97.34%, and 99.82% across three datasets, respectively. Feature distribution observed by T-distributed stochastic neighbor embedding (T-SNE) verifies the effectiveness of the method presented above. CONCLUSIONS: An efficient metric learning method has been described for flower cultivars identification task, which not only provides high recognition rates but also makes the feature extracted from the recognition network interpretable. This study demonstrated that the proposed method provides new ideas for the application of a small amount of data in the field of identification, and has important reference significance for the flower cultivars identification research.

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA