RESUMEN
With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) task is proposed, which aims to match pairs of person images from the visible and infrared modalities. The significant modality discrepancy between the modalities poses a major challenge. Existing VI Re-ID methods focus on cross-modal feature learning and modal transformation to alleviate the discrepancy but overlook the impact of person contour information. Contours exhibit modality invariance, which is vital for learning effective identity representations and cross-modal matching. In addition, due to the low intra-modal diversity in the visible modality, it is difficult to distinguish the boundaries between some hard samples. To address these issues, we propose the Graph Sampling-based Multi-stream Enhancement Network (GSMEN). Firstly, the Contour Expansion Module (CEM) incorporates the contour information of a person into the original samples, further reducing the modality discrepancy and leading to improved matching stability between image pairs of different modalities. Additionally, to better distinguish cross-modal hard sample pairs during the training process, an innovative Cross-modality Graph Sampler (CGS) is designed for sample selection before training. The CGS calculates the feature distance between samples from different modalities and groups similar samples into the same batch during the training process, effectively exploring the boundary relationships between hard classes in the cross-modal setting. Some experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate the superiority of our proposed method. Specifically, in the VISâIR task, the experimental results on the RegDB dataset achieve 93.69% for Rank-1 and 92.56% for mAP.
RESUMEN
Intelligent medicine is eager to automatically generate radiology reports to ease the tedious work of radiologists. Previous researches mainly focused on the text generation with encoder-decoder structure, while CNN networks for visual features ignored the long-range dependencies correlated with textual information. Besides, few studies exploit cross-modal mappings to promote radiology report generation. To alleviate the above problems, we propose a novel end-to-end radiology report generation model dubbed Self-Supervised dual-Stream Network (S3-Net). Specifically, a Dual-Stream Visual Feature Extractor (DSVFE) composed of ResNet and SwinTransformer is proposed to capture more abundant and effective visual features, where the former focuses on local response and the latter explores long-range dependencies. Then, we introduced the Fusion Alignment Module (FAM) to fuse the dual-stream visual features and facilitate alignment between visual features and text features. Furthermore, the Self-Supervised Learning with Mask(SSLM) is introduced to further enhance the visual feature representation ability. Experimental results on two mainstream radiology reporting datasets (IU X-ray and MIMIC-CXR) show that our proposed approach outperforms previous models in terms of language generation metrics.
Asunto(s)
Radiología , Automanejo , Humanos , Radiografía , Radiólogos , Benchmarking , Procesamiento de Imagen Asistido por ComputadorRESUMEN
Many dimensionality reduction methods in the manifold learning field have the so-called small-sample-size (SSS) problem. Starting from solving the SSS problem, we first summarize the existing dimensionality reduction methods and construct a unified criterion function of these methods. Then, combining the unified criterion with the matrix function, we propose a general matrix function dimensionality reduction framework. This framework is configurable, that is, one can select suitable functions to construct such a matrix transformation framework, and then a series of new dimensionality reduction methods can be derived from this framework. In this article, we discuss how to choose suitable functions from two aspects: 1) solving the SSS problem and 2) improving pattern classification ability. As an extension, with the inverse hyperbolic tangent function and linear function, we propose a new matrix function dimensionality reduction framework. Compared with the existing methods to solve the SSS problem, these new methods can obtain better pattern classification ability and have less computational complexity. The experimental results on handwritten digit, letters databases, and two face databases show the superiority of the new methods.
RESUMEN
Traditional clustering methods often cannot avoid the problem of selecting neighborhood parameters and the number of clusters, and the optimal selection of these parameters varies among different shapes of data, which requires prior knowledge. To address the above parameter selection problem, we propose an effective clustering algorithm based on adaptive neighborhood, which can obtain satisfactory clustering results without setting the neighborhood parameters and the number of clusters. The core idea of the algorithm is to first iterate adaptively to a logarithmic stable state and obtain neighborhood information according to the distribution characteristics of the dataset, and then mark and peel the boundary points according to this neighborhood information, and finally cluster the data clusters with the core points as the centers. We have conducted extensive comparative experiments on datasets of different sizes and different distributions and achieved satisfactory experimental results.
Asunto(s)
Algoritmos , Análisis por Conglomerados , RotaciónRESUMEN
A semismooth Newton method, based on variational inequalities and generalized derivative, is designed and analysed for unilateral contact problem between two membranes. The problem is first formulated as a corresponding regularized problem with a nonlinear function, which can be solved by the semismooth Newton method. We prove the convergence of the method in the function space. To improve the performance of the semismooth Newton method, we use the path-following method to adjust the parameter automatically. Finally, some numerical results are presented to illustrate the performance of the proposed method.