Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 97
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Phys Chem Chem Phys ; 25(35): 24097-24109, 2023 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-37655461

RESUMEN

Polymers are known to effectively improve the toughness of inorganic matrices; however, the mechanism at the molecular level is still unclear. In this study, we used molecular dynamics simulations to unravel the effects and mechanisms of different molecular chain lengths of polyacrylic acid (PAA) on toughening calcium silicate hydrate (CSH), which is the basic building block of cement-based materials. Our simulation results indicate that an optimal molecular chain length of polymers contributes to the largest toughening effect on the matrix, leading to up to 60.98% increase in fracture energy. During the uniaxial tensile tests along the x-axis and z-axis direction, the configuration evolution of the PAA molecule determines the toughening effect. As the polymer unfolds and its size matches the defects of CSH, the stress distribution of the system becomes more homogeneous, which favors an increase in toughness. Furthermore, based on our simulation results and a mathematical model, we propose a theory of "strain rate/optimal chain length". This theory suggests that the optimal toughening effect can be achieved when the molecular chain length of the organic component is 1.3-1.5 times the largest defect size of the inorganic matrix. This work provides molecular-scale insights into the toughening mechanisms of an organic/inorganic system and may have practical implications for improving the toughness of cement-based materials.

2.
Phys Chem Chem Phys ; 20(13): 8773-8789, 2018 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-29542793

RESUMEN

Graphene oxide (GO) reinforced cement nanocomposites open up a new path for sustainable concrete design. In this paper, reactive force-field molecular dynamics was utilized to investigate the structure, reactivity and interfacial bonding of calcium silicate hydrate (C-S-H)/GO nanocomposite functionalized by hydroxyl (C-OH), epoxy (C-O-C), carboxyl (COOH) and sulfonic (SO3H) groups with a coverage of 10%. The silicate chains in the hydrophilic C-S-H substrate provided numerous non-bridging oxygen sites and counter ions (Ca ions) with high reactivity, which allowed interlayer water molecules to dissociate into Si-OH and Ca-OH. On the other hand, protons dissociated from the functional groups and transferred to non-bridging sites in C-S-H, producing carbonyl (C[double bond, length as m-dash]O) and Si-OH. The de-protonation degree of the different groups in the vicinity of the C-S-H surface was in the following order: COOH (SO3H) > C-OH > C-O-C. In the GO-COOH sheet, most COOH groups were de-protonated to COO- groups, which enhanced the polarity and hydrophilicity of the GO sheets and formed stable COOCa bonds with neighboring Ca ions. The de-protonated COO- could also accept H bonds from Si-OH in the C-S-H gel, which further strengthened the interfacial connection. On the contrary, in the GO-Oo sheet, only 8% of the epoxy group was stretched open by the Ca ions and transformed to carbonyl group, showing weak polarity and connection with the C-S-H sheet. Furthermore, uniaxial tensile test on different C-S-H/GO models revealed that C-S-H reinforced with GO-COOH and GO-OH had better interfacial cohesive strength and ductility than that observed under tensile loading. Under the reaction force field, the dissociation of water, the proton exchange between the C-S-H and GO structure, and Oc-Ca-Os bond breakage occurred to resist tensile loading. The weakest mechanical behavior observed in the G/C-S-H, GO-Oo/C-S-H and GO-SO3H/C-S-H composites was attributed to the poor bonding, dissociation of functional groups and instability of atoms in the interface region. Hopefully, the molecular-scale strengthening mechanisms could provide a scientific guide for sustainable design of cement composites.

3.
Lab Invest ; 95(8): 860-71, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26006021

RESUMEN

Although the phosphatidyl-inositol-3-kinase (PI3K)/Akt pathway is essential for conferring cardioprotection in response to ischemic preconditioning (IP), the role of PI3K/Akt signaling in the infarcted heart for mediating the anti-arrhythmic effects in response to IP remains unclear. We explored the involvement of PI3K/Akt in the IP-like effect of connexin 43 and proangiogenic factors with particular regard to its role in protecting against ischemia-induced arrhythmia, heart failure, and myocardial remodeling. Groups of pigs were administered phosphate-buffered saline (PBS) or LY294002 solution. Before induction of myocardial infarction (MI), pigs were grouped according to whether or not they underwent IP. Next, all animals underwent MI induction by ligation of the left anterior descending (LAD) coronary artery. Myocardial tissues from the pig hearts at 7 days after MI were used to assess myocardium myeloperoxidase and reaction oxygen species, infarct size, collagen content, blood vascular density, expression of Akt, connexin 43, and proangiogenic growth factors, using spectrophotometer, histology, immunohistochemistry, real-time RT-PCR, and western blot. At 7 days after MI, IP significantly reduced animal mortality and malignant ventricular arrhythmia, myocardial inflammation, infarct size, and collagen content, and improved cardiac function and remodeling; use of the PI3K inhibitor LY294002 diminished these effects. In parallel with a decline in Akt expression and phosphorylation by MI, LY294002 injection resulted in significant suppression of connexin 43 and proangiogenic factor expression, and a reduction of angiogenesis and collateral circulation. These findings demonstrate that the cardioprotective effects of IP on antiventricular arrhythmia and myocardial repair occur through upregulation of PI3K/Akt-mediated connexin 43 and growth factor signaling.


Asunto(s)
Precondicionamiento Isquémico , Infarto del Miocardio/metabolismo , Fosfatidilinositol 3-Quinasas/metabolismo , Transducción de Señal/fisiología , Animales , Cromonas , Morfolinas , Miocardio/patología , Inhibidores de las Quinasa Fosfoinosítidos-3 , Porcinos , Factor A de Crecimiento Endotelial Vascular/metabolismo
4.
Artículo en Inglés | MEDLINE | ID: mdl-38683707

RESUMEN

Knowledge distillation-based anomaly detection (KDAD) methods rely on the teacher-student paradigm to detect and segment anomalous regions by contrasting the unique features extracted by both networks. However, existing KDAD methods suffer from two main limitations: 1) the student network can effortlessly replicate the teacher network's representations and 2) the features of the teacher network serve solely as a "reference standard" and are not fully leveraged. Toward this end, we depart from the established paradigm and instead propose an innovative approach called asymmetric distillation postsegmentation (ADPS). Our ADPS employs an asymmetric distillation paradigm that takes distinct forms of the same image as the input of the teacher-student networks, driving the student network to learn discriminating representations for anomalous regions. Meanwhile, a customized Weight Mask Block (WMB) is proposed to generate a coarse anomaly localization mask that transfers the distilled knowledge acquired from the asymmetric paradigm to the teacher network. Equipped with WMB, the proposed postsegmentation module (PSM) can effectively detect and segment abnormal regions with fine structures and clear boundaries. Experimental results demonstrate that the proposed ADPS outperforms the state-of-the-art methods in detecting and segmenting anomalies. Surprisingly, ADPS significantly improves average precision (AP) metric by 9% and 20% on the MVTec anomaly detection (AD) and KolektorSDD2 datasets, respectively.

5.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3213-3229, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38051621

RESUMEN

Visual grounding (VG) aims to locate a specific target in an image based on a given language query. The discriminative information from context is important for distinguishing the target from other objects, particularly for the targets that have the same category as others. However, most previous methods underestimate such information. Moreover, they are usually designed for the standard scene (without any novel object), which limits their generalization to the open-vocabulary scene. In this paper, we propose a novel framework with context disentangling and prototype inheriting for robust visual grounding to handle both scenes. Specifically, the context disentangling disentangles the referent and context features, which achieves better discrimination between them. The prototype inheriting inherits the prototypes discovered from the disentangled visual features by a prototype bank to fully utilize the seen data, especially for the open-vocabulary scene. The fused features, obtained by leveraging Hadamard product on disentangled linguistic and visual features of prototypes to avoid sharp adjusting the importance between the two types of features, are then attached with a special token and feed to a vision Transformer encoder for bounding box regression. Extensive experiments are conducted on both standard and open-vocabulary scenes. The performance comparisons indicate that our method outperforms the state-of-the-art methods in both scenarios.

6.
IEEE Trans Pattern Anal Mach Intell ; 46(4): 2461-2474, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38015702

RESUMEN

Stereo matching is a fundamental building block for many vision and robotics applications. An informative and concise cost volume representation is vital for stereo matching of high accuracy and efficiency. In this article, we present a novel cost volume construction method, named attention concatenation volume (ACV), which generates attention weights from correlation clues to suppress redundant information and enhance matching-related information in the concatenation volume. The ACV can be seamlessly embedded into most stereo matching networks, the resulting networks can use a more lightweight aggregation network and meanwhile achieve higher accuracy. We further design a fast version of ACV to enable real-time performance, named Fast-ACV, which generates high likelihood disparity hypotheses and the corresponding attention weights from low-resolution correlation clues to significantly reduce computational and memory cost and meanwhile maintain a satisfactory accuracy. Furthermore, we design a highly accurate network ACVNet and a real-time network Fast-ACVNet based on our ACV and Fast-ACV respectively, which achieve state-of-the-art performance on several benchmarks.

7.
IEEE Trans Image Process ; 33: 297-309, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38100340

RESUMEN

Recognizing actions performed on unseen objects, known as Compositional Action Recognition (CAR), has attracted increasing attention in recent years. The main challenge is to overcome the distribution shift of "action-objects" pairs between the training and testing sets. Previous works for CAR usually introduce extra information (e.g. bounding box) to enhance the dynamic cues of video features. However, these approaches do not essentially eliminate the inherent inductive bias in the video, which can be regarded as the stumbling block for model generalization. Because the video features are usually extracted from the visually cluttered areas in which many objects cannot be removed or masked explicitly. To this end, this work attempts to implicitly accomplish semantic-level decoupling of "object-action" in the high-level feature space. Specifically, we propose a novel Semantic-Decoupling Transformer framework, dubbed as DeFormer, which contains two insightful sub-modules: Objects-Motion Decoupler (OMD) and Semantic-Decoupling Constrainer (SDC). In OMD, we initialize several learnable tokens incorporating annotation priors to learn an instance-level representation and then decouple it into the appearance feature and motion feature in high-level visual space. In SDC, we use textual information in the high-level language space to construct a dual-contrastive association to constrain the decoupled appearance feature and motion feature obtained in OMD. Extensive experiments verify the generalization ability of DeFormer. Specifically, compared to the baseline method, DeFormer achieves absolute improvements of 3%, 3.3%, and 5.4% under three different settings on STH-ELSE, while corresponding improvements on EPIC-KITCHENS-55 are 4.7%, 9.2%, and 4.4%. Besides, DeFormer gains state-of-the-art results either on ground-truth or detected annotations.

8.
IEEE Trans Image Process ; 33: 1136-1148, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38300774

RESUMEN

The image-level label has prevailed in weakly supervised semantic segmentation tasks due to its easy availability. Since image-level labels can only indicate the existence or absence of specific categories of objects, visualization-based techniques have been widely adopted to provide object location clues. Considering class activation maps (CAMs) can only locate the most discriminative part of objects, recent approaches usually adopt an expansion strategy to enlarge the activation area for more integral object localization. However, without proper constraints, the expanded activation will easily intrude into the background region. In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion. Specifically, we propose a CAM-driven reconstruction module to directly reconstruct the input image from deep CAM features, which constrains the diffusion of last-layer object attention by preserving the coarse spatial structure of the image content. Moreover, we propose an activation self-modulation module to refine CAMs with finer spatial structure details by enhancing regional consistency. Without external saliency models to provide background clues, our approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively, demonstrating the superiority of our proposed approach. The source codes and models have been made available at https://github.com/NUST-Machine-Intelligence-Laboratory/SSC.

9.
Heliyon ; 10(7): e28552, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38560176

RESUMEN

Introduction: Simultaneous involvement of the peripheral nervous system (PNS) and central nervous system (CNS) during the same period in diffuse large B-cell lymphoma (DLBCL) is rarely documented. In this particular case, the diagnosis of diffuse large B-cell lymphoma was pathologically confirmed, with invasion into the basal ganglia, diencephalon, and several peripheral nerves. The initial clinical manifestations were dyspnoea and hyperventilation. Case presentation: The patient presented to the hospital with fatigue, dyspnoea, and limb pain for over 7 months, accompanied by progressive breathlessness and unconsciousness in the last 6 days. Initial treatment with glucocorticoids for Guillain-Barre syndrome (GBS) proved ineffective in controlling the severe shortness of breath and hyperventilation, necessitating the use of ventilator-assisted ventilation. 18-Fluorodeoxyglucose positron emission tomography/computed tomography (18FDG PET/CT) showed that the basal ganglia, brainstem, and multiple peripheral nerves were thickened and metabolically active. There were atypical cells in the cerebrospinal fluid; the pathology indicated invasive B-cell lymphoma, demonstrating a propensity toward diffuse large B-cell lymphoma (DLBCL). After receiving chemotherapy, the patient regained consciousness and was successfully weaned off ventilator assistance but died of severe pneumonia. Discussion: The early clinical manifestations of DLBCL lack specificity, and multifocal DLBCL complicates the diagnostic process. When a single primary disease cannot explain multiple symptoms, the possibility of DLBCL should be considered, and nervous system invasion should be considered when nervous system symptoms are present. Once nervous system involvement occurs in DLBCL, whether the central or peripheral nervous system, it indicates a poor prognosis.

10.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5612-5624, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38416607

RESUMEN

How to effectively explore the colors of exemplars and propagate them to colorize each frame is vital for exemplar-based video colorization. In this article, we present a BiSTNet to explore colors of exemplars and utilize them to help video colorization by a bidirectional temporal feature fusion with the guidance of semantic image prior. We first establish the semantic correspondence between each frame and the exemplars in deep feature space to explore color information from exemplars. Then, we develop a simple yet effective bidirectional temporal feature fusion module to propagate the colors of exemplars into each frame and avoid inaccurate alignment. We note that there usually exist color-bleeding artifacts around the boundaries of important objects in videos. To overcome this problem, we develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process. In addition, we develop a multi-scale refinement block to progressively colorize frames in a coarse-to-fine manner. Extensive experimental results demonstrate that the proposed BiSTNet performs favorably against state-of-the-art methods on the benchmark datasets and real-world scenes. Moreover, the BiSTNet obtains one champion in NTIRE 2023 video colorization challenge (Kang et al. 2023).

11.
Adv Mater ; : e2405183, 2024 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-38973222

RESUMEN

Biological materials relying on hierarchically ordered architectures inspire the emergence of advanced composites with mutually exclusive mechanical properties, but the efficient topology optimization and large-scale manufacturing remain challenging. Herein, this work proposes a scalable bottom-up approach to fabricate a novel nacre-like cement-resin composite with gradient brick-and-mortar (BM) structure, and demonstrates a machine learning-assisted method to optimize the gradient structure. The fabricated gradient composite exhibits an extraordinary combination of high flexural strength, toughness, and impact resistance. Particularly, the toughness and impact resistance of such composite attractively surpass the cement counterparts by factors of approximately 700 and 600 times, and even outperform natural rocks, fiber-reinforced cement-based materials and even some alloys. The strengthening and toughening mechanisms are clarified as the regional-matrix densifying and crack-tip shielding effects caused by the gradient BM structure. The developed gradient composite not only endows a promising structural material for protective applications in harsh scenarios, but also paves a new way for biomimetic metamaterials designing.

12.
IEEE Trans Neural Netw Learn Syst ; 34(4): 1838-1851, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-32502968

RESUMEN

Hashing has been widely applied to multimodal retrieval on large-scale multimedia data due to its efficiency in computation and storage. In this article, we propose a novel deep semantic multimodal hashing network (DSMHN) for scalable image-text and video-text retrieval. The proposed deep hashing framework leverages 2-D convolutional neural networks (CNN) as the backbone network to capture the spatial information for image-text retrieval, while the 3-D CNN as the backbone network to capture the spatial and temporal information for video-text retrieval. In the DSMHN, two sets of modality-specific hash functions are jointly learned by explicitly preserving both intermodality similarities and intramodality semantic labels. Specifically, with the assumption that the learned hash codes should be optimal for the classification task, two stream networks are jointly trained to learn the hash functions by embedding the semantic labels on the resultant hash codes. Moreover, a unified deep multimodal hashing framework is proposed to learn compact and high-quality hash codes by exploiting the feature representation learning, intermodality similarity-preserving learning, semantic label-preserving learning, and hash function learning with different types of loss functions simultaneously. The proposed DSMHN method is a generic and scalable deep hashing framework for both image-text and video-text retrievals, which can be flexibly integrated with different types of loss functions. We conduct extensive experiments for both single-modal- and cross-modal-retrieval tasks on four widely used multimodal-retrieval data sets. Experimental results on both image-text- and video-text-retrieval tasks demonstrate that the DSMHN significantly outperforms the state-of-the-art methods.

13.
IEEE Trans Image Process ; 32: 2960-2971, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37195845

RESUMEN

Weakly supervised semantic segmentation (WSSS) models relying on class activation maps (CAMs) have achieved desirable performance comparing to the non-CAMs-based counterparts. However, to guarantee WSSS task feasible, we need to generate pseudo labels by expanding the seeds from CAMs which is complex and time-consuming, thus hindering the design of efficient end-to-end (single-stage) WSSS approaches. To tackle the above dilemma, we resort to the off-the-shelf and readily accessible saliency maps for directly obtaining pseudo labels given the image-level class labels. Nevertheless, the salient regions may contain noisy labels and cannot seamlessly fit the target objects, and saliency maps can only be approximated as pseudo labels for simple images containing single-class objects. As such, the achieved segmentation model with these simple images cannot generalize well to the complex images containing multi-class objects. To this end, we propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model, to alleviate the noisy label and multi-class generalization issues. Specifically, we propose the online noise filtering and progressive noise detection modules to tackle image-level and pixel-level noise, respectively. Moreover, a bidirectional alignment mechanism is proposed to reduce the data distribution gap at both input and output space with simple-to-complex image synthesis and complex-to-simple adversarial learning. MDBA can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset. The source codes and models have been made available at https://github.com/NUST-Machine-Intelligence-Laboratory/MDBA.

14.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 7559-7576, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36395133

RESUMEN

In the semi-supervised skeleton-based action recognition task, obtaining more discriminative information from both labeled and unlabeled data is a challenging problem. As the current mainstream approach, contrastive learning can learn more representations of augmented data, which can be considered as the pretext task of action recognition. However, such a method still confronts three main limitations: 1) It usually learns global-granularity features that cannot well reflect the local motion information. 2) The positive/negative pairs are usually pre-defined, some of which are ambiguous. 3) It generally measures the distance between positive/negative pairs only within the same granularity, which neglects the contrasting between the cross-granularity positive and negative pairs. Toward these limitations, we propose a novel Multi-granularity Anchor-Contrastive representation Learning (dubbed as MAC-Learning) to learn multi-granularity representations by conducting inter- and intra-granularity contrastive pretext tasks on the learnable and structural-link skeletons among three types of granularities covering local, context, and global views. To avoid the disturbance of ambiguous pairs from noisy and outlier samples, we design a more reliable Multi-granularity Anchor-Contrastive Loss (dubbed as MAC-Loss) that measures the agreement/disagreement between high-confidence soft-positive/negative pairs based on the anchor graph instead of the hard-positive/negative pairs in the conventional contrastive loss. Extensive experiments on both NTU RGB+D and Northwestern-UCLA datasets show that the proposed MAC-Learning outperforms existing competitive methods in semi-supervised skeleton-based action recognition tasks.

15.
Artículo en Inglés | MEDLINE | ID: mdl-37220053

RESUMEN

Thanks to the advantages of the friendly annotations and the satisfactory performance, weakly-supervised semantic segmentation (WSSS) approaches have been extensively studied. Recently, the single-stage WSSS (SS-WSSS) was awakened to alleviate problems of the expensive computational costs and the complicated training procedures in multistage WSSS. However, the results of such an immature model suffer from problems of background incompleteness and object incompleteness. We empirically find that they are caused by the insufficiency of the global object context and the lack of local regional contents, respectively. Under these observations, we propose an SS-WSSS model with only the image-level class label supervisions, termed weakly supervised feature coupling network (WS-FCN), which can capture the multiscale context formed from the adjacent feature grids, and encode the fine-grained spatial information from the low-level features into the high-level ones. Specifically, a flexible context aggregation (FCA) module is proposed to capture the global object context in different granular spaces. Besides, a semantically consistent feature fusion (SF2) module is proposed in a bottom-up parameter-learnable fashion to aggregate the fine-grained local contents. Based on these two modules, WS-FCN lies in a self-supervised end-to-end training fashion. Extensive experimental results on the challenging PASCAL VOC 2012 and MS COCO 2014 demonstrate the effectiveness and efficiency of WS-FCN, which can achieve state-of-the-art results by 65.02% and 64.22% mIoU on PASCAL VOC 2012 val set and test set, 34.12% mIoU on MS COCO 2014 val set, respectively. The code and weight have been released at:WS-FCN.

16.
Artículo en Inglés | MEDLINE | ID: mdl-37713222

RESUMEN

Text-based person search (TBPS) is a challenging task that aims to search pedestrian images with the same identity from an image gallery given a query text. In recent years, TBPS has made remarkable progress, and state-of-the-art (SOTA) methods achieve superior performance by learning local fine-grained correspondence between images and texts. However, most existing methods rely on explicitly generated local parts to model fine-grained correspondence between modalities, which is unreliable due to the lack of contextual information or the potential introduction of noise. Moreover, the existing methods seldom consider the information inequality problem between modalities caused by image-specific information. To address these limitations, we propose an efficient joint multilevel alignment network (MANet) for TBPS, which can learn aligned image/text feature representations between modalities at multiple levels, and realize fast and effective person search. Specifically, we first design an image-specific information suppression (ISS) module, which suppresses image background and environmental factors by relation-guided localization (RGL) and channel attention filtration (CAF), respectively. This module effectively alleviates the information inequality problem and realizes the alignment of information volume between images and texts. Second, we propose an implicit local alignment (ILA) module to adaptively aggregate all pixel/word features of image/text to a set of modality-shared semantic topic centers and implicitly learn the local fine-grained correspondence between modalities without additional supervision and cross-modal interactions. Also, a global alignment (GA) is introduced as a supplement to the local perspective. The cooperation of global and local alignment modules enables better semantic alignment between modalities. Extensive experiments on multiple databases demonstrate the effectiveness and superiority of our MANet.

17.
IEEE Trans Image Process ; 32: 6032-6046, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37910422

RESUMEN

Text-Image Person Re-identification (TIReID) aims to retrieve the image corresponding to the given text query from a pool of candidate images. Existing methods employ prior knowledge from single-modality pre-training to facilitate learning, but lack multi-modal correspondence information. Vision-Language Pre-training, such as CLIP (Contrastive Language-Image Pretraining), can address the limitation. However, CLIP falls short in capturing fine-grained information, thereby not fully leveraging its powerful capacity in TIReID. Besides, the popular explicit local matching paradigm for mining fine-grained information heavily relies on the quality of local parts and cross-modal inter-part interaction/guidance, leading to intra-modal information distortion and ambiguity problems. Accordingly, in this paper, we propose a CLIP-driven Fine-grained information excavation framework (CFine) to fully utilize the powerful knowledge of CLIP for TIReID. To transfer the multi-modal knowledge effectively, we conduct fine-grained information excavation to mine modality-shared discriminative details for global alignment. Specifically, we propose a multi-level global feature learning (MGF) module that fully mines the discriminative local information within each modality, thereby emphasizing identity-related discriminative clues through enhanced interaction between global image (text) and informative local patches (words). MGF generates a set of enhanced global features for later inference. Furthermore, we design cross-grained feature refinement (CFR) and fine-grained correspondence discovery (FCD) modules to establish cross-modal correspondence at both coarse and fine-grained levels (image-word, sentence-patch, word-patch), ensuring the reliability of informative local patches/words. CFR and FCD are removed during inference to optimize computational efficiency. Extensive experiments on multiple benchmarks demonstrate the superior performance of our method in TIReID.

18.
Artículo en Inglés | MEDLINE | ID: mdl-37995167

RESUMEN

This article proposes a new hashing framework named relational consistency induced self-supervised hashing (RCSH) for large-scale image retrieval. To capture the potential semantic structure of data, RCSH explores the relational consistency between data samples in different spaces, which learns reliable data relationships in the latent feature space and then preserves the learned relationships in the Hamming space. The data relationships are uncovered by learning a set of prototypes that group similar data samples in the latent feature space. By uncovering the semantic structure of the data, meaningful data-to-prototype and data-to-data relationships are jointly constructed. The data-to-prototype relationships are captured by constraining the prototype assignments generated from different augmented views of an image to be the same. Meanwhile, these data-to-prototype relationships are preserved to learn informative compact hash codes by matching them with these reliable prototypes. To accomplish this, a novel dual prototype contrastive loss is proposed to maximize the agreement of prototype assignments in the latent feature space and Hamming space. The data-to-data relationships are captured by enforcing the distribution of pairwise similarities in the latent feature space and Hamming space to be consistent, which makes the learned hash codes preserve meaningful similarity relationships. Extensive experimental results on four widely used image retrieval datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods. Besides, the proposed method achieves promising performance in out-of-domain retrieval tasks, which shows its good generalization ability. The source code and models are available at https://github.com/IMAG-LuJin/RCSH.

19.
Materials (Basel) ; 16(22)2023 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-38005173

RESUMEN

Alite dissolution plays a crucial role in cement hydration. However, quantitative investigations into alite powder dissolution are limited, especially regarding the influence of chemical admixtures. This study investigates the impact of particle size, temperature, saturation level, and mixing speed on alite powder dissolution rate, considering the real-time evolution of specific surface area during the alite powder dissolution process. Furthermore, the study delves into the influence of two organic toughening agents, chitosan oligosaccharide (COS) and anionic/non-ionic polyester-based polyurethane (PU), on the kinetics of alite powder dissolution. The results demonstrate a specific-surface-area change formula during alite powder dissolution: SS0=0.348e1-m/m0/0.085+0.651. Notably, the temperature and saturation level significantly affect dissolution rates, whereas the effect of particle size is more complicated. COS shows dosage-dependent effects on alite dissolution, acting through both its acidic nature and surface coverage. On the other hand, PU inhibits alite dissolution by blocking the active sites of alite through electrostatic adsorption, which is particularly evident at high temperatures.

20.
IEEE Trans Image Process ; 32: 4341-4354, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37490376

RESUMEN

The visual feature pyramid has shown its superiority in both effectiveness and efficiency in a variety of applications. However, current methods overly focus on inter-layer feature interactions while disregarding the importance of intra-layer feature regulation. Despite some attempts to learn a compact intra-layer feature representation with the use of attention mechanisms or vision transformers, they overlook the crucial corner regions that are essential for dense prediction tasks. To address this problem, we propose a Centralized Feature Pyramid (CFP) network for object detection, which is based on a globally explicit centralized feature regulation. Specifically, we first propose a spatial explicit visual center scheme, where a lightweight MLP is used to capture the globally long-range dependencies, and a parallel learnable visual center mechanism is used to capture the local corner regions of the input images. Based on this, we then propose a globally centralized regulation for the commonly-used feature pyramid in a top-down fashion, where the explicit visual center information obtained from the deepest intra-layer feature is used to regulate frontal shallow features. Compared to the existing feature pyramids, CFP not only has the ability to capture the global long-range dependencies but also efficiently obtain an all-round yet discriminative feature representation. Experimental results on the challenging MS-COCO validate that our proposed CFP can achieve consistent performance gains on the state-of-the-art YOLOv5 and YOLOX object detection baselines.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA