Búsqueda | Portal de Búsqueda de la BVS España

1.

A model of tertiary lymphatic structure-related prognosis for penile squamous cell carcinoma.

Tang, Han; Su, Zhengwei; Huang, Qingming; Li, Yongpeng; Chen, Rongchao; Ban, Chengjie; Liu, Chanzhen; Lu, Haoyuan; Yi, Xian-Lin; Tang, Yong.

BMC Urol ; 24(1): 165, 2024 Aug 02.

Artículo en Inglés | MEDLINE | ID: mdl-39090582

RESUMEN

BACKGROUND: We investigated the feasibility of the tertiary lymphoid structure (TLS) as a prognostic marker for penile squamous cell carcinoma(SCC). METHODS: We retrospectively collected data from 83 patients with penile squamous cell carcinoma. H&E-stained slides were reviewed for TLS density. In addition, clinical parameters were analyzed, the prognostic value of these parameters on overall survival (OS) was evaluated using â Kaplan-Meier survival curves, and the prognostic value of influencing factors was evaluated using Cox multifactor design nomogram analysis. RESULT: BMI, T, N, and M are significant in the survival curve with or without tertiary lymphoid structure. BMI, T, N, M and TLS were used to construct a prognostic model for penile squamous cell carcinoma, and the prediction accuracy reached a consensus of 0.884(0.835-0.932), and the decision consensus reached 0.581(0.508-0.655). CONCLUSION: TLS may be a positive prognostic factor for penile squamous cell carcinoma, and the combination of BMI, T, N and M can better evaluate the prognosis of patients.

Asunto(s)

Carcinoma de Células Escamosas , Neoplasias del Pene , Estructuras Linfoides Terciarias , Masculino , Neoplasias del Pene/patología , Neoplasias del Pene/mortalidad , Humanos , Carcinoma de Células Escamosas/patología , Carcinoma de Células Escamosas/mortalidad , Pronóstico , Estudios Retrospectivos , Persona de Mediana Edad , Anciano , Estructuras Linfoides Terciarias/patología , Adulto , Tasa de Supervivencia

2.

A 9-10-Bit Adjustable and Energy-Efficient Switching Scheme for Successive Approximation Register Analog-to-Digital Converter with One Least Significant Bit Common-Mode Voltage Variation.

Hu, Yunfeng; Chen, Chaoyi; Hu, Lexing; Huang, Qingming; Tang, Bin; Hu, Mengsi; Yuan, Bingbing; Wu, Zhaohui; Li, Bin.

Sensors (Basel) ; 24(11)2024 May 21.

Artículo en Inglés | MEDLINE | ID: mdl-38894065

RESUMEN

A 9-10-bit adjustable and energy-efficient switching scheme for SAR ADC with one-LSB common-mode voltage variation is proposed. Based on capacitor-splitting technology and common-mode conversion techniques, the proposed switching scheme reduces the DAC switching energy by 96.41% compared to the conventional scheme. The low complexity and the one-LSB common-mode voltage offset of this scheme benefit from the simultaneous switching of the reference voltages of the capacitors corresponding to the positive array and the negative array throughout the entire reference voltage switching process, and the reference voltage of each capacitor in the scheme does not change more than two voltages. The post-layout result shows that the ADC achieves the 54.96 dB SNDR, the 61.73 dB SFDR, and the 0.67 µw power consumption with the 10-bit mode and the 48.33 dB SNDR, the 54.17 dB SFDR, and the 0.47 µw power consumption with the 9-bit mode in a 180 nm process with a 100 kS/s sampling frequency.

3.

[Leonurine inhibits ferroptosis in renal tubular epithelial cells by activating p62/Nrf2/HO-1 signaling pathway].

Wu, Ai-Jun; Chen, Nai-Qing; Huang, Li-Hua; Cheng, Ran; Wang, Xiao-Wan; Li, Chuang; Mao, Wei; Huang, Qing-Ming; Xu, Peng; Tian, Rui-Min.

Zhongguo Zhong Yao Za Zhi ; 48(8): 2176-2183, 2023 Apr.

Artículo en Zh | MEDLINE | ID: mdl-37282905

RESUMEN

To investigate the protective effect and the potential mechanism of leonurine(Leo) against erastin-induced ferroptosis in human renal tubular epithelial cells(HK-2 cells), an in vitro erastin-induced ferroptosis model was constructed to detect the cell viability as well as the expressions of ferroptosis-related indexes and signaling pathway-related proteins. HK-2 cells were cultured in vitro, and the effects of Leo on the viability of HK-2 cells at 10, 20, 40, 60, 80 and 100 µmol·L~(-1) were examined by CCK-8 assay to determine the safe dose range of Leo administration. A ferroptosis cell model was induced by erastin, a common ferroptosis inducer, and the appropriate concentrations were screened. CCK-8 assay was used to detect the effects of Leo(20, 40, 80 µmol·L~(-1)) and positive drug ferrostatin-1(Fer-1, 1, 2 µmol·L~(-1)) on the viability of ferroptosis model cells, and the changes of cell morphology were observed by phase contrast microscopy. Then, the optimal concentration of Leo was obtained by Western blot for nuclear factor erythroid 2-related factor 2(Nrf2) activation, and transmission electron microscope was further used to detect the characteristic microscopic morphological changes during ferroptosis. Flow cytometry was performed to detect reactive oxygen species(ROS), and the level of glutathione(GSH) was measured using a GSH assay kit. The expressions of glutathione peroxidase 4(GPX4), p62, and heme oxygenase 1(HO-1) in each group were quantified by Western blot. RESULTS:: showed that Leo had no side effects on the viability of normal HK-2 cells in the concentration range of 10-100 µmol·L~(-1). The viability of HK-2 cells decreased as the concentration of erastin increased, and 5 µmol·L~(-1) erastin significantly induced ferroptosis in the cells. Compared with the model group, Leo dose-dependently increased cell via-bility and improved cell morphology, and 80 µmol·L~(-1) Leo promoted the translocation of Nrf2 from the cytoplasm to the nucleus. Further studies revealed that Leo remarkably alleviated the characteristic microstructural damage of ferroptosis cells caused by erastin, inhibited the release of intracellular ROS, elevated GSH and GPX4, promoted the nuclear translocation of Nrf2, and significantly upregulated the expression of p62 and HO-1 proteins. In conclusion, Leo exerted a protective effect on erastin-induced ferroptosis in HK-2 cells, which might be associated with its anti-oxidative stress by activating p62/Nrf2/HO-1 signaling pathway.

Asunto(s)

Ferroptosis , Humanos , Especies Reactivas de Oxígeno/metabolismo , Factor 2 Relacionado con NF-E2/genética , Factor 2 Relacionado con NF-E2/metabolismo , Transducción de Señal , Células Epiteliales/metabolismo , Glutatión

4.

Enhancement of Upconversion Luminescence by the Construction of a 3Yb-Er-Hf Sublattice Energy Cluster and Surface Defect Elimination.

Yu, Han; Lin, Mingming; Lin, Hang; Liu, Changwei; Zhang, Xinqi; Huang, Qingming.

Inorg Chem ; 61(13): 5405-5412, 2022 Apr 04.

Artículo en Inglés | MEDLINE | ID: mdl-35306822

RESUMEN

Nanotetragonal LiYF4:RE (Tm,Er,Ho) is a kind of excellent upconversion luminescence (UCL) material potentially used in many fields, while the enhancement of UC emission and regulation of luminescence lifetime are still a challenge. Herein, a strategy was reported to enhance UCL performance with the aid of the construction of a 3Yb-Er-Hf sublattice energy cluster with the introduction of Hf4+ and the interception of surface defect fluorescence quenching. UCL was obviously decreased by Hf4+ doping without surface defect elimination, but after the interception of surface defect quenching, UCL was dramatically enhanced more than 300-fold with an Er3+/Hf4+ mole ratio of 1:1. The contribution of UCL enhancement by the construction of a 3Yb-Er-Hf sublattice energy cluster is about 1.5 times of the sample without energy cluster construction. Interestingly, the lifetime of UCL can also be regulated by this strategy. According to the results of systematical microstructure analyses and UCL performance behaviors examined by X-ray powder diffraction (XRD), small-angle X-ray scattering (SAXS), transmission electron microscopy (TEM), nuclear magnetic resonance (NMR), and fluorescence spectrophotometry (FS) methods, the possible mechanism of UCL enhancement was proposed. This work may be an inspiration for researchers to design and develop high-performance UCL nanomaterials.

5.

Regulation pore size distribution for facilitating malachite green removal on carbon foam.

Zhang, Xinqi; Wang, Kang; He, Chong; Lin, Yun; Hu, Hui; Huang, Qingming; Yu, Han; Zhou, Tianhua; Lin, Qilang.

Environ Res ; 213: 113715, 2022 10.

Artículo en Inglés | MEDLINE | ID: mdl-35718166

RESUMEN

Malachite green (MG) is widely used as a textile dye and an aquacultural biocide, and become a serious pollution of drink water, but effectually isolating and removing it from wastewater are still a challenge. Here we report a new strategy to prepare a carbon foam with tunable pore size distribution by a one-pot lava foam process. We find that uniform micropore size is beneficial to the formation of C-OH coordination on the pore surface, increasing MG adsorption rates via H+ ionization. As a result, carbon foam with uniform pore size distribution demonstrates an optimum MG removal efficiency of 1812 mg g-1 and a higher partition coefficient of 3.02 mg g-1 µM-1, which is twice that of carbon foams with irregular pore size distribution. The adsorption of MG onto these adsorbents was found to be an endothermic monolayer chemical adsorption process, and the Gibbs free energy of adsorption process was decreased obviously by regulating micropore size distribution. The experiment results are in good agreement with pseudo-second-order kinetic and Langmuir isotherm models. Revealed the pore size distribution was the critical factor of MG removal by carbon foam. It should be and inspiration for the design and development of highly efficiency adsorbents for dyes removal.

Asunto(s)

Carbono , Contaminantes Químicos del Agua , Adsorción , Concentración de Iones de Hidrógeno , Cinética , Colorantes de Rosanilina

6.

Resonance Emission Enhancement (REE) for Narrow Band Red-Emitting A₂GeF₆:Mn⁴⁺ (A = Na, K, Rb, Cs) Phosphors Synthesized via a Precipitation-Cation Exchange Route.

Lian, Hongzhou; Huang, Qingming; Chen, Yeqing; Li, Kai; Liang, Sisi; Shang, Mengmeng; Liu, Manman; Lin, Jun.

Inorg Chem ; 56(19): 11900-11910, 2017 Oct 02.

Artículo en Inglés | MEDLINE | ID: mdl-28926231

RESUMEN

Narrow band red-emitting A2GeF6:Mn4+ (A = Na, K, Rb, Cs) phosphors were prepared through a two-step precipitation-cation exchange route using a K2MnF6 precursor as the Mn4+ source. The phase purity, morphology, and constituent were characterized by X-ray diffraction (XRD), scanning electron microscopy (SEM), X-ray photoelectric spectroscopy (XPS), and electron paramagnetic resonance (EPR) examination. Optical properties were investigated by photoluminescence spectra (PL and PLE) and high-resolution PL. A temperature-dependent PL examination was performed to investigate the electron-phonon coupling emission mechanism of Mn4+ in these alkali fluorogermanates. The PL data show that both ordered distribution and appropriate distance between Mn4+ ions are propitious for enhancement of the emission intensity. A resonance emission enhancement (REE) mechanism has been proposed to explain the intensity increment among these products. These phosphors present bright red emission under blue light (467 nm) illumination, among which Cs2GeF6:0.03Mn4+ exhibits the most excellent optical properties with a quantum yield (QY) of 93%. A WLED (white light-emitting diode) fabricated with blend of commercial YAG:Ce3+ and this phosphor emits intense warm white light with low color temperature (CCT = 3385 K) and high color rendering index (Ra = 90.5), implying its potential application as red phosphor in WLEDs.

7.

Upconversion effective enhancement by producing various coordination surroundings of rare-Earth ions.

Huang, Qingming; Yu, Han; Ma, En; Zhang, Xinqi; Cao, Wenbing; Yang, Chengang; Yu, Jianchang.

Inorg Chem ; 54(6): 2643-51, 2015 Mar 16.

Artículo en Inglés | MEDLINE | ID: mdl-25723777

RESUMEN

In this manuscript, we present a simple route to enhance upconversion (UC) emission by producing two different coordination sites of trivalent cations in a matrix material and adjusting crystal field asymmetry by Hf(4+) co-doping. A cubic phase, Y3.2Al0.32Yb0.4Er0.08F12, with these structural characteristics was synthesized successfully by introducing a small ion (Al(3+)) into YF3. X-ray diffraction (XRD), nuclear magnetic resonance (NMR), transmission electron microscopy (TEM), X-ray spectroscopy (XPS), and fluorescence spectrophotometry (FS) were employed for its crystalline structure and luminescent property analysis. As a result, the coordination environments of the rare-earth ions were varied more obviously than a hexagonal NaYF4 matrix with the same Hf(4+) co-doping concentration, with vertical comparison, UC luminescent intensities of cubic Y3.2Al0.32Yb0.4Er0.08F12 were largely enhanced (â¼32-80 times greater than that of different band emissions), while the maximum enhancement of hexagonal NaYF4 was by a factor of â¼12. According to our experimental results, the mechanism has been demonstrated involving the crystalline structure, crystal field asymmetry, luminescence lifetime, hypersensitive transition, and so on. The study may be helpful for the design and fabrication of high-performance UC materials.

Asunto(s)

Metales de Tierras Raras/química , Cristalografía por Rayos X , Mediciones Luminiscentes , Modelos Moleculares , Conformación Molecular

8.

Mitigating Confounding Bias in Practical Recommender Systems With Partially Inaccessible Exposure Status.

Cao, Tianwei; Xu, Qianqian; Yang, Zhiyong; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; 46(2): 957-974, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-37878433

RESUMEN

To improve user experience, recommender systems have been widely used on many online platforms. In these systems, recommendation models are typically learned from positive/negative feedback that are collected automatically. Notably, recommender systems are a little different from general supervised learning tasks. In recommender systems, there are some factors (e.g., previous recommendation models or operation strategies of a online platform) that determine which items can be exposed to each individual user. Normally, the previous exposure results are not only relevant to the instances' features (i.e., user or item), but also affect their feedback ratings, thus leading to confounding bias in the recommendation models. To mitigate this bias, researchers have already provided a variety of strategies. However, there are still two issues that are underappreciated: 1) previous debiased RS approaches cannot effectively capture recommendation-specific, exposure-specific and their common knowledge simultaneously; 2) the true exposure results of the user-item pairs are partially inaccessible, so there would be some noises if we use their observability to approximate it as existing approaches. Motivated by this, we develop a novel debiasing recommendation approach. More specifically, we first propose a mutual information-based counterfactual learning framework based on the causal relationship among the instance features, exposure status, and ratings. This framework can 1) capture recommendation-specific, exposure-specific and their common knowledge by explicitly modeling the relationship among the causal factors, and 2) achieve robustness towards partially inaccessible exposure results by a pairwise learning strategy. Under such a framework, we implement an optimizable loss function with theoretical analysis. By minimizing this loss, we expect to obtain an unbiased recommendation model that reflects the users' real interests. Meanwhile, we also prove that our loss function has robustness towards the partial inaccessibility of the exposure status. Finally, extensive experiments on public datasets manifest the superiority of our proposed method in boosting the recommendation performance.

9.

Uncertainty-boosted Robust Video Activity Anticipation.

Qi, Zhaobo; Wang, Shuhui; Zhang, Weigang; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Apr 29.

Artículo en Inglés | MEDLINE | ID: mdl-38683715

RESUMEN

Video activity anticipation aims to predict what will happen in the future, embracing a broad application prospect ranging from robot vision and autonomous driving. Despite the recent progress, the data uncertainty issue, reflected as the content evolution process and dynamic correlation in event labels, has been somehow ignored. This reduces the model generalization ability and deep understanding on video content, leading to serious error accumulation and degraded performance. In this paper, we address the uncertainty learning problem and propose an uncertainty-boosted robust video activity anticipation framework, which generates uncertainty values to indicate the credibility of the anticipation results. The uncertainty value is used to derive a temperature parameter in the softmax function to modulate the predicted target activity distribution. To guarantee the distribution adjustment, we construct a reasonable target activity label representation by incorporating the activity evolution from the temporal class correlation and the semantic relationship. Moreover, we quantify the uncertainty into relative values by comparing the uncertainty among sample pairs and their temporal-lengths. This relative strategy provides a more accessible way in uncertainty modeling than quantifying the absolute uncertainty values on the whole dataset. Experiments on multiple backbones and benchmarks show our framework achieves promising performance and better robustness/interpretability. Source codes are available at https://github.com/qzhb/UbRV2A.

10.

CenterNet++ for Object Detection.

Duan, Kaiwen; Bai, Song; Xie, Lingxi; Qi, Honggang; Huang, Qingming; Tian, Qi.

IEEE Trans Pattern Anal Mach Intell ; 46(5): 3509-3521, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38090835

RESUMEN

There are two mainstream approaches for object detection: top-down and bottom-up. The state-of-the-art approaches are mainly top-down methods. In this paper, we demonstrate that bottom-up approaches show competitive performance compared with top-down approaches and have higher recall rates. Our approach, named CenterNet, detects each object as a triplet of keypoints (top-left and bottom-right corners and the center keypoint). We first group the corners according to some designed cues and confirm the object locations based on the center keypoints. The corner keypoints allow the approach to detect objects of various scales and shapes and the center keypoint reduces the confusion introduced by a large number of false-positive proposals. Our approach is an anchor-free detector because it does not need to define explicit anchor boxes. We adapt our approach to backbones with different structures, including 'hourglass'-like networks and 'pyramid'-like networks, which detect objects in single-resolution and multi-resolution feature maps, respectively. On the MS-COCO dataset, CenterNet with Res2Net-101 and Swin-Transformer achieve average precisions (APs) of 53.7% and 57.1%, respectively, outperforming all existing bottom-up detectors and achieving state-of-the-art performance. We also design a real-time CenterNet model, which achieves a good trade-off between accuracy and speed, with an AP of 43.6% at 30.5 frames per second (FPS).

11.

Improved Diversity-Promoting Collaborative Metric Learning for Recommendation.

Bao, Shilong; Xu, Qianqian; Yang, Zhiyong; He, Yuan; Cao, Xiaochun; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-38861429

RESUMEN

Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems (RS), closing the gap between metric learning and collaborative filtering. Following the convention of RS, existing practices exploit unique user representation in their model design. This paper focuses on a challenging scenario where a user has multiple categories of interests. Under this setting, the unique user representation might induce preference bias, especially when the item category distribution is imbalanced. To address this issue, we propose a novel method called Diversity-Promoting Collaborative Metric Learning (DPCML), with the hope of considering the commonly ignored minority interest of the user. The key idea behind DPCML is to introduce a set of multiple representations for each user in the system where users' preference toward an item is aggregated by taking the minimum item-user distance among their embedding set. Specifically, we instantiate two effective assignment strategies to explore a proper quantity of vectors for each user. Meanwhile, a Diversity Control Regularization Scheme (DCRS) is developed to accommodate the multi-vector representation strategy better. Theoretically, we show that DPCML could induce a smaller generalization error than traditional CML. Furthermore, we notice that CML-based approaches usually require negative sampling to reduce the heavy computational burden caused by the pairwise objective therein. In this paper, we reveal the fundamental limitation of the widely adopted hard-aware sampling from the One-Way Partial AUC (OPAUC) perspective and then develop an effective sampling alternative for the CML-based paradigm. Finally, comprehensive experiments over a range of benchmark datasets speak to the efficacy of DPCML.

12.

Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering.

Yu, Ting; Fu, Kunhao; Zhang, Jian; Huang, Qingming; Yu, Jun.

IEEE Trans Image Process ; 33: 3115-3129, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38656836

RESUMEN

Long-term Video Question Answering (VideoQA) is a challenging vision-and-language bridging task focusing on semantic understanding of untrimmed long-term videos and diverse free-form questions, simultaneously emphasizing comprehensive cross-modal reasoning to yield precise answers. The canonical approaches often rely on off-the-shelf feature extractors to detour the expensive computation overhead, but often result in domain-independent modality-unrelated representations. Furthermore, the inherent gradient blocking between unimodal comprehension and cross-modal interaction hinders reliable answer generation. In contrast, recent emerging successful video-language pre-training models enable cost-effective end-to-end modeling but fall short in domain-specific ratiocination and exhibit disparities in task formulation. Toward this end, we present an entirely end-to-end solution for long-term VideoQA: Multi-granularity Contrastive cross-modal collaborative Generation (MCG) model. To derive discriminative representations possessing high visual concepts, we introduce Joint Unimodal Modeling (JUM) on a clip-bone architecture and leverage Multi-granularity Contrastive Learning (MCL) to harness the intrinsically or explicitly exhibited semantic correspondences. To alleviate the task formulation discrepancy problem, we propose a Cross-modal Collaborative Generation (CCG) module to reformulate VideoQA as a generative task instead of the conventional classification scheme, empowering the model with the capability for cross-modal high-semantic fusion and generation so as to rationalize and answer. Extensive experiments conducted on six publicly available VideoQA datasets underscore the superiority of our proposed method.

13.

Algorithm-Dependent Generalization of AUPRC Optimization: Theory and Algorithm.

Wen, Peisong; Xu, Qianqian; Yang, Zhiyong; He, Yuan; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; 46(7): 5062-5079, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38315603

RESUMEN

Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning. Despite extensive studies on AUPRC optimization, generalization is still an open problem. In this work, we present the first trial in the algorithm-dependent generalization of stochastic AUPRC optimization. The obstacles to our destination are three-fold. First, according to the consistency analysis, the majority of existing stochastic estimators are biased with biased sampling strategies. To address this issue, we propose a stochastic estimator with sampling-rate-invariant consistency and reduce the consistency error by estimating the full-batch scores with score memory. Second, standard techniques for algorithm-dependent generalization analysis cannot be directly applied to listwise losses. To fill this gap, we extend the model stability from instance-wise losses to listwise losses. Third, AUPRC optimization involves a compositional optimization problem, which brings complicated computations. In this work, we propose to reduce the computational complexity by matrix spectral decomposition. Based on these techniques, we derive the first algorithm-dependent generalization bound for AUPRC optimization. Motivated by theoretical results, we propose a generalization-induced learning framework, which improves the AUPRC generalization by equivalently increasing the batch size and the number of valid training examples. Practically, experiments on image retrieval and long-tailed classification speak to the effectiveness and soundness of our framework.

14.

SMART: Syntax-Calibrated Multi-Aspect Relation Transformer for Change Captioning.

Tu, Yunbin; Li, Liang; Su, Li; Zha, Zheng-Jun; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; 46(7): 4926-4943, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38349824

RESUMEN

Change captioning aims to describe the semantic change between two similar images. In this process, as the most typical distractor, viewpoint change leads to the pseudo changes about appearance and position of objects, thereby overwhelming the real change. Besides, since the visual signal of change appears in a local region with weak feature, it is difficult for the model to directly translate the learned change features into the sentence. In this paper, we propose a syntax-calibrated multi-aspect relation transformer to learn effective change features under different scenes, and build reliable cross-modal alignment between the change features and linguistic words during caption generation. Specifically, a multi-aspect relation learning network is designed to 1) explore the fine-grained changes under irrelevant distractors (e.g., viewpoint change) by embedding the relations of semantics and relative position into the features of each image; 2) learn two view-invariant image representations by strengthening their global contrastive alignment relation, so as to help capture a stable difference representation; 3) provide the model with the prior knowledge about whether and where the semantic change happened by measuring the relation between the representations of captured difference and the image pair. Through the above manner, the model can learn effective change features for caption generation. Further, we introduce the syntax knowledge of Part-of-Speech (POS) and devise a POS-based visual switch to calibrate the transformer decoder. The POS-based visual switch dynamically utilizes visual information during different word generation based on the POS of words. This enables the decoder to build reliable cross-modal alignment, so as to generate a high-level linguistic sentence about change. Extensive experiments show that the proposed method achieves the state-of-the-art performance on the three public datasets.

15.

Fine-Grained Accident Detection: Database and Algorithm.

Yu, Hongyang; Zhang, Xinfeng; Wang, Yaowei; Huang, Qingming; Yin, Baocai.

IEEE Trans Image Process ; 33: 1059-1069, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38265894

RESUMEN

This paper presents a novel fine-grained task for traffic accident analysis. Accident detection in surveillance or dashcam videos is a common task in the field of traffic accident analysis by using videos. However, common accident detection does not analyze the specific particulars of the accident, only identifies the accident's existence or occurrence time in a video. In this paper, we define the novel fine-grained accident detection task which contains fine-grained accident classification, temporal-spatial occurrence region localization, and accident severity estimation. A transformer-based framework combining the RGB and optical flow information of videos is proposed for fine-grained accident detection. Additionally, we introduce a challenging Fine-grained Accident Detection (FAD) database that covers multiple tasks in surveillance videos which places more emphasis on the overall perspective. Experimental results demonstrate that our model could effectively extract the video features for multiple tasks, indicating that current traffic accident analysis has limitations in dealing with the FAD task and that further research is indeed needed.

16.

Learning Hierarchical Modular Networks for Video Captioning.

Li, Guorong; Ye, Hanhua; Qi, Yuankai; Wang, Shuhui; Qing, Laiyun; Huang, Qingming; Yang, Ming-Hsuan.

IEEE Trans Pattern Anal Mach Intell ; 46(2): 1049-1064, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-37878438

RESUMEN

Video captioning aims to generate natural language descriptions for a given video clip. Existing methods mainly focus on end-to-end representation learning via word-by-word comparison between predicted captions and ground-truth texts. Although significant progress has been made, such supervised approaches neglect semantic alignment between visual and linguistic entities, which may negatively affect the generated captions. In this work, we propose a hierarchical modular network to bridge video representations and linguistic semantics at four granularities before generating captions: entity, verb, predicate, and sentence. Each level is implemented by one module to embed corresponding semantics into video representations. Additionally, we present a reinforcement learning module based on the scene graph of captions to better measure sentence similarity. Extensive experimental results show that the proposed method performs favorably against the state-of-the-art models on three widely-used benchmark datasets, including microsoft research video description corpus (MSVD), MSR-video to text (MSR-VTT), and video-and-TEXt (VATEX).

17.

Stereo Image Restoration via Attention-Guided Correspondence Learning.

Zhang, Shengping; Yu, Wei; Jiang, Feng; Nie, Liqiang; Yao, Hongxun; Huang, Qingming; Tao, Dacheng.

IEEE Trans Pattern Anal Mach Intell ; 46(7): 4850-4865, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38261483

RESUMEN

Although stereo image restoration has been extensively studied, most existing work focuses on restoring stereo images with limited horizontal parallax due to the binocular symmetry constraint. Stereo images with unlimited parallax (e.g., large ranges and asymmetrical types) are more challenging in real-world applications and have rarely been explored so far. To restore high-quality stereo images with unlimited parallax, this paper proposes an attention-guided correspondence learning method, which learns both self- and cross-views feature correspondence guided by parallax and omnidirectional attention. To learn cross-view feature correspondence, a Selective Parallax Attention Module (SPAM) is proposed to interact with cross-view features under the guidance of parallax attention that adaptively selects receptive fields for different parallax ranges. Furthermore, to handle asymmetrical parallax, we propose a Non-local Omnidirectional Attention Module (NOAM) to learn the non-local correlation of both self- and cross-view contexts, which guides the aggregation of global contextual features. Finally, we propose an Attention-guided Correspondence Learning Restoration Network (ACLRNet) upon SPAMs and NOAMs to restore stereo images by associating the features of two views based on the learned correspondence. Extensive experiments on five benchmark datasets demonstrate the effectiveness and generalization of the proposed method on three stereo image restoration tasks including super-resolution, denoising, and compression artifact reduction.

18.

Inductive State-Relabeling Adversarial Active Learning with Heuristic Clique Rescaling.

Zhang, Beichen; Li, Liang; Wang, Shuhui; Cai, Shaofei; Zha, Zheng-Jun; Tian, Qi; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Jul 23.

Artículo en Inglés | MEDLINE | ID: mdl-39042533

RESUMEN

Active learning (AL) is to design label-efficient algorithms by labeling the most representative samples. It reduces annotation cost and attracts increasing attention from the community. However, previous AL methods suffer from the inadequacy of annotations and unreliable uncertainty estimation. Moreover, we find that they ignore the intra-diversity of selected samples, which leads to sampling redundancy. In view of these challenges, we propose an inductive state-relabeling adversarial AL model (ISRA) that consists of a unified representation generator, an inductive state-relabeling discriminator, and a heuristic clique rescaling module. The generator introduces contrastive learning to leverage unlabeled samples for self-supervised training, where the mutual information is utilized to improve the representation quality for AL selection. Then, we design an inductive uncertainty indicator to learn the state score from labeled data and relabel unlabeled data with different importance for better discrimination of instructive samples. To solve the problem of sampling redundancy, the heuristic clique rescaling module measures the intra-diversity of candidate samples and recurrently rescales them to select the most informative samples. The experiments conducted on eight datasets and two imbalanced scenarios show that our model outperforms the previous state-of-the-art AL methods. As an extension on the cross-modal AL task, we apply ISRA to the image captioning and it also achieves superior performance.

19.

Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation.

Zhang, Ke; Yang, Yan; Yu, Jun; Fan, Jianping; Jiang, Hanliang; Huang, Qingming; Han, Weidong.

IEEE Trans Med Imaging ; PP2024 Jul 08.

Artículo en Inglés | MEDLINE | ID: mdl-38976466

RESUMEN

The potential of automated radiology report generation in alleviating the time-consuming tasks of radiologists is increasingly being recognized in medical practice. Existing report generation methods have evolved from using image-level features to the latest approach of utilizing anatomical regions, significantly enhancing interpretability. However, directly and simplistically using region features for report generation compromises the capability of relation reasoning and overlooks the common attributes potentially shared across regions. To address these limitations, we propose a novel region-based Attribute Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for report generation, utilizing scene graph generation as an auxiliary task to further enhance interpretability and relational reasoning capability. The core components of AP-ISG are the Iterative Scene Graph Generation (ISGG) module and the Attribute Prototype-guided Learning (APL) module. Specifically, ISSG employs an autoregressive scheme for structural edge reasoning and a contextualization mechanism for relational reasoning. APL enhances intra-prototype matching and reduces inter-prototype semantic overlap in the visual space to fully model the potential attribute commonalities among regions. Extensive experiments on the MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of AP-ISG across multiple metrics.

20.

Sequential Manipulation Against Rank Aggregation: Theory and Algorithm.

Ma, Ke; Xu, Qianqian; Zeng, Jinshan; Liu, Wei; Cao, Xiaochun; Sun, Yingfei; Huang, Qingming.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Jun 19.

Artículo en Inglés | MEDLINE | ID: mdl-38896521

RESUMEN

Rank aggregation with pairwise comparisons is widely encountered in sociology, politics, economics, psychology, sports, etc. Given the enormous social impact and the consequent incentives, the potential adversary has a strong motivation to manipulate the ranking list. However, the ideal attack opportunity and the excessive adversarial capability cause the existing methods to be impractical. To fully explore the potential risks, we leverage an online attack on the vulnerable data collection process. Since it is independent of rank aggregation and lacks effective protection mechanisms, we disrupt the data collection process by fabricating pairwise comparisons without knowledge of the future data or the true distribution. From the game-theoretic perspective, the confrontation scenario between the online manipulator and the ranker who takes control of the original data source is formulated as a distributionally robust game that deals with the uncertainty of knowledge. Then we demonstrate that the equilibrium in the above game is potentially favorable to the adversary by analyzing the vulnerability of the sampling algorithms such as Bernoulli and reservoir methods. According to the above theoretical analysis, different sequential manipulation policies are proposed under a Bayesian decision framework and a large class of parametric pairwise comparison models. For attackers with complete knowledge, we establish the asymptotic optimality of the proposed policies. To increase the success rate of the sequential manipulation with incomplete knowledge, a distributionally robust estimator, which replaces the maximum likelihood estimation in a saddle point problem, provides a conservative data generation solution. Finally, the corroborating empirical evidence shows that the proposed method manipulates the results of rank aggregation methods in a sequential manner.

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA