Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
IEEE Trans Image Process ; 33: 2759-2769, 2024.
Article in English | MEDLINE | ID: mdl-38530734

ABSTRACT

Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and then merge them into long video relations. On the other hand, top-down methods directly classify long video tubelet pairs. While recent video-based methods utilizing video tubelets have shown promising results, we argue that the effective modeling of spatial and temporal context plays a more significant role than the choice between clip tubelets and video tubelets. This motivates us to revisit the clip-based paradigm and explore the key success factors in VidVRD. In this paper, we propose a Hierarchical Context Model (HCM) that enriches the object-based spatial context and relation-based temporal context based on clips. We demonstrate that using clip tubelets can achieve superior performance compared to most video-based methods. Additionally, using clip tubelets offers more flexibility in model designs and helps alleviate the limitations associated with video tubelets, such as the challenging long-term object tracking problem and the loss of temporal information in long-term tubelet feature compression. Extensive experiments conducted on two challenging VidVRD benchmarks validate that our HCM achieves a new state-of-the-art performance, highlighting the effectiveness of incorporating advanced spatial and temporal context modeling within the clip-based paradigm.

2.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3388-3405, 2024 May.
Article in English | MEDLINE | ID: mdl-38090829

ABSTRACT

The training and inference of Graph Neural Networks (GNNs) are costly when scaling up to large-scale graphs. Graph Lottery Ticket (GLT) has presented the first attempt to accelerate GNN inference on large-scale graphs by jointly pruning the graph structure and the model weights. Though promising, GLT encounters robustness and generalization issues when deployed in real-world scenarios, which are also long-standing and critical problems in deep learning ideology. In real-world scenarios, the distribution of unseen test data is typically diverse. We attribute the failures on out-of-distribution (OOD) data to the incapability of discerning causal patterns, which remain stable amidst distribution shifts. In traditional spase graph learning, the model performance deteriorates dramatically as the graph/network sparsity exceeds a certain high level. Worse still, the pruned GNNs are hard to generalize to unseen graph data due to limited training set at hand. To tackle these issues, we propose the Resilient Graph Lottery Ticket (RGLT) to find more robust and generalizable GLT in GNNs. Concretely, we reactivate a fraction of weights/edges by instantaneous gradient information at each pruning point. After sufficient pruning, we conduct environmental interventions to extrapolate potential test distribution. Finally, we perform last several rounds of model averages to further improve generalization. We provide multiple examples and theoretical analyses that underpin the universality and reliability of our proposal. Further, RGLT has been experimentally verified across various independent identically distributed (IID) and out-of-distribution (OOD) graph benchmarks.

3.
Article in English | MEDLINE | ID: mdl-37478041

ABSTRACT

Sensors are the key to environmental monitoring, which impart benefits to smart cities in many aspects, such as providing real-time air quality information to assist human decision-making. However, it is impractical to deploy massive sensors due to the expensive costs, resulting in sparse data collection. Therefore, how to get fine-grained data measurement has long been a pressing issue. In this article, we aim to infer values at nonsensor locations based on observations from available sensors (termed spatiotemporal inference), where capturing spatiotemporal relationships among the data plays a critical role. Our investigations reveal two significant insights that have not been explored by previous works. First, data exhibit distinct patterns at both long-and short-term temporal scales, which should be analyzed separately. Second, short-term patterns contain more delicate relations, including those across spatial and temporal dimensions simultaneously, while long-term patterns involve high-level temporal trends. Based on these observations, we propose to decouple the modeling of short-and long-term patterns. Specifically, we introduce a joint spatiotemporal graph attention network to learn the relations across space and time for short-term patterns. Furthermore, we propose a graph recurrent network with a time skip strategy to alleviate the gradient vanishing problem and model the long-term dependencies. Experimental results on four public real-world datasets demonstrate that our method effectively captures both long-and short-term relations, achieving state-of-the-art performance against existing methods.

4.
J Neurosci ; 31(41): 14542-50, 2011 Oct 12.
Article in English | MEDLINE | ID: mdl-21994371

ABSTRACT

Painful events in our environment are often accompanied by stimuli from other sensory modalities. These stimuli may influence the perception and processing of acute pain, in particular when they comprise emotional cues, like facial expressions of people surrounding us. In this whole-head magnetoencephalography (MEG) study, we examined the neuronal mechanisms underlying the influence of emotional (fearful, angry, or happy) compared to neutral facial expressions on the processing of pain in humans. Independent of their valence, subjective pain ratings for intracutaneous inputs were higher when pain stimuli were presented together with emotional facial expressions than when they were presented with a neutral facial expression. Source reconstruction using linear beamforming revealed pain-induced early (70-270 ms) oscillatory beta-band activity (BBA; 15-25 Hz) and gamma-band activity (GBA; 60-80 Hz) in the sensorimotor cortex. The presentation of faces with emotional expressions compared to faces with neutral expressions led to a stronger bilateral suppression of the pain-induced BBA, possibly reflecting enhanced response readiness of the sensorimotor system. Moreover, pain-induced GBA in the sensorimotor cortex was larger for faces expressing fear than for faces expressing anger, which might reflect the facilitation of avoidance-motivated behavior triggered by the concurrent presentation of faces with fearful expressions and painful stimuli. Thus, the presence of emotional cues, like facial expressions from people surrounding us, while receiving acute pain may facilitate neuronal processes involved in the preparation and execution of adequate protective motor responses.


Subject(s)
Brain Mapping , Brain Waves/physiology , Cerebral Cortex/physiopathology , Emotions , Facial Expression , Pain/pathology , Adult , Biological Clocks/physiology , Electroencephalography , Evoked Potentials, Somatosensory/physiology , Female , Humans , Magnetoencephalography , Male , Pain Measurement , Spectrum Analysis , Time Factors , Young Adult
5.
IEEE Trans Cybern ; 49(12): 4243-4252, 2019 Dec.
Article in English | MEDLINE | ID: mdl-30296245

ABSTRACT

This paper focuses on weakly supervised image understanding, in which the semantic labels are available only at image-level, without the specific object or scene location in an image. Existing algorithms implicitly assume that image-level labels are error-free, which might be too restrictive. In practice, image labels obtained from the pretrained predictors are easily contaminated. To solve this problem, we propose a novel algorithm for weakly supervised segmentation when only noisy image labels are available during training. More specifically, a semantic space is constructed first by encoding image labels through a graphlet (i.e., superpixel cluster) embedding process. Then, we observe that in the semantic space, the distribution of graphlets from images with a same label remains stable, regardless of the noises in image labels. Therefore, we propose a generative model, called latent stability analysis, to discover the stable patterns from images with noisy labels. Inferring graphlet semantics by making use of these mid-level stable patterns is much more secure and accurate than directly transferring noisy image-level labels into different regions. Finally, we calculate the semantics of each superpixel using maximum majority voting of its correlated graphlets. Comprehensive experimental results show that our algorithm performs impressively when the image labels are predicted by either the hand-crafted or deeply learned image descriptors.

6.
IEEE Trans Cybern ; 49(6): 2156-2167, 2019 Jun.
Article in English | MEDLINE | ID: mdl-29993760

ABSTRACT

Accurately recognizing sophisticated sceneries from a rich variety of semantic categories is an indispensable component in many intelligent systems, e.g., scene parsing, video surveillance, and autonomous driving. Recently, there have emerged a large quantity of deep architectures for scene categorization, wherein promising performance has been achieved. However, these models cannot explicitly encode human visual perception toward different sceneries, i.e., the sequence of humans sequentially allocates their gazes. To solve this problem, we propose deep gaze shifting kernel to distinguish sceneries from different categories. Specifically, we first project regions from each scenery into the so-called perceptual space, which is established by combining color, texture, and semantic features. Then, a novel non-negative matrix factorization algorithm is developed which decomposes the regions' feature matrix into the product of the basis matrix and the sparse codes. The sparse codes indicate the saliency level of different regions. In this way, the gaze shifting path from each scenery is derived and an aggregation-based convolutional neural network is designed accordingly to learn its deep representation. Finally, the deep representations of gaze shifting paths from all the scene images are incorporated into an image kernel, which is further fed into a kernel SVM for scene categorization. Comprehensive experiments on six scenery data sets have demonstrated the superiority of our method over a series of shallow/deep recognition models. Besides, eye tracking experiments have shown that our predicted gaze shifting paths are 94.6% consistent with the real human gaze allocations.

7.
Proc Conf ; 2018: 2122-2132, 2018 Jun.
Article in English | MEDLINE | ID: mdl-32219222

ABSTRACT

Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.

8.
Eur J Pain ; 10(8): 757-65, 2006 Nov.
Article in English | MEDLINE | ID: mdl-16439173

ABSTRACT

We investigated pain evoked activity in the human secondary sensory cortex (SII) following clonidine administration in six healthy volunteers using multi-channel magnetoencephalography (MEG). Pain was elicited by electrical shocks applied intracutaneously to the fingertip. Subjects rated pain intensity and perceptions of tiredness and passiveness by numerical ranking scales. Each subject underwent two investigations, one week apart from each other, with clonidine doses of 1.5 or 3.0microg/kg, administered intravenously in a random order and double-blinded. We applied a total number of seven blocks, each consisting of 60 painful stimuli, with one adaptation block, one pre-medication block, four post-medication blocks and one recovery block at the end of the session. MEG data were analysed by dipole reconstruction using CURRY(R) (Neuroscan, Hamburg) software package. Cortical activity in the contralateral SII cortex appeared with peak latencies of 118.5+/-10ms. This activity was significantly reduced by clonidine, in parallel with a reduction of pain intensity and enhancement of subjective tiredness and passiveness. There was, however, no significant correlation between MEG and subjective effects. Although both clonidine doses had similar effects, the higher dose induced longer changes. Results indicate that intravenous clonidine is able to relieve pain, but the exact mechanism of clonidine at the level of the SII cortex remains unclear. It is possible that clonidine interacts with the brainstem ascending system regulating vigilance and arousal which would explain the observed decrement of pain induced activity in SII. An additional more specific analgesic action at spinal level cannot be excluded.


Subject(s)
Adrenergic alpha-Agonists/administration & dosage , Clonidine/administration & dosage , Pain/physiopathology , Somatosensory Cortex/drug effects , Somatosensory Cortex/physiopathology , Adult , Dose-Response Relationship, Drug , Double-Blind Method , Electroencephalography , Female , Humans , Image Processing, Computer-Assisted , Infusions, Intravenous , Magnetoencephalography , Male
9.
Neurosci Lett ; 328(1): 29-32, 2002 Aug 02.
Article in English | MEDLINE | ID: mdl-12123852

ABSTRACT

The influence of attention on the processing of pain in the secondary somatosensory cortex (SII) was analyzed using magnetoencephalography in response to painful infra-red heat stimuli applied to the left hand in six male healthy subjects, aged 22-28 years. Three experimental paradigms were chosen to deliver attention dependent results under comparable levels of vigilance. Single moving dipole sources for the pain-evoked responses were calculated in the individual cortex anatomy determined by magnetic resonance imaging. Though pain stimuli followed the same intensity pattern in all paradigms, evoked SII activity increased markedly from the low attention task to the mid-level attention task (P < 0.001). In contrast, further increase of attention from mid-level to high was not accompanied by an additional enhancement of SII activity. It therefore is concluded that activation patterns of SII follow a saturation function which cannot be enlarged by maximizing the relevance of the painful stimuli.


Subject(s)
Afferent Pathways/physiology , Attention/physiology , Nociceptors/physiology , Pain Threshold/physiology , Pain/physiopathology , Somatosensory Cortex/physiology , Acoustic Stimulation , Adult , Brain Mapping , Cues , Functional Laterality/physiology , Humans , Infrared Rays/adverse effects , Magnetoencephalography , Male , Nerve Fibers, Myelinated/physiology , Neural Conduction/physiology , Neuropsychological Tests , Pain Measurement , Physical Stimulation/adverse effects , Reaction Time/physiology
10.
IEEE Trans Image Process ; 23(3): 1419-29, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24723537

ABSTRACT

Photo aesthetic quality evaluation is a fundamental yet under addressed task in computer vision and image processing fields. Conventional approaches are frustrated by the following two drawbacks. First, both the local and global spatial arrangements of image regions play an important role in photo aesthetics. However, existing rules, e.g., visual balance, heuristically define which spatial distribution among the salient regions of a photo is aesthetically pleasing. Second, it is difficult to adjust visual cues from multiple channels automatically in photo aesthetics assessment. To solve these problems, we propose a new photo aesthetics evaluation framework, focusing on learning the image descriptors that characterize local and global structural aesthetics from multiple visual channels. In particular, to describe the spatial structure of the image local regions, we construct graphlets small-sized connected graphs by connecting spatially adjacent atomic regions. Since spatially adjacent graphlets distribute closely in their feature space, we project them onto a manifold and subsequently propose an embedding algorithm. The embedding algorithm encodes the photo global spatial layout into graphlets. Simultaneously, the importance of graphlets from multiple visual channels are dynamically adjusted. Finally, these post-embedding graphlets are integrated for photo aesthetics evaluation using a probabilistic model. Experimental results show that: 1) the visualized graphlets explicitly capture the aesthetically arranged atomic regions; 2) the proposed approach generalizes and improves four prominent aesthetic rules; and 3) our approach significantly outperforms state-of-the-art algorithms in photo aesthetics prediction.


Subject(s)
Algorithms , Biomimetics/methods , Cues , Esthetics , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Photography/methods , Image Enhancement/methods
11.
Exp Brain Res ; 180(2): 205-15, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17287993

ABSTRACT

Expectation of pain is an important adaptive process enabling individuals to avoid bodily harm. It reflects the linking of past experience and environmental cues with imminent threat. In the present study, we examined changes in perceived pain contingent upon variation of the interval between an auditory cue and a subsequent painful laser stimulus. The duration of the cue-to-stimulus delay was systematically varied between 2, 4 and 6 s. Pain intensity and evoked brain responses measured by EEG and MEG recordings were analysed. Pain ratings from 15 subjects increased with longer cue-to-pain delays, accompanied by an increase in activity of the midcingulate cortex (MCC), as modelled from evoked EEG potential maps. On the other hand, MEG-based source activity in secondary somatosensory (SII) cortex remained unaffected by manipulation of the cue-to-stimulus interval. We conclude that activity in limbic structures such as MCC play a key role in the temporal dynamics of recruitment of expectation towards pain. Although this reaction is adaptive if the individual is able to avoid the stimulus, it is maladaptive if such opportunity is not present.


Subject(s)
Brain Mapping , Cues , Pain Threshold/physiology , Pain , Reaction Time/physiology , Adult , Analysis of Variance , Electroencephalography , Evoked Potentials, Somatosensory/physiology , Humans , Magnetoencephalography , Male , Pain/pathology , Pain/physiopathology , Pain/psychology , Pain Measurement/methods , Psychophysics , Somatosensory Cortex/physiopathology
12.
Brain Behav Immun ; 19(4): 283-95, 2005 Jul.
Article in English | MEDLINE | ID: mdl-15890494

ABSTRACT

We investigated the effects of expectation on intensity ratings and somatosensory evoked magnetic fields and electrical potentials following painful infrared laser stimuli in six healthy subjects. The stimulus series contained trials preceded by different auditory cues which either contained valid, invalid or no information about the upcoming laser intensity. High and low intensities occurred equally probable across cue types. High intensity stimuli induced greater pain than low intensity across all cue types. Furthermore, laser intensity significantly interacted with cue validity: high intensity stimuli were perceived less painful and low intensity stimuli more painful following invalid compared to valid cues. The amplitude of the evoked magnetic field localized within the contralateral secondary somatosensory cortex (SII) at about 165 ms after laser stimuli varied also both with stimulus intensity and cue validity. The evoked electric potential peaked at about 300 ms after laser stimuli and yielded a single dipole source within a region encompassing the caudal anterior cingulate cortex and posterior cingulate cortex. Its amplitude also varied with stimulus intensity, but failed to show any cue validity effects. This result suggests a priming of early cortical nociceptive sensitivity by cues signaling pain severity. A possible contribution of the SII cortex to the manifestation of nocebo/placebo cognitions is discussed.


Subject(s)
Evoked Potentials, Somatosensory/physiology , Gyrus Cinguli/physiology , Judgment/physiology , Pain Threshold/physiology , Pain Threshold/psychology , Placebo Effect , Set, Psychology , Acoustic Stimulation , Adult , Association Learning/physiology , Cues , Electroencephalography , Functional Laterality/physiology , Humans , Magnetoencephalography , Male , Nociceptors/physiology , Pain Measurement , Reference Values
13.
Arzneimittelforschung ; 54(3): 143-51, 2004.
Article in English | MEDLINE | ID: mdl-15112860

ABSTRACT

AIM: The analgesic effects of morphine (CAS 57-27-2) in clinical use are well described. Sedation is discussed as a relevant side-effect, mostly based on data recorded in normal subjects without pain. The aim of this study was to quantify and to evaluate electrophysiologically the analgesic and sedative effects of morphine for the first time using an experimental pain model. METHODS: Analgesic and sedative effects of a low dose of morphine sulfate (CAS 6211-15-0; 10 mg i.v.) were determined using a standard phasic pain model (intracutaneously administered electrical pulses) in a placebo-controlled design with seven healthy subjects. Five blocks (1 block = 80 stimuli) of painful stimuli were applied, covering a period of 3 h. Analgesia was assessed by subjective pain ratings and by pain-related brain potentials. Sedation was determined by the power spectra of the spontaneous EEG, by auditory evoked potentials (AEP), reaction times and mood scales. RESULTS: In all subjects the pain related variables were suppressed maximally 2 h after morphine administration (p < 0.01 versus placebo), indicated by a decrease of the pain ratings by about 45% and of the pain related brain potentials by about 50%. Interestingly, no effect on any sedation variable was found (p > 0.05). CONCLUSION: The lack of sedative effects in the presence of marked analgesia was surprising in comparison with results of previous studies. It is concluded that the experimental pain increased the arousal level thus counteracting morphine-induced sedation. This may explain why other studies found relevant sedation after morphine application in the absence of pain. This underlines that sedative effects of analgesic drugs should be evaluated in the presence of pain. In relation to other analgesics (meperidine, pentacozine, nortilidine, flupirtine and tramadol) evaluated by exactly the same experimental protocol, morphine exhibited a potent analgesia with the smallest sedative effects of all.


Subject(s)
Analgesics, Opioid/pharmacology , Electroencephalography/drug effects , Hypnotics and Sedatives , Morphine/pharmacology , Adult , Affect/drug effects , Analgesics, Opioid/pharmacokinetics , Arousal/drug effects , Biotransformation , Double-Blind Method , Electric Stimulation , Humans , Hypnotics and Sedatives/pharmacokinetics , Male , Mass Spectrometry , Morphine/pharmacokinetics , Pain Measurement/drug effects , Pain Threshold/drug effects , Reaction Time/drug effects
SELECTION OF CITATIONS
SEARCH DETAIL