Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 95
Filtrar
1.
Nat Commun ; 15(1): 4464, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796464

RESUMO

By mimicking the neurons and synapses of the human brain and employing spiking neural networks on neuromorphic chips, neuromorphic computing offers a promising energy-efficient machine intelligence. How to borrow high-level brain dynamic mechanisms to help neuromorphic computing achieve energy advantages is a fundamental issue. This work presents an application-oriented algorithm-software-hardware co-designed neuromorphic system for this issue. First, we design and fabricate an asynchronous chip called "Speck", a sensing-computing neuromorphic system on chip. With the low processor resting power of 0.42mW, Speck can satisfy the hardware requirements of dynamic computing: no-input consumes no energy. Second, we uncover the "dynamic imbalance" in spiking neural networks and develop an attention-based framework for achieving the algorithmic requirements of dynamic computing: varied inputs consume energy with large variance. Together, we demonstrate a neuromorphic system with real-time power as low as 0.70mW. This work exhibits the promising potentials of neuromorphic computing with its asynchronous event-driven, sparse, and dynamic nature.


Assuntos
Algoritmos , Redes Neurais de Computação , Neurônios , Humanos , Neurônios/fisiologia , Modelos Neurológicos , Potenciais de Ação/fisiologia , Sinapses/fisiologia , Encéfalo/fisiologia , Software
2.
J Anal Methods Chem ; 2024: 9273705, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38737631

RESUMO

The accurate determination of the free nicotine content in cigarette smoke is crucial for assessing cigarette quality, studying harm and addiction, and reducing tar levels. Currently, the determination of free nicotine in tobacco products primarily relies on methods such as pH calculation, nuclear magnetic resonance (NMR) spectroscopy, headspace solid-phase microextraction (HS-SPME), and traditional solvent extraction. However, these methods have limitations that restrict their widespread application. In this study, the free nicotine in cigarette smoke was directly extracted by using cyclohexane according to the traditional solvent extraction method and detected via gas chromatography-mass spectrometry. Compared with the traditional two-phase solvent extraction, our experimental method is easy to execute and eliminates the influence of aqueous solutions on the original distribution of nicotine in cigarette smoke particulate matter. Furthermore, the presence of protonated nicotine in tobacco does not affect the determination. Compared with HS-SPME and NMR spectroscopy, our approach, which involves solvent extraction followed by chromatographic separation and instrumental detection, offers simplicity, improved precision, better detection limits, and reduced interference during the instrumental detection stage. The standard addition recoveries in the conducted experiment ranged from 96.2% to 102.5%. The limit of detection was 2.8 µg/cig, and the correlation coefficient (R2) for the quadratic regression of the standard curve exceeded 0.999. The relative standard deviation for parallel samples was between 1.7% and 3.4% (n = 5), fully meeting the requirements for the determination of free nicotine in cigarette smoke. Analysis of cigarette samples from 38 commercially available brands revealed that the content of free nicotine ranged from 0.376 to 0.716 mg/cig, with an average of 0.540 mg/cig, and free nicotine accounted for 39.1%-88.8% of the total nicotine content.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38787669

RESUMO

With the benefit of deep learning techniques, recent researches have made significant progress in image compression artifacts reduction. Despite their improved performances, prevailing methods only focus on learning a mapping from the compressed image to the original one but ignore the intrinsic attributes of the given compressed images, which greatly harms the performance of downstream parsing tasks. Different from these methods, we propose to decouple the intrinsic attributes into two complementary features for artifacts reduction, i.e., the compression-insensitive features to regularize the high-level semantic representations during training and the compression-sensitive features to be aware of the compression degree. To achieve this, we first employ adversarial training to regularize the compressed and original encoded features for retaining high-level semantics, and we then develop the compression quality-aware feature encoder for compression-sensitive features. Based on these dual complementary features, we propose a Dual Awareness Guidance Network (DAGN) to utilize these awareness features as transformation guidance during the decoding phase. In our proposed DAGN, we develop a cross-feature fusion module to maintain the consistency of compression-insensitive features by fusing compression-insensitive features into the artifacts reduction baseline. Our method achieves an average 2.06 dB PSNR gains on BSD500, outperforming state-of-the-art methods, and only requires 29.7 ms to process one image on BSD500. Besides, the experimental results on LIVE1 and LIU4K also demonstrates the efficiency, effectiveness, and superiority of the proposed method in terms of quantitative metrics, visual quality, and downstream machine vision tasks.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38502629

RESUMO

PSNR-oriented models are a critical class of super-resolution models with applications across various fields. However, these models tend to generate over-smoothed images, a problem that has been analyzed previously from the perspectives of models or loss functions, but without taking into account the impact of data properties. In this paper, we present a novel phenomenon that we term the center-oriented optimization (COO) problem, where a model's output converges towards the center point of similar high-resolution images, rather than towards the ground truth. We demonstrate that the strength of this problem is related to the uncertainty of data, which we quantify using entropy. We prove that as the entropy of high-resolution images increases, their center point will move further away from the clean image distribution, and the model will generate over-smoothed images. Implicitly optimizing the COO problem, perceptual-driven approaches such as perceptual loss, model structure optimization, or GAN-based methods can be viewed. We propose an explicit solution to the COO problem, called Detail Enhanced Contrastive Loss (DECLoss). DECLoss utilizes the clustering property of contrastive learning to directly reduce the variance of the potential high-resolution distribution and thereby decrease the entropy. We evaluate DECLoss on multiple super-resolution benchmarks and demonstrate that it improves the perceptual quality of PSNR-oriented models. Moreover, when applied to GAN-based methods, such as RaGAN, DECLoss helps to achieve state-of-the-art performance, such as 0.093 LPIPS with 24.51 PSNR on 4× downsampled Urban100, validating the effectiveness and generalization of our approach.

5.
Artigo em Inglês | MEDLINE | ID: mdl-38502633

RESUMO

Transformers have shown remarkable performance, however, their architecture design is a time-consuming process that demands expertise and trial-and-error. Thus, it is worthwhile to investigate efficient methods for automatically searching high-performance Transformers via Transformer Architecture Search (TAS). In order to improve the search efficiency, training-free proxy based methods have been widely adopted in Neural Architecture Search (NAS). Whereas, these proxies have been found to be inadequate in generalizing well to Transformer search spaces, as confirmed by several studies and our own experiments. This paper presents an effective scheme for TAS called TRansformer Architecture search with ZerO-cost pRoxy guided evolution (T-Razor) that achieves exceptional efficiency. Firstly, through theoretical analysis, we discover that the synaptic diversity of multi-head self-attention (MSA) and the saliency of multi-layer perceptron (MLP) are correlated with the performance of corresponding Transformers. The properties of synaptic diversity and synaptic saliency motivate us to introduce the ranks of synaptic diversity and saliency that denoted as DSS++ for evaluating and ranking Transformers. DSS++ incorporates correlation information among sampled Transformers to provide unified scores for both synaptic diversity and synaptic saliency. We then propose a block-wise evolution search guided by DSS++ to find optimal Transformers. DSS++ determines the positions for mutation and crossover, enhancing the exploration ability. Experimental results demonstrate that our T-Razor performs competitively against the state-of-the-art manually or automatically designed Transformer architectures across four popular Transformer search spaces. Significantly, T-Razor improves the searching efficiency across different Transformer search spaces, e.g., reducing required GPU days from more than 24 to less than 0.4 and outperforming existing zero-cost approaches. We also apply T-Razor to the BERT search space and find that the searched Transformers achieve competitive GLUE results on several Neural Language Processing (NLP) datasets. This work provides insights into training-free TAS, revealing the usefulness of evaluating Transformers based on the properties of their different blocks.

6.
Nat Commun ; 15(1): 2179, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38467684

RESUMO

Metagenomic binning is an essential technique for genome-resolved characterization of uncultured microorganisms in various ecosystems but hampered by the low efficiency of binning tools in adequately recovering metagenome-assembled genomes (MAGs). Here, we introduce BASALT (Binning Across a Series of Assemblies Toolkit) for binning and refinement of short- and long-read sequencing data. BASALT employs multiple binners with multiple thresholds to produce initial bins, then utilizes neural networks to identify core sequences to remove redundant bins and refine non-redundant bins. Using the same assemblies generated from Critical Assessment of Metagenome Interpretation (CAMI) datasets, BASALT produces up to twice as many MAGs as VAMB, DASTool, or metaWRAP. Processing assemblies from a lake sediment dataset, BASALT produces ~30% more MAGs than metaWRAP, including 21 unique class-level prokaryotic lineages. Functional annotations reveal that BASALT can retrieve 47.6% more non-redundant opening-reading frames than metaWRAP. These results highlight the robust handling of metagenomic sequencing data of BASALT.


Assuntos
Ecossistema , Metagenoma , Silicatos , Metagenoma/genética , Metagenômica/métodos
7.
Front Neurosci ; 18: 1371290, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38550564

RESUMO

Introduction: Spiking Neural Networks (SNNs), inspired by brain science, offer low energy consumption and high biological plausibility with their event-driven nature. However, the current SNNs are still suffering from insufficient performance. Methods: Recognizing the brain's adeptness at information processing for various scenarios with complex neuronal connections within and across regions, as well as specialized neuronal architectures for specific functions, we propose a Spiking Global-Local-Fusion Transformer (SGLFormer), that significantly improves the performance of SNNs. This novel architecture enables efficient information processing on both global and local scales, by integrating transformer and convolution structures in SNNs. In addition, we uncover the problem of inaccurate gradient backpropagation caused by Maxpooling in SNNs and address it by developing a new Maxpooling module. Furthermore, we adopt spatio-temporal block (STB) in the classification head instead of global average pooling, facilitating the aggregation of spatial and temporal features. Results: SGLFormer demonstrates its superior performance on static datasets such as CIFAR10/CIFAR100, and ImageNet, as well as dynamic vision sensor (DVS) datasets including CIFAR10-DVS and DVS128-Gesture. Notably, on ImageNet, SGLFormer achieves a top-1 accuracy of 83.73% with 64 M parameters, outperforming the current SOTA directly trained SNNs by a margin of 6.66%. Discussion: With its high performance, SGLFormer can support more computer vision tasks in the future. The codes for this study can be found in https://github.com/ZhangHanN1/SGLFormer.

8.
Artigo em Inglês | MEDLINE | ID: mdl-38319762

RESUMO

With the help of special neuromorphic hardware, spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (DRL). In this article, we focus on the task where the agent needs to learn multidimensional deterministic policies to control, which is very common in real scenarios. Recently, the surrogate gradient method has been utilized for training multilayer SNNs, which allows SNNs to achieve comparable performance with the corresponding deep networks in this task. Most existing spike-based reinforcement learning (RL) methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully connected (FC) layer. However, the decimal characteristic of the firing rate brings the floating-point matrix operations to the FC layer, making the whole SNN unable to deploy on the neuromorphic hardware directly. To develop a fully spiking actor network (SAN) without any floating-point matrix operations, we draw inspiration from the nonspiking interneurons found in insects and employ the membrane voltage of the nonspiking neurons to represent the action. Before the nonspiking neurons, multiple population neurons are introduced to decode different dimensions of actions. Since each population is used to decode a dimension of action, we argue that the neurons in each population should be connected in time domain and space domain. Hence, the intralayer connections are used in output populations to enhance the representation capacity. This mechanism exists extensively in animals and has been demonstrated effectively. Finally, we propose a fully SAN with intralayer connections (ILC-SAN). Extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art performance on continuous control tasks from OpenAI gym. Moreover, we estimate the theoretical energy consumption when deploying ILC-SAN on neuromorphic chips to illustrate its high energy efficiency.

9.
IEEE Trans Image Process ; 33: 1272-1284, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38285574

RESUMO

To manipulate large-scale data, anchor-based multi-view clustering methods have grown in popularity owing to their linear complexity in terms of the number of samples. However, these existing approaches pay less attention to two aspects. 1) They target at learning a shared affinity matrix by using the local information from every single view, yet ignoring the global information from all views, which may weaken the ability to capture complementary information. 2) They do not consider the removal of feature redundancy, which may affect the ability to depict the real sample relationships. To this end, we propose a novel fast multi-view clustering method via pick-and-place transform learning named PPTL, which could capture insightful global features to characterize the sample relationships quickly. Specifically, PPTL first concatenates all the views along the feature direction to produce a global matrix. Considering the redundancy of the global matrix, we design a pick-and-place transform with l2,p -norm regularization to abandon the poor features and consequently construct a compact global representation matrix. Thus, by conducting anchor-based subspace clustering on the compact global representation matrix, PPTL can learn a consensus skinny affinity matrix with a discriminative clustering structure. Numerous experiments performed on small-scale to large-scale datasets demonstrate that our method is not only faster but also achieves superior clustering performance over state-of-the-art methods across a majority of the datasets.

10.
J Assist Reprod Genet ; 41(3): 757-765, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38270748

RESUMO

PURPOSE: To investigate the prevalence of Y chromosome polymorphisms in Chinese men and analyze their associations with male infertility and female adverse pregnancy outcomes. METHODS: The clinical data of 32,055 Chinese men who underwent karyotype analysis from October 2014 to September 2019 were collected. Fisher's exact test, chi-square test, or Kruskal-Wallis test was used to analyze the effects of Y chromosome polymorphism on semen parameters, azoospermia factor (AZF) microdeletions, and female adverse pregnancy outcomes. RESULTS: The incidence of Y chromosome polymorphic variants was 1.19% (381/32,055) in Chinese men. The incidence of non-obstructive azoospermia (NOA) was significantly higher in men with the Yqh- variant than that in men with normal karyotype and other Y chromosome polymorphic variants (p < 0.050). The incidence of AZF microdeletions was significantly different among the normal karyotype and different Y chromosome polymorphic variant groups (p < 0.001). The detection rate of AZF microdeletions was 28.92% (24/83) in the Yqh- group and 2.50% (3/120) in the Y ≤ 21 group. The AZFb + c region was the most common AZF microdeletion (78.57%, 22/28), followed by AZFc microdeletion (7.14%,2/28) in NOA patients with Yqh- variants. There was no significant difference in the distribution of female adverse pregnancy outcomes among the normal karyotype and different Y chromosome polymorphic variant groups (p = 0.528). CONCLUSIONS: Patients with 46,XYqh- variant have a higher incidence of NOA and AZF microdeletions than patients with normal karyotype and other Y chromosome polymorphic variants. Y chromosome polymorphic variants do not affect female adverse pregnancy outcomes.


Assuntos
Azoospermia , Infertilidade Masculina , Oligospermia , Humanos , Masculino , Feminino , Azoospermia/epidemiologia , Azoospermia/genética , Estudos Retrospectivos , Deleção Cromossômica , Infertilidade Masculina/genética , Cromossomos Humanos Y/genética , China/epidemiologia , Oligospermia/genética
11.
Nucleic Acids Res ; 52(1): e3, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-37941140

RESUMO

Compared with proteins, DNA and RNA are more difficult languages to interpret because four-letter coded DNA/RNA sequences have less information content than 20-letter coded protein sequences. While BERT (Bidirectional Encoder Representations from Transformers)-like language models have been developed for RNA, they are ineffective at capturing the evolutionary information from homologous sequences because unlike proteins, RNA sequences are less conserved. Here, we have developed an unsupervised multiple sequence alignment-based RNA language model (RNA-MSM) by utilizing homologous sequences from an automatic pipeline, RNAcmap, as it can provide significantly more homologous sequences than manually annotated Rfam. We demonstrate that the resulting unsupervised, two-dimensional attention maps and one-dimensional embeddings from RNA-MSM contain structural information. In fact, they can be directly mapped with high accuracy to 2D base pairing probabilities and 1D solvent accessibilities, respectively. Further fine-tuning led to significantly improved performance on these two downstream tasks compared with existing state-of-the-art techniques including SPOT-RNA2 and RNAsnap2. By comparison, RNA-FM, a BERT-based RNA language model, performs worse than one-hot encoding with its embedding in base pair and solvent-accessible surface area prediction. We anticipate that the pre-trained RNA-MSM model can be fine-tuned on many other tasks related to RNA structure and function.


Assuntos
Aprendizado de Máquina , RNA , Alinhamento de Sequência , DNA/química , Proteínas , RNA/química , Solventes
12.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 2638-2657, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-37782582

RESUMO

Most existing learning-based deraining methods are supervisedly trained on synthetic rainy-clean pairs. The domain gap between the synthetic and real rain makes them less generalized to complex real rainy scenes. Moreover, the existing methods mainly utilize the property of the image or rain layers independently, while few of them have considered their mutually exclusive relationship. To solve above dilemma, we explore the intrinsic intra-similarity within each layer and inter-exclusiveness between two layers and propose an unsupervised non-local contrastive learning (NLCL) deraining method. The non-local self-similarity image patches as the positives are tightly pulled together and rain patches as the negatives are remarkably pushed away, and vice versa. On one hand, the intrinsic self-similarity knowledge within positive/negative samples of each layer benefits us to discover more compact representation; on the other hand, the mutually exclusive property between the two layers enriches the discriminative decomposition. Thus, the internal self-similarity within each layer (similarity) and the external exclusive relationship of the two layers (dissimilarity) serving as a generic image prior jointly facilitate us to unsupervisedly differentiate the rain from clean image. We further discover that the intrinsic dimension of the non-local image patches is generally higher than that of the rain patches. This insight motivates us to design an asymmetric contrastive loss that precisely models the compactness discrepancy of the two layers, thereby improving the discriminative decomposition. In addition, recognizing the limited quality of existing real rain datasets, which are often small-scale or obtained from the internet, we collect a large-scale real dataset under various rainy weathers that contains high-resolution rainy images. Extensive experiments conducted on different real rainy datasets demonstrate that the proposed method obtains state-of-the-art performance in real deraining.

13.
IEEE Trans Cybern ; 54(3): 1997-2010, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37824314

RESUMO

Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency. In practice, visible cameras can better perceive texture details and slow motion, while event cameras can be free from motion blurs and have a larger dynamic range which enables them to work well under fast motion and low illumination (LI). Therefore, the two sensors can cooperate with each other to achieve more reliable object tracking. In this work, we propose a large-scale Visible-Event benchmark (termed VisEvent) due to the lack of a realistic and scaled dataset for this task. Our dataset consists of 820 video pairs captured under LI, high speed, and background clutter scenarios, and it is divided into a training and a testing subset, each of which contains 500 and 320 videos, respectively. Based on VisEvent, we transform the event flows into event images and construct more than 30 baseline methods by extending current single-modality trackers into dual-modality versions. More importantly, we further build a simple but effective tracking algorithm by proposing a cross-modality transformer, to achieve more effective feature fusion between visible and event data. Extensive experiments on the proposed VisEvent dataset, FE108, COESOT, and two simulated datasets (i.e., OTB-DVS and VOT-DVS), validated the effectiveness of our model. The dataset and source code have been released on: https://github.com/wangxiao5791509/VisEvent_SOT_Benchmark.

14.
Artigo em Inglês | MEDLINE | ID: mdl-38090871

RESUMO

Data-dependent hashing methods aim to learn hash functions from the pairwise or triplet relationships among the data, which often lead to low efficiency and low collision rate by only capturing the local distribution of the data. To solve the limitation, we propose central similarity, in which the hash codes of similar data pairs are encouraged to approach a common center and those of dissimilar pairs to converge to different centers. As a new global similarity metric, central similarity can improve the efficiency and retrieval accuracy of hash learning. By introducing a new concept, hash centers, we principally formulate the computation of the proposed central similarity metric, in which the hash centers refer to a set of points scattered in the Hamming space with a sufficient mutual distance between each other. To construct well-separated hash centers, we provide two efficient methods: 1) leveraging the Hadamard matrix and Bernoulli distributions to generate data-independent hash centers and 2) learning data-dependent hash centers from data representations. Based on the proposed similarity metric and hash centers, we propose central similarity quantization (CSQ) that optimizes the central similarity between data points with respect to their hash centers instead of optimizing the local similarity to generate a high-quality deep hash function. We also further improve the CSQ with data-dependent hash centers, dubbed as CSQ with learnable center (CSQ [Formula: see text] ). The proposed CSQ and CSQ [Formula: see text] are generic and applicable to image and video hashing scenarios. We conduct extensive experiments on large-scale image and video retrieval tasks, and the proposed CSQ yields noticeably boosted retrieval performance, i.e., 3%-20% in mean average precision (mAP) over the previous state-of-the-art methods, which also demonstrates that our methods can generate cohesive hash codes for similar data pairs and dispersed hash codes for dissimilar pairs.

15.
Sci Adv ; 9(40): eadi1480, 2023 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-37801497

RESUMO

Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency by introducing neural dynamics and spike properties. As the emerging spiking deep learning paradigm attracts increasing interest, traditional programming frameworks cannot meet the demands of the automatic differentiation, parallel computation acceleration, and high integration of processing neuromorphic datasets and deployment. In this work, we present the SpikingJelly framework to address the aforementioned dilemma. We contribute a full-stack toolkit for preprocessing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips. Compared to existing methods, the training of deep SNNs can be accelerated 11×, and the superior extensibility and flexibility of SpikingJelly enable users to accelerate custom models at low costs through multilevel inheritance and semiautomatic code generation. SpikingJelly paves the way for synthesizing truly energy-efficient SNN-based machine intelligence systems, which will enrich the ecology of neuromorphic computing.


Assuntos
Algoritmos , Neurônios , Redes Neurais de Computação , Aprendizado de Máquina , Inteligência
16.
Nat Commun ; 14(1): 5798, 2023 09 18.
Artigo em Inglês | MEDLINE | ID: mdl-37723170

RESUMO

Biophysically detailed multi-compartment models are powerful tools to explore computational principles of the brain and also serve as a theoretical framework to generate algorithms for artificial intelligence (AI) systems. However, the expensive computational cost severely limits the applications in both the neuroscience and AI fields. The major bottleneck during simulating detailed compartment models is the ability of a simulator to solve large systems of linear equations. Here, we present a novel Dendritic Hierarchical Scheduling (DHS) method to markedly accelerate such a process. We theoretically prove that the DHS implementation is computationally optimal and accurate. This GPU-based method performs with 2-3 orders of magnitude higher speed than that of the classic serial Hines method in the conventional CPU platform. We build a DeepDendrite framework, which integrates the DHS method and the GPU computing engine of the NEURON simulator and demonstrate applications of DeepDendrite in neuroscience tasks. We investigate how spatial patterns of spine inputs affect neuronal excitability in a detailed human pyramidal neuron model with 25,000 spines. Furthermore, we provide a brief discussion on the potential of DeepDendrite for AI, specifically highlighting its ability to enable the efficient training of biophysically detailed models in typical image classification tasks.


Assuntos
Inteligência Artificial , Neurônios , Humanos , Algoritmos , Células Piramidais , Encéfalo
17.
Prog Neurobiol ; 230: 102521, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37673370

RESUMO

Attention can be deployed among external sensory stimuli or internal working memory (WM) representations, and recent primate studies have revealed that these external and internal selections share a common neural basis in the prefrontal cortex (PFC). However, it remains to be elucidated how PFC implements these selections, especially in humans. The present study aimed to further investigate whether PFC responded differentially to the peripheral and central retrospective cues (retro-cues) that induced attention selection among WM representations. To achieve this, we combined magnetoencephalography (MEG, Experiment 1) and transcranial magnetic stimulation (TMS, Experiment 2) with an orientation-recall paradigm. Experiment 1 found that a peripheral retro-cue with 100% reliability had a greater benefit on WM performance than a central retro-cue, while this advantage of peripheral over central cues vanished when the cue reliability dropped to 50% (non-informative). MEG source analysis indicated that the 100% peripheral retro-cue elicited earlier (∼125 ms) PFC responses than the central retro-cue (∼275 ms). Meanwhile, Granger causality analysis showed that PFC had earlier (0-200 ms) top-down signals projecting to the superior parietal lobule (SPL) and the lateral occipital cortex (LOC) after the onset of peripheral retro-cues, while these top-down signals appeared later (300-500 ms) after the onset of central retro-cues. Importantly, PFC activity within this period of 300-500 ms correlated with the peripheral advantage in behavior. Moreover, Experiment 2 applied TMS at different time points to test the causal influence of brain activity on behavior and found that stimulating PFC at 100 ms abolished the behavioral benefit of the peripheral retro-cue, as well as its advantage over the central retro-cue. Taken together, our results suggested that the advantage of peripheral over central retro-cues in the mnemonic domain is realized through faster top-down control from PFC, which challenged traditional opinions that the top-down control of attention on WM required at least 300 ms to appear. The present study highlighted that in addition to the causal role of PFC in attention selection of WM representations, timing was critical as well and faster was better.


Assuntos
Sinais (Psicologia) , Memória de Curto Prazo , Animais , Humanos , Reprodutibilidade dos Testes , Estudos Retrospectivos , Cognição
18.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 14020-14037, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37494161

RESUMO

DAVIS camera, streaming two complementary sensing modalities of asynchronous events and frames, has gradually been used to address major object detection challenges (e.g., fast motion blur and low-light). However, how to effectively leverage rich temporal cues and fuse two heterogeneous visual streams remains a challenging endeavor. To address this challenge, we propose a novel streaming object detector with Transformer, namely SODFormer, which first integrates events and frames to continuously detect objects in an asynchronous manner. Technically, we first build a large-scale multimodal neuromorphic object detection dataset (i.e., PKU-DAVIS-SOD) over 1080.1 k manual labels. Then, we design a spatiotemporal Transformer architecture to detect objects via an end-to-end sequence prediction problem, where the novel temporal Transformer module leverages rich temporal cues from two visual streams to improve the detection performance. Finally, an asynchronous attention-based fusion module is proposed to integrate two heterogeneous sensing modalities and take complementary advantages from each end, which can be queried at any time to locate objects and break through the limited output frequency from synchronized frame-based fusion strategies. The results show that the proposed SODFormer outperforms four state-of-the-art methods and our eight baselines by a significant margin. We also show that our unifying framework works well even in cases where the conventional frame-based camera fails, e.g., high-speed motion and low-light conditions. Our dataset and code can be available at https://github.com/dianzl/SODFormer.

19.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13553-13566, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37432804

RESUMO

Unsupervised domain adaption has been widely adopted in tasks with scarce annotated data. Unfortunately, mapping the target-domain distribution to the source-domain unconditionally may distort the essential structural information of the target-domain data, leading to inferior performance. To address this issue, we first propose to introduce active sample selection to assist domain adaptation regarding the semantic segmentation task. By innovatively adopting multiple anchors instead of a single centroid, both source and target domains can be better characterized as multimodal distributions, in which way more complementary and informative samples are selected from the target domain. With only a little workload to manually annotate these active samples, the distortion of the target-domain distribution can be effectively alleviated, achieving a large performance gain. In addition, a powerful semi-supervised domain adaptation strategy is proposed to alleviate the long-tail distribution problem and further improve the segmentation performance. Extensive experiments are conducted on public datasets, and the results demonstrate that the proposed approach outperforms state-of-the-art methods by large margins and achieves similar performance to the fully-supervised upperbound, i.e., 71.4% mIoU on GTA5 and 71.8% mIoU on SYNTHIA. The effectiveness of each component is also verified by thorough ablation studies.

20.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11720-11732, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37134032

RESUMO

Few-shot learning aims to recognize novel queries with limited support samples by learning from base knowledge. Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains, which are usually infeasible for realistic applications. Toward this issue, we propose to address the cross-domain few-shot learning problem where only extremely few samples are available in target domains. Under this realistic setting, we focus on the fast adaptation capability of meta-learners by proposing an effective dual adaptive representation alignment approach. In our approach, a prototypical feature alignment is first proposed to recalibrate support instances as prototypes and reproject these prototypes with a differentiable closed-form solution. Therefore feature spaces of learned knowledge can be adaptively transformed to query spaces by the cross-instance and cross-prototype relations. Besides the feature alignment, we further present a normalized distribution alignment module, which exploits prior statistics of query samples for solving the covariant shifts among the support and query samples. With these two modules, a progressive meta-learning framework is constructed to perform the fast adaptation with extremely few-shot samples while maintaining its generalization capabilities. Experimental evidence demonstrates our approach achieves new state-of-the-art results on 4 CDFSL benchmarks and 4 fine-grained cross-domain benchmarks.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA