Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
1.
IEEE Trans Image Process ; 30: 9470-9481, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34780327

RESUMO

Fine-grained visual recognition is to classify objects with visually similar appearances into subcategories, which has made great progress with the development of deep CNNs. However, handling subtle differences between different subcategories still remains a challenge. In this paper, we propose to solve this issue in one unified framework from two aspects, i.e., constructing feature-level interrelationships, and capturing part-level discriminative features. This framework, namely PArt-guided Relational Transformers (PART), is proposed to learn the discriminative part features with an automatic part discovery module, and to explore the intrinsic correlations with a feature transformation module by adapting the Transformer models from the field of natural language processing. The part discovery module efficiently discovers the discriminative regions which are highly-corresponded to the gradient descent procedure. Then the second feature transformation module builds correlations within the global embedding and multiple part embedding, enhancing spatial interactions among semantic pixels. Moreover, our proposed approach does not rely on additional part branches in the inference time and reaches state-of-the-art performance on 3 widely-used fine-grained object recognition benchmarks. Experimental results and explainable visualizations demonstrate the effectiveness of our proposed approach.

2.
Cereb Cortex ; 2021 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-34476462

RESUMO

The "sensory recruitment hypothesis" posits an essential role of sensory cortices in working memory, beyond the well-accepted frontoparietal areas. Yet, this hypothesis has recently been challenged. In the present study, participants performed a delayed orientation recall task while high-spatial-resolution 3 T functional magnetic resonance imaging (fMRI) signals were measured in posterior cortices. A multivariate inverted encoding model approach was used to decode remembered orientations based on blood oxygen level-dependent fMRI signals from visual cortices during the delay period. We found that not only did activity in the contralateral primary visual cortex (V1) retain high-fidelity representations of the visual stimuli, but activity in the ipsilateral V1 also contained such orientation tuning. Moreover, although the encoded tuning was faded in the contralateral V1 during the late delay period, tuning information in the ipsilateral V1 remained sustained. Furthermore, the ipsilateral representation was presented in secondary visual cortex (V2) as well, but not in other higher-level visual areas. These results thus supported the sensory recruitment hypothesis and extended it to the ipsilateral sensory areas, which indicated the distributed involvement of visual areas in visual working memory.

3.
Artigo em Inglês | MEDLINE | ID: mdl-34379596

RESUMO

Tracking-by-detection is a very popular framework for single-object tracking that attempts to search the target object within a local search window for each frame. Although such a local search mechanism works well on simple videos, however, it makes the trackers sensitive to extremely challenging scenarios, such as heavy occlusion and fast motion. In this article, we propose a novel and general target-aware attention mechanism (termed TANet) and integrate it with a tracking-by-detection framework to conduct joint local and global search for robust tracking. Specifically, we extract the features of the target object patch and continuous video frames; then, we concatenate and feed them into a decoder network to generate target-aware global attention maps. More importantly, we resort to adversarial training for better attention prediction. The appearance and motion discriminator networks are designed to ensure its consistency in spatial and temporal views. In the tracking procedure, we integrate target-aware attention with multiple trackers by exploring candidate search regions for robust tracking. Extensive experiments on both short- and long-term tracking benchmark datasets all validated the effectiveness of our algorithm.

4.
Artigo em Inglês | MEDLINE | ID: mdl-34428133

RESUMO

Open set recognition (OSR), aiming to simultaneously classify the seen classes and identify the unseen classes as unknown, is essential for reliable machine learning. The key challenge of OSR is how to reduce the empirical classification risk on the labeled known data and the open space risk on the potential unknown data simultaneously. To handle the challenge, we formulate the open space risk problem from the perspective of multi-class integration, and model the unexploited extra-class space with a novel concept Reciprocal Point. Follow this, a novel Adversarial Reciprocal Point Learning framework is proposed to minimize the overlap of known distribution and unknown distributions without loss of known classification accuracy. Specifically, each reciprocal point is learned by the extra-class space with the corresponding known category, and the confrontation among multiple known categories are employed to reduce the empirical classification risk. An adversarial margin constraint is proposed to reduce the open space risk by limiting the latent open space constructed by reciprocal points. Moreover, an instantiated adversarial enhancement method is designed to generate diverse and confusing training samples. Extensive experimental results on various benchmark datasets indicate that the proposed method is significantly superior to existing approaches and achieves state-of-the-art performance.

5.
IEEE Trans Image Process ; 30: 6855-6868, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34319875

RESUMO

Image-based salient object detection has made great progress over the past decades, especially after the revival of deep neural networks. By the aid of attention mechanisms to weight the image features adaptively, recent advanced deep learning-based models encourage the predicted results to approximate the ground-truth masks with as large predictable areas as possible, thus achieving the state-of-the-art performance. However, these methods do not pay enough attention to small areas prone to misprediction. In this way, it is still tough to accurately locate salient objects due to the existence of regions with indistinguishable foreground and background and regions with complex or fine structures. To address these problems, we propose a novel convolutional neural network with purificatory mechanism and structural similarity loss. Specifically, in order to better locate preliminary salient objects, we first introduce the promotion attention, which is based on spatial and channel attention mechanisms to promote attention to salient regions. Subsequently, for the purpose of restoring the indistinguishable regions that can be regarded as error-prone regions of one model, we propose the rectification attention, which is learned from the areas of wrong prediction and guide the network to focus on error-prone regions thus rectifying errors. Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects. In addition to paying different attention to these hard-to-distinguish regions, we also consider the structural constraints on complex regions and propose the Structural Similarity Loss. The proposed loss models the region-level pair-wise relationship between regions to assist these regions to calibrate their own saliency values. In experiments, the proposed purificatory mechanism and structural similarity loss can both effectively improve the performance, and the proposed approach outperforms 19 state-of-the-art methods on six datasets with a notable margin. Also, the proposed method is efficient and runs at over 27FPS on a single NVIDIA 1080Ti GPU.

6.
Artigo em Inglês | MEDLINE | ID: mdl-34125685

RESUMO

We propose a novel network pruning approach by information preserving of pretrained network weights (filters). Network pruning with the information preserving is formulated as a matrix sketch problem, which is efficiently solved by the off-the-shelf frequent direction method. Our approach, referred to as FilterSketch, encodes the second-order information of pretrained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure. FilterSketch requires neither training from scratch nor data-driven iterative optimization, leading to a several-orders-of-magnitude reduction of time cost in the optimization of pruning. Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of floating-point operations (FLOPs) and prunes 59.9% of network parameters with negligible accuracy cost for ResNet-110. On ILSVRC-2012, it reduces 45.5% of FLOPs and removes 43.0% of parameters with only 0.69% accuracy drop for ResNet-50. Our code and pruned models can be found at https://github.com/lmbxmu/FilterSketch.

7.
Pattern Recognit ; 118: 108006, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34002101

RESUMO

The fast pandemics of coronavirus disease (COVID-19) has led to a devastating influence on global public health. In order to treat the disease, medical imaging emerges as a useful tool for diagnosis. However, the computed tomography (CT) diagnosis of COVID-19 requires experts' extensive clinical experience. Therefore, it is essential to achieve rapid and accurate segmentation and detection of COVID-19. This paper proposes a simple yet efficient and general-purpose network, called Sequential Region Generation Network (SRGNet), to jointly detect and segment the lesion areas of COVID-19. SRGNet can make full use of the supervised segmentation information and then outputs multi-scale segmentation predictions. Through this, high-quality lesion-areas suggestions can be generated on the predicted segmentation maps, reducing the diagnosis cost. Simultaneously, the detection results conversely refine the segmentation map by a post-processing procedure, which significantly improves the segmentation accuracy. The superiorities of our SRGNet over the state-of-the-art methods are validated through extensive experiments on the built COVID-19 database.

8.
J Assist Reprod Genet ; 38(5): 1133-1141, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33656621

RESUMO

PURPOSE: The sperm DNA fragmentation index (DFI) was quantitatively measured and its relationship with age, semen quality, and infertility conditions was investigated. METHODS: Semen routine test and sperm DFI were performed in 2760 infertile male and 2354 male whose spouse experienced at least one unexplained miscarriage to analyze the correlation between sperm DNA damage, semen routine parameters, and age. RESULTS: Sperm DFI was significantly lower from patients whose wife experienced unexplained miscarriage compared to infertility males (p = 0.000). An inverse correlation between sperm DFI and sperm progressive motility was observed (rs = - 0.465, p = 0.000) and sperm DFI was positively correlated with age (rs = 0.255, p = 0.000). However, the correlation between sperm DFI and sperm concentration, semen volume, total sperm count, and motile sperm count were not proved. CONCLUSIONS: Sperm DFI is an important indicator for evaluating the quality of semen. Sperm DNA integrity testing is preferentially recommended to those who have decreased sperm progressive motility, especially older men. An integrative analysis of sperm DFI, sperm progressive motility, age, and infertility conditions can provide a more comprehensive assessment of male fertility.


Assuntos
Fragmentação do DNA , Infertilidade Masculina/genética , Reprodução/genética , Análise do Sêmen , Dano ao DNA/genética , Fertilidade/genética , Humanos , Infertilidade Masculina/patologia , Masculino , Sêmen/citologia , Contagem de Espermatozoides , Motilidade Espermática/genética , Espermatozoides/crescimento & desenvolvimento , Espermatozoides/patologia
9.
IEEE Trans Pattern Anal Mach Intell ; 43(9): 2936-2952, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-33710952

RESUMO

Neural architecture search (NAS) has achieved unprecedented performance in various computer vision tasks. However, most existing NAS methods are defected in search efficiency and model generalizability. In this paper, we propose a novel NAS framework, termed MIGO-NAS, with the aim to guarantee the efficiency and generalizability in arbitrary search spaces. On the one hand, we formulate the search space as a multivariate probabilistic distribution, which is then optimized by a novel multivariate information-geometric optimization (MIGO). By approximating the distribution with a sampling, training, and testing pipeline, MIGO guarantees the memory efficiency, training efficiency, and search flexibility. Besides, MIGO is the first time to decrease the estimation error of natural gradient in multivariate distribution. On the other hand, for a set of specific constraints, the neural architectures are generated by a novel dynamic programming network generation (DPNG), which significantly reduces the training cost under various hardware environments. Experiments validate the advantages of our approach over existing methods by establishing a superior accuracy and efficiency i.e., 2.39 test error on CIFAR-10 benchmark and 21.7 on ImageNet benchmark, with only 1.5 GPU hours and 96 GPU hours for searching, respectively. Besides, the searched architectures can be well generalize to computer vision tasks including object detection and semantic segmentation, i.e., 25× FLOPs compression, with 6.4 mAP gain over Pascal VOC dataset, and 29.9× FLOPs compression, with only 1.41 percent performance drop over Cityscapes dataset. The code is publicly available.

10.
Artigo em Inglês | MEDLINE | ID: mdl-33684047

RESUMO

Event cameras as bioinspired vision sensors have shown great advantages in high dynamic range and high temporal resolution in vision tasks. Asynchronous spikes from event cameras can be depicted using the marked spatiotemporal point processes (MSTPPs). However, how to measure the distance between asynchronous spikes in the MSTPPs still remains an open issue. To address this problem, we propose a general asynchronous spatiotemporal spike metric considering both spatiotemporal structural properties and polarity attributes for event cameras. Technically, the conditional probability density function is first introduced to describe the spatiotemporal distribution and polarity prior in the MSTPPs. Besides, a spatiotemporal Gaussian kernel is defined to capture the spatiotemporal structure, which transforms discrete spikes into the continuous function in a reproducing kernel Hilbert space (RKHS). Finally, the distance between asynchronous spikes can be quantified by the inner product in the RKHS. The experimental results demonstrate that the proposed approach outperforms the state-of-the-art methods and achieves significant improvement in computational efficiency. Especially, it is able to better depict the changes involving spatiotemporal structural properties and polarity attributes.

11.
IEEE Trans Cybern ; PP2021 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-33400673

RESUMO

Neuronal circuits formed in the brain are complex with intricate connection patterns. Such complexity is also observed in the retina with a relatively simple neuronal circuit. A retinal ganglion cell (GC) receives excitatory inputs from neurons in previous layers as driving forces to fire spikes. Analytical methods are required to decipher these components in a systematic manner. Recently a method called spike-triggered non-negative matrix factorization (STNMF) has been proposed for this purpose. In this study, we extend the scope of the STNMF method. By using retinal GCs as a model system, we show that STNMF can detect various computational properties of upstream bipolar cells (BCs), including spatial receptive field, temporal filter, and transfer nonlinearity. In addition, we recover synaptic connection strengths from the weight matrix of STNMF. Furthermore, we show that STNMF can separate spikes of a GC into a few subsets of spikes, where each subset is contributed by one presynaptic BC. Taken together, these results corroborate that STNMF is a useful method for deciphering the structure of neuronal circuits.

12.
IEEE Trans Pattern Anal Mach Intell ; 43(5): 1636-1648, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-31751267

RESUMO

Semantic object part segmentation is a fundamental task in object understanding and geometric analysis. The clear understanding of part relationships can be of great use to the segmentation process. In this work, we propose a novel Ordinal Multi-task Part Segmentation (OMPS) approach which explicitly models the part ordinal relationship to guide the segmentation process in a recurrent manner. Quantitative and qualitative experiments are conducted first to explore the mutual impacts among object parts and then an ordinal part inference algorithm is formulated via experimental observations. Specifically, our framework is mainly composed of two modules, the forward module to segment multiple parts as individual subtasks with prior knowledge, and the recurrent module to generate appropriate part priors with the ordinal inference algorithm. These two modules work iteratively to optimize the segmentation performance and the network parameters. Experimental results show that our approach outperforms the state-of-the-art models on human and vehicle part parsing benchmarks. Comprehensive evaluations are conducted to demonstrate the effectiveness of our approach in object part segmentation.

13.
Artigo em Inglês | MEDLINE | ID: mdl-33270558

RESUMO

Online image hashing has received increasing research attention recently, which processes large-scale data in a streaming fashion to update the hash functions on-the-fly. To this end, most existing works exploit this problem under a supervised setting, i.e., using class labels to boost the hashing performance, which suffers from the defects in both adaptivity and efficiency: First, large amounts of training batches are required to learn up-to-date hash functions, which leads to poor online adaptivity. Second, the training is time-consuming, which contradicts with the core need of online learning. In this paper, a novel supervised online hashing scheme, termed Fast Class-wise Updating for Online Hashing (FCOH), is proposed to address the above two challenges by introducing a novel and efficient inner product operation. To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches. Quantitatively, such a decomposition further leads to at least 75% storage saving. To further achieve online efficiency, we propose a semi-relaxation optimization, which accelerates the online training by treating different binary constraints independently. Without additional constraints and variables, the time complexity is significantly reduced. Such a scheme is also quantitatively shown to well preserve past information during updating hashing functions. We have quantitatively demonstrated that the collective effort of class-wise updating and semi-relaxation optimization provides a superior performance comparing to various state-of-the-art methods, which is verified through extensive experiments on three widely-used datasets.

14.
Opt Express ; 28(17): 25308-25318, 2020 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-32907054

RESUMO

We propose an effective scheme for high-precision three dimensional(3D) atom localization via measuring the population of excited state in a four-level atomic system driven by a probe field and three orthogonal standing-wave fields. In this scheme, the position-dependent multiphoton quantum destructive interference leads to multiphoton excitation of the excited state and enhances the fluorescence emission. We show that adjusting the frequency detuning and phase shifts associated with the standing-wave fields can modify the multiphoton quantum destructive interference and lead to a redistribution of the atoms. The maximal probability of finding the atom at the certain position in one period of the standing-wave fields can be 100% and the highest spatial precision is about 0.02λ.

15.
Reprod Toxicol ; 94: 8-12, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32259568

RESUMO

Sperm DNA fragmentation index (SDF), as an important supplement to routine semen parameters, has been proposed to discriminate between fertile and infertile men, and predicts the outcomes of natural conception and in vitro fertilization. Unfortunately there are uncertainty and contradictory evidences regarding the importance of SDF. An important reason is the fact that significant and fundamental research about SDF is rare. This study was designed to characterize the microRNA (miRNA) expression profile in seminal plasma of normospermic patients with different SDF and their implications in human fertility. Using next-generation sequencing (NGS), a total of 897 human miRNAs were detected from 10 seminal plasma samples, out of which 431 differentially expressed miRNAs in 5 pairs of seminal plasma samples (each pair of seminal plasma samples obtained from the same male), with 14 miRNAs were identified in all the pairs. According to the fold change and expression level, 7 miRNAs including miR-374b-5p, miR-429, hsa-miR-26b-5p, miR-21-5p, miR-4257, miR-135b-5p and miR-134-5p were selected for further excavation. MiR-374b-5p and miR-26b-5p were significantly different in 3 sets of individual seminal plasma samples with different SDF from total 90 infertile patients (30 patients each set). Our results demonstrate that the profile of miR-374b and miR-26b with significantly decreased expression could be used as a first indication of increased SDF. And miR-374b and miR-26b could serve as adjunct biomarkers for the diagnosis of idiopathic infertile males.


Assuntos
Fragmentação do DNA , Infertilidade Masculina/genética , MicroRNAs , Sêmen/metabolismo , Adulto , Biomarcadores/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Pessoa de Meia-Idade , Reação em Cadeia da Polimerase em Tempo Real , Adulto Jovem
16.
Front Genet ; 11: 319, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32318099

RESUMO

The impact of aging on reproductive outcomes has received considerable critical attention; however, there is much less information available on the effects of paternal age compared to the effects of maternal age. In this study, methylation levels of sperm rDNA promoter regions and Long Interspersed Nucleotide Element 1 (LINE-1) were measured using pyrosequencing and fertilization, day 3 good-quality embryo, pregnancies, and implantation results were assessed. We observed significantly increasing levels of DNA methylation in the sperm rDNA promoter regions with age based on stratifying the samples by age alone (P = 0.0001) and performing linear regression analysis (P < 0.0001). Meanwhile, no statistically significant correlations were observed between global LINE-1 methylation with age. No statistically significant correlations were observed between sperm rDNA promoter methylation levels and either the day 3 good-quality embryo rate or clinical pregnancy rate. In contrast, the correlation between sperm rDNA promoter methylation levels and fertilization (2 pronuclei) rate was nearly significant (P = 0.0707), especially the methylation levels of some individual CpG units (CpG_10, P = 0.0176; CpG_11, P = 0.0438; CpG_14, P = 0.0232) and rDNA promoter methylation levels measured using primerS2 (P = 0.0513). No significant correlation was found between sperm rDNA promoter methylation levels and fertilization rates (2 pronuclei, 1 pronuclei, and 1 polypronuclei). Our results demonstrate that sperm are susceptible to age-associated alterations in methylation levels of rDNA promoter regions, suggesting that sperm rDNA promoter methylation levels can be applied to DNA methylation-based age prediction, and that the aberrant methylation of rDNA promoters may be partially responsible for enhanced disease susceptibility of offspring sired by older fathers. Methylation levels of sperm rDNA promoter regions may correlate with polypronuclei rates of IVF programs.

17.
Sci Rep ; 10(1): 7146, 2020 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-32346004

RESUMO

Most of the existing recognition algorithms are proposed for closed set scenarios, where all categories are known beforehand. However, in practice, recognition is essentially an open set problem. There are categories we know called "knowns", and there are more we do not know called "unknowns". Enumerating all categories beforehand is never possible, consequently, it is infeasible to prepare sufficient training samples for those unknowns. Applying closed set recognition methods will naturally lead to unseen-category errors. To address this problem, we propose the prototype-based Open Deep Network (P-ODN) for open set recognition tasks. Specifically, we introduce prototype learning into open set recognition. Prototypes and prototype radiuses are trained jointly to guide a CNN network to derive more discriminative features. Then P-ODN detects the unknowns by applying a multi-class triplet thresholding method based on the distance metric between features and prototypes. Manual labeling the unknowns which are detected in the previous process as new categories. Predictors for new categories are added to the classification layer to "open" the deep neural networks to incorporate new categories dynamically. The weights of new predictors are initialized exquisitely by applying a distances based algorithm to transfer the learned knowledge. Consequently, this initialization method speeds up the fine-tuning process and reduce the samples needed to train new predictors. Extensive experiments show that P-ODN can effectively detect unknowns and needs only few samples with human intervention to recognize a new category. In the real world scenarios, our method achieves state-of-the-art performance on the UCF11, UCF50, UCF101 and HMDB51 datasets.

18.
Neural Netw ; 126: 42-51, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32197212

RESUMO

Recent studies have suggested that the cognitive process of the human brain is realized as probabilistic inference and can be further modeled by probabilistic graphical models like Markov random fields. Nevertheless, it remains unclear how probabilistic inference can be implemented by a network of spiking neurons in the brain. Previous studies have tried to relate the inference equation of binary Markov random fields to the dynamic equation of spiking neural networks through belief propagation algorithm and reparameterization, but they are valid only for Markov random fields with limited network structure. In this paper, we propose a spiking neural network model that can implement inference of arbitrary binary Markov random fields. Specifically, we design a spiking recurrent neural network and prove that its neuronal dynamics are mathematically equivalent to the inference process of Markov random fields by adopting mean-field theory. Furthermore, our mean-field approach unifies previous works. Theoretical analysis and experimental results, together with the application to image denoising, demonstrate that our proposed spiking neural network can get comparable results to that of mean-field inference.


Assuntos
Modelos Neurológicos , Redes Neurais de Computação , Encéfalo/fisiologia , Humanos , Modelos Estatísticos , Neurônios/fisiologia
19.
IEEE Trans Cybern ; 2020 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-32167923

RESUMO

Deep convolutional neural networks (CNNs) have demonstrated impressive performance on many visual tasks. Recently, they became useful models for the visual system in neuroscience. However, it is still not clear what is learned by CNNs in terms of neuronal circuits. When a deep CNN with many layers is used for the visual system, it is not easy to compare the structure components of CNNs with possible neuroscience underpinnings due to highly complex circuits from the retina to the higher visual cortex. Here, we address this issue by focusing on single retinal ganglion cells with biophysical models and recording data from animals. By training CNNs with white noise images to predict neuronal responses, we found that fine structures of the retinal receptive field can be revealed. Specifically, convolutional filters learned are resembling biological components of the retinal circuit. This suggests that a CNN learning from one single retinal cell reveals a minimal neural network carried out in this cell. Furthermore, when CNNs learned from different cells are transferred between cells, there is a diversity of transfer learning performance, which indicates that CNNs are cell specific. Moreover, when CNNs are transferred between different types of input images, here white noise versus natural images, transfer learning shows a good performance, which implies that CNNs indeed capture the full computational ability of a single retinal cell for different inputs. Taken together, these results suggest that CNNs could be used to reveal structure components of neuronal circuits, and provide a powerful model for neural system identification.

20.
Neural Netw ; 125: 19-30, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32070853

RESUMO

Neural coding is one of the central questions in systems neuroscience for understanding how the brain processes stimulus from the environment, moreover, it is also a cornerstone for designing algorithms of brain-machine interface, where decoding incoming stimulus is highly demanded for better performance of physical devices. Traditionally researchers have focused on functional magnetic resonance imaging (fMRI) data as the neural signals of interest for decoding visual scenes. However, our visual perception operates in a fast time scale of millisecond in terms of an event termed neural spike. There are few studies of decoding by using spikes. Here we fulfill this aim by developing a novel decoding framework based on deep neural networks, named spike-image decoder (SID), for reconstructing natural visual scenes, including static images and dynamic videos, from experimentally recorded spikes of a population of retinal ganglion cells. The SID is an end-to-end decoder with one end as neural spikes and the other end as images, which can be trained directly such that visual scenes are reconstructed from spikes in a highly accurate fashion. Our SID also outperforms on the reconstruction of visual stimulus compared to existing fMRI decoding models. In addition, with the aid of a spike encoder, we show that SID can be generalized to arbitrary visual scenes by using the image datasets of MNIST, CIFAR10, and CIFAR100. Furthermore, with a pre-trained SID, one can decode any dynamic videos to achieve real-time encoding and decoding of visual scenes by spikes. Altogether, our results shed new light on neuromorphic computing for artificial visual systems, such as event-based visual cameras and visual neuroprostheses.


Assuntos
Potenciais Evocados Visuais , Modelos Neurológicos , Redes Neurais de Computação , Córtex Visual/fisiologia , Animais , Interfaces Cérebro-Computador , Imageamento por Ressonância Magnética , Células Ganglionares da Retina/fisiologia , Córtex Visual/diagnóstico por imagem , Percepção Visual
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...