Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 62
Filtrar
1.
Comput Biol Med ; 176: 108545, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38749325

RESUMEN

Reliable classification of sleep stages is crucial in sleep medicine and neuroscience research for providing valuable insights, diagnoses, and understanding of brain states. The current gold standard method for sleep stage classification is polysomnography (PSG). Unfortunately, PSG is an expensive and cumbersome process involving numerous electrodes, often conducted in an unfamiliar clinic and annotated by a professional. Although commercial devices like smartwatches track sleep, their performance is well below PSG. To address these disadvantages, we present a feed-forward neural network that achieves gold-standard levels of agreement using only a single lead of electrocardiography (ECG) data. Specifically, the median five-stage Cohen's kappa is 0.725 on a large, diverse dataset of 5 to 90-year-old subjects. Comparisons with a comprehensive meta-analysis of between-human inter-rater agreement confirm the non-inferior performance of our model. Finally, we developed a novel loss function to align the training objective with Cohen's kappa. Our method offers an inexpensive, automated, and convenient alternative for sleep stage classification-further enhanced by a real-time scoring option. Cardiosomnography, or a sleep study conducted with ECG only, could take expert-level sleep studies outside the confines of clinics and laboratories and into realistic settings. This advancement democratizes access to high-quality sleep studies, considerably enhancing the field of sleep medicine and neuroscience. It makes less-expensive, higher-quality studies accessible to a broader community, enabling improved sleep research and more personalized, accessible sleep-related healthcare interventions.


Asunto(s)
Electrocardiografía , Redes Neurales de la Computación , Fases del Sueño , Humanos , Electrocardiografía/métodos , Fases del Sueño/fisiología , Adulto , Persona de Mediana Edad , Masculino , Anciano , Adolescente , Femenino , Anciano de 80 o más Años , Niño , Preescolar , Polisomnografía/métodos , Procesamiento de Señales Asistido por Computador
2.
Sci Adv ; 10(3): eadk1525, 2024 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-38232159

RESUMEN

Field programmable gate array (FPGA) is widely used in the acceleration of deep learning applications because of its reconfigurability, flexibility, and fast time-to-market. However, conventional FPGA suffers from the trade-off between chip area and reconfiguration latency, making efficient FPGA accelerations that require switching between multiple configurations still elusive. Here, we propose a ferroelectric field-effect transistor (FeFET)-based context-switching FPGA supporting dynamic reconfiguration to break this trade-off, enabling loading of arbitrary configuration without interrupting the active configuration execution. Leveraging the intrinsic structure and nonvolatility of FeFETs, compact FPGA primitives are proposed and experimentally verified. The evaluation results show our design shows a 63.0%/74.7% reduction in a look-up table (LUT)/connection block (CB) area and 82.7%/53.6% reduction in CB/switch box power consumption with a minimal penalty in the critical path delay (9.6%). Besides, our design yields significant time savings by 78.7 and 20.3% on average for context-switching and dynamic reconfiguration applications, respectively.

3.
Nat Biomed Eng ; 7(4): 546-558, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-34795394

RESUMEN

For brain-computer interfaces (BCIs), obtaining sufficient training data for algorithms that map neural signals onto actions can be difficult, expensive or even impossible. Here we report the development and use of a generative model-a model that synthesizes a virtually unlimited number of new data distributions from a learned data distribution-that learns mappings between hand kinematics and the associated neural spike trains. The generative spike-train synthesizer is trained on data from one recording session with a monkey performing a reaching task and can be rapidly adapted to new sessions or monkeys by using limited additional neural data. We show that the model can be adapted to synthesize new spike trains, accelerating the training and improving the generalization of BCI decoders. The approach is fully data-driven, and hence, applicable to applications of BCIs beyond motor control.


Asunto(s)
Interfaces Cerebro-Computador , Humanos , Algoritmos , Neuronas , Fenómenos Biomecánicos
4.
J Neurol ; 269(9): 4920-4938, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35501501

RESUMEN

OBJECTIVES: This study (1) describes and compares saccade and pupil abnormalities in patients with manifest alpha-synucleinopathies (αSYN: Parkinson's disease (PD), Multiple System Atrophy (MSA)) and a tauopathy (progressive supranuclear palsy (PSP)); (2) determines whether patients with rapid-eye-movement sleep behaviour disorder (RBD), a prodromal stage of αSYN, already have abnormal responses that may indicate a risk for developing PD or MSA. METHODS: Ninety (46 RBD, 27 PD, 17 MSA) patients with an αSYN, 10 PSP patients, and 132 healthy age-matched controls (CTRL) were examined with a 10-min video-based eye-tracking task (Free Viewing). Participants were free to look anywhere on the screen while saccade and pupil behaviours were measured. RESULTS: PD, MSA, and PSP spent more time fixating the centre of the screen than CTRL. All patient groups made fewer macro-saccades (> 2◦ amplitude) with smaller amplitude than CTRL. Saccade frequency was greater in RBD than in other patients. Following clip change, saccades were temporarily suppressed, then rebounded at a slower pace than CTRL in all patient groups. RBD had distinct, although discrete saccade abnormalities that were more marked in PD, MSA, and even more in PSP. The vertical saccade rate was reduced in all patients and decreased most in PSP. Clip changes produced large increases or decreases in screen luminance requiring pupil constriction or dilation, respectively. PSP elicited smaller pupil constriction/dilation responses than CTRL, while MSA elicited the opposite. CONCLUSION: RBD patients already have discrete but less pronounced saccade abnormalities than PD and MSA patients. Vertical gaze palsy and altered pupil control differentiate PSP from αSYN.


Asunto(s)
Atrofia de Múltiples Sistemas , Enfermedad de Parkinson , Parálisis Supranuclear Progresiva , Sinucleinopatías , Biomarcadores , Tecnología de Seguimiento Ocular , Humanos , Atrofia de Múltiples Sistemas/diagnóstico , Enfermedad de Parkinson/complicaciones , Enfermedad de Parkinson/diagnóstico , Parálisis Supranuclear Progresiva/diagnóstico
5.
Exp Brain Res ; 240(6): 1873-1885, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35445861

RESUMEN

The pupil responds to a salient stimulus appearing in the environment, in addition to its modulation by global luminance. These pupillary responses can be evoked by visual or auditory stimuli, scaled with stimulus salience, and enhanced by multisensory presentation. In addition, pupil size is modulated by various visual stimulus attributes, such as color, area, and motion. However, research that concurrently examines the influence of different factors on pupillary responses is limited. To explore how presentation of multiple visual stimuli influences human pupillary responses, we presented arrays of visual stimuli and systematically varied their luminance, color, and set size. Saliency level, computed by the saliency model, systematically changed with set size across all conditions, with higher saliency levels in larger set sizes. Pupillary constriction responses were evoked by the appearance of visual stimuli, with larger pupillary responses observed in larger set size. These effects were pronounced even though the global luminance level was unchanged using isoluminant chromatic stimuli. Furthermore, larger pupillary constriction responses were obtained in the blue, compared to other color conditions. Together, we argue that both cortical and subcortical areas contribute to the observed pupillary constriction modulated by set size and color.


Asunto(s)
Luz , Pupila , Humanos , Estimulación Luminosa , Pupila/fisiología
6.
IEEE Trans Neural Netw Learn Syst ; 33(8): 3778-3791, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-33596177

RESUMEN

The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also can adapt to new situations. In contrast, deep neural networks only learn one sophisticated but fixed mapping from inputs to outputs. This limits their applicability to more dynamic situations, where the input to output mapping may change with different contexts. A salient example is continual learning-learning new independent tasks sequentially without forgetting previous tasks. Continual learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby a previously learned mapping of an old task is erased when learning new mappings for new tasks. Herein, we propose a new biologically plausible type of deep neural network with extra, out-of-network, task-dependent biasing units to accommodate these dynamic situations. This allows, for the first time, a single network to learn potentially unlimited parallel input to output mappings, and to switch on the fly between them at runtime. Biasing units are programed by leveraging beneficial perturbations (opposite to well-known adversarial perturbations) for each task. Beneficial perturbations for a given task bias the network toward that task, essentially switching the network into a different mode to process that task. This largely eliminates catastrophic interference between tasks. Our approach is memory-efficient and parameter-efficient, can accommodate many tasks, and achieves the state-of-the-art performance across different tasks and domains.


Asunto(s)
Inteligencia Artificial , Redes Neurales de la Computación , Encéfalo , Humanos , Aprendizaje
7.
Sci Rep ; 11(1): 19020, 2021 09 24.
Artículo en Inglés | MEDLINE | ID: mdl-34561503

RESUMEN

Motor brain machine interfaces (BMIs) directly link the brain to artificial actuators and have the potential to mitigate severe body paralysis caused by neurological injury or disease. Most BMI systems involve a decoder that analyzes neural spike counts to infer movement intent. However, many classical BMI decoders (1) fail to take advantage of temporal patterns of spike trains, possibly over long time horizons; (2) are insufficient to achieve good BMI performance at high temporal resolution, as the underlying Gaussian assumption of decoders based on spike counts is violated. Here, we propose a new statistical feature that represents temporal patterns or temporal codes of spike events with richer description-wavelet average coefficients (WAC)-to be used as decoder input instead of spike counts. We constructed a wavelet decoder framework by using WAC features with a sliding-window approach, and compared the resulting decoder against classical decoders (Wiener and Kalman family) and new deep learning based decoders ( Long Short-Term Memory) using spike count features. We found that the sliding-window approach boosts decoding temporal resolution, and using WAC features significantly improves decoding performance over using spike count features.


Asunto(s)
Interfaces Cerebro-Computador , Corteza Motora/fisiología , Animales , Haplorrinos , Locomoción/fisiología , Aprendizaje Automático , Neuronas/fisiología , Análisis de Ondículas
8.
Eur J Neurosci ; 2019 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-31077473

RESUMEN

The saliency map has played a long-standing role in models and theories of visual attention, and it is now supported by neurobiological evidence from several cortical and subcortical brain areas. While visual saliency is computed during moments of active fixation, it is not known whether the same is true while engaged in smooth pursuit of a moving stimulus, which is very common in real-world vision. Here, we examined extrafoveal saliency coding in the superior colliculus, a midbrain area associated with attention and gaze, during smooth pursuit eye movements. We found that SC neurons from the superficial visual layers showed a robust representation of peripheral saliency evoked by a conspicuous stimulus embedded in a wide-field array of goal-irrelevant stimuli. In contrast, visuomotor neurons from the intermediate saccade-related layers showed a poor saliency representation, even though most of these neurons were visually responsive during smooth pursuit. These results confirm and extend previous findings that place the SCs in a unique role as a saliency map that monitors peripheral vision during foveation of stationary and now moving objects.

9.
Front Neurol ; 10: 80, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30833926

RESUMEN

Background: Fetal alcohol spectrum disorders (FASD) is one of the most common causes of developmental disabilities and neurobehavioral deficits. Despite the high-prevalence of FASD, the current diagnostic process is challenging and time- and money- consuming, with underreported profiles of the neurocognitive and neurobehavioral impairments because of limited clinical capacity. We assessed children/youth with FASD from a multimodal perspective and developed a high-performing, low-cost screening protocol using a machine learning framework. Methods and Findings: Participants with FASD and age-matched typically developing controls completed up to six assessments, including saccadic eye movement tasks (prosaccade, antisaccade, and memory-guided saccade), free viewing of videos, psychometric tests, and neuroimaging of the corpus callosum. We comparatively investigated new machine learning methods applied to these data, toward the acquisition of a quantitative signature of the neurodevelopmental deficits, and the development of an objective, high-throughput screening tool to identify children/youth with FASD. Our method provides a comprehensive profile of distinct measures in domains including sensorimotor and visuospatial control, visual perception, attention, inhibition, working memory, academic functions, and brain structure. We also showed that a combination of four to six assessments yields the best FASD vs. control classification accuracy; however, this protocol is expensive and time consuming. We conducted a cost/benefit analysis of the six assessments and developed a high-performing, low-cost screening protocol based on a subset of eye movement and psychometric tests that approached the best result under a range of constraints (time, cost, participant age, required administration, and access to neuroimaging facility). Using insights from the theory of value of information, we proposed an optimal annual screening procedure for children at risk of FASD. Conclusions: We developed a high-capacity, low-cost screening procedure under constrains, with high expected monetary benefit, substantial impact of the referral and diagnostic process, and expected maximized long-term benefits to the tested individuals and to society. This annual screening procedure for children/youth at risk of FASD can be easily and widely deployed for early identification, potentially leading to earlier intervention and treatment. This is crucial for neurodevelopmental disorders, to mitigate the severity of the disorder and/or frequency of secondary comorbidities.

10.
J Vis ; 19(1): 11, 2019 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-30650434

RESUMEN

Most visual saliency models that integrate top-down factors process task and context information using machine learning techniques. Although these methods have been successful in improving prediction accuracy for human attention, they require significant training data and are unable to provide an understanding of what makes information relevant to a task such that it will attract gaze. This means that we still lack a general theory for the interaction between task and attention or eye movements. Recently, Tanner and Itti (2017) proposed the theory of goal relevance to explain what makes information relevant to goals. In this work, we record eye movements of 80 participants who each played one of four variants of a Mario video game and construct a combined saliency model using features from three sources: bottom-up, learned top-down, and goal relevance. We use this model to predict the eye behavior and find that the addition of goal relevance significantly improves the Normalized Scanpath Saliency score of the model from 4.35 to 5.82 (p < 1 × 10-100).


Asunto(s)
Atención/fisiología , Movimientos Oculares/fisiología , Objetivos , Percepción Visual/fisiología , Humanos , Modelos Teóricos , Reconocimiento Visual de Modelos/fisiología , Juegos de Video
11.
Neural Comput ; 31(2): 344-387, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30576615

RESUMEN

This work lays the foundation for a framework of cortical learning based on the idea of a competitive column, which is inspired by the functional organization of neurons in the cortex. A column describes a prototypical organization for neurons that gives rise to an ability to learn scale, rotation, and translation-invariant features. This is empowered by a recently developed learning rule, conflict learning, which enables the network to learn over both driving and modulatory feedforward, feedback, and lateral inputs. The framework is further supported by introducing both a notion of neural ambiguity and an adaptive threshold scheme. Ambiguity, which captures the idea that too many decisions lead to indecision, gives the network a dynamic way to resolve locally ambiguous decisions. The adaptive threshold operates over multiple timescales to regulate neural activity under the varied arrival timings of input in a highly interconnected multilayer network with feedforward and feedback. The competitive column architecture is demonstrated on a large-scale (54,000 neurons and 18 million synapses), invariant model of border ownership. The model is trained on four simple, fixed-scale shapes: two squares, one rectangle, and one symmetric L-shape. Tested on 1899 synthetic shapes of varying scale and complexity, the model correctly assigned border ownership with 74% accuracy. The model's abilities were also illustrated on contours of objects taken from natural images. Combined with conflict learning, the competitive column and ambiguity give a better intuitive understanding of how feedback, modulation, and inhibition may interact in the brain to influence activation and learning.

12.
Proc Natl Acad Sci U S A ; 114(35): 9451-9456, 2017 08 29.
Artículo en Inglés | MEDLINE | ID: mdl-28808026

RESUMEN

Models of visual attention postulate the existence of a bottom-up saliency map that is formed early in the visual processing stream. Although studies have reported evidence of a saliency map in various cortical brain areas, determining the contribution of phylogenetically older pathways is crucial to understanding its origin. Here, we compared saliency coding from neurons in two early gateways into the visual system: the primary visual cortex (V1) and the evolutionarily older superior colliculus (SC). We found that, while the response latency to visual stimulus onset was earlier for V1 neurons than superior colliculus superficial visual-layer neurons (SCs), the saliency representation emerged earlier in SCs than in V1. Because the dominant input to the SCs arises from V1, these relative timings are consistent with the hypothesis that SCs neurons pool the inputs from multiple V1 neurons to form a feature-agnostic saliency map, which may then be relayed to other brain areas.


Asunto(s)
Corteza Visual/fisiología , Animales , Atención/fisiología , Macaca mulatta , Masculino , Neuronas/fisiología , Estimulación Luminosa , Psicofísica , Tiempo de Reacción , Colículos Superiores , Vías Visuales/fisiología
13.
Psychol Rev ; 124(2): 168-178, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28221085

RESUMEN

The concept of relevance is used ubiquitously in everyday life. However, a general quantitative definition of relevance has been lacking, especially as pertains to quantifying the relevance of sensory observations to one's goals. We propose a theoretical definition for the information value of data observations with respect to a goal, which we call "goal relevance." We consider the probability distribution of an agent's subjective beliefs over how a goal can be achieved. When new data are observed, its goal relevance is measured as the Kullback-Leibler divergence between belief distributions before and after the observation. Theoretical predictions about the relevance of different obstacles in simulated environments agreed with the majority response of 38 human participants in 83.5% of trials, beating multiple machine-learning models. Our new definition of goal relevance is general, quantitative, explicit, and allows one to put a number onto the previously elusive notion of relevance of observations to a goal. (PsycINFO Database Record


Asunto(s)
Atención , Objetivos , Intuición , Análisis y Desempeño de Tareas , Humanos , Modelos Psicológicos , Modelos Teóricos , Teoría Psicológica
14.
Neural Netw ; 88: 32-48, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28189041

RESUMEN

Although Hebbian learning has long been a key component in understanding neural plasticity, it has not yet been successful in modeling modulatory feedback connections, which make up a significant portion of connections in the brain. We develop a new learning rule designed around the complications of learning modulatory feedback and composed of three simple concepts grounded in physiologically plausible evidence. Using border ownership as a prototypical example, we show that a Hebbian learning rule fails to properly learn modulatory connections, while our proposed rule correctly learns a stimulus-driven model. To the authors' knowledge, this is the first time a border ownership network has been learned. Additionally, we show that the rule can be used as a drop-in replacement for a Hebbian learning rule to learn a biologically consistent model of orientation selectivity, a network which lacks any modulatory connections. Our results predict that the mechanisms we use are integral for learning modulatory connections in the brain and furthermore that modulatory connections have a strong dependence on inhibition.


Asunto(s)
Retroalimentación , Aprendizaje Automático , Modelos Neurológicos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Encéfalo/fisiología , Humanos , Aprendizaje/fisiología , Plasticidad Neuronal/fisiología
15.
Nat Commun ; 8: 14263, 2017 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-28117340

RESUMEN

Models of visual attention postulate the existence of a saliency map whose function is to guide attention and gaze to the most conspicuous regions in a visual scene. Although cortical representations of saliency have been reported, there is mounting evidence for a subcortical saliency mechanism, which pre-dates the evolution of neocortex. Here, we conduct a strong test of the saliency hypothesis by comparing the output of a well-established computational saliency model with the activation of neurons in the primate superior colliculus (SC), a midbrain structure associated with attention and gaze, while monkeys watched video of natural scenes. We find that the activity of SC superficial visual-layer neurons (SCs), specifically, is well-predicted by the model. This saliency representation is unlikely to be inherited from fronto-parietal cortices, which do not project to SCs, but may be computed in SCs and relayed to other areas via tectothalamic pathways.


Asunto(s)
Atención/fisiología , Modelos Neurológicos , Neuronas/fisiología , Colículos Superiores/fisiología , Percepción Visual/fisiología , Animales , Simulación por Computador , Macaca mulatta , Masculino , Modelos Animales , Vías Nerviosas , Estimulación Luminosa/métodos , Movimientos Sacádicos , Programas Informáticos , Colículos Superiores/citología , Tálamo/fisiología
16.
Behav Brain Sci ; 40: e140, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-29342619

RESUMEN

Hulleman & Olivers (H&O) make a much-needed stride forward for a better understanding of visual search behavior by rejecting theories based on discrete stimulus items. I propose that the framework could be further enhanced by clearly delineating distinct mechanisms for attention guidance, selection, and enhancement during visual search, instead of conflating them into a single functional field of view.


Asunto(s)
Atención , Estimulación Luminosa
17.
IEEE Trans Image Process ; 25(4): 1566-79, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26829792

RESUMEN

A large number of saliency models, each based on a different hypothesis, have been proposed over the past 20 years. In practice, while subscribing to one hypothesis or computational principle makes a model that performs well on some types of images, it hinders the general performance of a model on arbitrary images and large-scale data sets. One natural approach to improve overall saliency detection accuracy would then be fusing different types of models. In this paper, inspired by the success of late-fusion strategies in semantic analysis and multi-modal biometrics, we propose to fuse the state-of-the-art saliency models at the score level in a para-boosting learning fashion. First, saliency maps generated by several models are used as confidence scores. Then, these scores are fed into our para-boosting learner (i.e., support vector machine, adaptive boosting, or probability density estimator) to generate the final saliency map. In order to explore the strength of para-boosting learners, traditional transformation-based fusion strategies, such as Sum, Min, and Max, are also explored and compared in this paper. To further reduce the computation cost of fusing too many models, only a few of them are considered in the next step. Experimental results show that score-level fusion outperforms each individual model and can further reduce the performance gap between the current models and the human inter-observer model.

18.
Neuron ; 88(3): 442-4, 2015 Nov 04.
Artículo en Inglés | MEDLINE | ID: mdl-26539886

RESUMEN

Visually-guided behavior recruits a network of brain regions so extensive that it is often affected by neuropsychiatric disorders, producing measurable atypical oculomotor signatures. Wang et al. (2015) combine eye tracking with computational attention models to decipher the neurobehavioral signature of autism.


Asunto(s)
Trastorno del Espectro Autista/diagnóstico , Trastorno del Espectro Autista/fisiopatología , Movimientos Oculares/fisiología , Estimulación Luminosa/métodos , Percepción Visual/fisiología , Femenino , Humanos , Masculino
19.
Vision Res ; 116(Pt B): 113-26, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-25448115

RESUMEN

Previous studies have shown that gaze direction of actors in a scene influences eye movements of passive observers during free-viewing (Castelhano, Wieth, & Henderson, 2007; Borji, Parks, & Itti, 2014). However, no computational model has been proposed to combine bottom-up saliency with actor's head pose and gaze direction for predicting where observers look. Here, we first learn probability maps that predict fixations leaving head regions (gaze following fixations), as well as fixations on head regions (head fixations), both dependent on the actor's head size and pose angle. We then learn a combination of gaze following, head region, and bottom-up saliency maps with a Markov chain composed of head region and non-head region states. This simple structure allows us to inspect the model and make comments about the nature of eye movements originating from heads as opposed to other regions. Here, we assume perfect knowledge of actor head pose direction (from an oracle). The combined model, which we call the Dynamic Weighting of Cues model (DWOC), explains observers' fixations significantly better than each of the constituent components. Finally, in a fully automatic combined model, we replace the oracle head pose direction data with detections from a computer vision model of head pose. Using these (imperfect) automated detections, we again find that the combined model significantly outperforms its individual components. Our work extends the engineering and scientific applications of saliency models and helps better understand mechanisms of visual attention.


Asunto(s)
Movimientos Oculares/fisiología , Fijación Ocular/fisiología , Cabeza , Reconocimiento Visual de Modelos/fisiología , Postura/fisiología , Femenino , Humanos , Imagenología Tridimensional , Masculino , Probabilidad
20.
IEEE Trans Image Process ; 24(1): 163-75, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25420258

RESUMEN

One of the major problems found when developing a 3D recognition system involves the choice of keypoint detector and descriptor. To help solve this problem, we present a new method for the detection of 3D keypoints on point clouds and we perform benchmarking between each pair of 3D keypoint detector and 3D descriptor to evaluate their performance on object and category recognition. These evaluations are done in a public database of real 3D objects. Our keypoint detector is inspired by the behavior and neural architecture of the primate visual system. The 3D keypoints are extracted based on a bottom-up 3D saliency map, that is, a map that encodes the saliency of objects in the visual environment. The saliency map is determined by computing conspicuity maps (a combination across different modalities) of the orientation, intensity, and color information in a bottom-up and in a purely stimulus-driven manner. These three conspicuity maps are fused into a 3D saliency map and, finally, the focus of attention (or keypoint location) is sequentially directed to the most salient points in this map. Inhibiting this location automatically allows the system to attend to the next most salient location. The main conclusions are: with a similar average number of keypoints, our 3D keypoint detector outperforms the other eight 3D keypoint detectors evaluated by achieving the best result in 32 of the evaluated metrics in the category and object recognition experiments, when the second best detector only obtained the best result in eight of these metrics. The unique drawback is the computational time, since biologically inspired 3D keypoint based on bottom-up saliency is slower than the other detectors. Given that there are big differences in terms of recognition performance, size and time requirements, the selection of the keypoint detector and descriptor has to be matched to the desired task and we give some directions to facilitate this choice.


Asunto(s)
Algoritmos , Imagenología Tridimensional/métodos , Modelos Neurológicos , Reconocimiento de Normas Patrones Automatizadas/métodos , Programas Informáticos , Animales , Bases de Datos Factuales , Primates , Curva ROC , Percepción Visual
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...