Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 12167-12178, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37339038

RESUMEN

In zero-shot learning (ZSL), the task of recognizing unseen categories when no data for training is available, state-of-the-art methods generate visual features from semantic auxiliary information (e.g., attributes). In this work, we propose a valid alternative (simpler, yet better scoring) to fulfill the very same task. We observe that, if first- and second-order statistics of the classes to be recognized were known, sampling from Gaussian distributions would synthesize visual features that are almost identical to the real ones as per classification purposes. We propose a novel mathematical framework to estimate first- and second-order statistics, even for unseen classes: our framework builds upon prior compatibility functions for ZSL and does not require additional training. Endowed with such statistics, we take advantage of a pool of class-specific Gaussian distributions to solve the feature generation stage through sampling. We exploit an ensemble mechanism to aggregate a pool of softmax classifiers, each trained in a one-seen-class-out fashion to better balance the performance over seen and unseen classes. Neural distillation is finally applied to fuse the ensemble into a single architecture which can perform inference through one forward pass only. Our method, termed Distilled Ensemble of Gaussian Generators, scores favorably with respect to state-of-the-art works.

2.
PLoS One ; 18(3): e0280987, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36888612

RESUMEN

Our brain constantly combines sensory information in unitary percept to build coherent representations of the environment. Even though this process could appear smooth, integrating sensory inputs from various sensory modalities must overcome several computational issues, such as recoding and statistical inferences problems. Following these assumptions, we developed a neural architecture replicating humans' ability to use audiovisual spatial representations. We considered the well-known ventriloquist illusion as a benchmark to evaluate its phenomenological plausibility. Our model closely replicated human perceptual behavior, proving a truthful approximation of the brain's ability to develop audiovisual spatial representations. Considering its ability to model audiovisual performance in a spatial localization task, we release our model in conjunction with the dataset we recorded for its validation. We believe it will be a powerful tool to model and better understand multisensory integration processes in experimental and rehabilitation environments.


Asunto(s)
Ilusiones , Percepción Visual , Humanos , Percepción Auditiva , Encéfalo , Simulación por Computador , Estimulación Acústica , Estimulación Luminosa
3.
Hum Brain Mapp ; 44(6): 2294-2306, 2023 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-36715247

RESUMEN

Multiple sclerosis (MS) is a neurological condition characterized by severe structural brain damage and by functional reorganization of the main brain networks that try to limit the clinical consequences of structural burden. Resting-state (RS) functional connectivity (FC) abnormalities found in this condition were shown to be variable across different MS phases, according to the severity of clinical manifestations. The article describes a system exploiting machine learning on RS FC matrices to discriminate different MS phenotypes and to identify relevant functional connections for MS stage characterization. To this end, the system exploits some mathematical properties of covariance-based RS FC representation, which can be described by a Riemannian manifold. The classification performance of the proposed framework was significantly above the chance level for all MS phenotypes. Moreover, the proposed system was successful in identifying relevant RS FC alterations contributing to an accurate phenotype classification.


Asunto(s)
Esclerosis Múltiple , Humanos , Esclerosis Múltiple/diagnóstico por imagen , Mapeo Encefálico , Inteligencia Artificial , Imagen por Resonancia Magnética , Vías Nerviosas/diagnóstico por imagen , Encéfalo/diagnóstico por imagen , Fenotipo
4.
Sci Rep ; 12(1): 19073, 2022 11 09.
Artículo en Inglés | MEDLINE | ID: mdl-36351956

RESUMEN

In this paper, we investigate brain activity associated with complex visual tasks, showing that electroencephalography (EEG) data can help computer vision in reliably recognizing actions from video footage that is used to stimulate human observers. Notably, we consider not only typical "explicit" video action benchmarks, but also more complex data sequences in which action concepts are only referred to, implicitly. To this end, we consider a challenging action recognition benchmark dataset-Moments in Time-whose video sequences do not explicitly visualize actions, but only implicitly refer to them (e.g., fireworks in the sky as an extreme example of "flying"). We employ such videos as stimuli and involve a large sample of subjects to collect a high-definition, multi-modal EEG and video data, designed for understanding action concepts. We discover an agreement among brain activities of different subjects stimulated by the same video footage. We name it as subjects consensus, and we design a computational pipeline to transfer knowledge from EEG to video, sharply boosting the recognition performance.


Asunto(s)
Electroencefalografía , Reconocimiento en Psicología , Humanos , Consenso , Encéfalo
5.
IEEE Trans Image Process ; 31: 7102-7115, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36346862

RESUMEN

Acoustic images are an emergent data modality for multimodal scene understanding. Such images have the peculiarity of distinguishing the spectral signature of the sound coming from different directions in space, thus providing a richer information as compared to that derived from single or binaural microphones. However, acoustic images are typically generated by cumbersome and costly microphone arrays which are not as widespread as ordinary microphones. This paper shows that it is still possible to generate acoustic images from off-the-shelf cameras equipped with only a single microphone and how they can be exploited for audio-visual scene understanding. We propose three architectures inspired by Variational Autoencoder, U-Net and adversarial models, and we assess their advantages and drawbacks. Such models are trained to generate spatialized audio by conditioning them to the associated video sequence and its corresponding monaural audio track. Our models are trained using the data collected by a microphone array as ground truth. Thus they learn to mimic the output of an array of microphones in the very same conditions. We assess the quality of the generated acoustic images considering standard generation metrics and different downstream tasks (classification, cross-modal retrieval and sound localization). We also evaluate our proposed models by considering multimodal datasets containing acoustic images, as well as datasets containing just monaural audio signals and RGB video frames. In all of the addressed downstream tasks we obtain notable performances using the generated acoustic data, when compared to the state of the art and to the results obtained using real acoustic images as input.


Asunto(s)
Acústica , Localización de Sonidos
6.
Neuroimage ; 239: 118288, 2021 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-34147631

RESUMEN

The relationship between structure and function is of interest in many research fields involving the study of complex biological processes. In neuroscience in particular, the fusion of structural and functional data can help to understand the underlying principles of the operational networks in the brain. To address this issue, this paper proposes a constrained autoregressive model leading to a representation of effective connectivity that can be used to better understand how the structure modulates the function. Or simply, it can be used to find novel biomarkers characterizing groups of subjects. In practice, an initial structural connectivity representation is re-weighted to explain the functional co-activations. This is obtained by minimizing the reconstruction error of an autoregressive model constrained by the structural connectivity prior. The model has been designed to also include indirect connections, allowing to split direct and indirect components in the functional connectivity, and it can be used with raw and deconvoluted BOLD signal. The derived representation of dependencies was compared to the well known dynamic causal model, giving results closer to known ground-truth. Further evaluation of the proposed effective network was performed on two typical tasks. In a first experiment the direct functional dependencies were tested on a community detection problem, where the brain was partitioned using the effective networks across multiple subjects. In a second experiment the model was validated in a case-control task, which aimed at differentiating healthy subjects from individuals with autism spectrum disorder. Results showed that using effective connectivity leads to clusters better describing the functional interactions in the community detection task, while maintaining the original structural organization, and obtaining a better discrimination in the case-control classification task.


Asunto(s)
Encéfalo/anatomía & histología , Conectoma , Modelos Neurológicos , Red Nerviosa/diagnóstico por imagen , Trastorno del Espectro Autista/diagnóstico por imagen , Encéfalo/diagnóstico por imagen , Causalidad , Simulación por Computador , Conjuntos de Datos como Asunto , Red en Modo Predeterminado , Humanos , Relación Estructura-Actividad
7.
IEEE Trans Pattern Anal Mach Intell ; 43(11): 4196-4202, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-33493111

RESUMEN

In state-of-the-art deep single-label classification models, the top- k (k=2,3,4, ...) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. Exploiting the information provided in the top k predicted classes boosts the final prediction of a model. We propose Guided Zoom, a novel way in which explainability could be used to improve model performance. We do so by making sure the model has "the right reasons" for a prediction. The reason/evidence upon which a deep neural network makes a prediction is defined to be the grounding, in the pixel space, for a specific class conditional probability in the model output. Guided Zoom examines how reasonable the evidence used to make each of the top- k predictions is. Test time evidence is deemed reasonable if it is coherent with evidence used to make similar correct decisions at training time. This leads to better informed predictions. We explore a variety of grounding techniques and study their complementarity for computing evidence. We show that Guided Zoom results in an improvement of a model's classification accuracy and achieves state-of-the-art classification performance on four fine-grained classification datasets. Our code is available at https://github.com/andreazuna89/Guided-Zoom.

8.
Sci Rep ; 10(1): 16549, 2020 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-33024225

RESUMEN

The retina is a complex circuit of the central nervous system whose aim is to encode visual stimuli prior the higher order processing performed in the visual cortex. Due to the importance of its role, modeling the retina to advance in interpreting its spiking activity output is a well studied problem. In particular, it has been shown that latent variable models can be used to model the joint distribution of Retinal Ganglion Cells (RGCs). In this work, we validate the applicability of Restricted Boltzmann Machines to model the spiking activity responses of a large a population of RGCs recorded with high-resolution electrode arrays. In particular, we show that latent variables can encode modes in the RGC activity distribution that are closely related to the visual stimuli. In contrast to previous work, we further validate our findings by comparing results associated with recordings from retinas under normal and altered encoding conditions obtained by pharmacological manipulation. In these conditions, we observe that the model reflects well-known physiological behaviors of the retina. Finally, we show that we can also discover temporal patterns, associated with distinct dynamics of the stimuli.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Retina/fisiología , Células Ganglionares de la Retina/fisiología , Algoritmos , Animales , Ratones , Estimulación Luminosa
9.
Sci Robot ; 5(46)2020 09 30.
Artículo en Inglés | MEDLINE | ID: mdl-32999049

RESUMEN

The increasing presence of robots in society necessitates a deeper understanding into what attitudes people have toward robots. People may treat robots as mechanistic artifacts or may consider them to be intentional agents. This might result in explaining robots' behavior as stemming from operations of the mind (intentional interpretation) or as a result of mechanistic design (mechanistic interpretation). Here, we examined whether individual attitudes toward robots can be differentiated on the basis of default neural activity pattern during resting state, measured with electroencephalogram (EEG). Participants observed scenarios in which a humanoid robot was depicted performing various actions embedded in daily contexts. Before they were introduced to the task, we measured their resting state EEG activity. We found that resting state EEG beta activity differentiated people who were later inclined toward interpreting robot behaviors as either mechanistic or intentional. This pattern is similar to the pattern of activity in the default mode network, which was previously demonstrated to have a social role. In addition, gamma activity observed when participants were making decisions about a robot's behavior indicates a relationship between theory of mind and said attitudes. Thus, we provide evidence that individual biases toward treating robots as either intentional agents or mechanistic artifacts can be detected at the neural level, already in a resting state EEG signal.


Asunto(s)
Actitud , Encéfalo/fisiología , Robótica/instrumentación , Adulto , Ritmo beta/fisiología , Electroencefalografía , Femenino , Ritmo Gamma/fisiología , Humanos , Masculino , Prejuicio , Descanso/fisiología , Análisis y Desempeño de Tareas , Adulto Joven
11.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2581-2593, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-31331879

RESUMEN

Heterogeneous data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while training data can be accurately collected to include a variety of sensory modalities, it is often the case that not all of them are available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to extract information from multimodal data in the training stage, in a form that can be exploited at test time, considering limitations such as noisy or missing modalities. This paper presents a new approach in this direction for RGB-D vision tasks, developed within the adversarial learning and privileged information frameworks. We consider the practical case of learning representations from depth and RGB videos, while relying only on RGB data at test time. We propose a new approach to train a hallucination network that learns to distill depth information via adversarial learning, resulting in a clean approach without several losses to balance or hyperparameters. We report state-of-the-art results for object classification on the NYUD dataset, and video action recognition on the largest multimodal dataset available for this task, the NTU RGB+D, as well as on the Northwestern-UCLA.

12.
Artículo en Inglés | MEDLINE | ID: mdl-31535994

RESUMEN

Small object tracking becomes an increasingly important task, which however has been largely unexplored in computer vision. The great challenges stem from the facts that: 1) small objects show extreme vague and variable appearances, and 2) they tend to be lost easier as compared to normal-sized ones due to the shaking of lens. In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift. We make three-fold contributions in this work. First, technically, we propose a new descriptor, named aggregation signature, based on saliency, able to represent highly distinctive features for small objects. Second, theoretically, we prove that the proposed signature matches the foreground object more accurately with a high probability. Third, experimentally, the aggregation signature achieves a high performance on multiple datasets, outperforming the state-of-the-art methods by large margins. Moreover, we contribute with two newly collected benchmark datasets, i.e., small90 and small112, for visually small object tracking. The datasets will be available in https://github.com/bczhangbczhang/.

13.
iScience ; 16: 242-249, 2019 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-31200114

RESUMEN

Can social gaze behavior reveal the leader during real-world group interactions? To answer this question, we developed a novel tripartite approach combining (1) computer vision methods for remote gaze estimation, (2) a detailed taxonomy to encode the implicit semantics of multi-party gaze features, and (3) machine learning methods to establish dependencies between leadership and visual behaviors. We found that social gaze behavior distinctively identified group leaders. Crucially, the relationship between leadership and gaze behavior generalized across democratic and autocratic leadership styles under conditions of low and high time-pressure, suggesting that gaze can serve as a general marker of leadership. These findings provide the first direct evidence that group visual patterns can reveal leadership across different social behaviors and validate a new promising method for monitoring natural group interactions.

14.
Neuroimage ; 196: 1-15, 2019 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30953833

RESUMEN

In this paper, we present an automated approach for segmenting multiple sclerosis (MS) lesions from multi-modal brain magnetic resonance images. Our method is based on a deep end-to-end 2D convolutional neural network (CNN) for slice-based segmentation of 3D volumetric data. The proposed CNN includes a multi-branch downsampling path, which enables the network to encode information from multiple modalities separately. Multi-scale feature fusion blocks are proposed to combine feature maps from different modalities at different stages of the network. Then, multi-scale feature upsampling blocks are introduced to upsize combined feature maps to leverage information from lesion shape and location. We trained and tested the proposed model using orthogonal plane orientations of each 3D modality to exploit the contextual information in all directions. The proposed pipeline is evaluated on two different datasets: a private dataset including 37 MS patients and a publicly available dataset known as the ISBI 2015 longitudinal MS lesion segmentation challenge dataset, consisting of 14 MS patients. Considering the ISBI challenge, at the time of submission, our method was amongst the top performing solutions. On the private dataset, using the same array of performance metrics as in the ISBI challenge, the proposed approach shows high improvements in MS lesion segmentation compared with other publicly available tools.


Asunto(s)
Encéfalo/diagnóstico por imagen , Encéfalo/patología , Imagenología Tridimensional/métodos , Imagen por Resonancia Magnética , Esclerosis Múltiple/diagnóstico por imagen , Esclerosis Múltiple/patología , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Redes Neurales de la Computación
15.
Sci Rep ; 9(1): 65, 2019 01 11.
Artículo en Inglés | MEDLINE | ID: mdl-30635604

RESUMEN

The analysis of the brain from a connectivity perspective is revealing novel insights into brain structure and function. Discovery is, however, hindered by the lack of prior knowledge used to make hypotheses. Additionally, exploratory data analysis is made complex by the high dimensionality of data. Indeed, to assess the effect of pathological states on brain networks, neuroscientists are often required to evaluate experimental effects in case-control studies, with hundreds of thousands of connections. In this paper, we propose an approach to identify the multivariate relationships in brain connections that characterize two distinct groups, hence permitting the investigators to immediately discover the subnetworks that contain information about the differences between experimental groups. In particular, we are interested in data discovery related to connectomics, where the connections that characterize differences between two groups of subjects are found. Nevertheless, those connections do not necessarily maximize the accuracy in classification since this does not guarantee reliable interpretation of specific differences between groups. In practice, our method exploits recent machine learning techniques employing sparsity to deal with weighted networks describing the whole-brain macro connectivity. We evaluated our technique on functional and structural connectomes from human and murine brain data. In our experiments, we automatically identified disease-relevant connections in datasets with supervised and unsupervised anatomy-driven parcellation approaches and by using high-dimensional datasets.


Asunto(s)
Encéfalo/anatomía & histología , Conectoma/métodos , Red Nerviosa/anatomía & histología , Vías Nerviosas/anatomía & histología , Animales , Encéfalo/fisiología , Humanos , Ratones , Red Nerviosa/fisiología , Vías Nerviosas/fisiología
16.
PLoS Biol ; 16(5): e2003663, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29813050

RESUMEN

Sleep science is entering a new era, thanks to new data-driven analysis approaches that, combined with mouse gene-editing technologies, show a promise in functional genomics and translational research. However, the investigation of sleep is time consuming and not suitable for large-scale phenotypic datasets, mainly due to the need for subjective manual annotations of electrophysiological states. Moreover, the heterogeneous nature of sleep, with all its physiological aspects, is not fully accounted for by the current system of sleep stage classification. In this study, we present a new data-driven analysis approach offering a plethora of novel features for the characterization of sleep. This novel approach allowed for identifying several substages of sleep that were hidden to standard analysis. For each of these substages, we report an independent set of homeostatic responses following sleep deprivation. By using our new substages classification, we have identified novel differences among various genetic backgrounds. Moreover, in a specific experiment with the Zfhx3 mouse line, a recent circadian mutant expressing both shortening of the circadian period and abnormal sleep architecture, we identified specific sleep states that account for genotypic differences at specific times of the day. These results add a further level of interaction between circadian clock and sleep homeostasis and indicate that dissecting sleep in multiple states is physiologically relevant and can lead to the discovery of new links between sleep phenotypes and genetic determinants. Therefore, our approach has the potential to significantly enhance the understanding of sleep physiology through the study of single mutations. Moreover, this study paves the way to systematic high-throughput analyses of sleep.


Asunto(s)
Fases del Sueño , Animales , Relojes Circadianos/genética , Electroencefalografía , Genotipo , Masculino , Ratones Endogámicos , Aprendizaje Automático no Supervisado
17.
IEEE Trans Cybern ; 48(5): 1619-1632, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-28622682

RESUMEN

A novel method is proposed for generic target tracking by audio measurements from a microphone array. To cope with noisy environments characterized by persistent and high energy interfering sources, a classification map (CM) based on spectral signatures is calculated by means of a machine learning algorithm. Next, the CM is combined with the acoustic map, describing the spatial distribution of sound energy, in order to obtain a cleaned joint map in which contributions from the disturbing sources are removed. A likelihood function is derived from this map and fed to a particle filter yielding the target location estimation on the acoustic image. The method is tested on two real environments, addressing both speaker and vehicle tracking. The comparison with a couple of trackers, relying on the acoustic map only, shows a sharp improvement in performance, paving the way to the application of audio tracking in real challenging environments.

18.
Cell Rep ; 18(10): 2521-2532, 2017 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-28273464

RESUMEN

We present a method for automated spike sorting for recordings with high-density, large-scale multielectrode arrays. Exploiting the dense sampling of single neurons by multiple electrodes, an efficient, low-dimensional representation of detected spikes consisting of estimated spatial spike locations and dominant spike shape features is exploited for fast and reliable clustering into single units. Millions of events can be sorted in minutes, and the method is parallelized and scales better than quadratically with the number of detected spikes. Performance is demonstrated using recordings with a 4,096-channel array and validated using anatomical imaging, optogenetic stimulation, and model-based quality control. A comparison with semi-automated, shape-based spike sorting exposes significant limitations of conventional methods. Our approach demonstrates that it is feasible to reliably isolate the activity of up to thousands of neurons and that dense, multi-channel probes substantially aid reliable spike sorting.


Asunto(s)
Potenciales de Acción/fisiología , Electrofisiología/instrumentación , Animales , Electrodos , Imagenología Tridimensional , Ratones Endogámicos C57BL , Modelos Neurológicos , Optogenética , Reproducibilidad de los Resultados , Células Ganglionares de la Retina/fisiología
19.
Artif Intell Med ; 70: 1-11, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27431033

RESUMEN

OBJECTIVE: High-throughput technologies have generated an unprecedented amount of high-dimensional gene expression data. Algorithmic approaches could be extremely useful to distill information and derive compact interpretable representations of the statistical patterns present in the data. This paper proposes a mining approach to extract an informative representation of gene expression profiles based on a generative model called the Counting Grid (CG). METHOD: Using the CG model, gene expression values are arranged on a discrete grid, learned in a way that "similar" co-expression patterns are arranged in close proximity, thus resulting in an intuitive visualization of the dataset. More than this, the model permits to identify the genes that distinguish between classes (e.g. different types of cancer). Finally, each sample can be characterized with a discriminative signature - extracted from the model - that can be effectively employed for classification. RESULTS: A thorough evaluation on several gene expression datasets demonstrate the suitability of the proposed approach from a twofold perspective: numerically, we reached state-of-the-art classification accuracies on 5 datasets out of 7, and similar results when the approach is tested in a gene selection setting (with a stability always above 0.87); clinically, by confirming that many of the genes highlighted by the model as significant play also a key role for cancer biology. CONCLUSION: The proposed framework can be successfully exploited to meaningfully visualize the samples; detect medically relevant genes; properly classify samples.


Asunto(s)
Algoritmos , Minería de Datos , Perfilación de la Expresión Génica , Análisis por Conglomerados , Genes Relacionados con las Neoplasias , Humanos , Neoplasias/genética
20.
Springerplus ; 5(1): 2114, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-28090428

RESUMEN

[This corrects the article DOI: 10.1186/s40064-016-2786-0.].

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...