RESUMO
Experimental design and computational modelling across the cognitive sciences often rely on measures of semantic similarity between concepts. Traditional measures of semantic similarity are typically derived from distance in taxonomic databases (e.g. WordNet), databases of participant-produced semantic features, or corpus-derived linguistic distributional similarity (e.g. CBOW), all of which are theoretically problematic in their lack of grounding in sensorimotor experience. We present a new measure of sensorimotor distance between concepts, based on multidimensional comparisons of their experiential strength across 11 perceptual and action-effector dimensions in the Lancaster Sensorimotor Norms. We demonstrate that, in modelling human similarity judgements, sensorimotor distance has comparable explanatory power to other measures of semantic similarity, explains variance in human judgements which is missed by other measures, and does so with the advantages of remaining both grounded and computationally efficient. Moreover, sensorimotor distance is equally effective for both concrete and abstract concepts. We further introduce a web-based tool ( https://lancaster.ac.uk/psychology/smdistance ) for easily calculating and visualising sensorimotor distance between words, featuring coverage of nearly 800 million word pairs. Supplementary materials are available at https://osf.io/d42q6/ .
Assuntos
Linguística , Semântica , Humanos , Formação de Conceito , Ciência Cognitiva , Gerenciamento de DadosRESUMO
There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental 'machine states', generated as the ASR analysis progresses over time, to the incremental 'brain states', measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain.
Assuntos
Encéfalo/fisiologia , Modelos Neurológicos , Redes Neurais de Computação , Percepção da Fala/fisiologia , Interface para o Reconhecimento da Fala , Adulto , Eletroencefalografia , Feminino , Humanos , Magnetoencefalografia , Masculino , Adulto JovemRESUMO
Neuronal population codes are increasingly being investigated with multivariate pattern-information analyses. A key challenge is to use measured brain-activity patterns to test computational models of brain information processing. One approach to this problem is representational similarity analysis (RSA), which characterizes a representation in a brain or computational model by the distance matrix of the response patterns elicited by a set of stimuli. The representational distance matrix encapsulates what distinctions between stimuli are emphasized and what distinctions are de-emphasized in the representation. A model is tested by comparing the representational distance matrix it predicts to that of a measured brain region. RSA also enables us to compare representations between stages of processing within a given brain or model, between brain and behavioral data, and between individuals and species. Here, we introduce a Matlab toolbox for RSA. The toolbox supports an analysis approach that is simultaneously data- and hypothesis-driven. It is designed to help integrate a wide range of computational models into the analysis of multichannel brain-activity measurements as provided by modern functional imaging and neuronal recording techniques. Tools for visualization and inference enable the user to relate sets of models to sets of brain regions and to statistically test and compare the models using nonparametric inference methods. The toolbox supports searchlight-based RSA, to continuously map a measured brain volume in search of a neuronal population code with a specific geometry. Finally, we introduce the linear-discriminant t value as a measure of representational discriminability that bridges the gap between linear decoding analyses and RSA. In order to demonstrate the capabilities of the toolbox, we apply it to both simulated and real fMRI data. The key functions are equally applicable to other modalities of brain-activity measurement. The toolbox is freely available to the community under an open-source license agreement (http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/license/).
Assuntos
Processamento Eletrônico de Dados , Encéfalo/citologia , Encéfalo/fisiologia , Simulação por Computador , Humanos , Modelos TeóricosRESUMO
Current research suggests that language comprehension engages two joint but functionally distinguishable neurobiological processes: a distributed bilateral system, which supports general perceptual and interpretative processes underpinning speech comprehension, and a left hemisphere (LH) frontotemporal system, selectively tuned to the processing of combinatorial grammatical sequences, such as regularly inflected verbs in English [Marslen-Wilson, W. D., & Tyler, L. K. Morphology, language and the brain: The decompositional substrate for language comprehension. Philosophical Transactions of the Royal Society: Biological Sciences, 362, 823-836, 2007]. Here we investigated how English derivationally complex words engage these systems, asking whether they selectively activate the LH system in the same way as inflections or whether they primarily engage the bilateral system that support nondecompositional access. In an fMRI study, we saw no evidence for selective activation of the LH frontotemporal system, even for highly transparent forms like bravely. Instead, a combination of univariate and multivariate analyses revealed the engagement of a distributed bilateral system, modulated by factors of perceptual complexity and semantic transparency. We discuss the implications for theories of the processing and representation of English derivational morphology and highlight the importance of neurobiological constraints in understanding these processes.
Assuntos
Mapeamento Encefálico , Encéfalo/fisiologia , Compreensão/fisiologia , Idioma , Estimulação Acústica , Análise de Variância , Feminino , Humanos , Masculino , Tempo de Reação/fisiologia , Semântica , Fala/fisiologiaRESUMO
Introduction: In recent years, machines powered by deep learning have achieved near-human levels of performance in speech recognition. The fields of artificial intelligence and cognitive neuroscience have finally reached a similar level of performance, despite their huge differences in implementation, and so deep learning models can-in principle-serve as candidates for mechanistic models of the human auditory system. Methods: Utilizing high-performance automatic speech recognition systems, and advanced non-invasive human neuroimaging technology such as magnetoencephalography and multivariate pattern-information analysis, the current study aimed to relate machine-learned representations of speech to recorded human brain representations of the same speech. Results: In one direction, we found a quasi-hierarchical functional organization in human auditory cortex qualitatively matched with the hidden layers of deep artificial neural networks trained as part of an automatic speech recognizer. In the reverse direction, we modified the hidden layer organization of the artificial neural network based on neural activation patterns in human brains. The result was a substantial improvement in word recognition accuracy and learned speech representations. Discussion: We have demonstrated that artificial and brain neural networks can be mutually informative in the domain of speech recognition.
RESUMO
The human conceptual system comprises simulated information of sensorimotor experience and linguistic distributional information of how words are used in language. Moreover, the linguistic shortcut hypothesis predicts that people will use computationally cheaper linguistic distributional information where it is sufficient to inform a task response. In a pre-registered category production study, we asked participants to verbally name members of concrete and abstract categories and tested whether performance could be predicted by a novel measure of sensorimotor similarity (based on an 11-dimensional representation of sensorimotor strength) and linguistic proximity (based on word co-occurrence derived from a large corpus). As predicted, both measures predicted the order and frequency of category production but, critically, linguistic proximity had an effect above and beyond sensorimotor similarity. A follow-up study using typicality ratings as an additional predictor found that typicality was often the strongest predictor of category production variables, but it did not subsume sensorimotor and linguistic effects. Finally, we created a novel, fully grounded computational model of conceptual activation during category production, which best approximated typical human performance when conceptual activation was allowed to spread indirectly between concepts, and when candidate category members came from both sensorimotor and linguistic distributional representations. Critically, model performance was indistinguishable from typical human performance. Results support the linguistic shortcut hypothesis in semantic processing and provide strong evidence that both linguistic and grounded representations are inherent to the functioning of the conceptual system. All materials, data, and code are available at https://osf.io/vaq56/.
Assuntos
Linguística , Semântica , Seguimentos , Humanos , Conhecimento , IdiomaRESUMO
In human visual processing, information from the visual field passes through numerous transformations before perceptual attributes such as colour are derived. The sequence of transforms involved in constructing perceptions of colour can be approximated by colour appearance models such as the CIE (2002) colour appearance model, abbreviated as CIECAM02. In this study, we test the plausibility of CIECAM02 as a model of colour processing by looking for evidence of its cortical entrainment. The CIECAM02 model predicts that colour is split in to two opposing chromatic components, red-green and cyan-yellow (termed CIECAM02-a and CIECAM02-b respectively), and an achromatic component (termed CIECAM02-A). Entrainment of cortical activity to the outputs of these components was estimated using measurements of electro- and magnetoencephalographic (EMEG) activity, recorded while healthy subjects watched videos of dots changing colour. We find entrainment to chromatic component CIECAM02-a at approximately 35â¯ms latency bilaterally in occipital lobe regions, and entrainment to achromatic component CIECAM02-A at approximately 75â¯ms latency, also bilaterally in occipital regions. For comparison, transforms from a less physiologically plausible model (CIELAB) were also tested, with no significant entrainment found.