Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Brain Topogr ; 37(5): 731-747, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38261272

RESUMO

Several studies have shown that mouth movements related to the pronunciation of individual phonemes are represented in the sensorimotor cortex. This would theoretically allow for brain computer interfaces that are capable of decoding continuous speech by training classifiers based on the activity in the sensorimotor cortex related to the production of individual phonemes. To address this, we investigated the decodability of trials with individual and paired phonemes (pronounced consecutively with one second interval) using activity in the sensorimotor cortex. Fifteen participants pronounced 3 different phonemes and 3 combinations of two of the same phonemes in a 7T functional MRI experiment. We confirmed that support vector machine (SVM) classification of single and paired phonemes was possible. Importantly, by combining classifiers trained on single phonemes, we were able to classify paired phonemes with an accuracy of 53% (33% chance level), demonstrating that activity of isolated phonemes is present and distinguishable in combined phonemes. A SVM searchlight analysis showed that the phoneme representations are widely distributed in the ventral sensorimotor cortex. These findings provide insights about the neural representations of single and paired phonemes. Furthermore, it supports the notion that speech BCI may be feasible based on machine learning algorithms trained on individual phonemes using intracranial electrode grids.


Assuntos
Imageamento por Ressonância Magnética , Fala , Máquina de Vetores de Suporte , Humanos , Imageamento por Ressonância Magnética/métodos , Masculino , Feminino , Adulto , Adulto Jovem , Fala/fisiologia , Mapeamento Encefálico/métodos , Interfaces Cérebro-Computador , Córtex Sensório-Motor/fisiologia , Córtex Sensório-Motor/diagnóstico por imagem , Fonética , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem
2.
Sensors (Basel) ; 24(5)2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38475158

RESUMO

Since the advent of modern computing, researchers have striven to make the human-computer interface (HCI) as seamless as possible. Progress has been made on various fronts, e.g., the desktop metaphor (interface design) and natural language processing (input). One area receiving attention recently is voice activation and its corollary, computer-generated speech. Despite decades of research and development, most computer-generated voices remain easily identifiable as non-human. Prosody in speech has two primary components-intonation and rhythm-both often lacking in computer-generated voices. This research aims to enhance computer-generated text-to-speech algorithms by incorporating melodic and prosodic elements of human speech. This study explores a novel approach to add prosody by using machine learning, specifically an LSTM neural network, to add paralinguistic elements to a recorded or generated voice. The aim is to increase the realism of computer-generated text-to-speech algorithms, to enhance electronic reading applications, and improved artificial voices for those in need of artificial assistance to speak. A computer that is able to also convey meaning with a spoken audible announcement will also improve human-to-computer interactions. Applications for the use of such an algorithm may include improving high-definition audio codecs for telephony, renewing old recordings, and lowering barriers to the utilization of computing. This research deployed a prototype modular platform for digital speech improvement by analyzing and generalizing algorithms into a modular system through laboratory experiments to optimize combinations and performance in edge cases. The results were encouraging, with the LSTM-based encoder able to produce realistic speech. Further work will involve optimizing the algorithm and comparing its performance against other approaches.


Assuntos
Percepção da Fala , Fala , Fala/fisiologia , Percepção da Fala/fisiologia , Computadores , Aprendizado de Máquina
3.
Sensors (Basel) ; 23(23)2023 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-38067738

RESUMO

This paper proposes, analyzes, and evaluates a deep learning architecture based on transformers for generating sign language motion from sign phonemes (represented using HamNoSys: a notation system developed at the University of Hamburg). The sign phonemes provide information about sign characteristics like hand configuration, localization, or movements. The use of sign phonemes is crucial for generating sign motion with a high level of details (including finger extensions and flexions). The transformer-based approach also includes a stop detection module for predicting the end of the generation process. Both aspects, motion generation and stop detection, are evaluated in detail. For motion generation, the dynamic time warping distance is used to compute the similarity between two landmarks sequences (ground truth and generated). The stop detection module is evaluated considering detection accuracy and ROC (receiver operating characteristic) curves. The paper proposes and evaluates several strategies to obtain the system configuration with the best performance. These strategies include different padding strategies, interpolation approaches, and data augmentation techniques. The best configuration of a fully automatic system obtains an average DTW distance per frame of 0.1057 and an area under the ROC curve (AUC) higher than 0.94.


Assuntos
Algoritmos , Língua de Sinais , Humanos , Movimento (Física) , Movimento , Mãos
4.
Clin Linguist Phon ; 36(4-5): 417-435, 2022 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-34460348

RESUMO

The current study investigated speech perception in children with ASD by directly comparing discrimination accuracy of phonemic contrasts in the native and non-native languages. The effect of speaker variability on phoneme perception was also examined. We also explored the relation between language impairment and accuracy in phoneme discrimination in children with ASD. Significant differences in performance were found between the ASD and TD groups on discrimination of the native phonemic contrasts. By contrast, no difference was found between the two groups on discrimination of the non-native phonemic contrasts. Further subgroup analysis revealed that the ALN group (ASD without language delay or impairment) showed significantly higher discrimination accuracy for the native syllable contrasts than the non-native counterpart. No significant difference was found in the discrimination accuracy between the native and non-native phonemic contrasts in the ALD group (ASD with language delay or impairment). The effect of speaker viability on phoneme discrimination was observed in the TD group but not in the ASD subgroups. Nonverbal reasoning ability was highly related to discrimination accuracy of both the native and non-native phonemic contrasts in children with ASD. The results of the present study suggest that speech perception in children with ASD is not as attuned to their native language as in their TD peers. Our findings also indicate that language delay or impairment is related to difficulty in perception of native phonemes in children with ASD.


Assuntos
Transtorno do Espectro Autista , Transtornos do Desenvolvimento da Linguagem , Percepção da Fala , Criança , Humanos , Idioma
5.
J Psycholinguist Res ; 49(3): 453-474, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32323122

RESUMO

Phonosemantics is a school of thought which believes that each sound or phoneme carries a specific psychological impression allotted by nature. And these psychological impressions were used to evolve different languages. Work has been done on this ground, but there is still scope for further research into the subject. The paper presents a new hypothesis, explaining the psychological representations of all the important IPA alphabets. The paper proposes a model of psychological mind, on which all the basic phonemes are placed, enabling us to understand the basic relationship between psychological semantic values and their phonetic values. To prove the correctness of the allocation, the paper applies these semantic features to 245 words of different languages, along with some additional evidences. The paper resolves the confusion regarding the same name for different objects, different names for the same object, the question of arbitrariness, and other queries raised by modern linguists.


Assuntos
Idioma , Modelos Psicológicos , Fonética , Semântica , Humanos
6.
J Neurosci ; 38(46): 9803-9813, 2018 11 14.
Artigo em Inglês | MEDLINE | ID: mdl-30257858

RESUMO

Speech is a critical form of human communication and is central to our daily lives. Yet, despite decades of study, an understanding of the fundamental neural control of speech production remains incomplete. Current theories model speech production as a hierarchy from sentences and phrases down to words, syllables, speech sounds (phonemes), and the actions of vocal tract articulators used to produce speech sounds (articulatory gestures). Here, we investigate the cortical representation of articulatory gestures and phonemes in ventral precentral and inferior frontal gyri in men and women. Our results indicate that ventral precentral cortex represents gestures to a greater extent than phonemes, while inferior frontal cortex represents both gestures and phonemes. These findings suggest that speech production shares a common cortical representation with that of other types of movement, such as arm and hand movements. This has important implications both for our understanding of speech production and for the design of brain-machine interfaces to restore communication to people who cannot speak.SIGNIFICANCE STATEMENT Despite being studied for decades, the production of speech by the brain is not fully understood. In particular, the most elemental parts of speech, speech sounds (phonemes) and the movements of vocal tract articulators used to produce these sounds (articulatory gestures), have both been hypothesized to be encoded in motor cortex. Using direct cortical recordings, we found evidence that primary motor and premotor cortices represent gestures to a greater extent than phonemes. Inferior frontal cortex (part of Broca's area) appears to represent both gestures and phonemes. These findings suggest that speech production shares a similar cortical organizational structure with the movement of other body parts.


Assuntos
Mapeamento Encefálico/métodos , Eletrocorticografia/métodos , Lobo Frontal/fisiologia , Gestos , Córtex Pré-Frontal/fisiologia , Fala/fisiologia , Adulto , Mapeamento Encefálico/instrumentação , Feminino , Humanos , Masculino , Movimento/fisiologia , Estimulação Luminosa/métodos
7.
Neuroimage ; 196: 237-247, 2019 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-30991126

RESUMO

Humans comprehend speech despite the various challenges such as mispronunciation and noisy environments. Our auditory system is robust to these thanks to the integration of the sensory input with prior knowledge and expectations built on language-specific regularities. One such regularity regards the permissible phoneme sequences, which determine the likelihood that a word belongs to a given language (phonotactic probability; "blick" is more likely to be an English word than "bnick"). Previous research demonstrated that violations of these rules modulate brain-evoked responses. However, several fundamental questions remain unresolved, especially regarding the neural encoding and integration strategy of phonotactics in naturalistic conditions, when there are no (or few) violations. Here, we used linear modelling to assess the influence of phonotactic probabilities on the brain responses to narrative speech measured with non-invasive EEG. We found that the relationship between continuous speech and EEG responses is best described when the stimulus descriptor includes phonotactic probabilities. This indicates that low-frequency cortical signals (<9 Hz) reflect the integration of phonotactic information during natural speech perception, providing us with a measure of phonotactic processing at the individual subject-level. Furthermore, phonotactics-related signals showed the strongest speech-EEG interactions at latencies of 100-500 ms, supporting a pre-lexical role of phonotactic information.


Assuntos
Córtex Cerebral/fisiologia , Fonética , Percepção da Fala/fisiologia , Estimulação Acústica , Adulto , Potenciais Evocados Auditivos , Feminino , Humanos , Masculino , Adulto Jovem
8.
J Psycholinguist Res ; 48(6): 1391-1406, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31428902

RESUMO

This study investigated whether the phonological representation of a word is modulated by its orthographic representation in case of a mismatch between the two representations. Such a mismatch is found in Persian, where short vowels are represented phonemically but not orthographically. Persian adult literates, Persian adult illiterates, and German adult literates were presented with two auditory tasks, an AX-discrimination task and a reversal task. We assumed that if orthographic representations influence phonological representations, Persian literates should perform worse than Persian illiterates or German literates on items with short vowels in these tasks. The results of the discrimination tasks showed that Persian literates and illiterates as well as German literates were approximately equally competent in discriminating short vowels in Persian words and pseudowords. Persian literates did not well discriminate German words containing phonemes that differed only in vowel length. German literates performed relatively poorly in discriminating German homographic words that differed only in vowel length. Persian illiterates were unable to perform the reversal task in Persian. The results of the other two participant groups in the reversal task showed the predicted poorer performance of Persian literates on Persian items containing short vowels compared to items containing long vowels only. German literates did not show this effect in German. Our results suggest two distinct effects of orthography on phonemic representations: whereas the lack of orthographic representations seems to affect phonemic awareness, homography seems to affect the discriminability of phonemic representations.


Assuntos
Alfabetização , Psicolinguística , Percepção da Fala/fisiologia , Adulto , Feminino , Alemanha , Humanos , Irã (Geográfico) , Masculino , Pessoa de Meia-Idade , Fonética , Desempenho Psicomotor/fisiologia , Leitura , Redação , Adulto Jovem
9.
J Neurosci ; 37(8): 2176-2185, 2017 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-28119400

RESUMO

Humans are unique in their ability to communicate using spoken language. However, it remains unclear how the speech signal is transformed and represented in the brain at different stages of the auditory pathway. In this study, we characterized electroencephalography responses to continuous speech by obtaining the time-locked responses to phoneme instances (phoneme-related potential). We showed that responses to different phoneme categories are organized by phonetic features. We found that each instance of a phoneme in continuous speech produces multiple distinguishable neural responses occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Comparing the patterns of phoneme similarity in the neural responses and the acoustic signals confirms a repetitive appearance of acoustic distinctions of phonemes in the neural data. Analysis of the phonetic and speaker information in neural activations revealed that different time intervals jointly encode the acoustic similarity of both phonetic and speaker categories. These findings provide evidence for a dynamic neural transformation of low-level speech features as they propagate along the auditory pathway, and form an empirical framework to study the representational changes in learning, attention, and speech disorders.SIGNIFICANCE STATEMENT We characterized the properties of evoked neural responses to phoneme instances in continuous speech. We show that each instance of a phoneme in continuous speech produces several observable neural responses at different times occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Each temporal event explicitly encodes the acoustic similarity of phonemes, and linguistic and nonlinguistic information are best represented at different time intervals. Finally, we show a joint encoding of phonetic and speaker information, where the neural representation of speakers is dependent on phoneme category. These findings provide compelling new evidence for dynamic processing of speech sounds in the auditory pathway.


Assuntos
Mapeamento Encefálico , Potenciais Evocados Auditivos/fisiologia , Fonética , Percepção da Fala/fisiologia , Fala/fisiologia , Estimulação Acústica , Acústica , Eletroencefalografia , Feminino , Humanos , Idioma , Masculino , Tempo de Reação , Estatística como Assunto , Fatores de Tempo
10.
Neuroimage ; 180(Pt A): 301-311, 2018 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-28993231

RESUMO

For people who cannot communicate due to severe paralysis or involuntary movements, technology that decodes intended speech from the brain may offer an alternative means of communication. If decoding proves to be feasible, intracranial Brain-Computer Interface systems can be developed which are designed to translate decoded speech into computer generated speech or to instructions for controlling assistive devices. Recent advances suggest that such decoding may be feasible from sensorimotor cortex, but it is not clear how this challenge can be approached best. One approach is to identify and discriminate elements of spoken language, such as phonemes. We investigated feasibility of decoding four spoken phonemes from the sensorimotor face area, using electrocorticographic signals obtained with high-density electrode grids. Several decoding algorithms including spatiotemporal matched filters, spatial matched filters and support vector machines were compared. Phonemes could be classified correctly at a level of over 75% with spatiotemporal matched filters. Support Vector machine analysis reached a similar level, but spatial matched filters yielded significantly lower scores. The most informative electrodes were clustered along the central sulcus. Highest scores were achieved from time windows centered around voice onset time, but a 500 ms window before onset time could also be classified significantly. The results suggest that phoneme production involves a sequence of robust and reproducible activity patterns on the cortical surface. Importantly, decoding requires inclusion of temporal information to capture the rapid shifts of robust patterns associated with articulator muscle group contraction during production of a phoneme. The high classification scores are likely to be enabled by the use of high density grids, and by the use of discrete phonemes. Implications for use in Brain-Computer Interfaces are discussed.


Assuntos
Mapeamento Encefálico/métodos , Córtex Sensório-Motor/fisiologia , Fala/fisiologia , Adolescente , Adulto , Algoritmos , Interfaces Cérebro-Computador , Eletrocorticografia/métodos , Feminino , Humanos , Idioma , Masculino , Fonética , Máquina de Vetores de Suporte , Adulto Jovem
11.
Lang Speech ; 61(1): 71-83, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28367672

RESUMO

Prosody is the pattern of inflection, pitch, and intensity that communicates emotional meaning above and beyond the individual meanings of lexical items and gestures during spoken language. Research has often addressed prosody extending most clearly across multiple speech chunks and carrying properties specific to individual speakers and individual intents. However, prosody exerts effects on intended meaning even for relatively brief speech streams with minimal syntactic cues. The present work seeks to test whether prosody may actually clarify the intended meaning of a two-word phrase even when the basic phonemic sequence of the words is distorted. Thirty-eight undergraduate participants attempted to correctly categorize auditorily presented two-word phrases as belonging to one of three categories: nonsensical phrases; sensible phrases; and spoonerisms. Mixed Poisson modeling of cumulative accuracy found a significant positive interaction of prosody with phrase type indicating that conversational prosody made participants 8.27% more likely to accurately detect spoonerisms. Prosody makes spoken-language comprehension of two-word phrases more robust to distortions of phonemic sequence.


Assuntos
Fonética , Acústica da Fala , Percepção da Fala , Qualidade da Voz , Estimulação Acústica , Adolescente , Adulto , Compreensão , Emoções , Feminino , Humanos , Masculino , Periodicidade , Percepção da Altura Sonora , Tempo de Reação , Inteligibilidade da Fala , Fatores de Tempo , Adulto Jovem
12.
Proc Natl Acad Sci U S A ; 111(18): 6792-7, 2014 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-24753585

RESUMO

Humans and animals can reliably perceive behaviorally relevant sounds in noisy and reverberant environments, yet the neural mechanisms behind this phenomenon are largely unknown. To understand how neural circuits represent degraded auditory stimuli with additive and reverberant distortions, we compared single-neuron responses in ferret primary auditory cortex to speech and vocalizations in four conditions: clean, additive white and pink (1/f) noise, and reverberation. Despite substantial distortion, responses of neurons to the vocalization signal remained stable, maintaining the same statistical distribution in all conditions. Stimulus spectrograms reconstructed from population responses to the distorted stimuli resembled more the original clean than the distorted signals. To explore mechanisms contributing to this robustness, we simulated neural responses using several spectrotemporal receptive field models that incorporated either a static nonlinearity or subtractive synaptic depression and multiplicative gain normalization. The static model failed to suppress the distortions. A dynamic model incorporating feed-forward synaptic depression could account for the reduction of additive noise, but only the combined model with feedback gain normalization was able to predict the effects across both additive and reverberant conditions. Thus, both mechanisms can contribute to the abilities of humans and animals to extract relevant sounds in diverse noisy environments.


Assuntos
Córtex Auditivo/fisiologia , Percepção da Fala/fisiologia , Estimulação Acústica , Animais , Feminino , Furões/fisiologia , Humanos , Modelos Neurológicos , Neurônios/fisiologia , Ruído , Dinâmica não Linear , Vocalização Animal
13.
Clin EEG Neurosci ; 55(6): 613-624, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38755963

RESUMO

Abnormalities in auditory processing are believed to play a major role in autism and attention-deficit hyperactivity disorder (ADHD). Both conditions often co-occur in children, causing difficulties in deciding the most promising intervention. Event-related potentials (ERPs) have been investigated and are showing promise to act as potential biomarkers for both conditions. This study investigated mismatch negativity (MMN) using a passive listening task and P3b in an active auditory go/no-go discrimination task. Recordings were available from 103 children (24 females): 35 with ADHD, 27 autistic, 15 autistic children with co-occurring ADHD, and 26 neurotypical (NT) children. The age range considered was between 4 and 17 years, but varied between groups. The results revealed increases in the MMN and P3b amplitudes with age. Older children with ADHD exhibited smaller P3b amplitudes, while younger autistic children showed reduced MMN amplitudes in response to phoneme changes compared to their NT counterparts. Notably, children diagnosed with autism and ADHD did not follow this pattern; instead, they exhibited more similarities to NT children. The reduced amplitudes of phonetically elicited MMN in children with autism and reduced P3b in children with ADHD suggest that the two respective ERPs can act as potential biomarkers for each condition. However, optimisation and standardisation of the testing protocol, as well as longitudinal studies are required in order to translate these findings into clinical practice.


Assuntos
Transtorno do Deficit de Atenção com Hiperatividade , Transtorno Autístico , Eletroencefalografia , Potenciais Evocados Auditivos , Humanos , Criança , Transtorno do Deficit de Atenção com Hiperatividade/fisiopatologia , Transtorno do Deficit de Atenção com Hiperatividade/diagnóstico , Feminino , Masculino , Potenciais Evocados Auditivos/fisiologia , Eletroencefalografia/métodos , Adolescente , Transtorno Autístico/fisiopatologia , Pré-Escolar , Percepção Auditiva/fisiologia , Atenção/fisiologia , Estimulação Acústica/métodos
14.
JMIRx Med ; 5: e49969, 2024 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-38345294

RESUMO

Background: High-frequency hearing loss is one of the most common problems in the aging population and with those who have a history of exposure to loud noises. This type of hearing loss can be frustrating and disabling, making it difficult to understand speech communication and interact effectively with the world. Objective: This study aimed to examine the impact of spatially unique haptic vibrations representing high-frequency phonemes on the self-perceived ability to understand conversations in everyday situations. Methods: To address high-frequency hearing loss, a multi-motor wristband was developed that uses machine learning to listen for specific high-frequency phonemes. The wristband vibrates in spatially unique locations to represent which phoneme was present in real time. A total of 16 participants with high-frequency hearing loss were recruited and asked to wear the wristband for 6 weeks. The degree of disability associated with hearing loss was measured weekly using the Abbreviated Profile of Hearing Aid Benefit (APHAB). Results: By the end of the 6-week study, the average APHAB benefit score across all participants reached 12.39 points, from a baseline of 40.32 to a final score of 27.93 (SD 13.11; N=16; P=.002, 2-tailed dependent t test). Those without hearing aids showed a 10.78-point larger improvement in average APHAB benefit score at 6 weeks than those with hearing aids (t14=2.14; P=.10, 2-tailed independent t test). The average benefit score across all participants for ease of communication was 15.44 (SD 13.88; N=16; P<.001, 2-tailed dependent t test). The average benefit score across all participants for background noise was 10.88 (SD 17.54; N=16; P=.03, 2-tailed dependent t test). The average benefit score across all participants for reverberation was 10.84 (SD 16.95; N=16; P=.02, 2-tailed dependent t test). Conclusions: These findings show that vibrotactile sensory substitution delivered by a wristband that produces spatially distinguishable vibrations in correspondence with high-frequency phonemes helps individuals with high-frequency hearing loss improve their perceived understanding of verbal communication. Vibrotactile feedback provides benefits whether or not a person wears hearing aids, albeit in slightly different ways. Finally, individuals with the greatest perceived difficulty understanding speech experienced the greatest amount of perceived benefit from vibrotactile feedback.

15.
J Exp Child Psychol ; 116(3): 728-37, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23827642

RESUMO

For both adults and children, acoustic context plays an important role in speech perception. For adults, both speech and nonspeech acoustic contexts influence perception of subsequent speech items, consistent with the argument that effects of context are due to domain-general auditory processes. However, prior research examining the effects of context on children's speech perception have focused on speech contexts; nonspeech contexts have not been explored previously. To better understand the developmental progression of children's use of contexts in speech perception and the mechanisms underlying that development, we created a novel experimental paradigm testing 5-year-old children's speech perception in several acoustic contexts. The results demonstrated that nonspeech context influences children's speech perception, consistent with claims that context effects arise from general auditory system properties rather than speech-specific mechanisms. This supports theoretical accounts of language development suggesting that domain-general processes play a role across the lifespan.


Assuntos
Acústica da Fala , Percepção da Fala , Estimulação Acústica , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Fonética , Inteligibilidade da Fala
16.
J Imaging ; 9(12)2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38132680

RESUMO

Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37.

17.
Exp Psychol ; 70(6): 336-343, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38288915

RESUMO

In this study, we re-examined the facilitation that occurs when auditorily presented monosyllabic primes and targets share their final phonemes, and in particular the rime (e.g., /vɔʀd/-/kɔʀd/). More specifically, we asked whether this rime facilitation effect is also observed when the two last consonants of the rime are transposed (e.g., /vɔʀd/-/kɔʀd/). In comparison to a control condition in which the primes and the targets were unrelated (e.g., /pylt/-/kɔʀd/), we found significant priming effects in both the rime (/vɔdʀ/-/kɔʀd/) and the transposed-phoneme "rime" /vɔdʀ/-/kɔʀd/ conditions. We also observed a significantly greater priming effect in the former condition than in the latter condition. We use the theoretical framework of the TISK model (Hannagan et al., 2013) to propose a novel account of final overlap phonological priming in terms of activation of both position-independent phoneme representations and bi-phone representations.


Assuntos
Fonética , Percepção da Fala , Humanos , Percepção da Fala/fisiologia
18.
Front Psychol ; 14: 1178427, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37251015

RESUMO

Introduction: Spelling is an essential foundation for reading and writing. However, many children leave school with spelling difficulties. By understanding the processes children use when they spell, we can intervene with appropriate instruction tailored to their needs. Methods: Our study aimed to identify key processes (lexical-semantic and phonological) by using a spelling assessment that distinguishes different printed letter strings/word types (regular and irregular words, and pseudowords). Misspellings in the test from 641 pupils in Reception Year to Year 6 were scored using alternatives to binary correct versus incorrect scoring systems. The measures looked at phonological plausibility, phoneme representations and letter distance. These have been used successfully in the past but not with a spelling test that distinguishes irregularly spelled words from regular words and pseudowords. Results: The findings suggest that children in primary school rely on both lexical-semantic and phonological processes to spell all types of letter string, but this varies depending on the level of spelling experience (younger Foundation/Key stage 1 and older Key stage 2). Although children in younger year groups seemed to rely more on phonics, based on the strongest correlation coefficients for all word types, with further spelling experience, lexical processes seemed to be more evident, depending on the type of word examined. Discussion: The findings have implications for the way we teach and assess spelling and could prove to be valuable for educators.

19.
Neurobiol Lang (Camb) ; 3(1): 18-45, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37215328

RESUMO

As part of silent reading models, visual orthographic information is transduced into an auditory phonological code in a process of grapheme-to-phoneme conversion (GPC). This process is often identified with lateral temporal-parietal regions associated with auditory phoneme encoding. However, the role of articulatory phonemic representations and the precentral gyrus in GPC is ambiguous. Though the precentral gyrus is implicated in many functional MRI studies of reading, it is not clear if the time course of activity in this region is consistent with the precentral gyrus being involved in GPC. We recorded cortical electrophysiology during a bimodal match/mismatch task from eight patients with perisylvian subdural electrodes to examine the time course of neural activity during a task that necessitated GPC. Patients made a match/mismatch decision between a 3-letter string and the following auditory bi-phoneme. We characterized the distribution and timing of evoked broadband high gamma (70-170 Hz) as well as phase-locking between electrodes. The precentral gyrus emerged with a high concentration of broadband high gamma responses to visual and auditory language as well as mismatch effects. The pars opercularis, supramarginal gyrus, and superior temporal gyrus were also involved. The precentral gyrus showed strong phase-locking with the caudal fusiform gyrus during letter-string presentation and with surrounding perisylvian cortex during the bimodal visual-auditory comparison period. These findings hint at a role for precentral cortex in transducing visual into auditory codes during silent reading.

20.
Philos Trans R Soc Lond B Biol Sci ; 376(1824): 20200195, 2021 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-33745314

RESUMO

Evidence is reviewed for widespread phonological and phonetic tendencies in contemporary languages. The evidence is based largely on the frequency of sound types in word lists and in phoneme inventories across the world's languages. The data reviewed point to likely tendencies in the languages of the Upper Palaeolithic. These tendencies include the reliance on specific nasal and voiceless stop consonants, the relative dispreference for posterior voiced consonants and the use of peripheral vowels. More tenuous hypotheses related to prehistoric languages are also reviewed. These include the propositions that such languages lacked labiodental consonants and relied more heavily on vowels, when contrasted to many contemporary languages. Such hypotheses suggest speech has adapted to subtle pressures that may in some cases vary across populations. This article is part of the theme issue 'Reconstructing prehistoric languages'.


Assuntos
Evolução Cultural , Idioma , Som , Percepção da Fala , Fala , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA