Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
PLoS Biol ; 22(5): e3002631, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38805517

RESUMEN

Music and speech are complex and distinct auditory signals that are both foundational to the human experience. The mechanisms underpinning each domain are widely investigated. However, what perceptual mechanism transforms a sound into music or speech and how basic acoustic information is required to distinguish between them remain open questions. Here, we hypothesized that a sound's amplitude modulation (AM), an essential temporal acoustic feature driving the auditory system across processing levels, is critical for distinguishing music and speech. Specifically, in contrast to paradigms using naturalistic acoustic signals (that can be challenging to interpret), we used a noise-probing approach to untangle the auditory mechanism: If AM rate and regularity are critical for perceptually distinguishing music and speech, judging artificially noise-synthesized ambiguous audio signals should align with their AM parameters. Across 4 experiments (N = 335), signals with a higher peak AM frequency tend to be judged as speech, lower as music. Interestingly, this principle is consistently used by all listeners for speech judgments, but only by musically sophisticated listeners for music. In addition, signals with more regular AM are judged as music over speech, and this feature is more critical for music judgment, regardless of musical sophistication. The data suggest that the auditory system can rely on a low-level acoustic property as basic as AM to distinguish music from speech, a simple principle that provokes both neurophysiological and evolutionary experiments and speculations.


Asunto(s)
Estimulación Acústica , Percepción Auditiva , Música , Percepción del Habla , Humanos , Masculino , Femenino , Adulto , Percepción Auditiva/fisiología , Estimulación Acústica/métodos , Percepción del Habla/fisiología , Adulto Joven , Habla/fisiología , Adolescente
2.
J Neurosci ; 44(30)2024 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-38926087

RESUMEN

Music, like spoken language, is often characterized by hierarchically organized structure. Previous experiments have shown neural tracking of notes and beats, but little work touches on the more abstract question: how does the brain establish high-level musical structures in real time? We presented Bach chorales to participants (20 females and 9 males) undergoing electroencephalogram (EEG) recording to investigate how the brain tracks musical phrases. We removed the main temporal cues to phrasal structures, so that listeners could only rely on harmonic information to parse a continuous musical stream. Phrasal structures were disrupted by locally or globally reversing the harmonic progression, so that our observations on the original music could be controlled and compared. We first replicated the findings on neural tracking of musical notes and beats, substantiating the positive correlation between musical training and neural tracking. Critically, we discovered a neural signature in the frequency range ∼0.1 Hz (modulations of EEG power) that reliably tracks musical phrasal structure. Next, we developed an approach to quantify the phrasal phase precession of the EEG power, revealing that phrase tracking is indeed an operation of active segmentation involving predictive processes. We demonstrate that the brain establishes complex musical structures online over long timescales (>5 s) and actively segments continuous music streams in a manner comparable to language processing. These two neural signatures, phrase tracking and phrasal phase precession, provide new conceptual and technical tools to study the processes underpinning high-level structure building using noninvasive recording techniques.


Asunto(s)
Percepción Auditiva , Electroencefalografía , Música , Humanos , Femenino , Masculino , Electroencefalografía/métodos , Adulto , Percepción Auditiva/fisiología , Adulto Joven , Estimulación Acústica/métodos , Encéfalo/fisiología
3.
Cereb Cortex ; 30(4): 2600-2614, 2020 04 14.
Artículo en Inglés | MEDLINE | ID: mdl-31761952

RESUMEN

Natural sounds contain acoustic dynamics ranging from tens to hundreds of milliseconds. How does the human auditory system encode acoustic information over wide-ranging timescales to achieve sound recognition? Previous work (Teng et al. 2017) demonstrated a temporal coding preference for the theta and gamma ranges, but it remains unclear how acoustic dynamics between these two ranges are coded. Here, we generated artificial sounds with temporal structures over timescales from ~200 to ~30 ms and investigated temporal coding on different timescales. Participants discriminated sounds with temporal structures at different timescales while undergoing magnetoencephalography recording. Although considerable intertrial phase coherence can be induced by acoustic dynamics of all the timescales, classification analyses reveal that the acoustic information of all timescales is preferentially differentiated through the theta and gamma bands, but not through the alpha and beta bands; stimulus reconstruction shows that the acoustic dynamics in the theta and gamma ranges are preferentially coded. We demonstrate that the theta and gamma bands show the generality of temporal coding with comparable capacity. Our findings provide a novel perspective-acoustic information of all timescales is discretised into two discrete temporal chunks for further perceptual analysis.


Asunto(s)
Estimulación Acústica/métodos , Corteza Auditiva/fisiología , Percepción Auditiva/fisiología , Ritmo Gamma/fisiología , Magnetoencefalografía/métodos , Ritmo Teta/fisiología , Adulto , Femenino , Humanos , Masculino , Sonido , Factores de Tiempo , Adulto Joven
4.
PLoS Biol ; 15(11): e2000812, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-29095816

RESUMEN

Natural sounds convey perceptually relevant information over multiple timescales, and the necessary extraction of multi-timescale information requires the auditory system to work over distinct ranges. The simplest hypothesis suggests that temporal modulations are encoded in an equivalent manner within a reasonable intermediate range. We show that the human auditory system selectively and preferentially tracks acoustic dynamics concurrently at 2 timescales corresponding to the neurophysiological theta band (4-7 Hz) and gamma band ranges (31-45 Hz) but, contrary to expectation, not at the timescale corresponding to alpha (8-12 Hz), which has also been found to be related to auditory perception. Listeners heard synthetic acoustic stimuli with temporally modulated structures at 3 timescales (approximately 190-, approximately 100-, and approximately 30-ms modulation periods) and identified the stimuli while undergoing magnetoencephalography recording. There was strong intertrial phase coherence in the theta band for stimuli of all modulation rates and in the gamma band for stimuli with corresponding modulation rates. The alpha band did not respond in a similar manner. Classification analyses also revealed that oscillatory phase reliably tracked temporal dynamics but not equivalently across rates. Finally, mutual information analyses quantifying the relation between phase and cochlear-scaled correlations also showed preferential processing in 2 distinct regimes, with the alpha range again yielding different patterns. The results support the hypothesis that the human auditory system employs (at least) a 2-timescale processing mode, in which lower and higher perceptual sampling scales are segregated by an intermediate temporal regime in the alpha band that likely reflects different underlying computations.


Asunto(s)
Estimulación Acústica , Percepción Auditiva/fisiología , Fenómenos Fisiológicos del Sistema Nervioso , Conducta , Biomarcadores/metabolismo , Electroencefalografía , Potenciales Evocados Auditivos/fisiología , Femenino , Ritmo Gamma/fisiología , Humanos , Masculino , Ritmo Teta/fisiología , Factores de Tiempo , Adulto Joven
5.
Neuroimage ; 202: 116152, 2019 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-31484039

RESUMEN

Segmenting the continuous speech stream into units for further perceptual and linguistic analyses is fundamental to speech recognition. The speech amplitude envelope (SE) has long been considered a fundamental temporal cue for segmenting speech. Does the temporal fine structure (TFS), a significant part of speech signals often considered to contain primarily spectral information, contribute to speech segmentation? Using magnetoencephalography, we show that the TFS entrains cortical responses between 3 and 6 Hz and demonstrate, using mutual information analysis, that (i) the temporal information in the TFS can be reconstructed from a measure of frame-to-frame spectral change and correlates with the SE and (ii) that spectral resolution is key to the extraction of such temporal information. Furthermore, we show behavioural evidence that, when the SE is temporally distorted, the TFS provides cues for speech segmentation and aids speech recognition significantly. Our findings show that it is insufficient to investigate solely the SE to understand temporal speech segmentation, as the SE and the TFS derived from a band-filtering method convey comparable, if not inseparable, temporal information. We argue for a more synthetic view of speech segmentation - the auditory system groups speech signals coherently in both temporal and spectral domains.


Asunto(s)
Señales (Psicología) , Acústica del Lenguaje , Inteligibilidad del Habla/fisiología , Percepción del Habla/fisiología , Adulto , Femenino , Humanos , Teoría de la Información , Magnetoencefalografía , Masculino , Reconocimiento en Psicología , Procesamiento de Señales Asistido por Computador , Factores de Tiempo , Adulto Joven
6.
Eur J Neurosci ; 48(8): 2770-2782, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-29044763

RESUMEN

Parsing continuous acoustic streams into perceptual units is fundamental to auditory perception. Previous studies have uncovered a cortical entrainment mechanism in the delta and theta bands (~1-8 Hz) that correlates with formation of perceptual units in speech, music, and other quasi-rhythmic stimuli. Whether cortical oscillations in the delta-theta bands are passively entrained by regular acoustic patterns or play an active role in parsing the acoustic stream is debated. Here, we investigate cortical oscillations using novel stimuli with 1/f modulation spectra. These 1/f signals have no rhythmic structure but contain information over many timescales because of their broadband modulation characteristics. We chose 1/f modulation spectra with varying exponents of f, which simulate the dynamics of environmental noise, speech, vocalizations, and music. While undergoing magnetoencephalography (MEG) recording, participants listened to 1/f stimuli and detected embedded target tones. Tone detection performance varied across stimuli of different exponents and can be explained by local signal-to-noise ratio computed using a temporal window around 200 ms. Furthermore, theta band oscillations, surprisingly, were observed for all stimuli, but robust phase coherence was preferentially displayed by stimuli with exponents 1 and 1.5. We constructed an auditory processing model to quantify acoustic information on various timescales and correlated the model outputs with the neural results. We show that cortical oscillations reflect a chunking of segments, > 200 ms. These results suggest an active auditory segmentation mechanism, complementary to entrainment, operating on a timescale of ~200 ms to organize acoustic information.


Asunto(s)
Estimulación Acústica/métodos , Corteza Auditiva/fisiología , Percepción Auditiva/fisiología , Música , Percepción del Habla/fisiología , Ritmo Teta/fisiología , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Factores de Tiempo , Adulto Joven
7.
PLoS One ; 19(8): e0309432, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39213300

RESUMEN

Building on research demonstrating the benefits of music training for emotional prosody recognition in nontonal languages, this study delves into its unexplored influence on tonal languages. In tonal languages, the acoustic similarity between lexical tones and music, along with the dual role of pitch in conveying lexical and affective meanings, create a unique interplay. We evaluated 72 participants, half of whom had extensive instrumental music training, with the other half serving as demographically matched controls. All participants completed an online test consisting of 210 Chinese pseudosentences, each designed to express one of five emotions: happiness, sadness, fear, anger, or neutrality. Our robust statistical analyses, which included effect size estimates and Bayesian factors, revealed that music and nonmusic groups exhibit similar abilities in identifying the emotional prosody of various emotions. However, the music group attributed higher intensity ratings to emotional prosodies of happiness, fear, and anger compared to the nonmusic group. These findings suggest that while instrumental music training is not related to emotional prosody recognition, it does appear to be related to perceived emotional intensity. This dissociation between emotion recognition and intensity evaluation adds a new piece to the puzzle of the complex relationship between music training and emotion perception in tonal languages.


Asunto(s)
Emociones , Lenguaje , Música , Humanos , Música/psicología , Femenino , Masculino , Emociones/fisiología , Adulto , Adulto Joven , Reconocimiento en Psicología/fisiología , Percepción del Habla/fisiología
8.
eNeuro ; 8(1)2021.
Artículo en Inglés | MEDLINE | ID: mdl-33272971

RESUMEN

Speech signals have a unique shape of long-term modulation spectrum that is distinct from environmental noise, music, and non-speech vocalizations. Does the human auditory system adapt to the speech long-term modulation spectrum and efficiently extract critical information from speech signals? To answer this question, we tested whether neural responses to speech signals can be captured by specific modulation spectra of non-speech acoustic stimuli. We generated amplitude modulated (AM) noise with the speech modulation spectrum and 1/f modulation spectra of different exponents to imitate temporal dynamics of different natural sounds. We presented these AM stimuli and a 10-min piece of natural speech to 19 human participants undergoing electroencephalography (EEG) recording. We derived temporal response functions (TRFs) to the AM stimuli of different spectrum shapes and found distinct neural dynamics for each type of TRFs. We then used the TRFs of AM stimuli to predict neural responses to the speech signals, and found that (1) the TRFs of AM modulation spectra of exponents 1, 1.5, and 2 preferably captured EEG responses to speech signals in the δ band and (2) the θ neural band of speech neural responses can be captured by the AM stimuli of an exponent of 0.75. Our results suggest that the human auditory system shows specificity to the long-term modulation spectrum and is equipped with characteristic neural algorithms tailored to extract critical acoustic information from speech signals.


Asunto(s)
Corteza Auditiva , Percepción del Habla , Estimulación Acústica , Percepción Auditiva , Electroencefalografía , Humanos , Habla
9.
Curr Biol ; 30(7): 1299-1305.e7, 2020 04 06.
Artículo en Inglés | MEDLINE | ID: mdl-32142700

RESUMEN

Ancient Chinese poetry is constituted by structured language that deviates from ordinary language usage [1, 2]; its poetic genres impose unique combinatory constraints on linguistic elements [3]. How does the constrained poetic structure facilitate speech segmentation when common linguistic [4-8] and statistical cues [5, 9] are unreliable to listeners in poems? We generated artificial Jueju, which arguably has the most constrained structure in ancient Chinese poetry, and presented each poem twice as an isochronous sequence of syllables to native Mandarin speakers while conducting magnetoencephalography (MEG) recording. We found that listeners deployed their prior knowledge of Jueju to build the line structure and to establish the conceptual flow of Jueju. Unprecedentedly, we found a phase precession phenomenon indicating predictive processes of speech segmentation-the neural phase advanced faster after listeners acquired knowledge of incoming speech. The statistical co-occurrence of monosyllabic words in Jueju negatively correlated with speech segmentation, which provides an alternative perspective on how statistical cues facilitate speech segmentation. Our findings suggest that constrained poetic structures serve as a temporal map for listeners to group speech contents and to predict incoming speech signals. Listeners can parse speech streams by using not only grammatical and statistical cues but also their prior knowledge of the form of language. VIDEO ABSTRACT.


Asunto(s)
Señales (Psicología) , Poesía como Asunto , Percepción del Habla , Habla , Adulto , China , Femenino , Humanos , Magnetoencefalografía , Masculino , Adulto Joven
10.
Sci Rep ; 6: 34390, 2016 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-27713546

RESUMEN

Natural sounds contain information on multiple timescales, so the auditory system must analyze and integrate acoustic information on those different scales to extract behaviorally relevant information. However, this multi-scale process in the auditory system is not widely investigated in the literature, and existing models of temporal integration are mainly built upon detection or recognition tasks on a single timescale. Here we use a paradigm requiring processing on relatively 'local' and 'global' scales and provide evidence suggesting that the auditory system extracts fine-detail acoustic information using short temporal windows and uses long temporal windows to abstract global acoustic patterns. Behavioral task performance that requires processing fine-detail information does not improve with longer stimulus length, contrary to predictions of previous temporal integration models such as the multiple-looks and the spectro-temporal excitation pattern model. Moreover, the perceptual construction of putatively 'unitary' auditory events requires more than hundreds of milliseconds. These findings support the hypothesis of a dual-scale processing likely implemented in the auditory cortex.


Asunto(s)
Percepción Auditiva , Modelos Psicológicos , Estimulación Acústica , Adulto , Conducta de Elección , Femenino , Humanos , Masculino , Modelos Neurológicos , Factores de Tiempo , Adulto Joven
11.
Hear Res ; 283(1-2): 136-43, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22101022

RESUMEN

Presenting the early part of a nonsense sentence in quiet improves recognition of the last keyword of the sentence in a masker, especially a speech masker. This priming effect depends on higher-order processing of the prime information during target-masker segregation. This study investigated whether introducing irrelevant content information into the prime reduces the priming effect. The results showed that presenting the first four syllables (not including the second and third keywords) of the three-keyword target sentence in quiet significantly improved recognition of the second and third keywords in a two-talker-speech masker but not a noise masker, relative to the no-priming condition. Increasing the prime content from four to eight syllables (including the first and second keywords of the target sentence) further improved recognition of the third keyword in either the noise or speech masker. However, if the last four syllables of the eight-syllable prime were replaced by four irrelevant syllables (which did not occur in the target sentence), all the prime-induced speech-recognition improvements disappeared. Thus, knowing the early part of the target sentence mainly reduces informational masking of target speech, possibly by helping listeners attend to the target speech. Increasing the informative content of the prime further improves target-speech recognition probably by reducing the processing load. The reduction of the priming effect by adding irrelevant information to the prime is not due to introducing additional masking of the target speech.


Asunto(s)
Señales (Psicología) , Enmascaramiento Perceptual , Reconocimiento en Psicología , Percepción del Habla , Estimulación Acústica , Adulto , Análisis de Varianza , Audiometría de Tonos Puros , Audiometría del Habla , Umbral Auditivo , Femenino , Humanos , Masculino , Ruido/efectos adversos , Espectrografía del Sonido , Factores de Tiempo , Adulto Joven
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda