Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
J Acoust Soc Am ; 151(5): 3369, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35649936

RESUMEN

Lexical bias is the tendency to perceive an ambiguous speech sound as a phoneme completing a word; more ambiguity typically causes greater reliance on lexical knowledge. A speech sound ambiguous between /g/ and /k/ is more likely to be perceived as /g/ before /ɪft/ and as /k/ before /ɪs/. The magnitude of this difference-the Ganong shift-increases when high cognitive load limits available processing resources. The effects of stimulus naturalness and informational masking on Ganong shifts and reaction times were explored. Tokens between /gɪ/ and /kɪ/ were generated using morphing software, from which two continua were created ("giss"-"kiss" and "gift"-"kift"). In experiment 1, Ganong shifts were considerably larger for sine- than noise-vocoded versions of these continua, presumably because the spectral sparsity and unnatural timbre of the former increased cognitive load. In experiment 2, noise-vocoded stimuli were presented alone or accompanied by contralateral interferers with constant within-band amplitude envelope, or within-band envelope variation that was the same or different across bands. The latter, with its implied spectro-temporal variation, was predicted to cause the greatest cognitive load. Reaction-time measures matched this prediction; Ganong shifts showed some evidence of greater lexical bias for frequency-varying interferers, but were influenced by context effects and diminished over time.


Asunto(s)
Percepción del Habla , Sesgo , Ruido/efectos adversos , Fonética , Tiempo de Reacción
2.
J Acoust Soc Am ; 149(6): 3769, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34241493

RESUMEN

Three experiments explored the effects of abrupt changes in stimulus properties on streaming dynamics. Listeners monitored 20-s-long low- and high-frequency (LHL-) tone sequences and reported the number of streams heard throughout. Experiments 1 and 2 used pure tones and examined the effects of changing triplet base frequency and level, respectively. Abrupt changes in base frequency (±3-12 semitones) caused significant magnitude-related falls in segregation (resetting), regardless of transition direction, but an asymmetry occurred for changes in level (±12 dB). Rising-level transitions usually decreased segregation significantly, whereas falling-level transitions had little or no effect. Experiment 3 used pure tones (unmodulated) and narrowly spaced (±25 Hz) tone pairs (dyads); the two evoke similar excitation patterns, but dyads are strongly modulated with a distinctive timbre. Dyad-only sequences induced a strongly segregated percept, limiting scope for further build-up. Alternation between groups of pure tones and dyads produced large, asymmetric changes in streaming. Dyad-to-pure transitions caused substantial resetting, but pure-to-dyad transitions sometimes elicited even greater segregation than for the corresponding interval in dyad-only sequences (overshoot). The results indicate that abrupt changes in timbre can strongly affect the likelihood of stream segregation without introducing significant peripheral-channeling cues. These asymmetric effects of transition direction are reminiscent of subtractive adaptation in vision.


Asunto(s)
Percepción Auditiva , Audición , Estimulación Acústica , Adaptación Fisiológica , Señales (Psicología) , Factores de Tiempo
3.
J Acoust Soc Am ; 150(5): 3693, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34852626

RESUMEN

Speech-on-speech informational masking arises because the interferer disrupts target processing (e.g., capacity limitations) or corrupts it (e.g., intrusions into the target percept); the latter should produce predictable errors. Listeners identified the consonant in monaural buzz-excited three-formant analogues of approximant-vowel syllables, forming a place of articulation series (/w/-/l/-/j/). There were two 11-member series; the vowel was either high-front or low-back. Series members shared formant-amplitude contours, fundamental frequency, and F1+F3 frequency contours; they were distinguished solely by the F2 frequency contour before the steady portion. Targets were always presented in the left ear. For each series, F2 frequency and amplitude contours were also used to generate interferers with altered source properties-sine-wave analogues of F2 (sine bleats) matched to their buzz-excited counterparts. Accompanying each series member with a fixed mismatched sine bleat in the contralateral ear produced systematic and predictable effects on category judgments; these effects were usually largest for bleats involving the fastest rate or greatest extent of frequency change. Judgments of isolated sine bleats using the three place labels were often unsystematic or arbitrary. These results indicate that informational masking by interferers involved corruption of target processing as a result of mandatory dichotic integration of F2 information, despite the grouping cues disfavoring this integration.


Asunto(s)
Inteligibilidad del Habla , Percepción del Habla , Estimulación Acústica , Juicio , Fonética , Acústica del Lenguaje
4.
J Acoust Soc Am ; 148(4): 2416, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33138537

RESUMEN

The impact of an extraneous formant on intelligibility is affected by the extent (depth) of variation in its formant-frequency contour. Two experiments explored whether this impact also depends on masker spectro-temporal coherence, using a method ensuring that interference occurred only through informational masking. Targets were monaural three-formant analogues (F1+F2+F3) of natural sentences presented alone or accompanied by a contralateral competitor for F2 (F2C) that listeners must reject to optimize recognition. The standard F2C was created using the inverted F2 frequency contour and constant amplitude. Variants were derived by dividing F2C into abutting segments (100-200 ms, 10-ms rise/fall). Segments were presented either in the correct order (coherent) or in random order (incoherent), introducing abrupt discontinuities into the F2C frequency contour. F2C depth was also manipulated (0%, 50%, or 100%) prior to segmentation, and the frequency contour of each segment either remained time-varying or was set to constant at the geometric mean frequency of that segment. The extent to which F2C lowered keyword scores depended on segment type (frequency-varying vs constant) and depth, but not segment order. This outcome indicates that the impact on intelligibility depends critically on the overall amount of frequency variation in the competitor, but not its spectro-temporal coherence.


Asunto(s)
Enmascaramiento Perceptual , Inteligibilidad del Habla , Percepción del Habla , Humanos , Reconocimiento en Psicología
5.
J Acoust Soc Am ; 147(2): 1113, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32113320

RESUMEN

Masking experienced when target speech is accompanied by a single interfering voice is often primarily informational masking (IM). IM is generally greater when the interferer is intelligible than when it is not (e.g., speech from an unfamiliar language), but the relative contributions of acoustic-phonetic and linguistic interference are often difficult to assess owing to acoustic differences between interferers (e.g., different talkers). Three-formant analogues (F1+F2+F3) of natural sentences were used as targets and interferers. Targets were presented monaurally either alone or accompanied contralaterally by interferers from another sentence (F0 = 4 semitones higher); a target-to-masker ratio (TMR) between ears of 0, 6, or 12 dB was used. Interferers were either intelligible or rendered unintelligible by delaying F2 and advancing F3 by 150 ms relative to F1, a manipulation designed to minimize spectro-temporal differences between corresponding interferers. Target-sentence intelligibility (keywords correct) was 67% when presented alone, but fell considerably when an unintelligible interferer was present (49%) and significantly further when the interferer was intelligible (41%). Changes in TMR produced neither a significant main effect nor an interaction with interferer type. Interference with acoustic-phonetic processing of the target can explain much of the impact on intelligibility, but linguistic factors-particularly interferer intrusions-also make an important contribution to IM.

6.
J Acoust Soc Am ; 145(3): 1230, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-31067923

RESUMEN

Differences in ear of presentation and level do not prevent effective integration of concurrent speech cues such as formant frequencies. For example, presenting the higher formants of a consonant-vowel syllable in the opposite ear to the first formant protects them from upward spread of masking, allowing them to remain effective speech cues even after substantial attenuation. This study used three-formant (F1+F2+F3) analogues of natural sentences and extended the approach to include competitive conditions. Target formants were presented dichotically (F1+F3; F2), either alone or accompanied by an extraneous competitor for F2 (i.e., F1±F2C+F3; F2) that listeners must reject to optimize recognition. F2C was created by inverting the F2 frequency contour and using the F2 amplitude contour without attenuation. In experiment 1, F2C was always absent and intelligibility was unaffected until F2 attenuation exceeded 30 dB; F2 still provided useful information at 48-dB attenuation. In experiment 2, attenuating F2 by 24 dB caused considerable loss of intelligibility when F2C was present, but had no effect in its absence. Factors likely to contribute to this interaction include informational masking from F2C acting to swamp the acoustic-phonetic information carried by F2, and interaural inhibition from F2C acting to reduce the effective level of F2.

7.
J Acoust Soc Am ; 143(2): 891, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29495741

RESUMEN

This study explored the extent to which informational masking of speech depends on the frequency region and number of extraneous formants in an interferer. Target formants-monotonized three-formant (F1+F2+F3) analogues of natural sentences-were presented monaurally, with target ear assigned randomly on each trial. Interferers were presented contralaterally. In experiment 1, single-formant interferers were created using the time-reversed F2 frequency contour and constant amplitude, root-mean-square (RMS)-matched to F2. Interferer center frequency was matched to that of F1, F2, or F3, while maintaining the extent of formant-frequency variation (depth) on a log scale. Adding an interferer lowered intelligibility; the effect of frequency region was small and broadly tuned around F2. In experiment 2, interferers comprised either one formant (F1, the most intense) or all three, created using the time-reversed frequency contours of the corresponding targets and RMS-matched constant amplitudes. Interferer formant-frequency variation was scaled to 0%, 50%, or 100% of the original depth. Increasing the depth of formant-frequency variation and number of formants in the interferer had independent and additive effects. These findings suggest that the impact on intelligibility depends primarily on the overall extent of frequency variation in each interfering formant (up to ∼100% depth) and the number of extraneous formants.

8.
J Acoust Soc Am ; 144(6): 3409, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-30599694

RESUMEN

Stream segregation for a test sequence comprising high-frequency (H) and low-frequency (L) pure tones, presented in a galloping rhythm, is much greater when preceded by a constant-frequency induction sequence matching one subset than by an inducer configured like the test sequence; this difference persists for several seconds. It has been proposed that constant-frequency inducers promote stream segregation by capturing the matching subset of test-sequence tones into an on-going, pre-established stream. This explanation was evaluated using 2-s induction sequences followed by longer test sequences (12-20 s). Listeners reported the number of streams heard throughout the test sequence. Experiment 1 used LHL- sequences and one or other subset of inducer tones was attenuated (0-24 dB in 6-dB steps, and ∞). Greater attenuation usually caused a progressive increase in segregation, towards that following the constant-frequency inducer. Experiment 2 used HLH- sequences and the L inducer tones were raised or lowered in frequency relative to their test-sequence counterparts (ΔfI = 0, 0.5, 1.0, or 1.5 × ΔfT ). Either change greatly increased segregation. These results are concordant with the notion of attention switching to new sounds but contradict the stream-capture hypothesis, unless a "proto-object" corresponding to the continuing subset is assumed to form during the induction sequence.

9.
J Acoust Soc Am ; 140(2): 1227, 2016 08.
Artículo en Inglés | MEDLINE | ID: mdl-27586751

RESUMEN

The role of source properties in across-formant integration was explored using three-formant (F1+F2+F3) analogues of natural sentences (targets). In experiment 1, F1+F3 were harmonic analogues (H1+H3) generated using a monotonous buzz source and second-order resonators; in experiment 2, F1+F3 were tonal analogues (T1+T3). F2 could take either form (H2 or T2). Target formants were always presented monaurally; the receiving ear was assigned randomly on each trial. In some conditions, only the target was present; in others, a competitor for F2 (F2C) was presented contralaterally. Buzz-excited or tonal competitors were created using the time-reversed frequency and amplitude contours of F2. Listeners must reject F2C to optimize keyword recognition. Whether or not a competitor was present, there was no effect of source mismatch between F1+F3 and F2. The impact of adding F2C was modest when it was tonal but large when it was harmonic, irrespective of whether F2C matched F1+F3. This pattern was maintained when harmonic and tonal counterparts were loudness-matched (experiment 3). Source type and competition, rather than acoustic similarity, governed the phonetic contribution of a formant. Contrary to earlier research using dichotic targets, requiring across-ear integration to optimize intelligibility, H2C was an equally effective informational masker for H2 as for T2.


Asunto(s)
Acústica del Lenguaje , Inteligibilidad del Habla/fisiología , Estimulación Acústica , Humanos , Fonética , Distribución Aleatoria , Percepción del Habla
10.
J Vis ; 15(1): 15.1.12, 2015 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-25589296

RESUMEN

To extend our understanding of the early visual hierarchy, we investigated the long-range integration of first- and second-order signals in spatial vision. In our first experiment we performed a conventional area summation experiment where we varied the diameter of (a) luminance-modulated (LM) noise and (b) contrast-modulated (CM) noise. Results from the LM condition replicated previous findings with sine-wave gratings in the absence of noise, consistent with long-range integration of signal contrast over space. For CM, the summation function was much shallower than for LM suggesting, at first glance, that the signal integration process was spatially less extensive than for LM. However, an alternative possibility was that the high spatial frequency noise carrier for the CM signal was attenuated by peripheral retina (or cortex), thereby impeding our ability to observe area summation of CM in the conventional way. To test this, we developed the "Swiss cheese" stimulus of Meese and Summers (2007) in which signal area can be varied without changing the stimulus diameter, providing some protection against inhomogeneity of the retinal field. Using this technique and a two-component subthreshold summation paradigm we found that (a) CM is spatially integrated over at least five stimulus cycles (possibly more), (b) spatial integration follows square-law signal transduction for both LM and CM and (c) the summing device integrates over spatially-interdigitated LM and CM signals when they are co-oriented, but not when cross-oriented. The spatial pooling mechanism that we have identified would be a good candidate component for a module involved in representing visual textures, including their spatial extent.


Asunto(s)
Luz , Sumación de Potenciales Postsinápticos/fisiología , Percepción Espacial/fisiología , Señales (Psicología) , Humanos , Modelos Biológicos , Umbral Sensorial
11.
J Acoust Soc Am ; 137(5): 2726-36, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25994702

RESUMEN

Recent research suggests that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This idea was explored using a method that ensures interference cannot occur through energetic masking. Three-formant (F1 + F2 + F3) analogues of natural sentences were synthesized using a monotonous periodic source. Target formants were presented monaurally, with the target ear assigned randomly on each trial. A competitor for F2 (F2C) was presented contralaterally; listeners must reject F2C to optimize recognition. In experiment 1, F2Cs with various frequency and amplitude contours were used. F2Cs with time-varying frequency contours were effective competitors; constant-frequency F2Cs had far less impact. To a lesser extent, amplitude contour also influenced competitor impact; this effect was additive. In experiment 2, F2Cs were created by inverting the F2 frequency contour about its geometric mean and varying its depth of variation over a range from constant to twice the original (0%-200%). The impact on intelligibility was least for constant F2Cs and increased up to ∼100% depth, but little thereafter. The effect of an extraneous formant depends primarily on its frequency contour; interference increases as the depth of variation is increased until the range exceeds that typical for F2 in natural speech.


Asunto(s)
Enmascaramiento Perceptual , Acústica del Lenguaje , Inteligibilidad del Habla , Percepción del Habla , Estimulación Acústica , Acústica , Adolescente , Adulto , Audiometría del Habla , Femenino , Humanos , Masculino , Reconocimiento en Psicología , Espectrografía del Sonido , Factores de Tiempo , Adulto Joven
12.
Adv Exp Med Biol ; 787: 323-31, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23716238

RESUMEN

How speech is separated perceptually from other speech remains poorly understood. In a series of experiments, perceptual organisation was probed by presenting three-formant (F1+F2+F3) analogues of target sentences dichotically, together with a competitor for F2 (F2C), or for F2+F3, which listeners must reject to optimise recognition. To control for energetic masking, the competitor was always presented in the opposite ear to the corresponding target formant(s). Sine-wave speech was used initially, and different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, whatever their amplitude characteristics, whereas constant-frequency F2Cs were ineffective. Subsequent studies used synthetic-formant speech to explore the effects of manipulating the rate and depth of formant-frequency change in the competitor. Competitor efficacy was not tuned to the rate of formant-frequency variation in the target sentences; rather, the reduction in intelligibility increased with competitor rate relative to the rate for the target sentences. Therefore, differences in speech rate may not be a useful cue for separating the speech of concurrent talkers. Effects of competitors whose depth of formant-frequency variation was scaled by a range of factors were explored using competitors derived either by inverting the frequency contour of F2 about its geometric mean (plausibly speech-like pattern) or by using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Competitor efficacy depended on the overall depth of frequency variation, not depth relative to that for the other formants. Furthermore, the triangle-wave competitors were as effective as their more speech-like counterparts. Overall, the results suggest that formant-frequency variation is critical for the across-frequency grouping of formants but that this grouping does not depend on speech-specific constraints.


Asunto(s)
Modelos Biológicos , Fonética , Localización de Sonidos/fisiología , Inteligibilidad del Habla , Percepción del Habla/fisiología , Estimulación Acústica/métodos , Señales (Psicología) , Pruebas de Audición Dicótica , Humanos , Masculino , Enmascaramiento Perceptual/fisiología , Acústica del Lenguaje , Pruebas de Discriminación del Habla
13.
PLoS One ; 18(5): e0285423, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37155632

RESUMEN

One of the primary jobs of visual perception is to build a three-dimensional representation of the world around us from our flat retinal images. These are a rich source of depth cues but no single one of them can tell us about scale (i.e., absolute depth and size). For example, the pictorial depth cues in a (perfect) scale model are identical to those in the real scene that is being modelled. Here we investigate image blur gradients, which derive naturally from the limited depth of field available for any optical device and can be used to help estimate visual scale. By manipulating image blur artificially to produce what is sometimes called fake tilt shift miniaturization, we provide the first performance-based evidence that human vision uses this cue when making forced-choice judgements about scale (identifying which of an image pair was a photograph of a full-scale railway scene, and which was a 1:76 scale model). The orientation of the blur gradient (relative to the ground plane) proves to be crucial, though its rate of change is less important for our task, suggesting a fairly coarse visual analysis of this image parameter.


Asunto(s)
Percepción de Profundidad , Percepción Visual , Humanos , Señales (Psicología) , Gravitación , Juicio
14.
J Vis ; 12(11)2012 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-23077206

RESUMEN

Contrast sensitivity improves with the area of a sine-wave grating, but why? Here we assess this phenomenon against contemporary models involving spatial summation, probability summation, uncertainty, and stochastic noise. Using a two-interval forced-choice procedure we measured contrast sensitivity for circular patches of sine-wave gratings with various diameters that were blocked or interleaved across trials to produce low and high extrinsic uncertainty, respectively. Summation curves were steep initially, becoming shallower thereafter. For the smaller stimuli, sensitivity was slightly worse for the interleaved design than for the blocked design. Neither area nor blocking affected the slope of the psychometric function. We derived model predictions for noisy mechanisms and extrinsic uncertainty that was either low or high. The contrast transducer was either linear (c(1.0)) or nonlinear (c(2.0)), and pooling was either linear or a MAX operation. There was either no intrinsic uncertainty, or it was fixed or proportional to stimulus size. Of these 10 canonical models, only the nonlinear transducer with linear pooling (the noisy energy model) described the main forms of the data for both experimental designs. We also show how a cross-correlator can be modified to fit our results and provide a contemporary presentation of the relation between summation and the slope of the psychometric function.


Asunto(s)
Sensibilidad de Contraste/fisiología , Probabilidad , Psicometría/métodos , Umbral Sensorial/fisiología , Percepción Espacial/fisiología , Incertidumbre , Humanos , Estimulación Luminosa/métodos
15.
PLoS One ; 17(5): e0267056, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35511914

RESUMEN

Image processing algorithms are used to improve digital image representations in either their appearance or storage efficiency. The merit of these algorithms depends, in part, on visual perception by human observers. However, in practice, most are assessed numerically, and the perceptual metrics that do exist are criterion sensitive with several shortcomings. Here we propose an objective performance-based perceptual measure of image quality and demonstrate this by comparing the efficacy of a denoising algorithm for a variety of filters. For baseline, we measured detection thresholds for a white noise signal added to one of a pair of natural images in a two-alternative forced-choice (2AFC) paradigm where each image was selected randomly from a set of n = 308 on each trial. In a series of experimental conditions, the stimulus image pairs were passed through various configurations of a denoising algorithm. The differences in noise detection thresholds with and without denoising are objective perceptual measures of the ability of the algorithm to render noise invisible. This was a factor of two (6dB) in our experiment and consistent across a range of filter bandwidths and types. We also found that thresholds in all conditions converged on a common value of PSNR, offering support for this metric. We discuss how the 2AFC approach might be used for other algorithms including compression, deblurring and edge-detection. Finally, we provide a derivation for our Cartesian-separable log-Gabor filters, with polar parameters. For the biological vision community this has some advantages over the more typical (i) polar-separable variety and (ii) Cartesian-separable variety with Cartesian parameters.


Asunto(s)
Compresión de Datos , Procesamiento de Imagen Asistido por Computador , Algoritmos , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Ruido , Relación Señal-Ruido
16.
Proc Biol Sci ; 278(1711): 1595-600, 2011 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-21068039

RESUMEN

Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the formant frequencies in the vocal-tract output-a key source of phonetic detail-from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher formants (F3' ≈ N5 + N6), such that the frequency contour of each formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying formants.


Asunto(s)
Acústica del Lenguaje , Inteligibilidad del Habla , Estimulación Acústica , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Fonética , Espectrografía del Sonido , Percepción del Habla
17.
J Acoust Soc Am ; 128(6): 3667-77, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21218899

RESUMEN

In an isolated syllable, a formant will tend to be segregated perceptually if its fundamental frequency (F0) differs from that of the other formants. This study explored whether similar results are found for sentences, and specifically whether differences in F0 (ΔF0) also influence across-formant grouping in circumstances where the exclusion or inclusion of the manipulated formant critically determines speech intelligibility. Three-formant (F1 + F2 + F3) analogues of almost continuously voiced natural sentences were synthesized using a monotonous glottal source (F0 = 150 Hz). Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3; F2), where F2C is a competitor for F2 that listeners must resist to optimize recognition. Competitors were created using time-reversed frequency and amplitude contours of F2, and F0 was manipulated (ΔF0 = ± 8, ± 2, or 0 semitones relative to the other formants). Adding F2C typically reduced intelligibility, and this reduction was greatest when ΔF0 = 0. There was an additional effect of absolute F0 for F2C, such that competitor efficacy was greater for higher F0s. However, competitor efficacy was not due to energetic masking of F3 by F2C. The results are consistent with the proposal that a grouping "primitive" based on common F0 influences the fusion and segregation of concurrent formants in sentence perception.


Asunto(s)
Vías Auditivas/fisiología , Acústica del Lenguaje , Percepción del Habla , Estimulación Acústica , Adolescente , Adulto , Umbral Auditivo , Comprensión , Pruebas de Audición Dicótica , Femenino , Humanos , Masculino , Persona de Mediana Edad , Enmascaramiento Perceptual , Percepción de la Altura Tonal , Espectrografía del Sonido , Inteligibilidad del Habla , Factores de Tiempo , Adulto Joven
18.
J Acoust Soc Am ; 128(2): 804-17, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20707450

RESUMEN

Speech comprises dynamic and heterogeneous acoustic elements, yet it is heard as a single perceptual stream even when accompanied by other sounds. The relative contributions of grouping "primitives" and of speech-specific grouping factors to the perceptual coherence of speech are unclear, and the acoustical correlates of the latter remain unspecified. The parametric manipulations possible with simplified speech signals, such as sine-wave analogues, make them attractive stimuli to explore these issues. Given that the factors governing perceptual organization are generally revealed only where competition operates, the second-formant competitor (F2C) paradigm was used, in which the listener must resist competition to optimize recognition [Remez, R. E., et al. (1994). Psychol. Rev. 101, 129-156]. Three-formant (F1+F2+F3) sine-wave analogues were derived from natural sentences and presented dichotically (one ear=F1+F2C+F3; opposite ear=F2). Different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, regardless of their amplitude characteristics. In contrast, F2Cs with constant frequency contours were completely ineffective. Competitor efficacy was not due to energetic masking of F3 by F2C. These findings indicate that modulation of the frequency, but not the amplitude, contour is critical for across-formant grouping.


Asunto(s)
Vías Auditivas/fisiología , Enmascaramiento Perceptual , Percepción de la Altura Tonal , Detección de Señal Psicológica , Inteligibilidad del Habla , Percepción del Habla , Estimulación Acústica , Umbral Auditivo , Pruebas de Audición Dicótica , Humanos , Acústica del Lenguaje , Factores de Tiempo
19.
J Vis ; 9(4): 7.1-16, 2009 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-19757916

RESUMEN

We assessed summation of contrast across eyes and area at detection threshold (C(t)). Stimuli were sine-wave gratings (2.5 c/deg) spatially modulated by cosine- and anticosine-phase raised plaids (0.5 c/deg components oriented at +/-45 degrees ). When presented dichoptically the signal regions were interdigitated across eyes but produced a smooth continuous grating following their linear binocular sum. The average summation ratio (C(t1)/([C(t1+2)]) for this stimulus pair was 1.64 (4.3 dB). This was only slightly less than the binocular summation found for the same patch type presented to both eyes, and the area summation found for the two different patch types presented to the same eye. We considered 192 model architectures containing each of the following four elements in all possible orders: (i) linear summation or a MAX operator across eyes, (ii) linear summation or a MAX operator across area, (iii) linear or accelerating contrast transduction, and (iv) additive Gaussian, stochastic noise. Formal equivalences reduced this to 62 different models. The most successful four-element model was: linear summation across eyes followed by nonlinear contrast transduction, linear summation across area, and late noise. Model performance was enhanced when additional nonlinearities were placed before binocular summation and after area summation. The implications for models of probability summation and uncertainty are discussed.


Asunto(s)
Sensibilidad de Contraste/fisiología , Modelos Neurológicos , Visión Binocular/fisiología , Percepción Visual/fisiología , Humanos , Dinámicas no Lineales , Distribución Normal , Estimulación Luminosa/métodos , Procesos Estocásticos
20.
Vis Neurosci ; 25(4): 585-601, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18764960

RESUMEN

Recent work has revealed multiple pathways for cross-orientation suppression in cat and human vision. In particular, ipsiocular and interocular pathways appear to assert their influence before binocular summation in human but have different (1) spatial tuning, (2) temporal dependencies, and (3) adaptation after-effects. Here we use mask components that fall outside the excitatory passband of the detecting mechanism to investigate the rules for pooling multiple mask components within these pathways. We measured psychophysical contrast masking functions for vertical 1 cycle/deg sine-wave gratings in the presence of left or right oblique ( 16%. We tested contrast gain control models involving two types of contrast combination on the denominator: (1) spatial pooling of the mask after a local nonlinearity (to calculate either root mean square contrast or energy) and (2) (Holmes & Meese, 2004, Journal of Vision 4, 1080-1089), involving the linear sum of the mask component contrasts. Monoptic and dichoptic masking were typically better fit by the spatial pooling models, but binocular masking was not: it demanded strict linear summation of the Michelson contrast across mask orientation. Another scheme, in which suppressive pooling followed compressive contrast responses to the mask components (e.g., oriented cortical cells), was ruled out by all of our data. We conclude that the different processes that underlie monoptic and dichoptic masking use the same type of contrast pooling within their respective suppressive fields, but the effects do not sum to predict the binocular case.


Asunto(s)
Sensibilidad de Contraste/fisiología , Fenómenos Fisiológicos Oculares , Enmascaramiento Perceptual/fisiología , Adulto , Cuerpos Geniculados/fisiología , Humanos , Modelos Psicológicos , Orientación/fisiología , Psicofísica , Percepción Espacial/fisiología , Visión Binocular/fisiología , Visión Monocular/fisiología , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA