Spectro-temporal acoustical markers differentiate speech from song across cultures.
Nat Commun
; 15(1): 4835, 2024 Jun 06.
Article
em En
| MEDLINE
| ID: mdl-38844457
ABSTRACT
Humans produce two forms of cognitively complex vocalizations speech and song. It is debated whether these differ based primarily on culturally specific, learned features, or if acoustical features can reliably distinguish them. We study the spectro-temporal modulation patterns of vocalizations produced by 369 people living in 21 urban, rural, and small-scale societies across six continents. Specific ranges of spectral and temporal modulations, overlapping within categories and across societies, significantly differentiate speech from song. Machine-learning classification shows that this effect is cross-culturally robust, vocalizations being reliably classified solely from their spectro-temporal features across all 21 societies. Listeners unfamiliar with the cultures classify these vocalizations using similar spectro-temporal cues as the machine learning algorithm. Finally, spectro-temporal features are better able to discriminate song from speech than a broad range of other acoustical variables, suggesting that spectro-temporal modulation-a key feature of auditory neuronal tuning-accounts for a fundamental difference between these categories.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Fala
/
Aprendizado de Máquina
Limite:
Adult
/
Female
/
Humans
/
Male
/
Middle aged
Idioma:
En
Revista:
Nat Commun
Ano de publicação:
2024
Tipo de documento:
Article