Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Sci Adv ; 9(23): eabq2969, 2023 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-37294764

RESUMO

The genetic basis of the human vocal system is largely unknown, as are the sequence variants that give rise to individual differences in voice and speech. Here, we couple data on diversity in the sequence of the genome with voice and vowel acoustics in speech recordings from 12,901 Icelanders. We show how voice pitch and vowel acoustics vary across the life span and correlate with anthropometric, physiological, and cognitive traits. We found that voice pitch and vowel acoustics have a heritable component and discovered correlated common variants in ABCC9 that associate with voice pitch. The ABCC9 variants also associate with adrenal gene expression and cardiovascular traits. By showing that voice and vowel acoustics are influenced by genetics, we have taken important steps toward understanding the genetics and evolution of the human vocal system.


Assuntos
Acústica da Fala , Voz , Humanos , Fala/fisiologia , Acústica
2.
IEEE J Biomed Health Inform ; 26(7): 3418-3426, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35294367

RESUMO

The diagnosis of sleep disordered breathing depends on the detection of respiratory-related events: apneas, hypopneas, snores, or respiratory event-related arousals from sleep studies. While a number of automatic detection methods have been proposed, their reproducibility has been an issue, in part due to the absence of a generally accepted protocol for evaluating their results. With sleep measurements this is usually treated as a classification problem and the accompanying issue of localization is not treated as similarly critical. To address these problems we present a detection evaluation protocol that is able to qualitatively assess the match between two annotations of respiratory-related events. This protocol relies on measuring the relative temporal overlap between two annotations in order to find an alignment that maximizes their F1-score at the sequence level. This protocol can be used in applications which require a precise estimate of the number of events, total event duration, and a joint estimate of event number and duration. We assess its application using a data set that contains over 10,000 manually annotated snore events from 9 subjects, and show that when using the American Academy of Sleep Medicine Manual standard, two sleep technologists can achieve an F1-score of 0.88 when identifying the presence of snore events. In addition, we drafted rules for marking snore boundaries and showed that one sleep technologist can achieve F1-score of 0.94 at the same tasks. Finally, we compared this protocol against the protocol that is used to evaluate sleep spindle detection and highlighted the differences.


Assuntos
Apneia Obstrutiva do Sono , Automação , Humanos , Polissonografia/métodos , Reprodutibilidade dos Testes , Sono , Apneia Obstrutiva do Sono/diagnóstico , Ronco
3.
IEEE/ACM Trans Audio Speech Lang Process ; 25(12): 2281-2291, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33748320

RESUMO

The goal of this study was to investigate the performance of different feature types for voice quality classification using multiple classifiers. The study compared the COVAREP feature set; which included glottal source features, frequency warped cepstrum and harmonic model features; against the mel-frequency cepstral coefficients (MFCCs) computed from the acoustic voice signal, acoustic-based glottal inverse filtered (GIF) waveform, and electroglottographic (EGG) waveform. Our hypothesis was that MFCCs can capture the perceived voice quality from either of these three voice signals. Experiments were carried out on recordings from 28 participants with normal vocal status who were prompted to sustain vowels with modal and non-modal voice qualities. Recordings were rated by an expert listener using the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V), and the ratings were transformed into a dichotomous label (presence or absence) for the prompted voice qualities of modal voice, breathiness, strain, and roughness. The classification was done using support vector machines, random forests, deep neural networks and Gaussian mixture model classifiers, which were built as speaker independent using a leave-one-speaker-out strategy. The best classification accuracy of 79.97% was achieved for the full COVAREP set. The harmonic model features were the best performing subset, with 78.47% accuracy, and the static+dynamic MFCCs scored at 74.52%. A closer analysis showed that MFCC and dynamic MFCC features were able to classify modal, breathy, and strained voice quality dimensions from the acoustic and GIF waveforms. Reduced classification performance was exhibited by the EGG waveform.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA