RESUMEN
This study compares fundamental frequency (fo) and fundamental frequency standard deviation (foSD) of COVID-19 patients with the same parameters in the speech of subjects without COVID-19, and verifies whether there is an effect of age and sex in the patient group. Both groups, subjects with and without COVID-19, are formed by Brazilian Portuguese speakers. Speech samples were obtained from 100 patients with mild to severe symptoms of COVID-19, and 100 healthy subjects. A single 31-syllable Portuguese sentence was used as the elicitation material for all subjects. The recordings were divided into four age groups. The acoustic measures were semi-automatically extracted and analyzed by a series of analyses of variance. Patients with COVID-19 present vocal differences in fo-related parameters when compared to healthy subjects, that is, patient voices presented higher fo and foSD with respect to control voices. In addition, for patient voices, there was an age and sex effect on fo SD values. Vocal parameters of women and elderly subjects showed more marked differences in fo-related parameters, indicating that patient voices are higher-pitched and have a higher variation of fo SD. Consequently, fo-related parameters may be tested as vocal biomarkers in the screening of respiratory insufficiency by voice analysis, in patients with severe symptoms of COVID-19.
Asunto(s)
COVID-19 , Voz , Humanos , Femenino , Anciano , Calidad de la Voz , Brasil/epidemiología , Acústica del LenguajeRESUMEN
Pathology reports are a main source of information regarding cancer diagnosis and are commonly written following semi-structured templates that include tumour localisation and behaviour. In this work, we evaluated the efficiency of support vector machines (SVMs) to classify pathology reports written in Portuguese into the International Classification of Diseases for Oncology (ICD-O), a biaxial classification of cancer topography and morphology. A partnership program with the Brazilian hospital A.C. Camargo Cancer Center provided anonymised pathology reports and structured data from 94,980 patients used for training and validation. We employed SVMs with tf-idf weighting scheme in a bag-of-words approach and report F1 score of 0.82 for 18 sites and 0.73 for 49 morphology classes. With the largest dataset ever used in such a task, our work provides reliable estimates for the classification of pathology reports in Portuguese and agrees with a few similar studies published in the same kind of data in other languages.
Asunto(s)
Clasificación Internacional de Enfermedades/organización & administración , Neoplasias/patología , Máquina de Vectores de Soporte , Brasil , Humanos , Neoplasias/diagnóstico , Sistema de RegistrosRESUMEN
This work develops an automated classifier of pathology reports which infers the topography and the morphology classes of a tumor using codes from the International Classification of Diseases for Oncology (ICD-O). Data from 94,980 patients of the A.C. Camargo Cancer Center was used for training and validation of Naive Bayes classifiers, evaluated by the F1-score. Measures greater than 74% in the topographic group and 61% in the morphologic group are reported. Our work provides a successful baseline for future research for the classification of medical documents written in Portuguese and in other domains.