Pesquisa | BVS - MINISTÉRIO DA SAÚDE

A review of social background profiling of speakers from speech accents.

Humayun, Mohammad Ali; Shuja, Junaid; Abas, Pg Emeroylariffion.

PeerJ Comput Sci ; 10: e1984, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38660189

RESUMO

Social background profiling of speakers is heavily used in areas, such as, speech forensics, and tuning speech recognition for accuracy improvement. This article provides a survey of recent research in speaker background profiling in terms of accent classification and analyses the datasets, speech features, and classification models used for the classification tasks. The aim is to provide a comprehensive overview of recent research related to speaker background profiling and to present a comparative analysis of the achieved performance measures. Comprehensive descriptions of the datasets, speech features, and classification models used in recent research for accent classification have been presented, with a comparative analysis made on the performance measures of the different methods. This analysis provides insights into the strengths and weaknesses of the different methods for accent classification. Subsequently, research gaps have been identified, which serve as a useful resource for researchers looking to advance the field.

An Ensembled Framework for Human Breast Cancer Survivability Prediction Using Deep Learning.

Mustafa, Ehzaz; Jadoon, Ehtisham Khan; Khaliq-Uz-Zaman, Sardar; Humayun, Mohammad Ali; Maray, Mohammed.

Diagnostics (Basel) ; 13(10)2023 May 10.

Artigo em Inglês | MEDLINE | ID: mdl-37238173

RESUMO

Breast cancer is categorized as an aggressive disease, and it is one of the leading causes of death. Accurate survival predictions for both long-term and short-term survivors, when delivered on time, can help physicians make effective treatment decisions for their patients. Therefore, there is a dire need to design an efficient and rapid computational model for breast cancer prognosis. In this study, we propose an ensemble model for breast cancer survivability prediction (EBCSP) that utilizes multi-modal data and stacks the output of multiple neural networks. Specifically, we design a convolutional neural network (CNN) for clinical modalities, a deep neural network (DNN) for copy number variations (CNV), and a long short-term memory (LSTM) architecture for gene expression modalities to effectively handle multi-dimensional data. The independent models' results are then used for binary classification (long term > 5 years and short term < 5 years) based on survivability using the random forest method. The EBCSP model's successful application outperforms models that utilize a single data modality for prediction and existing benchmarks.

A transformer fine-tuning strategy for text dialect identification.

Humayun, Mohammad Ali; Yassin, Hayati; Shuja, Junaid; Alourani, Abdullah; Abas, Pg Emeroylariffion.

Neural Comput Appl ; 35(8): 6115-6124, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36408287

RESUMO

Online medical consultation can significantly improve the efficiency of primary health care. Recently, many online medical question-answer services have been developed that connect the patients with relevant medical consultants based on their questions. Considering the linguistic variety in their question, social background identification of patients can improve the referral system by selecting a medical consultant with a similar social origin for efficient communication. This paper has proposed a novel fine-tuning strategy for the pre-trained transformers to identify the social origin of text authors. When fused with the existing adapter model, the proposed methods achieve an overall accuracy of 53.96% for the Arabic dialect identification task on the Nuanced Arabic Dialect Identification (NADI) dataset. The overall accuracy is 0.54% higher than the previous best for the same dataset, which establishes the utility of custom fine-tuning strategies for pre-trained transformer models.

Spatial position constraint for unsupervised learning of speech representations.

Humayun, Mohammad Ali; Yassin, Hayati; Abas, Pg Emeroylariffion.

PeerJ Comput Sci ; 7: e650, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34395866

RESUMO

The success of supervised learning techniques for automatic speech processing does not always extend to problems with limited annotated speech. Unsupervised representation learning aims at utilizing unlabelled data to learn a transformation that makes speech easily distinguishable for classification tasks, whereby deep auto-encoder variants have been most successful in finding such representations. This paper proposes a novel mechanism to incorporate geometric position of speech samples within the global structure of an unlabelled feature set. Regression to the geometric position is also added as an additional constraint for the representation learning auto-encoder. The representation learnt by the proposed model has been evaluated over a supervised classification task for limited vocabulary keyword spotting, with the proposed representation outperforming the commonly used cepstral features by about 9% in terms of classification accuracy, despite using a limited amount of labels during supervision. Furthermore, a small keyword dataset has been collected for Kadazan, an indigenous, low-resourced Southeast Asian language. Analysis for the Kadazan dataset also confirms the superiority of the proposed representation for limited annotation. The results are significant as they confirm that the proposed method can learn unsupervised speech representations effectively for classification tasks with scarce labelled data.

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA