Pesquisa | Biblioteca Virtual em Saúde

Development and Validation of a Deep Learning System for Sound-based Prediction of Urinary Flow.

Lee, Han Jie; Aslim, Edwin Jonathan; Balamurali, B T; Ng, Lynn Yun Shu; Kuo, Tricia Li Chuen; Lin, Cindy Ming Ying; Clarke, Christopher Johann; Priyadarshinee, Prachee; Chen, Jer-Ming; Ng, Lay Guat.

Eur Urol Focus ; 9(1): 209-215, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-35835694

RESUMO

BACKGROUND: Uroflowmetry remains an important tool for the assessment of patients with lower urinary tract symptoms (LUTS), but accuracy can be limited by within-subject variation of urinary flow rates. Voiding acoustics appear to correlate well with conventional uroflowmetry and show promise as a convenient home-based alternative for the monitoring of urinary flows. OBJECTIVE: To evaluate the ability of a sound-based deep learning algorithm (Audioflow) to predict uroflowmetry parameters and identify abnormal urinary flow patterns. DESIGN, SETTING, AND PARTICIPANTS: In this prospective open-label study, 534 male participants recruited at Singapore General Hospital between December 1, 2017 and July 1, 2019 voided into a uroflowmetry machine, and voiding acoustics were recorded using a smartphone in close proximity. The Audioflow algorithm consisted of two models-the first model for the prediction of flow parameters including maximum flow rate (Qmax), average flow rate (Qave), and voided volume (VV) was trained and validated using leave-one-out cross-validation procedures; the second model for discrimination of normal and abnormal urinary flows was trained based on a reference standard created by three senior urologists. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS: Lin's correlation coefficient was used to evaluate the agreement between Audioflow predictions and conventional uroflowmetry for Qmax, Qave, and VV. Accuracy of the Audioflow algorithm in the identification of abnormal urinary flows was assessed with sensitivity analyses and the area under the receiver operating curve (AUC); this algorithm was compared with an external panel of graders comprising six urology residents/general practitioners who separately graded flow patterns in the validation dataset. RESULTS AND LIMITATIONS: A total of 331 patients were included for analysis. Agreement between Audioflow and conventional uroflowmetry for Qmax, Qave, and VV was 0.77 (95% confidence interval [CI], 0.72-0.80), 0.85 (95% CI, 0.82-0.88) and 0.84 (95% CI, 0.80-0.87), respectively. For the identification of abnormal flows, Audioflow achieved a high rate of agreement of 83.8% (95% CI, 77.5-90.1%) with the reference standard, and was comparable with an external panel of six residents/general practitioners. AUC was 0.892 (95% CI, 0.834-0.951), with high sensitivity of 87.3% (95% CI, 76.8-93.7%) and specificity of 77.5% (95% CI, 61.1-88.6%). CONCLUSIONS: The results of this study suggest that a deep learning algorithm can predict uroflowmetry parameters and identify abnormal urinary voids based on voiding sounds, and shows promise as a simple home-based alternative to uroflowmetry in the management of patients with LUTS. PATIENT SUMMARY: In this study, we trained a deep learning-based algorithm to measure urinary flow rates and identify abnormal flow patterns based on voiding sounds. This may provide a convenient, home-based alternative to conventional uroflowmetry for the assessment and monitoring of patients with lower urinary tract symptoms.

Assuntos

Aprendizado Profundo , Sintomas do Trato Urinário Inferior , Humanos , Masculino , Sintomas do Trato Urinário Inferior/diagnóstico , Estudos Prospectivos , Reologia/métodos , Urodinâmica

AttendAffectNet-Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention.

Thao, Ha Thi Phuong; Balamurali, B T; Roig, Gemma; Herremans, Dorien.

Sensors (Basel) ; 21(24)2021 Dec 14.

Artigo em Inglês | MEDLINE | ID: mdl-34960450

RESUMO

In this paper, we tackle the problem of predicting the affective responses of movie viewers, based on the content of the movies. Current studies on this topic focus on video representation learning and fusion techniques to combine the extracted features for predicting affect. Yet, these typically, while ignoring the correlation between multiple modality inputs, ignore the correlation between temporal inputs (i.e., sequential features). To explore these correlations, a neural network architecture-namely AttendAffectNet (AAN)-uses the self-attention mechanism for predicting the emotions of movie viewers from different input modalities. Particularly, visual, audio, and text features are considered for predicting emotions (and expressed in terms of valence and arousal). We analyze three variants of our proposed AAN: Feature AAN, Temporal AAN, and Mixed AAN. The Feature AAN applies the self-attention mechanism in an innovative way on the features extracted from the different modalities (including video, audio, and movie subtitles) of a whole movie to, thereby, capture the relationships between them. The Temporal AAN takes the time domain of the movies and the sequential dependency of affective responses into account. In the Temporal AAN, self-attention is applied on the concatenated (multimodal) feature vectors representing different subsequent movie segments. In the Mixed AAN, we combine the strong points of the Feature AAN and the Temporal AAN, by applying self-attention first on vectors of features obtained from different modalities in each movie segment and then on the feature representations of all subsequent (temporal) movie segments. We extensively trained and validated our proposed AAN on both the MediaEval 2016 dataset for the Emotional Impact of Movies Task and the extended COGNIMUSE dataset. Our experiments demonstrate that audio features play a more influential role than those extracted from video and movie subtitles when predicting the emotions of movie viewers on these datasets. The models that use all visual, audio, and text features simultaneously as their inputs performed better than those using features extracted from each modality separately. In addition, the Feature AAN outperformed other AAN variants on the above-mentioned datasets, highlighting the importance of taking different features as context to one another when fusing them. The Feature AAN also performed better than the baseline models when predicting the valence dimension.

Assuntos

Emoções , Filmes Cinematográficos , Nível de Alerta , Redes Neurais de Computação

Deep Neural Network-Based Respiratory Pathology Classification Using Cough Sounds.

Balamurali, B T; Hee, Hwan Ing; Kapoor, Saumitra; Teoh, Oon Hoe; Teng, Sung Shin; Lee, Khai Pin; Herremans, Dorien; Chen, Jer Ming.

Sensors (Basel) ; 21(16)2021 Aug 18.

Artigo em Inglês | MEDLINE | ID: mdl-34450996

RESUMO

Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). To train a deep neural network model, we collected a new dataset of cough sounds, labelled with a clinician's diagnosis. The chosen model is a bidirectional long-short-term memory network (BiLSTM) based on Mel-Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs-healthy or pathology (in general or belonging to a specific respiratory pathology)-reaches accuracy exceeding 84% when classifying the cough to the label provided by the physicians' diagnosis. To classify the subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among four classes of coughs, overall accuracy dropped: one class of pathological coughs is often misclassified as the other. However, if one considers the healthy cough classified as healthy and pathological cough classified to have some kind of pathology, then the overall accuracy of the four-class model is above 84%. A longitudinal study of MFCC feature space when comparing pathological and recovered coughs collected from the same subjects revealed the fact that pathological coughs, irrespective of the underlying conditions, occupy the same feature space making it harder to differentiate only using MFCC features.

Assuntos

Asma , Tosse , Asma/diagnóstico , Criança , Tosse/diagnóstico , Humanos , Estudos Longitudinais , Redes Neurais de Computação , Sons Respiratórios/diagnóstico , Som

Acoustic Effect of Face Mask Design and Material Choice.

Balamurali, B T; Enyi, Tan; Clarke, Christopher Johann; Harn, Sim Yuh; Chen, Jer-Ming.

Acoust Aust ; 49(3): 505-512, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34099950

RESUMO

The widespread adoption of face masks is now a standard public health response to the 2020 pandemic. Although studies have shown that wearing a face mask interferes with speech and intelligibility, relating the acoustic response of the mask to design parameters such as fabric choice, number of layers and mask geometry is not well understood. Using a dummy head mounted with a loudspeaker at its mouth generating a broadband signal, we report the acoustic response associated with 10 different masks (different material/design) and the effect of material layers; a small number of masks were found to be almost acoustically transparent (minimal losses). While different mask material and design result in different frequency responses, we find that material selection has somewhat greater influence on transmission characteristics than mask design or geometry choices. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s40857-021-00245-2.

Hydrodynamic object identification with artificial neural models.

Lakkam, Sreetej; Balamurali, B T; Bouffanais, Roland.

Sci Rep ; 9(1): 11242, 2019 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-31375742

RESUMO

The lateral-line system that has evolved in many aquatic animals enables them to navigate murky fluid environments, locate and discriminate obstacles. Here, we present a data-driven model that uses artificial neural networks to process flow data originating from a stationary sensor array located away from an obstacle placed in a potential flow. The ability of neural networks to estimate complex underlying relationships between parameters, in the absence of any explicit mathematical description, is first assessed with two basic potential flow problems: single source/sink identification and doublet detection. Subsequently, we address the inverse problem of identifying an obstacle shape from distant measures of the pressure or velocity field. Using the analytical solution to the forward problem, very large training data sets are generated, allowing us to obtain the synaptic weights by means of a gradient-descent based optimization. The resulting neural network exhibits remarkable effectiveness in predicting unknown obstacle shapes, especially at relatively large distances for which classical linear regression models are completely ineffectual. These results have far-reaching implications for the design and development of artificial passive hydrodynamic sensing technology.

Impact of dynamic rate coding aspects of mobile phone networks on forensic voice comparison.

Alzqhoul, Esam A S; Nair, Balamurali B T; Guillemin, Bernard J.

Sci Justice ; 55(5): 363-74, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-26385720

RESUMO

Previous studies have shown that landline and mobile phone networks are different in their ways of handling the speech signal, and therefore in their impact on it. But the same is also true of the different networks within the mobile phone arena. There are two major mobile phone technologies currently in use today, namely the global system for mobile communications (GSM) and code division multiple access (CDMA) and these are fundamentally different in their design. For example, the quality of the coded speech in the GSM network is a function of channel quality, whereas in the CDMA network it is determined by channel capacity (i.e., the number of users sharing a cell site). This paper examines the impact on the speech signal of a key feature of these networks, namely dynamic rate coding, and its subsequent impact on the task of likelihood-ratio-based forensic voice comparison (FVC). Surprisingly, both FVC accuracy and precision are found to be better for both GSM- and CDMA-coded speech than for uncoded. Intuitively one expects FVC accuracy to increase with increasing coded speech quality. This trend is shown to occur for the CDMA network, but, surprisingly, not for the GSM network. Further, in respect to comparisons between these two networks, FVC accuracy for CDMA-coded speech is shown to be slightly better than for GSM-coded speech, particularly when the coded-speech quality is high, but in terms of FVC precision the two networks are shown to be very similar.

Assuntos

Identificação Biométrica , Telefone Celular , Redes de Comunicação de Computadores , Acústica da Fala , Voz , Ciências Forenses , Humanos , Tecnologia sem Fio

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA