Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Sensors (Basel) ; 22(14)2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35890943

RESUMO

Reinforcement learning (RL) with both exploration and exploit abilities is applied to games to demonstrate that it can surpass human performance. This paper mainly applies Deep Q-Network (DQN), which combines reinforcement learning and deep learning to the real-time action response of NS-SHAFT game with Cheat Engine as the API of game information autonomously. Based on a personal computer, we build an experimental learning environment that automatically captures the NS-SHAFT's frame, which is provided to DQN to decide the action of moving left, moving right, or stay in same location, survey different parameters: such as the sample frequency, different reward function, and batch size, etc. The experiment found that the relevant parameter settings have a certain degree of influence on the DQN learning effect. Moreover, we use Cheat Engine as the API of NS-SHAFT game information to locate the relevant values in the NS-SHAFT game, and then read the relevant values to achieve the operation of the overall experimental platform and the calculation of Reward. Accordingly, we successfully establish an instant learning environment and instant game training for the NS-SHAFT game.


Assuntos
Redes Neurais de Computação , Reforço Psicológico , Humanos , Recompensa
2.
Sensors (Basel) ; 22(22)2022 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-36433387

RESUMO

Current surveillance systems frequently use fixed-angle cameras and record a feed from those cameras. There are several disadvantages to such systems, including a low resolution for far away objects, a limited frame range and wasted disk space. This paper presents a novel algorithm for automatically detecting, tracking and zooming in on active targets. The object tracking system is connected to a camera that has a 360° horizontal and 90° vertical movement range. The combination of tracking, movement identification and zoom means that the system is able to effectively improve the resolution of small or distant objects. The object detection system allows for the disk space to be conserved as the system ceases recording when no valid targets are detected. Using an adaptive object segmentation algorithm, it is possible to detect the shape of moving objects efficiently. When processing multiple targets, each target is assigned a color and is treated separately. The tracking algorithm is able to adapt to targets moving at different speeds and is able to control the camera according to a predictive formula to prevent the loss of image quality due to camera trail. In the test environment, the zoom can sufficiently lock onto the head of a moving human; however, simultaneous tracking and zooming occasionally results in a failure to track. If this system is deployed with a facial recognition algorithm, the recognition accuracy can be effectively improved.


Assuntos
Algoritmos , Movimento , Humanos
3.
Sensors (Basel) ; 21(21)2021 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-34770704

RESUMO

The aim of this paper is to distinguish the vehicle detection and count the class number in each classification from the inputs. We proposed the use of Fuzzy Guided Scale Choice (FGSC)-based SSD deep neural network architecture for vehicle detection and class counting with parameter optimization. The 'FGSC' blocks are integrated into the convolutional layers of the model, which emphasize essential features while ignoring less important ones that are not significant for the operation. We created the passing detection lines and class counting windows and connected them with the proposed FGSC-SSD deep neural network model. The 'FGSC' blocks in the convolution layer emphasize essential features and find out unnecessary features by using the scale choice method at the training stage and eliminate that significant speedup of the model. In addition, FGSC blocks avoided many unusable parameters in the saturation interval and improved the performance efficiency. In addition, the Fuzzy Sigmoid Function (FSF) increases the activation interval through fuzzy logic. While performing operations, the FGSC-SSD model reduces the computational complexity of convolutional layers and their parameters. As a result, the model tested Frames Per Second (FPS) on edge artificial intelligence (AI) and reached a real-time processing speed of 38.4 and an accuracy rate of more than 94%. Therefore, this work might be considered an improvement to the traffic monitoring approach by using edge AI applications.


Assuntos
Inteligência Artificial , Sulfadiazina de Prata , Algoritmos , Lógica Fuzzy , Redes Neurais de Computação
4.
Sensors (Basel) ; 20(24)2020 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-33348665

RESUMO

In recent years, chip design technology and AI (artificial intelligence) have made significant progress. This forces all of fields to investigate how to increase the competitiveness of products with machine learning technology. In this work, we mainly use deep learning coupled with motor control to realize the real-time interactive system of air hockey, and to verify the feasibility of machine learning in the real-time interactive system. In particular, we use the convolutional neural network YOLO ("you only look once") to capture the hockey current position. At the same time, the law of reflection and neural networking are applied to predict the end position of the puck Based on the predicted location, the system will control the stepping motor to move the linear slide to realize the real-time interactive air hockey system. Finally, we discuss and verify the accuracy of the prediction of the puck end position and improve the system response time to meet the system requirements.

5.
Sensors (Basel) ; 20(2)2020 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-31940932

RESUMO

In recent years, there are several cost-effective intelligent sensing systems such as ultrasound imaging systems for visualizing the internal body structures of the body. Further, such intelligent sensing systems such as ultrasound systems have been deployed by medical doctors around the globe for efficient detection of several diseases and disorders in the human body. Even though the ultrasound sensing system is a useful tool for obtaining the imagery of various body parts, there is always a possibility of inconsistencies in these images due to the variation in the settings of the system parameters. Therefore, in order to overcome such issues, this research devises an SVM-enabled intelligent genetic algorithmic model for choosing the universal features with four distinct settings of the parameters. Subsequently, the distinguishing characteristics of these features are assessed utilizing the Sorensen-Dice coefficient, T-test, and Pearson's R measure. It is apparent from the results of the SVM-enabled intelligent genetic algorithmic model that this approach aids in the effectual selection of universal features for the breast cyst images. In addition, this approach also accomplishes superior accuracy in the classification of the ultrasound image for four distinct settings of the parameters.


Assuntos
Cisto Mamário/diagnóstico por imagem , Processamento de Imagem Assistida por Computador , Máquina de Vetores de Suporte , Ultrassonografia , Feminino , Análise de Fourier , Humanos , Análise de Ondaletas
6.
Sensors (Basel) ; 19(22)2019 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-31698678

RESUMO

The present methods of diagnosing depression are entirely dependent on self-report ratings or clinical interviews. Those traditional methods are subjective, where the individual may or may not be answering genuinely to questions. In this paper, the data has been collected using self-report ratings and also using electronic smartwatches. This study aims to develop a weighted average ensemble machine learning model to predict major depressive disorder (MDD) with superior accuracy. The data has been pre-processed and the essential features have been selected using a correlation-based feature selection method. With the selected features, machine learning approaches such as Logistic Regression, Random Forest, and the proposed Weighted Average Ensemble Model are applied. Further, for assessing the performance of the proposed model, the Area under the Receiver Optimization Characteristic Curves has been used. The results demonstrate that the proposed Weighted Average Ensemble model performs with better accuracy than the Logistic Regression and the Random Forest approaches.


Assuntos
Transtorno Depressivo Maior/diagnóstico , Adulto , Feminino , Humanos , Modelos Logísticos , Aprendizado de Máquina , Masculino , Máquina de Vetores de Suporte
7.
Sensors (Basel) ; 18(11)2018 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-30428610

RESUMO

Cleaning robot has the highest penetration rate among the service robots. This paper proposes a high-efficiency mechanism for an intelligent cleaning robot automatically returns to charging in a short time when the power is insufficient. The proposed mechanism initially combines the robot's own motor encoder with neural network linear regression to calculate the moving distance and rotation angle for the location estimation of the robot itself. At the same time, a self-rotating camera is applied to scan the number of infrared spots on the docking station to find the location of the docking station so that the cleaning robot returns to charging properly in two stages, existing infrared range and extended infrared range. In addition, six-axis acceleration and ultrasound are both applied to deal with the angle error that is caused by collision. Experimental results show that the proposed recharging mechanism significantly improves the efficiency of recharging.

8.
Diagnostics (Basel) ; 14(9)2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38732366

RESUMO

We present a deep learning (DL) network-based approach for detecting and semantically segmenting two specific types of tuberculosis (TB) lesions in chest X-ray (CXR) images. In the proposed method, we use a basic U-Net model and its enhanced versions to detect, classify, and segment TB lesions in CXR images. The model architectures used in this study are U-Net, Attention U-Net, U-Net++, Attention U-Net++, and pyramid spatial pooling (PSP) Attention U-Net++, which are optimized and compared based on the test results of each model to find the best parameters. Finally, we use four ensemble approaches which combine the top five models to further improve lesion classification and segmentation results. In the training stage, we use data augmentation and preprocessing methods to increase the number and strength of lesion features in CXR images, respectively. Our dataset consists of 110 training, 14 validation, and 98 test images. The experimental results show that the proposed ensemble model achieves a maximum mean intersection-over-union (MIoU) of 0.70, a mean precision rate of 0.88, a mean recall rate of 0.75, a mean F1-score of 0.81, and an accuracy of 1.0, which are all better than those of only using a single-network model. The proposed method can be used by clinicians as a diagnostic tool assisting in the examination of TB lesions in CXR images.

9.
Biomimetics (Basel) ; 8(3)2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37504210

RESUMO

The objective of this paper is to present a novel design of intelligent neuro-supervised networks (INSNs) in order to study the dynamics of a mathematical model for Parkinson's disease illness (PDI), governed with three differential classes to represent the rhythms of brain electrical activity measurements at different locations in the cerebral cortex. The proposed INSNs are constructed by exploiting the knacks of multilayer structure neural networks back-propagated with the Levenberg-Marquardt (LM) and Bayesian regularization (BR) optimization approaches. The reference data for the grids of input and the target samples of INSNs were formulated with a reliable numerical solver via the Adams method for sundry scenarios of PDI models by way of variation of sensor locations in order to measure the impact of the rhythms of brain electrical activity. The designed INSNs for both backpropagation procedures were implemented on created datasets segmented arbitrarily into training, testing, and validation samples by optimization of mean squared error based fitness function. Comparison of outcomes on the basis of exhaustive simulations of proposed INSNs via both LM and BR methodologies was conducted with reference solutions of PDI models by means of learning curves on MSE, adaptive control parameters of algorithms, absolute error, histogram error plots, and regression index. The outcomes endorse the efficacy of both INSNs solvers for different scenarios in PDI models, but the accuracy of the BR-based method is relatively superior, albeit at the cost of slightly more computations.

10.
Diagnostics (Basel) ; 13(12)2023 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-37370966

RESUMO

The ongoing fast-paced technology trend has brought forth ceaseless transformation. In this regard, cloud computing has long proven to be the paramount deliverer of services such as computing power, software, networking, storage, and databases on a pay-per-use basis. The cloud is a big proponent of the internet of things (IoT), furnishing the computation and storage requisite to address internet-of-things applications. With the proliferating IoT devices triggering a continual data upsurge, the cloud-IoT interaction encounters latency, bandwidth, and connectivity restraints. The inclusion of the decentralized and distributed fog computing layer amidst the cloud and IoT layer extends the cloud's processing, storage, and networking services close to end users. This hierarchical edge-fog-cloud model distributes computation and intelligence, yielding optimal solutions while tackling constraints like massive data volume, latency, delay, and security vulnerability. The healthcare domain, warranting time-critical functionalities, can reap benefits from the cloud-fog-IoT interplay. This research paper propounded a fog-assisted smart healthcare system to diagnose heart or cardiovascular disease. It combined a fuzzy inference system (FIS) with the recurrent neural network model's variant of the gated recurrent unit (GRU) for pre-processing and predictive analytics tasks. The proposed system showcases substantially improved performance results, with classification accuracy at 99.125%. With major processing of healthcare data analytics happening at the fog layer, it is observed that the proposed work reveals optimized results concerning delays in terms of latency, response time, and jitter, compared to the cloud. Deep learning models are adept at handling sophisticated tasks, particularly predictive analytics. Time-critical healthcare applications reap benefits from deep learning's exclusive potential to furnish near-perfect results, coupled with the merits of the decentralized fog model, as revealed by the experimental results.

11.
Front Public Health ; 11: 1091850, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36817919

RESUMO

Brain tumor diagnosis has been a lengthy process, and automation of a process such as brain tumor segmentation speeds up the timeline. U-Nets have been a commonly used solution for semantic segmentation, and it uses a downsampling-upsampling approach to segment tumors. U-Nets rely on residual connections to pass information during upsampling; however, an upsampling block only receives information from one downsampling block. This restricts the context and scope of an upsampling block. In this paper, we propose SPP-U-Net where the residual connections are replaced with a combination of Spatial Pyramid Pooling (SPP) and Attention blocks. Here, SPP provides information from various downsampling blocks, which will increase the scope of reconstruction while attention provides the necessary context by incorporating local characteristics with their corresponding global dependencies. Existing literature uses heavy approaches such as the usage of nested and dense skip connections and transformers. These approaches increase the training parameters within the model which therefore increase the training time and complexity of the model. The proposed approach on the other hand attains comparable results to existing literature without changing the number of trainable parameters over larger dimensions such as 160 × 192 × 192. All in all, the proposed model scores an average dice score of 0.883 and a Hausdorff distance of 7.84 on Brats 2021 cross validation.


Assuntos
Neoplasias Encefálicas , Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Neoplasias Encefálicas/patologia , Encéfalo
12.
Front Public Health ; 11: 1109236, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36794074

RESUMO

Introduction: Cancer happening rates in humankind are gradually rising due to a variety of reasons, and sensible detection and management are essential to decrease the disease rates. The kidney is one of the vital organs in human physiology, and cancer in the kidney is a medical emergency and needs accurate diagnosis and well-organized management. Methods: The proposed work aims to develop a framework to classify renal computed tomography (CT) images into healthy/cancer classes using pre-trained deep-learning schemes. To improve the detection accuracy, this work suggests a threshold filter-based pre-processing scheme, which helps in removing the artefact in the CT slices to achieve better detection. The various stages of this scheme involve: (i) Image collection, resizing, and artefact removal, (ii) Deep features extraction, (iii) Feature reduction and fusion, and (iv) Binary classification using five-fold cross-validation. Results and discussion: This experimental investigation is executed separately for: (i) CT slices with the artefact and (ii) CT slices without the artefact. As a result of the experimental outcome of this study, the K-Nearest Neighbor (KNN) classifier is able to achieve 100% detection accuracy by using the pre-processed CT slices. Therefore, this scheme can be considered for the purpose of examining clinical grade renal CT images, as it is clinically significant.


Assuntos
Neoplasias , Humanos , Tomografia Computadorizada por Raios X/métodos , Diagnóstico Diferencial , Rim/diagnóstico por imagem
13.
Diagnostics (Basel) ; 13(11)2023 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-37296683

RESUMO

Several advances in computing facilities were made due to the advancement of science and technology, including the implementation of automation in multi-specialty hospitals. This research aims to develop an efficient deep-learning-based brain-tumor (BT) detection scheme to detect the tumor in FLAIR- and T2-modality magnetic-resonance-imaging (MRI) slices. MRI slices of the axial-plane brain are used to test and verify the scheme. The reliability of the developed scheme is also verified through clinically collected MRI slices. In the proposed scheme, the following stages are involved: (i) pre-processing the raw MRI image, (ii) deep-feature extraction using pretrained schemes, (iii) watershed-algorithm-based BT segmentation and mining the shape features, (iv) feature optimization using the elephant-herding algorithm (EHA), and (v) binary classification and verification using three-fold cross-validation. Using (a) individual features, (b) dual deep features, and (c) integrated features, the BT-classification task is accomplished in this study. Each experiment is conducted separately on the chosen BRATS and TCIA benchmark MRI slices. This research indicates that the integrated feature-based scheme helps to achieve a classification accuracy of 99.6667% when a support-vector-machine (SVM) classifier is considered. Further, the performance of this scheme is verified using noise-attacked MRI slices, and better classification results are achieved.

14.
Sci Rep ; 12(1): 18197, 2022 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-36307444

RESUMO

Convolutional Neural Network (CNN) has been employed in classifying the COVID cases from the lungs' CT-Scan with promising quantifying metrics. However, SARS COVID-19 has been mutated, and we have many versions of the virus B.1.1.7, B.1.135, and P.1, hence there is a need for a more robust architecture that will classify the COVID positive patients from COVID negative patients with less training. We have developed a neural network based on the number of channels present in the images. The CNN architecture is developed in accordance with the number of the channels present in the dataset and are extracting the features separately from the channels present in the CT-Scan dataset. In the tower architecture, the first tower is dedicated for only the first channel present in the image; the second CNN tower is dedicated to the first and second channel feature maps, and finally the third channel takes account of all the feature maps from all three channels. We have used two datasets viz. one from Tongji Hospital, Wuhan, China and another SARS-CoV-2 dataset to train and evaluate our CNN architecture. The proposed model brought about an average accuracy of 99.4%, F1 score 0.988, and AUC 0.99.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/diagnóstico por imagem , Redes Neurais de Computação , Tomografia Computadorizada por Raios X
15.
Front Public Health ; 10: 819865, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35400062

RESUMO

Understanding the reason for an infant's cry is the most difficult thing for parents. There might be various reasons behind the baby's cry. It may be due to hunger, pain, sleep, or diaper-related problems. The key concept behind identifying the reason behind the infant's cry is mainly based on the varying patterns of the crying audio. The audio file comprises many features, which are highly important in classifying the results. It is important to convert the audio signals into the required spectrograms. In this article, we are trying to find efficient solutions to the problem of predicting the reason behind an infant's cry. In this article, we have used the Mel-frequency cepstral coefficients algorithm to generate the spectrograms and analyzed the varying feature vectors. We then came up with two approaches to obtain the experimental results. In the first approach, we used the Convolution Neural network (CNN) variants like VGG16 and YOLOv4 to classify the infant cry signals. In the second approach, a multistage heterogeneous stacking ensemble model was used for infant cry classification. Its major advantage was the inclusion of various advanced boosting algorithms at various levels. The proposed multistage heterogeneous stacking ensemble model had the edge over the other neural network models, especially in terms of overall performance and computing power. Finally, after many comparisons, the proposed model revealed the virtuoso performance and a mean classification accuracy of up to 93.7%.


Assuntos
Choro , Redes Neurais de Computação , Algoritmos , Humanos , Lactente
16.
Behav Neurol ; 2022: 6878783, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35464043

RESUMO

Multimodal medical image fusion is a current technique applied in the applications related to medical field to combine images from the same modality or different modalities to improve the visual content of the image to perform further operations like image segmentation. Biomedical research and medical image analysis highly demand medical image fusion to perform higher level of medical analysis. Multimodal medical fusion assists medical practitioners to visualize the internal organs and tissues. Multimodal medical fusion of brain image helps to medical practitioners to simultaneously visualize hard portion like skull and soft portion like tissue. Brain tumor segmentation can be accurately performed by utilizing the image obtained after multimodal medical image fusion. The area of the tumor can be accurately located with the information obtained from both Positron Emission Tomography and Magnetic Resonance Image in a single fused image. This approach increases the accuracy in diagnosing the tumor and reduces the time consumed in diagnosing and locating the tumor. The functional information of the brain is available in the Positron Emission Tomography while the anatomy of the brain tissue is available in the Magnetic Resonance Image. Thus, the spatial characteristics and functional information can be obtained from a single image using a robust multimodal medical image fusion model. The proposed approach uses a generative adversarial network to fuse Positron Emission Tomography and Magnetic Resonance Image into a single image. The results obtained from the proposed approach can be used for further medical analysis to locate the tumor and plan for further surgical procedures. The performance of the GAN based model is evaluated using two metrics, namely, structural similarity index and mutual information. The proposed approach achieved a structural similarity index of 0.8551 and a mutual information of 2.8059.


Assuntos
Neoplasias Encefálicas , Processamento de Imagem Assistida por Computador , Encéfalo/diagnóstico por imagem , Neoplasias Encefálicas/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Tomografia por Emissão de Pósitrons/métodos
17.
IEEE Trans Neural Netw Learn Syst ; 33(11): 6129-6143, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33900925

RESUMO

Underwater image processing has been shown to exhibit significant potential for exploring underwater environments. It has been applied to a wide variety of fields, such as underwater terrain scanning and autonomous underwater vehicles (AUVs)-driven applications, such as image-based underwater object detection. However, underwater images often suffer from degeneration due to attenuation, color distortion, and noise from artificial lighting sources as well as the effects of possibly low-end optical imaging devices. Thus, object detection performance would be degraded accordingly. To tackle this problem, in this article, a lightweight deep underwater object detection network is proposed. The key is to present a deep model for jointly learning color conversion and object detection for underwater images. The image color conversion module aims at transforming color images to the corresponding grayscale images to solve the problem of underwater color absorption to enhance the object detection performance with lower computational complexity. The presented experimental results with our implementation on the Raspberry pi platform have justified the effectiveness of the proposed lightweight jointly learning model for underwater object detection compared with the state-of-the-art approaches.

18.
Front Public Health ; 9: 798905, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34938715

RESUMO

The exponential growth of social media users has changed the dynamics of retrieving the potential information from user-generated content and transformed the paradigm of information-retrieval mechanism with the novel developments on the concept of "web of data". In this regard, our proposed Ontology-Based Sentiment Analysis provides two novel approaches: First, the emotion extraction on tweets related to COVID-19 is carried out by a well-formed taxonomy that comprises possible emotional concepts with fine-grained properties and polarized values. Second, the potential entities present in the tweet can be analyzed for semantic associativity. The extraction of emotions can be performed in two cases: (i) words directly associated with the emotional concepts present in the taxonomy and (ii) words indirectly present in the emotional concepts. Though the latter case is very challenging in processing the tweets to find the hidden patterns and extract the meaningful facts associated with it, our proposed work is able to extract and detect almost 81% of true positives and considerably able to detect the false negatives. Finally, the proposed approach's superior performance is witnessed from its comparison with other peer-level approaches.


Assuntos
COVID-19 , Mídias Sociais , Emoções , Humanos , Pandemias , SARS-CoV-2 , Análise de Sentimentos
19.
Front Genet ; 12: 784814, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34868275

RESUMO

Alzheimer's is a progressive, irreversible, neurodegenerative brain disease. Even with prominent symptoms, it takes years to notice, decode, and reveal Alzheimer's. However, advancements in technologies, such as imaging techniques, help in early diagnosis. Still, sometimes the results are inaccurate, which delays the treatment. Thus, the research in recent times focused on identifying the molecular biomarkers that differentiate the genotype and phenotype characteristics. However, the gene expression dataset's generated features are huge, 1,000 or even more than 10,000. To overcome such a curse of dimensionality, feature selection techniques are introduced. We designed a gene selection pipeline combining a filter, wrapper, and unsupervised method to select the relevant genes. We combined the minimum Redundancy and maximum Relevance (mRmR), Wrapper-based Particle Swarm Optimization (WPSO), and Auto encoder to select the relevant features. We used the GSE5281 Alzheimer's dataset from the Gene Expression Omnibus We implemented an Improved Deep Belief Network (IDBN) with simple stopping criteria after choosing the relevant genes. We used a Bayesian Optimization technique to tune the hyperparameters in the Improved Deep Belief Network. The tabulated results show that the proposed pipeline shows promising results.

20.
J Healthc Eng ; 2021: 7517313, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34804460

RESUMO

The cry is a loud, high pitched verbal communication of infants. The very high fundamental frequency and resonance frequency characterize a neonatal infant cry having certain sudden variations. Furthermore, in a tiny duration solitary utterance, the cry signal also possesses both voiced and unvoiced features. Mostly, infants communicate with their caretakers through cries, and sometimes, it becomes difficult for the caretakers to comprehend the reason behind the newborn infant cry. As a result, this research proposes a novel work for classifying the newborn infant cries under three groups such as hunger, sleep, and discomfort. For each crying frame, twelve features get extracted through acoustic feature engineering, and the variable selection using random forests was used for selecting the highly discriminative features among the twelve time and frequency domain features. Subsequently, the extreme gradient boosting-powered grouped-support-vector network is deployed for neonate cry classification. The empirical results show that the proposed method could effectively classify the neonate cries under three different groups. The finest experimental results showed a mean accuracy of around 91% for most scenarios, and this exhibits the potential of the proposed extreme gradient boosting-powered grouped-support-vector network in neonate cry classification. Also, the proposed method has a fast recognition rate of 27 seconds in the identification of these emotional cries.


Assuntos
Choro , Voz , Acústica , Coleta de Dados , Humanos , Lactente , Recém-Nascido , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA