Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(3)2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38339454

RESUMO

This paper discusses the problem of recognizing defective epoxy drop images for the purpose of performing vision-based die attachment inspection in integrated circuit (IC) manufacturing based on deep neural networks. Two supervised and two unsupervised recognition models are considered. The supervised models examined are an autoencoder (AE) network together with a multi-layer perceptron network (MLP) and a VGG16 network, while the unsupervised models examined are an autoencoder (AE) network together with k-means clustering and a VGG16 network together with k-means clustering. Since in practice very few defective epoxy drop images are available on an actual IC production line, the emphasis in this paper is placed on the impact of data augmentation on the recognition outcome. The data augmentation is achieved by generating synthesized defective epoxy drop images via our previously developed enhanced loss function CycleGAN generative network. The experimental results indicate that when using data augmentation, the supervised and unsupervised models of VGG16 generate perfect or near perfect accuracies for recognition of defective epoxy drop images for the dataset examined. More specifically, for the supervised models of AE+MLP and VGG16, the recognition accuracy is improved by 47% and 1%, respectively, and for the unsupervised models of AE+Kmeans and VGG+Kmeans, the recognition accuracy is improved by 37% and 15%, respectively, due to the data augmentation.

2.
Sensors (Basel) ; 24(5)2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38475083

RESUMO

This paper provides a review of various machine learning approaches that have appeared in the literature aimed at individualizing or personalizing the amplification settings of hearing aids. After stating the limitations associated with the current one-size-fits-all settings of hearing aid prescriptions, a spectrum of studies in engineering and hearing science are discussed. These studies involve making adjustments to prescriptive values in order to enable preferred and individualized settings for a hearing aid user in an audio environment of interest to that user. This review gathers, in one place, a comprehensive collection of works that have been conducted thus far with respect to achieving the personalization or individualization of the amplification function of hearing aids. Furthermore, it underscores the impact that machine learning can have on enabling an improved and personalized hearing experience for hearing aid users. This paper concludes by stating the challenges and future research directions in this area.


Assuntos
Auxiliares de Audição , Perda Auditiva Neurossensorial , Humanos , Perda Auditiva Neurossensorial/reabilitação , Aprendizado de Máquina
3.
Sensors (Basel) ; 23(10)2023 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-37430778

RESUMO

In integrated circuit manufacturing, defects in epoxy drops for die attachments are required to be identified during production. Modern identification techniques based on vision-based deep neural networks require the availability of a very large number of defect and non-defect epoxy drop images. In practice, however, very few defective epoxy drop images are available. This paper presents a generative adversarial network solution to generate synthesized defective epoxy drop images as a data augmentation approach so that vision-based deep neural networks can be trained or tested using such images. More specifically, the so-called CycleGAN variation of the generative adversarial network is used by enhancing its cycle consistency loss function with two other loss functions consisting of learned perceptual image patch similarity (LPIPS) and a structural similarity index metric (SSIM). The results obtained indicate that when using the enhanced loss function, the quality of synthesized defective epoxy drop images are improved by 59%, 12%, and 131% for the metrics of the peak signal-to-noise ratio (PSNR), universal image quality index (UQI), and visual information fidelity (VIF), respectively, compared to the CycleGAN standard loss function. A typical image classifier is used to show the improvement in the identification outcome when using the synthesized images generated by the developed data augmentation approach.

4.
Sensors (Basel) ; 22(9)2022 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-35590995

RESUMO

This paper presents an assistive hearing smartphone app mimicking the two main functions of hearing aids, consisting of compression and noise reduction. The app is designed to run in real time on smartphones or tablets. Appropriate levels of amplification or gain are activated by selecting a filter from a filter bank for six audio environment situations covering three sound pressure levels of speech and two sound pressure levels of noise. The results of this smartphone app for real-world audio environments are provided, indicating its effectiveness as a real-time platform for studying compression and noise reduction algorithms in the field or in realistic audio environments.


Assuntos
Auxiliares de Audição , Perda Auditiva Neurossensorial , Audição , Perda Auditiva Neurossensorial/reabilitação , Humanos , Ruído , Smartphone
5.
Sensors (Basel) ; 22(16)2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-36015791

RESUMO

Adaptive dynamic range optimization (ADRO) is a hearing aid fitting rationale which involves adjusting the gains in a number of frequency bands by using a series of rules. The rules reflect the comparison of the estimated percentile occurrences of the sound levels with the audibility and comfort hearing levels of a person suffering from hearing loss. In the study reported in this paper, a previously developed machine learning method was utilized to personalize the ADRO fitting in order to provide an improved hearing experience as compared to the standard ADRO hearing aid fitting. The personalization was carried out based on the user preference model within the framework of maximum likelihood inverse reinforcement learning. The testing of ten subjects with hearing loss was conducted, which indicated that the personalized ADRO was preferred over the standard ADRO on average by about 10 times. Furthermore, a word recognition experiment was conducted, which showed that the personalized ADRO had no adverse impact on speech understanding as compared to the standard ADRO.


Assuntos
Implantes Cocleares , Surdez , Auxiliares de Audição , Perda Auditiva , Percepção da Fala , Perda Auditiva/reabilitação , Humanos
6.
Sensors (Basel) ; 21(11)2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34071736

RESUMO

The interest in contactless or remote heart rate measurement has been steadily growing in healthcare and sports applications. Contactless methods involve the utilization of a video camera and image processing algorithms. Recently, deep learning methods have been used to improve the performance of conventional contactless methods for heart rate measurement. After providing a review of the related literature, a comparison of the deep learning methods whose codes are publicly available is conducted in this paper. The public domain UBFC dataset is used to compare the performance of these deep learning methods for heart rate measurement. The results obtained show that the deep learning method PhysNet generates the best heart rate measurement outcome among these methods, with a mean absolute error value of 2.57 beats per minute and a mean square error value of 7.56 beats per minute.


Assuntos
Aprendizado Profundo , Algoritmos , Coração , Frequência Cardíaca , Fotopletismografia , Processamento de Sinais Assistido por Computador
7.
Sensors (Basel) ; 20(10)2020 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-32443857

RESUMO

Existing public domain multi-modal datasets for human action recognition only include actions of interest that have already been segmented from action streams. These datasets cannot be used to study a more realistic action recognition scenario where actions of interest occur randomly and continuously among actions of non-interest or no actions. It is more challenging to recognize actions of interest in continuous action streams since the starts and ends of these actions are not known and need to be determined in an on-the-fly manner. Furthermore, there exists no public domain multi-modal dataset in which video and inertial data are captured simultaneously for continuous action streams. The main objective of this paper is to describe a dataset that is collected and made publicly available, named Continuous Multimodal Human Action Dataset (C-MHAD), in which video and inertial data stream are captured simultaneously in a continuous way. This dataset is then used in an example recognition technique and the results obtained indicate that the fusion of these two sensing modalities increases the F1 scores compared to using each sensing modality individually.


Assuntos
Algoritmos , Conjuntos de Dados como Assunto , Atividades Humanas , Gravação em Vídeo , Humanos
8.
Sensors (Basel) ; 20(12)2020 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-32630480

RESUMO

This paper addresses real-time moving object detection with high accuracy in high-resolution video frames. A previously developed framework for moving object detection is modified to enable real-time processing of high-resolution images. First, a computationally efficient method is employed, which detects moving regions on a resized image while maintaining moving regions on the original image with mapping coordinates. Second, a light backbone deep neural network in place of a more complex one is utilized. Third, the focal loss function is employed to alleviate the imbalance between positive and negative samples. The results of the extensive experimentations conducted indicate that the modified framework developed in this paper achieves a processing rate of 21 frames per second with 86.15% accuracy on the dataset SimitMovingDataset, which contains high-resolution images of the size 1920 × 1080.

9.
Sensors (Basel) ; 19(17)2019 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-31450609

RESUMO

This paper presents the simultaneous utilization of video images and inertial signals that are captured at the same time via a video camera and a wearable inertial sensor within a fusion framework in order to achieve a more robust human action recognition compared to the situations when each sensing modality is used individually. The data captured by these sensors are turned into 3D video images and 2D inertial images that are then fed as inputs into a 3D convolutional neural network and a 2D convolutional neural network, respectively, for recognizing actions. Two types of fusion are considered-Decision-level fusion and feature-level fusion. Experiments are conducted using the publicly available dataset UTD-MHAD in which simultaneous video images and inertial signals are captured for a total of 27 actions. The results obtained indicate that both the decision-level and feature-level fusion approaches generate higher recognition accuracies compared to the approaches when each sensing modality is used individually. The highest accuracy of 95.6% is obtained for the decision-level fusion approach.


Assuntos
Gravação em Vídeo , Visão Ocular/fisiologia , Dispositivos Eletrônicos Vestíveis , Algoritmos , Aprendizado Profundo , Humanos , Redes Neurais de Computação
10.
Speech Commun ; 55(4): 523-534, 2013 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24610967

RESUMO

A computationally efficient speech enhancement pipeline in noisy environments based on a single-processor implementation is developed for utilization in bilateral cochlear implant systems. A two-channel joint objective function is defined and a closed form solution is obtained based on the weighted-Euclidean distortion measure. The computational efficiency and no need for synchronization aspects of this pipeline make it a suitable solution for real-time deployment. A speech quality measure is used to show its effectiveness in six different noisy environments as compared to a similar one-channel enhancement pipeline when using two separate processors or when using independent sequential processing.

11.
IEEE Trans Pattern Anal Mach Intell ; 44(7): 3523-3542, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-33596172

RESUMO

Image segmentation is a key task in computer vision and image processing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among others, and numerous segmentation algorithms are found in the literature. Against this backdrop, the broad success of deep learning (DL) has prompted the development of new image segmentation approaches leveraging DL models. We provide a comprehensive review of this recent literature, covering the spectrum of pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the relationships, strengths, and challenges of these DL-based segmentation models, examine the widely used datasets, compare performances, and discuss promising research directions.


Assuntos
Aprendizado Profundo , Robótica , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação
12.
IEEE Access ; 6: 9017-9026, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30250774

RESUMO

This paper presents a smartphone app that performs real-time voice activity detection based on convolutional neural network. Real-time implementation issues are discussed showing how the slow inference time associated with convolutional neural networks is addressed. The developed smartphone app is meant to act as a switch for noise reduction in the signal processing pipelines of hearing devices, enabling noise estimation or classification to be conducted in noise-only parts of noisy speech signals. The developed smartphone app is compared with a previously developed voice activity detection app as well as with two highly cited voice activity detection algorithms. The experimental results indicate that the developed app using convolutional neural network outperforms the previously developed smartphone app.

13.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 2837-2840, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30440992

RESUMO

This paper presents the integration of three major modules of the signal processing pipeline that go into a typical digital hearing aid as a real-time smartphone app. These modules include voice activity detection, noise reduction, and compression. The steps taken to allow the real-time implementation of this integration or signal processing pipeline are discussed. These steps can be utilized to create similar signal processing pipelines or integrated apps to evaluate hearing improvement algorithms. The real-time characteristics of the developed integrated app are reported as well as an objective evaluation of its noise reduction.


Assuntos
Auxiliares de Audição , Processamento de Sinais Assistido por Computador , Smartphone , Algoritmos , Ruído
14.
Comput Methods Programs Biomed ; 162: 139-148, 2018 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-29903480

RESUMO

BACKGROUND AND OBJECTIVE: The detection of optic nerve head (ONH) in retinal fundus images plays a key role in identifying Diabetic Retinopathy (DR) as well as other abnormal conditions in eye examinations. This paper presents a method and its associated software towards the development of an Android smartphone app based on a previously developed ONH detection algorithm. The development of this app and the use of the d-Eye lens which can be snapped onto a smartphone provide a mobile and cost-effective computer-aided diagnosis (CAD) system in ophthalmology. In particular, this CAD system would allow eye examination to be conducted in remote locations with limited access to clinical facilities. METHODS: A pre-processing step is first carried out to enable the ONH detection on the smartphone platform. Then, the optimization steps taken to run the algorithm in a computationally and memory efficient manner on the smartphone platform is discussed. RESULTS: The smartphone code of the ONH detection algorithm was applied to the STARE and DRIVE databases resulting in about 96% and 100% detection rates, respectively, with an average execution time of about 2 s and 1.3 s. In addition, two other databases captured by the d-Eye and iExaminer snap-on lenses for smartphones were considered resulting in about 93% and 91% detection rates, respectively, with an average execution time of about 2.7 s and 2.2 s, respectively.


Assuntos
Retinopatia Diabética/diagnóstico por imagem , Diagnóstico por Computador , Interpretação de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador , Aplicativos Móveis , Algoritmos , Sistemas Computacionais , Análise Custo-Benefício , Fundo de Olho , Humanos , Modelos Estatísticos , Oftalmologia/instrumentação , Disco Óptico , Smartphone
15.
IEEE Trans Med Imaging ; 26(4): 427-51, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17427731

RESUMO

Functional localization is a concept which involves the application of a sequence of geometrical and statistical image processing operations in order to define the location of brain activity or to produce functional/parametric maps with respect to the brain structure or anatomy. Considering that functional brain images do not normally convey detailed structural information and, thus, do not present an anatomically specific localization of functional activity, various image registration techniques are introduced in the literature for the purpose of mapping functional activity into an anatomical image or a brain atlas. The problems addressed by these techniques differ depending on the application and the type of analysis, i.e., single-subject versus group analysis. Functional to anatomical brain image registration is the core part of functional localization in most applications and is accompanied by intersubject and subject-to-atlas registration for group analysis studies. Cortical surface registration and automatic brain labeling are some of the other tools towards establishing a fully automatic functional localization procedure. While several previous survey papers have reviewed and classified general-purpose medical image registration techniques, this paper provides an overview of brain functional localization along with a survey and classification of the image registration techniques related to this problem.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/anatomia & histologia , Encéfalo/fisiologia , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Algoritmos , Inteligência Artificial , Análise por Conglomerados , Potenciais Evocados/fisiologia , Humanos , Aumento da Imagem/métodos , Imageamento Tridimensional/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
16.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 736-739, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28268433

RESUMO

This paper presents a voice activity detector (VAD) for automatic switching between a noise classifier and a speech enhancer as part of the signal processing pipeline of hearing aid devices. The developed VAD consists of a computationally efficient feature extractor and a random forest classifier. Previously used signal features as well as two newly introduced signal features are extracted and fed into the classifier to perform automatic switching. This switching approach is compared to two popular VADs. The results obtained indicate the introduced approach outperforms these existing approaches in terms of both detection rate and processing time.


Assuntos
Auxiliares de Audição , Ruído , Processamento de Sinais Assistido por Computador , Percepção da Fala , Humanos , Fala , Voz
17.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 85-88, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28268287

RESUMO

It is well established that the presence of environmental noises has a negative impact on the performance of hearing aid devices. This paper addresses a noise adaptive speech enhancement solution for the purpose of improving the performance of hearing aid devices in noisy environments. Depending on three noise types of babble, machinery, and driving car, the parameters of a recently developed speech enhancement algorithm are appropriately adjusted to gain improved speech understanding performance in noisy environments. This solution is implemented on smartphone platforms as an app and interfaced with a hearing aid device. A clinical testing protocol is devised to evaluate the performance of the app in participants with normal hearing and hearing impairments. The clinical testing results have indicated that statistically significant improvement in speech understanding is gained between the unprocessed and processed conditions using the developed noise adaptive speech enhancement solution.


Assuntos
Algoritmos , Auxiliares de Audição , Perda Auditiva Neurossensorial/terapia , Smartphone , Inteligibilidade da Fala , Adulto , Idoso , Idoso de 80 Anos ou mais , Desenho de Equipamento , Perda Auditiva Neurossensorial/reabilitação , Humanos , Pessoa de Meia-Idade , Ruído , Razão Sinal-Ruído
18.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 5885-5888, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28269593

RESUMO

In this paper, the development of a speech processing pipeline on smartphones for hearing aid devices (HADs) is presented. This pipeline is used for noise suppression and speech enhancement (SE) to improve speech quality and intelligibility. The proposed method is implemented to run in real-time on Android smartphones. The results of the testing conducted indicate that the proposed method suppresses the noise and improves the perceptual quality of speech in terms of three objective measures of perceptual evaluation of speech quality (PESQ), noise attenuation level (NAL), and the coherent speech intelligibility index (CSII).


Assuntos
Auxiliares de Audição , Smartphone , Percepção da Fala/fisiologia , Algoritmos , Humanos , Inteligibilidade da Fala
20.
Artigo em Inglês | MEDLINE | ID: mdl-25570902

RESUMO

This paper presents a home-based Senior Fitness Test (SFT) measurement system by using an inertial sensor and a depth camera in a collaborative way. The depth camera is used to monitor the correct pose of a subject for a fitness test and any deviation from the correct pose while the inertial sensor is used to measure the number of a fitness test action performed by the subject within the time duration specified by the fitness protocol. The results indicate that this collaborative approach leads to high success rates in providing the SFT measurements under realistic conditions.


Assuntos
Teste de Esforço/instrumentação , Teste de Esforço/métodos , Aptidão Física , Desenho de Equipamento , Feminino , Humanos , Masculino , Experimentação Humana não Terapêutica , Gravação em Vídeo/instrumentação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA