Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Int J Neural Syst ; 34(3): 2450011, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38231046

RESUMO

Linear assignment problems are well-known combinatorial optimization problems involving domains such as logistics, robotics and telecommunications. In general, obtaining an optimal solution to such problems is computationally infeasible even in small settings, so heuristic algorithms are often used to find near-optimal solutions. In order to attain the right assignment permutation, this study investigates a general-purpose learning strategy that uses a bipartite graph to describe the problem structure and a message passing Graph Neural Network (GNN) model to learn the correct mapping. Comparing the proposed structure with two existing DNN solutions, simulation results show that the proposed approach significantly improves classification accuracy, proving to be very efficient in terms of processing time and memory requirements, due to its inherent parameter sharing capability. Among the many practical uses that require solving allocation problems in everyday scenarios, we decided to apply the proposed approach to address the scheduling of electric smart meters access within an electricity distribution smart grid infrastructure, since near-real-time energy monitoring is a key element of the green transition that has become increasingly important in recent times. The results obtained show that the proposed graph-based solver, although sub-optimal, exhibits the highest scalability, compared with other state-of-the-art heuristic approaches. To foster the reproducibility of the results, we made the code available at https://github.com/aircarlo/GNN_LSAP.


Assuntos
Algoritmos , Aprendizagem , Reprodutibilidade dos Testes , Simulação por Computador , Redes Neurais de Computação
2.
Comput Methods Programs Biomed ; 242: 107840, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37832429

RESUMO

BACKGROUND AND OBJECTIVES: Timely identification of dysarthria progression in patients with bulbar-onset amyotrophic lateral sclerosis (ALS) is relevant to have a comprehensive assessment of the disease evolution. To this goal literature recognized the utmost importance of the assessment of the number of syllables uttered by a subject during the oral diadochokinesis (DDK) test. METHODS: To support clinicians, this work proposes a remote deep learning-based system, which consists (i) of a web application to acquire audio tracks of bulbar-onset ALS patients and healthy control subjects while performing the oral DDK test (i.e., repeating the /pa/, /pa-ta-ka/ and /oo-ee/ syllables) and (ii) a DDK-AID network designed to process the acquired audio signals which have different duration and to output the number of per-task syllables repeated by the subject. RESULTS: The DDK-AID network overcomes the comparative method achieving a mean Accuracy of 90.23 in counting syllables repeated by the eleven bulbar-onset ALS-patients while performing the oral DDK test. CONCLUSIONS: The proposed remote monitoring system, in the light of the achieved performance, represents an important step towards the implementation of self-service telemedicine systems which may ensure customised care plans.


Assuntos
Esclerose Lateral Amiotrófica , Aprendizado Profundo , Humanos , Esclerose Lateral Amiotrófica/diagnóstico , Software
3.
Data Brief ; 48: 109146, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37128585

RESUMO

Accurate perception and awareness of the environment surrounding the automobile is a challenge in automotive research. This article presents A3CarScene, a dataset recorded while driving a research vehicle equipped with audio and video sensors on public roads in the Marche Region, Italy. The sensor suite includes eight microphones installed inside and outside the passenger compartment and two dashcams mounted on the front and rear windows. Approximately 31 h of data for each device were collected during October and November 2022 by driving about 1500 km along diverse roads and landscapes, in variable weather conditions, in daytime and nighttime hours. All key information for the scene understanding process of automated vehicles has been accurately annotated. For each route, annotations with beginning and end timestamps report the type of road traveled (motorway, trunk, primary, secondary, tertiary, residential, and service roads), the degree of urbanization of the area (city, town, suburban area, village, exurban and rural areas), the weather conditions (clear, cloudy, overcast, and rainy), the level of lighting (daytime, evening, night, and tunnel), the type (asphalt or cobblestones) and moisture status (dry or wet) of the road pavement, and the state of the windows (open or closed). This large-scale dataset is valuable for developing new driving assistance technologies based on audio or video data alone or in a multimodal manner and for improving the performance of systems currently in use. The data acquisition process with sensors in multiple locations allows for the assessment of the best installation placement concerning the task. Deep learning engineers can use this dataset to build new baselines, as a comparative benchmark, and to extend existing databases for autonomous driving.

5.
Sensors (Basel) ; 22(12)2022 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-35746120

RESUMO

It is a well-established practice to build a robust system for sound event detection by training supervised deep learning models on large datasets, but audio data collection and labeling are often challenging and require large amounts of effort. This paper proposes a workflow based on few-shot metric learning for emergency siren detection performed in steps: prototypical networks are trained on publicly available sources or synthetic data in multiple combinations, and at inference time, the best knowledge learned in associating a sound with its class representation is transferred to identify ambulance sirens, given only a few instances for the prototype computation. Performance is evaluated on siren recordings acquired by sensors inside and outside the cabin of an equipped car, investigating the contribution of filtering techniques for background noise reduction. The results show the effectiveness of the proposed approach, achieving AUPRC scores equal to 0.86 and 0.91 in unfiltered and filtered conditions, respectively, outperforming a convolutional baseline model with and without fine-tuning for domain adaptation. Extensive experiments conducted on several recording sensor placements prove that few-shot learning is a reliable technique even in real-world scenarios and gives valuable insights for developing an in-car emergency vehicle detection system.


Assuntos
Redes Neurais de Computação , Som , Ambulâncias , Coleta de Dados , Tempo
6.
J Acoust Soc Am ; 148(5): 3052, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33261386

RESUMO

The Rhodes piano is an electromechanical keyboard instrument, released for the first time in 1946 and subsequently manufactured for at least four decades, reaching an iconic status and being now generally referred to as the electric piano. A few academic works discuss its operating principle and propose different physical modeling strategies; however, the inharmonic modes that characterize the attack transient have not been subject of a dedicated study before. This study addresses this topic by first observing the spectrum at the pickup output, applying a psychoacoustic model to assess perceptual relevance, and then conducts a series of scanning laser Doppler vibrometry (SLDV) experiments on the Rhodes asymmetric tuning fork. This study compares the modes of the Rhodes piano to those of its individual parts, allowing for the extraction of important information regarding role and natural modes. On the basis of this study, numerical experiments are conducted that show the intermodulation of the modes due to the magnetic pickup and allow the tones produced by the Rhodes from the collected data to be closely matched. Finally, this study is able to extract the distribution of the most important modes found on the whole keyboard range of a Rhodes piano, which can be useful for sound synthesis.

7.
Comput Intell Neurosci ; 2017: 1512670, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28638405

RESUMO

The primary cause of injury-related death for the elders is represented by falls. The scientific community devoted them particular attention, since injuries can be limited by an early detection of the event. The solution proposed in this paper is based on a combined One-Class SVM (OCSVM) and template-matching classifier that discriminate human falls from nonfalls in a semisupervised framework. Acoustic signals are captured by means of a Floor Acoustic Sensor; then Mel-Frequency Cepstral Coefficients and Gaussian Mean Supervectors (GMSs) are extracted for the fall/nonfall discrimination. Here we propose a single-sensor two-stage user-aided approach: in the first stage, the OCSVM detects abnormal acoustic events. In the second, the template-matching classifier produces the final decision exploiting a set of template GMSs related to the events marked as false positives by the user. The performance of the algorithm has been evaluated on a corpus containing human falls and nonfall sounds. Compared to the OCSVM only approach, the proposed algorithm improves the performance by 10.14% in clean conditions and 4.84% in noisy conditions. Compared to Popescu and Mahnot (2009) the performance improvement is 19.96% in clean conditions and 8.08% in noisy conditions.


Assuntos
Acidentes por Quedas , Acústica , Algoritmos , Máquina de Vetores de Suporte , Acidentes por Quedas/mortalidade , Idoso , Humanos , Distribuição Normal
8.
Comput Intell Neurosci ; 2017: 4694860, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28182121

RESUMO

In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are predicted from the previous frames by means of Long-Short Term Memory recurrent denoising autoencoders. The reconstruction error between the input and the output of the autoencoder is used as activation signal to detect novel events. There is no evidence of studies focused on comparing previous efforts to automatically recognize novel events from audio signals and giving a broad and in depth evaluation of recurrent neural network-based autoencoders. The present contribution aims to consistently evaluate our recent novel approaches to fill this white spot in the literature and provide insight by extensive evaluations carried out on three databases: A3Novelty, PASCAL CHiME, and PROMETHEUS. Besides providing an extensive analysis of novel and state-of-the-art methods, the article shows how RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average F-measure over the three databases.


Assuntos
Acústica , Compressão de Dados , Redes Neurais de Computação , Bases de Dados Factuais , Humanos , Modelos Estatísticos
9.
Cogn Neurodyn ; 5(3): 253-64, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22942915

RESUMO

Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA