Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Graefes Arch Clin Exp Ophthalmol ; 262(3): 975-982, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37747539

RESUMEN

PURPOSE: This narrative review aims to provide an overview of the dangers, controversial aspects, and implications of artificial intelligence (AI) use in ophthalmology and other medical-related fields. METHODS: We conducted a decade-long comprehensive search (January 2013-May 2023) of both academic and grey literature, focusing on the application of AI in ophthalmology and healthcare. This search included key web-based academic databases, non-traditional sources, and targeted searches of specific organizations and institutions. We reviewed and selected documents for relevance to AI, healthcare, ethics, and guidelines, aiming for a critical analysis of ethical, moral, and legal implications of AI in healthcare. RESULTS: Six main issues were identified, analyzed, and discussed. These include bias and clinical safety, cybersecurity, health data and AI algorithm ownership, the "black-box" problem, medical liability, and the risk of widening inequality in healthcare. CONCLUSION: Solutions to address these issues include collecting high-quality data of the target population, incorporating stronger security measures, using explainable AI algorithms and ensemble methods, and making AI-based solutions accessible to everyone. With careful oversight and regulation, AI-based systems can be used to supplement physician decision-making and improve patient care and outcomes.


Asunto(s)
Inteligencia Artificial , Oftalmología , Humanos , Algoritmos , Inteligencia Artificial/ética , Bases de Datos Factuales , Principios Morales
2.
Sensors (Basel) ; 23(5)2023 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-36904859

RESUMEN

During flight, unmanned aerial vehicles (UAVs) need several sensors to follow a predefined path and reach a specific destination. To this aim, they generally exploit an inertial measurement unit (IMU) for pose estimation. Usually, in the UAV context, an IMU entails a three-axis accelerometer and a three-axis gyroscope. However, as happens for many physical devices, they can present some misalignment between the real value and the registered one. These systematic or occasional errors can derive from different sources and could be related to the sensor itself or to external noise due to the place where it is located. Hardware calibration requires special equipment, which is not always available. In any case, even if possible, it can be used to solve the physical problem and sometimes requires removing the sensor from its location, which is not always feasible. At the same time, solving the problem of external noise usually requires software procedures. Moreover, as reported in the literature, even two IMUs from the same brand and the same production chain could produce different measurements under identical conditions. This paper proposes a soft calibration procedure to reduce the misalignment created by systematic errors and noise based on the grayscale or RGB camera built-in on the drone. Based on the transformer neural network architecture trained in a supervised learning fashion on pairs of short videos shot by the UAV's camera and the correspondent UAV measurements, the strategy does not require any special equipment. It is easily reproducible and could be used to increase the trajectory accuracy of the UAV during the flight.

3.
Int J Mol Sci ; 23(16)2022 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-36012423

RESUMEN

The persistence of long-term coronavirus-induced disease 2019 (COVID-19) sequelae demands better insights into its natural history. Therefore, it is crucial to discover the biomarkers of disease outcome to improve clinical practice. In this study, 160 COVID-19 patients were enrolled, of whom 80 had a "non-severe" and 80 had a "severe" outcome. Sera were analyzed by proximity extension assay (PEA) to assess 274 unique proteins associated with inflammation, cardiometabolic, and neurologic diseases. The main clinical and hematochemical data associated with disease outcome were grouped with serological data to form a dataset for the supervised machine learning techniques. We identified nine proteins (i.e., CD200R1, MCP1, MCP3, IL6, LTBP2, MATN3, TRANCE, α2-MRAP, and KIT) that contributed to the correct classification of COVID-19 disease severity when combined with relative neutrophil and lymphocyte counts. By analyzing PEA, clinical and hematochemical data with statistical methods that were able to handle many variables in the presence of a relatively small sample size, we identified nine potential serum biomarkers of a "severe" outcome. Most of these were confirmed by literature data. Importantly, we found three biomarkers associated with central nervous system pathologies and protective factors, which were downregulated in the most severe cases.


Asunto(s)
COVID-19 , Proteómica , Biomarcadores/sangre , COVID-19/diagnóstico , Humanos , Recuento de Linfocitos , Aprendizaje Automático
4.
Sensors (Basel) ; 20(18)2020 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-32962168

RESUMEN

Person re-identification is concerned with matching people across disjointed camera views at different places and different time instants. This task results of great interest in computer vision, especially in video surveillance applications where the re-identification and tracking of persons are required on uncontrolled crowded spaces and after long time periods. The latter aspects are responsible for most of the current unsolved problems of person re-identification, in fact, the presence of many people in a location as well as the passing of hours or days give arise to important visual appearance changes of people, for example, clothes, lighting, and occlusions; thus making person re-identification a very hard task. In this paper, for the first time in the state-of-the-art, a meta-feature based Long Short-Term Memory (LSTM) hashing model for person re-identification is presented. Starting from 2D skeletons extracted from RGB video streams, the proposed method computes a set of novel meta-features based on movement, gait, and bone proportions. These features are analysed by a network composed of a single LSTM layer and two dense layers. The first layer is used to create a pattern of the person's identity, then, the seconds are used to generate a bodyprint hash through binary coding. The effectiveness of the proposed method is tested on three challenging datasets, that is, iLIDS-VID, PRID 2011, and MARS. In particular, the reported results show that the proposed method, which is not based on visual appearance of people, is fully competitive with respect to other methods based on visual features. In addition, thanks to its skeleton model abstraction, the method results to be a concrete contribute to address open problems, such as long-term re-identification and severe illumination changes, which tend to heavily influence the visual appearance of persons.


Asunto(s)
Algoritmos , Memoria a Largo Plazo , Marcha , Humanos
5.
J Biomed Inform ; 89: 81-100, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30521854

RESUMEN

Strokes, surgeries, or degenerative diseases can impair motor abilities and balance. Long-term rehabilitation is often the only way to recover, as completely as possible, these lost skills. To be effective, this type of rehabilitation should follow three main rules. First, rehabilitation exercises should be able to keep patient's motivation high. Second, each exercise should be customizable depending on patient's needs. Third, patient's performance should be evaluated objectively, i.e., by measuring patient's movements with respect to an optimal reference model. To meet the just reported requirements, in this paper, an interactive and low-cost full body rehabilitation framework for the generation of 3D immersive serious games is proposed. The framework combines two Natural User Interfaces (NUIs), for hand and body modeling, respectively, and a Head Mounted Display (HMD) to provide the patient with an interactive and highly defined Virtual Environment (VE) for playing with stimulating rehabilitation exercises. The paper presents the overall architecture of the framework, including the environment for the generation of the pilot serious games and the main features of the used hand and body models. The effectiveness of the proposed system is shown on a group of ninety-two patients. In a first stage, a pool of seven rehabilitation therapists has evaluated the results of the patients on the basis of three reference rehabilitation exercises, confirming a significant gradual recovery of the patients' skills. Moreover, the feedbacks received by the therapists and patients, who have used the system, have pointed out remarkable results in terms of motivation, usability, and customization. In a second stage, by comparing the current state-of-the-art in rehabilitation area with the proposed system, we have observed that the latter can be considered a concrete contribution in terms of versatility, immersivity, and novelty. In a final stage, by training a Gated Recurrent Unit Recurrent Neural Network (GRU-RNN) with healthy subjects (i.e., baseline), we have also provided a reference model to objectively evaluate the degree of the patients' performance. To estimate the effectiveness of this last aspect of the proposed approach, we have used the NTU RGB + D Action Recognition dataset obtaining comparable results with the current literature in action recognition.


Asunto(s)
Terapia por Ejercicio/métodos , Rehabilitación/métodos , Juegos de Video , Terapia de Exposición Mediante Realidad Virtual/métodos , Humanos
6.
Int J Neural Syst ; 34(4): 2450019, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38414421

RESUMEN

Data privacy and security is an essential challenge in medical clinical settings, where individual hospital has its own sensitive patients data. Due to recent advances in decentralized machine learning in Federated Learning (FL), each hospital has its own private data and learning models to collaborate with other trusted participating hospitals. Heterogeneous data and models among different hospitals raise major challenges in robust FL, such as gradient leakage, where participants can exploit model weights to infer data. Here, we proposed a robust FL method to efficiently tackle data and model heterogeneity, where we train our model using knowledge distillation and a novel weighted client confidence score on hematological cytomorphology data in clinical settings. In the knowledge distillation, each participant learns from other participants by a weighted confidence score so that knowledge from clean models is distributed other than the noisy clients possessing noisy data. Moreover, we use symmetric loss to reduce the negative impact of data heterogeneity and label diversity by reducing overfitting the model to noisy labels. In comparison to the current approaches, our proposed method performs the best, and this is the first demonstration of addressing both data and model heterogeneity in end-to-end FL that lays the foundation for robust FL in laboratories and clinical applications.


Asunto(s)
Aprendizaje Automático , Procesos Mentales , Humanos
7.
Comput Methods Programs Biomed ; 245: 108037, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38271793

RESUMEN

BACKGROUND: aortic stenosis is a common heart valve disease that mainly affects older people in developed countries. Its early detection is crucial to prevent the irreversible disease progression and, eventually, death. A typical screening technique to detect stenosis uses echocardiograms; however, variations introduced by other tissues, camera movements, and uneven lighting can hamper the visual inspection, leading to misdiagnosis. To address these issues, effective solutions involve employing deep learning algorithms to assist clinicians in detecting and classifying stenosis by developing models that can predict this pathology from single heart views. Although promising, the visual information conveyed by a single image may not be sufficient for an accurate diagnosis, especially when using an automatic system; thus, this indicates that different solutions should be explored. METHODOLOGY: following this rationale, this paper proposes a novel deep learning architecture, composed of a multi-view, multi-scale feature extractor, and a transformer encoder (MV-MS-FETE) to predict stenosis from parasternal long and short-axis views. In particular, starting from the latter, the designed model extracts relevant features at multiple scales along its feature extractor component and takes advantage of a transformer encoder to perform the final classification. RESULTS: experiments were performed on the recently released Tufts medical echocardiogram public dataset, which comprises 27,788 images split into training, validation, and test sets. Due to the recent release of this collection, tests were also conducted on several state-of-the-art models to create multi-view and single-view benchmarks. For all models, standard classification metrics were computed (e.g., precision, F1-score). The obtained results show that the proposed approach outperforms other multi-view methods in terms of accuracy and F1-score and has more stable performance throughout the training procedure. Furthermore, the experiments also highlight that multi-view methods generally perform better than their single-view counterparts. CONCLUSION: this paper introduces a novel multi-view and multi-scale model for aortic stenosis recognition, as well as three benchmarks to evaluate it, effectively providing multi-view and single-view comparisons that fully highlight the model's effectiveness in aiding clinicians in performing diagnoses while also producing several baselines for the aortic stenosis recognition task.


Asunto(s)
Estenosis de la Válvula Aórtica , Humanos , Anciano , Constricción Patológica , Estenosis de la Válvula Aórtica/diagnóstico por imagen , Ecocardiografía , Corazón , Algoritmos
8.
Int J Neural Syst ; 34(5): 2450024, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38533631

RESUMEN

Emotion recognition plays an essential role in human-human interaction since it is a key to understanding the emotional states and reactions of human beings when they are subject to events and engagements in everyday life. Moving towards human-computer interaction, the study of emotions becomes fundamental because it is at the basis of the design of advanced systems to support a broad spectrum of application areas, including forensic, rehabilitative, educational, and many others. An effective method for discriminating emotions is based on ElectroEncephaloGraphy (EEG) data analysis, which is used as input for classification systems. Collecting brain signals on several channels and for a wide range of emotions produces cumbersome datasets that are hard to manage, transmit, and use in varied applications. In this context, the paper introduces the Empátheia system, which explores a different EEG representation by encoding EEG signals into images prior to their classification. In particular, the proposed system extracts spatio-temporal image encodings, or atlases, from EEG data through the Processing and transfeR of Interaction States and Mappings through Image-based eNcoding (PRISMIN) framework, thus obtaining a compact representation of the input signals. The atlases are then classified through the Empátheia architecture, which comprises branches based on convolutional, recurrent, and transformer models designed and tuned to capture the spatial and temporal aspects of emotions. Extensive experiments were conducted on the Shanghai Jiao Tong University (SJTU) Emotion EEG Dataset (SEED) public dataset, where the proposed system significantly reduced its size while retaining high performance. The results obtained highlight the effectiveness of the proposed approach and suggest new avenues for data representation in emotion recognition from EEG signals.


Asunto(s)
Encéfalo , Emociones , Humanos , China , Electroencefalografía/métodos , Conducta Compulsiva
9.
Int J Neural Syst ; 33(2): 2350003, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36585854

RESUMEN

Considering the 2030 United Nations intent of world connection, Cyber Intelligence becomes the main area of the human dimension able of inflicting changes in geopolitical dynamics. In cyberspace, the new battlefield is the mind of people including new weapons like abuse of social media with information manipulation, deception by activists and misinformation. In this paper, a Sentiment Analysis system with Anomaly Detection (SAAD) capability is proposed. The system, scalable and modular, uses an OSINT-Deep Learning approach to investigate on social media sentiment in order to predict suspicious anomaly trend in Twitter posts. Anomaly detection is investigated with a new semi-supervised process that is able to detect potentially dangerous situations in critical areas. The main contributions of the paper are the system suitability for working in different areas and domains, the anomaly detection procedure in sentiment context and a time-dependent confusion matrix to address model evaluation with unbalanced dataset. Real experiments and tests were performed on Sahel Region. The detected anomalies in negative sentiment have been checked by experts of Sahel area, proving true links between the models results and real situations observable from the tweets.


Asunto(s)
Análisis de Sentimientos , Medios de Comunicación Sociales , Humanos , Inteligencia , Intención , Actitud
10.
Int J Neural Syst ; 33(8): 2350033, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37246573

RESUMEN

Swarm Learning (SL) is a promising approach to perform the distributed and collaborative model training without any central server. However, data sensitivity is the main concern for privacy when collaborative training requires data sharing. A neural network, especially Generative Adversarial Network (GAN), is able to reproduce the original data from model parameters, i.e. gradient leakage problem. To solve this problem, SL provides a framework for secure aggregation using blockchain methods. In this paper, we consider the scenario of compromised and malicious participants in the SL environment, where a participant can manipulate the privacy of other participant in collaborative training. We propose a method, Swarm-FHE, Swarm Learning with Fully Homomorphic Encryption (FHE), to encrypt the model parameters before sharing with the participants which are registered and authenticated by blockchain technology. Each participant shares the encrypted parameters (i.e. ciphertexts) with other participants in SL training. We evaluate our method with training of the convolutional neural networks on the CIFAR-10 and MNIST datasets. On the basis of a considerable number of experiments and results with different hyperparameter settings, our method performs better as compared to other existing methods.


Asunto(s)
Seguridad Computacional , Redes Neurales de la Computación , Humanos
11.
Int J Neural Syst ; 33(10): 2350052, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37567858

RESUMEN

Over the years, the humanities community has increasingly requested the creation of artificial intelligence frameworks to help the study of cultural heritage. Document Layout segmentation, which aims at identifying the different structural components of a document page, is a particularly interesting task connected to this trend, specifically when it comes to handwritten texts. While there are many effective approaches to this problem, they all rely on large amounts of data for the training of the underlying models, which is rarely possible in a real-world scenario, as the process of producing the ground truth segmentation task with the required precision to the pixel level is a very time-consuming task and often requires a certain degree of domain knowledge regarding the documents at hand. For this reason, in this paper, we propose an effective few-shot learning framework for document layout segmentation relying on two novel components, namely a dynamic instance generation and a segmentation refinement module. This approach is able of achieving performances comparable to the current state of the art on the popular Diva-HisDB dataset, while relying on just a fraction of the available data.


Asunto(s)
Inteligencia Artificial , Procesamiento de Imagen Asistido por Computador
12.
Neural Netw ; 153: 386-398, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35785610

RESUMEN

Improving existing neural network architectures can involve several design choices such as manipulating the loss functions, employing a diverse learning strategy, exploiting gradient evolution at training time, optimizing the network hyper-parameters, or increasing the architecture depth. The latter approach is a straightforward solution, since it directly enhances the representation capabilities of a network; however, the increased depth generally incurs in the well-known vanishing gradient problem. In this paper, borrowing from different methods addressing this issue, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by preserving information from the input image through interlaced auto-encoders (AEs), and further refines the base network architecture by means of skip and residual connections. To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on five collections, i.e., MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, and Caltech-256; where the SIRe-extended architectures achieve significantly increased performances across all models and datasets, thus confirming the presented approach effectiveness.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación
13.
Int J Neural Syst ; 32(10): 2250040, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-35881015

RESUMEN

Human feelings expressed through verbal (e.g. voice) and non-verbal communication channels (e.g. face or body) can influence either human actions or interactions. In the literature, most of the attention was given to facial expressions for the analysis of emotions conveyed through non-verbal behaviors. Despite this, psychology highlights that the body is an important indicator of the human affective state in performing daily life activities. Therefore, this paper presents a novel method for affective action and interaction recognition from videos, exploiting multi-view representation learning and only full-body handcrafted characteristics selected following psychological and proxemic studies. Specifically, 2D skeletal data are extracted from RGB video sequences to derive diverse low-level skeleton features, i.e. multi-views, modeled through the bag-of-visual-words clustering approach generating a condition-related codebook. In this way, each affective action and interaction within a video can be represented as a frequency histogram of codewords. During the learning phase, for each affective class, training samples are used to compute its global histogram of codewords stored in a database and later used for the recognition task. In the recognition phase, the video frequency histogram representation is matched against the database of class histograms and classified as the closest affective class in terms of Euclidean distance. The effectiveness of the proposed system is evaluated on a specifically collected dataset containing 6 emotion for both actions and interactions, on which the proposed system obtains 93.64% and 90.83% accuracy, respectively. In addition, the devised strategy also achieves in line performances with other literature works based on deep learning when tested on a public collection containing 6 emotions plus a neutral state, demonstrating the effectiveness of the presented approach and confirming the findings in psychological and proxemic studies.


Asunto(s)
Algoritmos , Expresión Facial , Análisis por Conglomerados , Actividades Humanas , Humanos , Esqueleto
14.
Int J Neural Syst ; 32(7): 2250030, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35730477

RESUMEN

Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction: the input image is projected in some latent space and then reconstructed, assuming that the network (mostly trained on normal data) will not be able to reconstruct the anomalous portions. However, this assumption does not always hold. We thus propose a new model based on the Vision Transformer architecture with patch masking: the input image is split in several patches, and each patch is reconstructed only from the surrounding data, thus ignoring the potentially anomalous information contained in the patch itself. We then show that multi-resolution patches and their collective embeddings provide a large improvement in the model's performance compared to the exclusive use of the traditional square patches. The proposed model has been tested on popular anomaly detection datasets such as MVTec and head CT and achieved good results when compared to other state-of-the-art approaches.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Tomografía Computarizada por Rayos X , Procesamiento de Imagen Asistido por Computador/métodos , Tomografía Computarizada por Rayos X/métodos
15.
Int J Neural Syst ; 32(5): 2250015, 2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-35209810

RESUMEN

The increasing availability of wireless access points (APs) is leading toward human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spectrum can become essential to obtain otherwise unavailable visual data. This domain-to-domain translation is feasible because both objects and people affect electromagnetic waves, causing radio and optical frequencies variations. In the literature, models capable of inferring radio-to-visual features mappings have gained momentum in the last few years since frequency changes can be observed in the radio domain through the channel state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction, e.g. amplitude. On this account, this paper presents a novel two-branch generative neural network that effectively maps radio data into visual features, following a teacher-student design that exploits a cross-modality supervision strategy. The latter conditions signal-based features in the visual domain to completely replace visual data. Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals. The approach is evaluated on publicly available data, where it obtains remarkable results for both silhouette and skeleton videos generation, demonstrating the effectiveness of the proposed cross-modality supervision strategy.


Asunto(s)
Ondas de Radio , Tecnología Inalámbrica , Humanos , Esqueleto
16.
Int J Neural Syst ; 31(12): 2150060, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34779358

RESUMEN

Network intrusion detection is becoming a challenging task with cyberattacks that are becoming more and more sophisticated. Failing the prevention or detection of such intrusions might have serious consequences. Machine learning approaches try to recognize network connection patterns to classify unseen and known intrusions but also require periodic re-training to keep the performances at a high level. In this paper, a novel continuous learning intrusion detection system, called Soft-Forgetting Self-Organizing Incremental Neural Network (SF-SOINN), is introduced. SF-SOINN, besides providing continuous learning capabilities, is able to perform fast classification, is robust to noise, and it obtains good performances with respect to the existing approaches. The main characteristic of SF-SOINN is the ability to remove nodes from the neural network based on their utility estimate. SF-SOINN has been validated on the well-known NSL-KDD and CIC-IDS-2017 intrusion detection datasets as well as on some artificial data to show the classification capability on more general tasks.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Educación Continua
17.
IEEE Trans Cybern ; 51(5): 2587-2600, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-31021784

RESUMEN

This paper discusses the problem of tracking a moving target by means of a cluster of mobile agents that is able to sense the acoustic emissions of the target, with the aim of improving the target localization and tracking performance with respect to conventional fixed-array acoustic localization. We handle the acoustic part of the problem by modeling the cluster as a sensor network, and we propose a centralized control strategy for the agents that exploits the spatial sensitivity pattern of the sensor network to estimate the best possible cluster configuration with respect to the expected target position. In order to take into account the position estimation delay due to the frame-based nature of the processing, the possible positions of the acoustic target in a given future time interval are represented in terms of a compatible set, that is, the set of all possible future positions of the target, given its dynamics and its present state. A frame-by-frame cluster reconfiguration algorithm is presented, which adapts the position of each sensing agent with the goal of pursuing the maximum overlap between the region of high acoustic sensitivity of the entire cluster and the compatible set of the sound-emitting target. The tracking scheme iterates, at each observation frame, the computation of the target compatible set, the reconfiguration of the cluster, and the target acoustic localization. The reconfiguration step makes use of an opportune cost function proportional to the difference of the compatibility set and the acoustic sensitivity spatial pattern determined by the mobile agent positions. Simulations under different geometric configurations and positioning constraints demonstrate the ability of the proposed approach to effectively localize and track a moving target based on its acoustic emission. The Doppler effect related to moving sources and sensors is taken into account, and its impact on performance is analyzed. We compare the localization results with conventional static-array localization and positioning of acoustic sensors through genetic algorithm optimization, and results demonstrate the sensible improvements in terms of localization and tracking performance. Although the method is discussed here with respect to acoustic target tracking, it can be effectively adapted to video-based localization and tracking, or to multimodal information settings (e.g., audio and video).

18.
Int J Neural Syst ; 31(2): 2050068, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33200620

RESUMEN

Deception detection is a relevant ability in high stakes situations such as police interrogatories or court trials, where the outcome is highly influenced by the interviewed person behavior. With the use of specific devices, e.g. polygraph or magnetic resonance, the subject is aware of being monitored and can change his behavior, thus compromising the interrogation result. For this reason, video analysis-based methods for automatic deception detection are receiving ever increasing interest. In this paper, a deception detection approach based on RGB videos, leveraging both facial features and stacked generalization ensemble, is proposed. First, a face, which is well-known to present several meaningful cues for deception detection, is identified, aligned, and masked to build video signatures. These signatures are constructed starting from five different descriptors, which allow the system to capture both static and dynamic facial characteristics. Then, video signatures are given as input to four base-level algorithms, which are subsequently fused applying the stacked generalization technique, resulting in a more robust meta-level classifier used to predict deception. By exploiting relevant cues via specific features, the proposed system achieves improved performances on a public dataset of famous court trials, with respect to other state-of-the-art methods based on facial features, highlighting the effectiveness of the proposed method.


Asunto(s)
Señales (Psicología) , Decepción , Algoritmos , Humanos
19.
Int J Neural Syst ; 30(10): 2050060, 2020 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-32938260

RESUMEN

Image anomaly detection is an application-driven problem where the aim is to identify novel samples, which differ significantly from the normal ones. We here propose Pyramidal Image Anomaly DEtector (PIADE), a deep reconstruction-based pyramidal approach, in which image features are extracted at different scale levels to better catch the peculiarities that could help to discriminate between normal and anomalous data. The features are dynamically routed to a reconstruction layer and anomalies can be identified by comparing the input image with its reconstruction. Unlike similar approaches, the comparison is done by using structural similarity and perceptual loss rather than trivial pixel-by-pixel comparison. The proposed method performed at par or better than the state-of-the-art methods when tested on publicly available datasets such as CIFAR10, COIL-100 and MVTec.


Asunto(s)
Aprendizaje Profundo , Interpretación de Imagen Asistida por Computador , Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático Supervisado , Humanos
20.
Neural Netw ; 124: 20-38, 2020 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31962232

RESUMEN

Classification of high dimensional data suffers from curse of dimensionality and over-fitting. Neural tree is a powerful method which combines a local feature selection and recursive partitioning to solve these problems, but it leads to high depth trees in classifying high dimensional data. On the other hand, if less depth trees are used, the classification accuracy decreases or over-fitting increases. This paper introduces a novel Neural Tree exploiting Expert Nodes (NTEN) to classify high-dimensional data. It is based on a decision tree structure, whose internal nodes are expert nodes performing multi-dimensional splitting. Any expert node has three decision-making abilities. Firstly, they can select the most eligible neural network with respect to the data complexity. Secondly, they evaluate the over-fitting. Thirdly, they can cluster the features to jointly minimize redundancy and overlapping. To this aim, metaheuristic optimization algorithms including GA, NSGA-II, PSO and ACO are applied. Based on these concepts, any expert node splits a class when the over-fitting is low, and clusters the features when the over-fitting is high. Some theoretical results on NTEN are derived, and experiments on 35 standard data show that NTEN reaches good classification results, reduces tree depth without over-fitting and degrading accuracy.


Asunto(s)
Redes Neurales de la Computación , Manejo de Datos/métodos , Árboles de Decisión
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA