Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros




Base de datos
Intervalo de año de publicación
1.
Comput Methods Programs Biomed ; 245: 108037, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38271793

RESUMEN

BACKGROUND: aortic stenosis is a common heart valve disease that mainly affects older people in developed countries. Its early detection is crucial to prevent the irreversible disease progression and, eventually, death. A typical screening technique to detect stenosis uses echocardiograms; however, variations introduced by other tissues, camera movements, and uneven lighting can hamper the visual inspection, leading to misdiagnosis. To address these issues, effective solutions involve employing deep learning algorithms to assist clinicians in detecting and classifying stenosis by developing models that can predict this pathology from single heart views. Although promising, the visual information conveyed by a single image may not be sufficient for an accurate diagnosis, especially when using an automatic system; thus, this indicates that different solutions should be explored. METHODOLOGY: following this rationale, this paper proposes a novel deep learning architecture, composed of a multi-view, multi-scale feature extractor, and a transformer encoder (MV-MS-FETE) to predict stenosis from parasternal long and short-axis views. In particular, starting from the latter, the designed model extracts relevant features at multiple scales along its feature extractor component and takes advantage of a transformer encoder to perform the final classification. RESULTS: experiments were performed on the recently released Tufts medical echocardiogram public dataset, which comprises 27,788 images split into training, validation, and test sets. Due to the recent release of this collection, tests were also conducted on several state-of-the-art models to create multi-view and single-view benchmarks. For all models, standard classification metrics were computed (e.g., precision, F1-score). The obtained results show that the proposed approach outperforms other multi-view methods in terms of accuracy and F1-score and has more stable performance throughout the training procedure. Furthermore, the experiments also highlight that multi-view methods generally perform better than their single-view counterparts. CONCLUSION: this paper introduces a novel multi-view and multi-scale model for aortic stenosis recognition, as well as three benchmarks to evaluate it, effectively providing multi-view and single-view comparisons that fully highlight the model's effectiveness in aiding clinicians in performing diagnoses while also producing several baselines for the aortic stenosis recognition task.


Asunto(s)
Estenosis de la Válvula Aórtica , Humanos , Anciano , Constricción Patológica , Estenosis de la Válvula Aórtica/diagnóstico por imagen , Ecocardiografía , Corazón , Algoritmos
2.
Int J Neural Syst ; 32(10): 2250040, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-35881015

RESUMEN

Human feelings expressed through verbal (e.g. voice) and non-verbal communication channels (e.g. face or body) can influence either human actions or interactions. In the literature, most of the attention was given to facial expressions for the analysis of emotions conveyed through non-verbal behaviors. Despite this, psychology highlights that the body is an important indicator of the human affective state in performing daily life activities. Therefore, this paper presents a novel method for affective action and interaction recognition from videos, exploiting multi-view representation learning and only full-body handcrafted characteristics selected following psychological and proxemic studies. Specifically, 2D skeletal data are extracted from RGB video sequences to derive diverse low-level skeleton features, i.e. multi-views, modeled through the bag-of-visual-words clustering approach generating a condition-related codebook. In this way, each affective action and interaction within a video can be represented as a frequency histogram of codewords. During the learning phase, for each affective class, training samples are used to compute its global histogram of codewords stored in a database and later used for the recognition task. In the recognition phase, the video frequency histogram representation is matched against the database of class histograms and classified as the closest affective class in terms of Euclidean distance. The effectiveness of the proposed system is evaluated on a specifically collected dataset containing 6 emotion for both actions and interactions, on which the proposed system obtains 93.64% and 90.83% accuracy, respectively. In addition, the devised strategy also achieves in line performances with other literature works based on deep learning when tested on a public collection containing 6 emotions plus a neutral state, demonstrating the effectiveness of the presented approach and confirming the findings in psychological and proxemic studies.


Asunto(s)
Algoritmos , Expresión Facial , Análisis por Conglomerados , Actividades Humanas , Humanos , Esqueleto
3.
Int J Neural Syst ; 32(5): 2250015, 2022 May.
Artículo en Inglés | MEDLINE | ID: mdl-35209810

RESUMEN

The increasing availability of wireless access points (APs) is leading toward human sensing applications based on Wi-Fi signals as support or alternative tools to the widespread visual sensors, where the signals enable to address well-known vision-related problems such as illumination changes or occlusions. Indeed, using image synthesis techniques to translate radio frequencies to the visible spectrum can become essential to obtain otherwise unavailable visual data. This domain-to-domain translation is feasible because both objects and people affect electromagnetic waves, causing radio and optical frequencies variations. In the literature, models capable of inferring radio-to-visual features mappings have gained momentum in the last few years since frequency changes can be observed in the radio domain through the channel state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction, e.g. amplitude. On this account, this paper presents a novel two-branch generative neural network that effectively maps radio data into visual features, following a teacher-student design that exploits a cross-modality supervision strategy. The latter conditions signal-based features in the visual domain to completely replace visual data. Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals. The approach is evaluated on publicly available data, where it obtains remarkable results for both silhouette and skeleton videos generation, demonstrating the effectiveness of the proposed cross-modality supervision strategy.


Asunto(s)
Ondas de Radio , Tecnología Inalámbrica , Humanos , Esqueleto
4.
Int J Neural Syst ; 31(2): 2050068, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33200620

RESUMEN

Deception detection is a relevant ability in high stakes situations such as police interrogatories or court trials, where the outcome is highly influenced by the interviewed person behavior. With the use of specific devices, e.g. polygraph or magnetic resonance, the subject is aware of being monitored and can change his behavior, thus compromising the interrogation result. For this reason, video analysis-based methods for automatic deception detection are receiving ever increasing interest. In this paper, a deception detection approach based on RGB videos, leveraging both facial features and stacked generalization ensemble, is proposed. First, a face, which is well-known to present several meaningful cues for deception detection, is identified, aligned, and masked to build video signatures. These signatures are constructed starting from five different descriptors, which allow the system to capture both static and dynamic facial characteristics. Then, video signatures are given as input to four base-level algorithms, which are subsequently fused applying the stacked generalization technique, resulting in a more robust meta-level classifier used to predict deception. By exploiting relevant cues via specific features, the proposed system achieves improved performances on a public dataset of famous court trials, with respect to other state-of-the-art methods based on facial features, highlighting the effectiveness of the proposed method.


Asunto(s)
Señales (Psicología) , Decepción , Algoritmos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA