Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Ann Clin Transl Neurol ; 10(8): 1314-1325, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37292032

RESUMEN

OBJECTIVE: Myasthenia gravis (MG) is an autoimmune disease leading to fatigable muscle weakness. Extra-ocular and bulbar muscles are most commonly affected. We aimed to investigate whether facial weakness can be quantified automatically and used for diagnosis and disease monitoring. METHODS: In this cross-sectional study, we analyzed video recordings of 70 MG patients and 69 healthy controls (HC) with two different methods. Facial weakness was first quantified with facial expression recognition software. Subsequently, a deep learning (DL) computer model was trained for the classification of diagnosis and disease severity using multiple cross-validations on videos of 50 patients and 50 controls. Results were validated using unseen videos of 20 MG patients and 19 HC. RESULTS: Expression of anger (p = 0.026), fear (p = 0.003), and happiness (p < 0.001) was significantly decreased in MG compared to HC. Specific patterns of decreased facial movement were detectable in each emotion. Results of the DL model for diagnosis were as follows: area under the curve (AUC) of the receiver operator curve 0.75 (95% CI 0.65-0.85), sensitivity 0.76, specificity 0.76, and accuracy 76%. For disease severity: AUC 0.75 (95% CI 0.60-0.90), sensitivity 0.93, specificity 0.63, and accuracy 80%. Results of validation, diagnosis: AUC 0.82 (95% CI: 0.67-0.97), sensitivity 1.0, specificity 0.74, and accuracy 87%. For disease severity: AUC 0.88 (95% CI: 0.67-1.0), sensitivity 1.0, specificity 0.86, and accuracy 94%. INTERPRETATION: Patterns of facial weakness can be detected with facial recognition software. Second, this study delivers a 'proof of concept' for a DL model that can distinguish MG from HC and classifies disease severity.


Asunto(s)
Aprendizaje Profundo , Parálisis Facial , Reconocimiento Facial , Miastenia Gravis , Humanos , Estudios Transversales , Miastenia Gravis/complicaciones , Miastenia Gravis/diagnóstico , Programas Informáticos
2.
IEEE Trans Image Process ; 30: 8342-8353, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34587011

RESUMEN

Resolution in deep convolutional neural networks (CNNs) is typically bounded by the receptive field size through filter sizes, and subsampling layers or strided convolutions on feature maps. The optimal resolution may vary significantly depending on the dataset. Modern CNNs hard-code their resolution hyper-parameters in the network architecture which makes tuning such hyper-parameters cumbersome. We propose to do away with hard-coded resolution hyper-parameters and aim to learn the appropriate resolution from data. We use scale-space theory to obtain a self-similar parametrization of filters and make use of the N-Jet: a truncated Taylor series to approximate a filter by a learned combination of Gaussian derivative filters. The parameter σ of the Gaussian basis controls both the amount of detail the filter encodes and the spatial extent of the filter. Since σ is a continuous parameter, we can optimize it with respect to the loss. The proposed N-Jet layer achieves comparable performance when used in state-of-the art architectures, while learning the correct resolution in each layer automatically. We evaluate our N-Jet layer on both classification and segmentation, and we show that learning σ is especially beneficial when dealing with inputs at multiple sizes.

3.
Artículo en Inglés | MEDLINE | ID: mdl-30307867

RESUMEN

We propose a general object counting method that does not use any prior category information. We learn from local image divisions to predict global image-level counts without using any form of local annotations. Our method separates the input image into a sets of image divisions - each fully covering the image. Each image division is composed of a set of region proposals or uniform grid cells. Our approach learns in an endto- end deep learning architecture to predict global image-level counts from local image divisions. The method incorporates a counting layer which predicts object counts in the complete image, by enforcing consistency in counts when dealing with overlapping image regions. Our counting layer is based on the inclusion-exclusion principle from set theory. We analyze the individual building blocks of our proposed approach on Pascal- VOC2007 and evaluate our method on the MS-COCO large scale generic object dataset as well as on three class-specific counting datasets: UCSD pedestrian dataset, and CARPK and PUCPR+ car datasets.

4.
IEEE Trans Image Process ; 26(8): 3965-3980, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28541898

RESUMEN

This paper focuses on fine-grained object classification using recognized scene text in natural images. While the state-of-the-art relies on visual cues only, this paper is the first work which proposes to combine textual and visual cues. Another novelty is the textual cue extraction. Unlike the state-of-the-art text detection methods, we focus more on the background instead of text regions. Once text regions are detected, they are further processed by two methods to perform text recognition, i.e., ABBYY commercial OCR engine and a state-of-the-art character recognition algorithm. Then, to perform textual cue encoding, bi- and trigrams are formed between the recognized characters by considering the proposed spatial pairwise constraints. Finally, extracted visual and textual cues are combined for fine-grained classification. The proposed method is validated on four publicly available data sets: ICDAR03, ICDAR13, Con-Text, and Flickr-logo. We improve the state-of-the-art end-to-end character recognition by a large margin of 15% on ICDAR03. We show that textual cues are useful in addition to visual cues for fine-grained classification. We show that textual cues are also useful for logo retrieval. Adding textual cues outperforms visual- and textual-only in fine-grained classification (70.7% to 60.3%) and logo retrieval (57.4% to 54.8%).

5.
IEEE Trans Image Process ; 23(4): 1569-80, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24577192

RESUMEN

This paper considers the recognition of realistic human actions in videos based on spatio-temporal interest points (STIPs). Existing STIP-based action recognition approaches operate on intensity representations of the image data. Because of this, these approaches are sensitive to disturbing photometric phenomena, such as shadows and highlights. In addition, valuable information is neglected by discarding chromaticity from the photometric representation. These issues are addressed by color STIPs. Color STIPs are multichannel reformulations of STIP detectors and descriptors, for which we consider a number of chromatic and invariant representations derived from the opponent color space. Color STIPs are shown to outperform their intensity-based counterparts on the challenging UCF sports, UCF11 and UCF50 action recognition benchmarks by more than 5% on average, where most of the gain is due to the multichannel descriptors. In addition, the results show that color STIPs are currently the single best low-level feature choice for STIP-based approaches to human action recognition.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Deportes/clasificación , Algoritmos , Color , Humanos , Procesamiento de Señales Asistido por Computador , Análisis Espacio-Temporal , Grabación en Video
6.
IEEE Trans Image Process ; 23(12): 5698-706, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25373082

RESUMEN

Many computer vision applications, including image classification, matching, and retrieval use global image representations, such as the Fisher vector, to encode a set of local image patches. To describe these patches, many local descriptors have been designed to be robust against lighting changes and noise. However, local image descriptors are unstable when the underlying image signal is low. Such low-signal patches are sensitive to small image perturbations, which might come e.g., from camera noise or lighting effects. In this paper, we first quantify the relation between the signal strength of a patch and the instability of that patch, and second, we extend the standard Fisher vector framework to explicitly take the descriptor instabilities into account. In comparison to common approaches to dealing with descriptor instabilities, our results show that modeling local descriptor instability is beneficial for object matching, image retrieval, and classification.

7.
IEEE Trans Pattern Anal Mach Intell ; 32(7): 1271-83, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20489229

RESUMEN

This paper studies automatic image classification by modeling soft assignment in the popular codebook model. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous features, the approach has been successfully applied for some years. In this paper, we investigate four types of soft assignment of visual words to image features. We demonstrate that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard assignment of the traditional codebook model. The traditional codebook model is compared against our method for five well-known data sets: 15 natural scenes, Caltech-101, Caltech-256, and Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model, whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps higher benefits when increasing the number of image categories.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA