Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 42(9): 2306-2320, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-30990421

RESUMO

In this work we present a new approach to the field of weakly supervised learning in the video domain. Our method is relevant to sequence learning problems which can be split up into sub-problems that occur in parallel. Here, we experiment with sign language data. The approach exploits sequence constraints within each independent stream and combines them by explicitly imposing synchronisation points to make use of parallelism that all sub-problems share. We do this with multi-stream HMMs while adding intermediate synchronisation constraints among the streams. We embed powerful CNN-LSTM models in each HMM stream following the hybrid approach. This allows the discovery of attributes which on their own lack sufficient discriminative power to be identified. We apply the approach to the domain of sign language recognition exploiting the sequential parallelism to learn sign language, mouth shape and hand shape classifiers. We evaluate the classifiers on three publicly available benchmark data sets featuring challenging real-life sign language with over 1,000 classes, full sentence based lip-reading and articulated hand shape recognition on a fine-grained hand shape taxonomy featuring over 60 different hand shapes. We clearly outperform the state-of-the-art on all data sets and observe significantly faster convergence using the parallel alignment approach.

2.
IEEE Trans Pattern Anal Mach Intell ; 41(2): 502-514, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29990282

RESUMO

In this work, fundamental analytic results in the form of error bounds are presented that quantify the effect of feature omission and selection for pattern classification in general, as well as the effect of context reduction in string classification, like automatic speech recognition, printed/handwritten character recognition, or statistical machine translation. A general simulation framework is introduced that supports discovery and proof of error bounds, which lead to the error bounds presented here. Initially derived tight lower and upper bounds for feature omission are generalized to feature selection, followed by another extension to context reduction of string class priors (aka language models) in string classification. For string classification, the quantitative effect of string class prior context reduction on symbol-level Bayes error is presented. The tightness of the original feature omission bounds seems lost in this case, as further simulations indicate. However, combining both feature omission andcontext reduction, the tightness of the bounds is retained. A central result of this work is the proof of the existence, and the amount of a statistical threshold w.r.t. the introduction of additional features in general pattern classification, or the increase of context in string classification beyond which a decrease in Bayes error is guaranteed.

3.
J Digit Imaging ; 21(3): 280-9, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17497197

RESUMO

The impact of image pattern recognition on accessing large databases of medical images has recently been explored, and content-based image retrieval (CBIR) in medical applications (IRMA) is researched. At the present, however, the impact of image retrieval on diagnosis is limited, and practical applications are scarce. One reason is the lack of suitable mechanisms for query refinement, in particular, the ability to (1) restore previous session states, (2) combine individual queries by Boolean operators, and (3) provide continuous-valued query refinement. This paper presents a powerful user interface for CBIR that provides all three mechanisms for extended query refinement. The various mechanisms of man-machine interaction during a retrieval session are grouped into four classes: (1) output modules, (2) parameter modules, (3) transaction modules, and (4) process modules, all of which are controlled by a detailed query logging. The query logging is linked to a relational database. Nested loops for interaction provide a maximum of flexibility within a minimum of complexity, as the entire data flow is still controlled within a single Web page. Our approach is implemented to support various modalities, orientations, and body regions using global features that model gray scale, texture, structure, and global shape characteristics. The resulting extended query refinement has a significant impact for medical CBIR applications.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Internet/estatística & dados numéricos , Interpretação de Imagem Radiográfica Assistida por Computador , Sistemas de Informação em Radiologia/instrumentação , Interface Usuário-Computador , Gráficos por Computador , Bases de Dados Factuais , Diagnóstico por Imagem/métodos , Humanos , Aplicações da Informática Médica , Reconhecimento Automatizado de Padrão , Sensibilidade e Especificidade , Design de Software
4.
IEEE Trans Pattern Anal Mach Intell ; 29(8): 1422-35, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17568145

RESUMO

We present the application of different nonlinear image deformation models to the task of image recognition. The deformation models are especially suited for local changes as they often occur in the presence of image object variability. We show that, among the discussed models, there is one approach that combines simplicity of implementation, low-computational complexity, and highly competitive performance across various real-world image recognition tasks. We show experimentally that the model performs very well for four different handwritten digit recognition tasks and for the classification of medical images, thus showing high generalization capacity. In particular, an error rate of 0.54 percent on the MNIST benchmark is achieved, as well as the lowest reported error rate, specifically 12.6 percent, in the 2005 international ImageCLEF evaluation of medical image categorization.


Assuntos
Processamento de Imagem Assistida por Computador , Reconhecimento Automatizado de Padrão , Algoritmos , Inteligência Artificial , Simulação por Computador , Humanos , Interpretação de Imagem Assistida por Computador , Dinâmica não Linear
5.
Comput Med Imaging Graph ; 29(2-3): 143-55, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15755534

RESUMO

Categorization of medical images means selecting the appropriate class for a given image out of a set of pre-defined categories. This is an important step for data mining and content-based image retrieval (CBIR). So far, published approaches are capable to distinguish up to 10 categories. In this paper, we evaluate automatic categorization into more than 80 categories describing the imaging modality and direction as well as the body part and biological system examined. Based on 6231 reference images from hospital routine, 85.5% correctness is obtained combining global texture features with scaled images. With a frequency of 97.7%, the correct class is within the best ten matches, which is sufficient for medical CBIR applications.


Assuntos
Diagnóstico por Imagem , Armazenamento e Recuperação da Informação , Automação , Alemanha
6.
IEEE Trans Pattern Anal Mach Intell ; 26(2): 269-74, 2004 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-15376902

RESUMO

We integrate the tangent method into a statistical framework for classification analytically and practically. The resulting consistent framework for adaptation allows us to efficiently estimate the tangent vectors representing the variability. The framework improves classification results on two real-world pattern recognition tasks from the domains handwritten character recognition and automatic speech recognition.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão , Técnica de Subtração , Análise por Conglomerados , Processamento Eletrônico de Dados , Retroalimentação , Aumento da Imagem/métodos , Modelos Estatísticos , Processamento de Linguagem Natural , Análise Numérica Assistida por Computador , Leitura , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador , Percepção da Fala
7.
IEEE Trans Pattern Anal Mach Intell ; 34(6): 1105-17, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22064798

RESUMO

We present latent log-linear models, an extension of log-linear models incorporating latent variables, and we propose two applications thereof: log-linear mixture models and image deformation-aware log-linear models. The resulting models are fully discriminative, can be trained efficiently, and the model complexity can be controlled. Log-linear mixture models offer additional flexibility within the log-linear modeling framework. Unlike previous approaches, the image deformation-aware model directly considers image deformations and allows for a discriminative training of the deformation parameters. Both are trained using alternating optimization. For certain variants, convergence to a stationary point is guaranteed and, in practice, even variants without this guarantee converge and find models that perform well. We tune the methods on the USPS data set and evaluate on the MNIST data set, demonstrating the generalization capabilities of our proposed models. Our models, although using significantly fewer parameters, are able to obtain competitive results with models proposed in the literature.


Assuntos
Modelos Lineares , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Linguagem Natural
8.
IEEE Trans Pattern Anal Mach Intell ; 34(2): 292-301, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21844628

RESUMO

In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.


Assuntos
Teorema de Bayes , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Simulação por Computador , Interface para o Reconhecimento da Fala
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa