Brain-inspired automated visual object discovery and detection.

Chen, Lichao; Singh, Sudhir; Kailath, Thomas; Roychowdhury, Vwani

Chen, Lichao; Singh, Sudhir; Kailath, Thomas; Roychowdhury, Vwani.

Afiliação

Chen L; Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095.
Singh S; Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095.
Kailath T; Department of Electrical Engineering, Stanford University, Stanford, CA 94305 kailath@stanford.edu vwani@ucla.edu.
Roychowdhury V; Department of Electrical and Computer Engineering, University of California, Los Angeles, CA 90095; kailath@stanford.edu vwani@ucla.edu.

Proc Natl Acad Sci U S A ; 116(1): 96-105, 2019 01 02.

Article em En | MEDLINE | ID: mdl-30559207

ABSTRACT

ABSTRACT

Despite significant recent progress, machine vision systems lag considerably behind their biological counterparts in performance, scalability, and robustness. A distinctive hallmark of the brain is its ability to automatically discover and model objects, at multiscale resolutions, from repeated exposures to unlabeled contextual data and then to be able to robustly detect the learned objects under various nonideal circumstances, such as partial occlusion and different view angles. Replication of such capabilities in a machine would require three key ingredients (i) access to large-scale perceptual data of the kind that humans experience, (ii) flexible representations of objects, and (iii) an efficient unsupervised learning algorithm. The Internet fortunately provides unprecedented access to vast amounts of visual data. This paper leverages the availability of such data to develop a scalable framework for unsupervised learning of object prototypes-brain-inspired flexible, scale, and shift invariant representations of deformable objects (e.g., humans, motorcycles, cars, airplanes) comprised of parts, their different configurations and views, and their spatial relationships. Computationally, the object prototypes are represented as geometric associative networks using probabilistic constructs such as Markov random fields. We apply our framework to various datasets and show that our approach is computationally scalable and can construct accurate and operational part-aware object models much more efficiently than in much of the recent computer vision literature. We also present efficient algorithms for detection and localization in new scenes of objects and their partial views.

Assuntos

Inteligência Artificial; Aprendizado de Máquina não Supervisionado; Algoritmos; Encéfalo/fisiologia; Simulação por Computador; Reconhecimento Facial; Sistemas de Informação Geográfica; Humanos; Reconhecimento Visual de Modelos; Percepção Visual

Palavras-chave

brain memory models; brain-inspired learning; brain-inspired object models; computer vision; machine learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Aprendizado de Máquina não Supervisionado Tipo de estudo: Diagnostic_studies Limite: Humans Idioma: En Revista: Proc Natl Acad Sci U S A Ano de publicação: 2019 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google