Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Arch Pathol Lab Med ; 147(10): 1178-1185, 2023 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-36538386

RESUMEN

CONTEXT.­: Prostate cancer diagnosis rests on accurate assessment of tissue by a pathologist. The application of artificial intelligence (AI) to digitized whole slide images (WSIs) can aid pathologists in cancer diagnosis, but robust, diverse evidence in a simulated clinical setting is lacking. OBJECTIVE.­: To compare the diagnostic accuracy of pathologists reading WSIs of prostatic biopsy specimens with and without AI assistance. DESIGN.­: Eighteen pathologists, 2 of whom were genitourinary subspecialists, evaluated 610 prostate needle core biopsy WSIs prepared at 218 institutions, with the option for deferral. Two evaluations were performed sequentially for each WSI: initially without assistance, and immediately thereafter aided by Paige Prostate (PaPr), a deep learning-based system that provides a WSI-level binary classification of suspicious for cancer or benign and pinpoints the location that has the greatest probability of harboring cancer on suspicious WSIs. Pathologists' changes in sensitivity and specificity between the assisted and unassisted modalities were assessed, together with the impact of PaPr output on the assisted reads. RESULTS.­: Using PaPr, pathologists improved their sensitivity and specificity across all histologic grades and tumor sizes. Accuracy gains on both benign and cancerous WSIs could be attributed to PaPr, which correctly classified 100% of the WSIs showing corrected diagnoses in the PaPr-assisted phase. CONCLUSIONS.­: This study demonstrates the effectiveness and safety of an AI tool for pathologists in simulated diagnostic practice, bridging the gap between computational pathology research and its clinical application, and resulted in the first US Food and Drug Administration authorization of an AI system in pathology.


Asunto(s)
Inteligencia Artificial , Neoplasias de la Próstata , Masculino , Humanos , Próstata/patología , Interpretación de Imagen Asistida por Computador/métodos , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Biopsia con Aguja
2.
Diagnostics (Basel) ; 12(3)2022 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-35328225

RESUMEN

We map single energy CT (SECT) scans to synthetic dual-energy CT (synth-DECT) material density iodine (MDI) scans using deep learning (DL) and demonstrate their value for liver segmentation. A 2D pix2pix (P2P) network was trained on 100 abdominal DECT scans to infer synth-DECT MDI scans from SECT scans. The source and target domain were paired with DECT monochromatic 70 keV and MDI scans. The trained P2P algorithm then transformed 140 public SECT scans to synth-DECT scans. We split 131 scans into 60% train, 20% tune, and 20% held-out test to train four existing liver segmentation frameworks. The remaining nine low-dose SECT scans tested system generalization. Segmentation accuracy was measured with the dice coefficient (DSC). The DSC per slice was computed to identify sources of error. With synth-DECT (and SECT) scans, an average DSC score of 0.93±0.06 (0.89±0.01) and 0.89±0.01 (0.81±0.02) was achieved on the held-out and generalization test sets. Synth-DECT-trained systems required less data to perform as well as SECT-trained systems. Low DSC scores were primarily observed around the scan margin or due to non-liver tissue or distortions within ground-truth annotations. In general, training with synth-DECT scans resulted in improved segmentation performance with less data.

3.
Front Digit Health ; 3: 671015, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34713144

RESUMEN

Artificial intelligence (AI) has been successful at solving numerous problems in machine perception. In radiology, AI systems are rapidly evolving and show progress in guiding treatment decisions, diagnosing, localizing disease on medical images, and improving radiologists' efficiency. A critical component to deploying AI in radiology is to gain confidence in a developed system's efficacy and safety. The current gold standard approach is to conduct an analytical validation of performance on a generalization dataset from one or more institutions, followed by a clinical validation study of the system's efficacy during deployment. Clinical validation studies are time-consuming, and best practices dictate limited re-use of analytical validation data, so it is ideal to know ahead of time if a system is likely to fail analytical or clinical validation. In this paper, we describe a series of sanity tests to identify when a system performs well on development data for the wrong reasons. We illustrate the sanity tests' value by designing a deep learning system to classify pancreatic cancer seen in computed tomography scans.

4.
Neural Comput ; 33(11): 2908-2950, 2021 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-34474476

RESUMEN

Replay is the reactivation of one or more neural patterns that are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated in deep artificial neural networks that learn over time to avoid catastrophic forgetting of previous knowledge. Replay algorithms have been successfully used in a wide range of deep learning methods within supervised, unsupervised, and reinforcement learning paradigms. In this letter, we provide the first comprehensive comparison between replay in the mammalian brain and replay in artificial neural networks. We identify multiple aspects of biological replay that are missing in deep learning systems and hypothesize how they could be used to improve artificial neural networks.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Animales , Hipocampo , Redes Neurales de la Computación , Refuerzo en Psicología , Sueño
5.
J Med Imaging (Bellingham) ; 8(3): 033505, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-34222557

RESUMEN

Purpose: The lack of standardization in quantitative radiomic measures of tumors seen on computed tomography (CT) scans is generally recognized as an unresolved issue. To develop reliable clinical applications, radiomics must be robust across different CT scan modes, protocols, software, and systems. We demonstrate how custom-designed phantoms, imprinted with human-derived patterns, can provide a straightforward approach to validating longitudinally stable radiomic signature values in a clinical setting. Approach: Described herein is a prototype process to design an anatomically informed 3D-printed radiomic phantom. We used a multimaterial, ultra-high-resolution 3D printer with voxel printing capabilities. Multiple tissue regions of interest (ROIs), from four pancreas tumors, one lung tumor, and a liver background, were extracted from digital imaging and communication in medicine (DICOM) CT exam files and were merged together to develop a multipurpose, circular radiomic phantom (18 cm diameter and 4 cm width). The phantom was scanned 30 times using standard clinical CT protocols to test repeatability. Features that have been found to be prognostic for various diseases were then investigated for their repeatability and reproducibility across different CT scan modes. Results: The structural similarity index between the segment used from the patients' DICOM image and the phantom CT scan was 0.71. The coefficient variation for all assessed radiomic features was < 1.0 % across 30 repeat scans of the phantom. The percent deviation (pDV) from the baseline value, which was the mean feature value determined from repeat scans, increased with the application of the lung convolution kernel, changes to the voxel size, and increases in the image noise. Gray level co-occurrence features, contrast, dissimilarity, and entropy were particularly affected by different scan modes, presenting with pDV > ± 15 % . Conclusions: Previously discovered prognostic and popular radiomic features are variable in practice and need to be interpreted with caution or excluded from clinical implementation. Voxel-based 3D printing can reproduce tissue morphology seen on CT exams. We believe that this is a flexible, yet practical, way to design custom phantoms to validate and compare radiomic metrics longitudinally, over time, and across systems.

6.
J Pathol ; 254(2): 147-158, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33904171

RESUMEN

Artificial intelligence (AI)-based systems applied to histopathology whole-slide images have the potential to improve patient care through mitigation of challenges posed by diagnostic variability, histopathology caseload, and shortage of pathologists. We sought to define the performance of an AI-based automated prostate cancer detection system, Paige Prostate, when applied to independent real-world data. The algorithm was employed to classify slides into two categories: benign (no further review needed) or suspicious (additional histologic and/or immunohistochemical analysis required). We assessed the sensitivity, specificity, positive predictive values (PPVs), and negative predictive values (NPVs) of a local pathologist, two central pathologists, and Paige Prostate in the diagnosis of 600 transrectal ultrasound-guided prostate needle core biopsy regions ('part-specimens') from 100 consecutive patients, and to ascertain the impact of Paige Prostate on diagnostic accuracy and efficiency. Paige Prostate displayed high sensitivity (0.99; CI 0.96-1.0), NPV (1.0; CI 0.98-1.0), and specificity (0.93; CI 0.90-0.96) at the part-specimen level. At the patient level, Paige Prostate displayed optimal sensitivity (1.0; CI 0.93-1.0) and NPV (1.0; CI 0.91-1.0) at a specificity of 0.78 (CI 0.64-0.89). The 27 part-specimens considered by Paige Prostate as suspicious, whose final diagnosis was benign, were found to comprise atrophy (n = 14), atrophy and apical prostate tissue (n = 1), apical/benign prostate tissue (n = 9), adenosis (n = 2), and post-atrophic hyperplasia (n = 1). Paige Prostate resulted in the identification of four additional patients whose diagnoses were upgraded from benign/suspicious to malignant. Additionally, this AI-based test provided an estimated 65.5% reduction of the diagnostic time for the material analyzed. Given its optimal sensitivity and NPV, Paige Prostate has the potential to be employed for the automated identification of patients whose histologic slides could forgo full histopathologic review. In addition to providing incremental improvements in diagnostic accuracy and efficiency, this AI-based system identified patients whose prostate cancers were not initially diagnosed by three experienced histopathologists. © 2021 The Authors. The Journal of Pathology published by John Wiley & Sons, Ltd. on behalf of The Pathological Society of Great Britain and Ireland.


Asunto(s)
Inteligencia Artificial , Neoplasias de la Próstata/diagnóstico , Anciano , Anciano de 80 o más Años , Biopsia , Biopsia con Aguja Gruesa , Humanos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Patólogos , Próstata/patología , Neoplasias de la Próstata/patología
7.
PLoS One ; 15(9): e0238302, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32886692

RESUMEN

Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recognition. For convolutional neural networks, there have been two major approaches: 1) inference methods to separate knowns from unknowns and 2) feature space regularization strategies to improve model robustness to novel inputs. Up to this point, there has been little attention to exploring the relationship between the two approaches and directly comparing performance on large-scale datasets that have more than a few dozen categories. Using the ImageNet ILSVRC-2012 large-scale classification dataset, we identify novel combinations of regularization and specialized inference methods that perform best across multiple open set classification problems of increasing difficulty level. We find that input perturbation and temperature scaling yield significantly better performance on large-scale datasets than other inference methods tested, regardless of the feature space regularization strategy. Conversely, we find that improving performance with advanced regularization schemes during training yields better performance when baseline inference techniques are used; however, when advanced inference methods are used to detect open set classes, the utility of these combersome training paradigms is less evident.


Asunto(s)
Algoritmos , Clasificación/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático
8.
Mod Pathol ; 33(10): 2058-2066, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32393768

RESUMEN

Prostate cancer (PrCa) is the second most common cancer among men in the United States. The gold standard for detecting PrCa is the examination of prostate needle core biopsies. Diagnosis can be challenging, especially for small, well-differentiated cancers. Recently, machine learning algorithms have been developed for detecting PrCa in whole slide images (WSIs) with high test accuracy. However, the impact of these artificial intelligence systems on pathologic diagnosis is not known. To address this, we investigated how pathologists interact with Paige Prostate Alpha, a state-of-the-art PrCa detection system, in WSIs of prostate needle core biopsies stained with hematoxylin and eosin. Three AP-board certified pathologists assessed 304 anonymized prostate needle core biopsy WSIs in 8 hours. The pathologists classified each WSI as benign or cancerous. After ~4 weeks, pathologists were tasked with re-reviewing each WSI with the aid of Paige Prostate Alpha. For each WSI, Paige Prostate Alpha was used to perform cancer detection and, for WSIs where cancer was detected, the system marked the area where cancer was detected with the highest probability. The original diagnosis for each slide was rendered by genitourinary pathologists and incorporated any ancillary studies requested during the original diagnostic assessment. Against this ground truth, the pathologists and Paige Prostate Alpha were measured. Without Paige Prostate Alpha, pathologists had an average sensitivity of 74% and an average specificity of 97%. With Paige Prostate Alpha, the average sensitivity for pathologists significantly increased to 90% with no statistically significant change in specificity. With Paige Prostate Alpha, pathologists more often correctly classified smaller, lower grade tumors, and spent less time analyzing each WSI. Future studies will investigate if similar benefit is yielded when such a system is used to detect other forms of cancer in a setting that more closely emulates real practice.


Asunto(s)
Aprendizaje Profundo , Diagnóstico por Computador/métodos , Interpretación de Imagen Asistida por Computador/métodos , Patología Clínica/métodos , Neoplasias de la Próstata/diagnóstico , Biopsia con Aguja Gruesa , Humanos , Masculino
9.
Sci Rep ; 10(1): 2539, 2020 02 13.
Artículo en Inglés | MEDLINE | ID: mdl-32054884

RESUMEN

The study of gaze behavior has primarily been constrained to controlled environments in which the head is fixed. Consequently, little effort has been invested in the development of algorithms for the categorization of gaze events (e.g. fixations, pursuits, saccade, gaze shifts) while the head is free, and thus contributes to the velocity signals upon which classification algorithms typically operate. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye + head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye + head rotational velocities (deg/s), infrared eye images and scene imagery (RGB + D). A portion was labelled by coders into gaze motion events with a mutual agreement of 0.74 sample based Cohen's κ. This labelled data was used to train and evaluate two machine learning algorithms, Random Forest and a Recurrent Neural Network model, for gaze event classification. Assessment involved the application of established and novel event based performance metrics. Classifiers achieve ~87% human performance in detecting fixations and saccades but fall short (50%) on detecting pursuit movements. Moreover, pursuit classification is far worse in the absence of head movement information. A subsequent analysis of feature significance in our best performing model revealed that classification can be done using only the magnitudes of eye and head movements, potentially removing the need for calibration between the head and eye tracking systems. The GW dataset, trained classifiers and evaluation metrics will be made publicly available with the intention of facilitating growth in the emerging area of head-free gaze event classification.


Asunto(s)
Movimientos Oculares/fisiología , Fijación Ocular/fisiología , Movimientos de la Cabeza/fisiología , Cabeza/fisiología , Actividades Cotidianas , Adulto , Algoritmos , Femenino , Humanos , Masculino , Movimiento (Física) , Desempeño Psicomotor/fisiología , Seguimiento Ocular Uniforme/fisiología , Reflejo Vestibuloocular/fisiología
10.
Neural Netw ; 113: 54-71, 2019 May.
Artículo en Inglés | MEDLINE | ID: mdl-30780045

RESUMEN

Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for computational learning systems and autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback for state-of-the-art deep neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. Although significant advances have been made in domain-specific learning with neural networks, extensive research efforts are required for the development of robust lifelong learning on autonomous agents and robots. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as structural plasticity, memory replay, curriculum and transfer learning, intrinsic motivation, and multisensory integration.


Asunto(s)
Aprendizaje Automático/tendencias , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/tendencias , Animales , Humanos , Memoria , Reconocimiento de Normas Patrones Automatizadas/métodos
11.
Front Artif Intell ; 2: 28, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-33733117

RESUMEN

Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence. Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding. However, the datasets and evaluation procedures used in these tasks are replete with flaws which allows the vision and language (V&L) algorithms to achieve a good performance without a robust understanding of vision and language. We argue for this position based on several recent studies in V&L literature and our own observations of dataset bias, robustness, and spurious correlations. Finally, we propose that several of these challenges can be mitigated by creation of carefully designed benchmarks.

12.
PLoS One ; 13(10): e0205341, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30335767

RESUMEN

Adult aging is associated with difficulties in recognizing negative facial expressions such as fear and anger. However, happiness and disgust recognition is generally found to be less affected. Eye-tracking studies indicate that the diagnostic features of fearful and angry faces are situated in the upper regions of the face (the eyes), and for happy and disgusted faces in the lower regions (nose and mouth). These studies also indicate age-differences in visual scanning behavior, suggesting a role for attention in emotion recognition deficits in older adults. However, because facial features can be processed extrafoveally, and expression recognition occurs rapidly, eye-tracking has been questioned as a measure of attention during emotion recognition. In this study, the Moving Window Technique (MWT) was used as an alternative to the conventional eye-tracking technology. By restricting the visual field to a moveable window, this technique provides a more direct measure of attention. We found a strong bias to explore the mouth across both age groups. Relative to young adults, older adults focused less on the left eye, and marginally more on the mouth and nose. Despite these different exploration patterns, older adults were most impaired in recognition accuracy for disgusted expressions. Correlation analysis revealed that among older adults, more mouth exploration was associated with faster recognition of both disgusted and happy expressions. As a whole, these findings suggest that in aging there are both attentional differences and perceptual deficits contributing to less accurate emotion recognition.


Asunto(s)
Envejecimiento/fisiología , Atención/fisiología , Ojo , Reconocimiento Facial/fisiología , Adulto , Anciano , Envejecimiento/psicología , Ira/fisiología , Emoción Expresada/fisiología , Cara , Expresión Facial , Miedo/psicología , Femenino , Felicidad , Humanos , Masculino , Persona de Mediana Edad , Estimulación Luminosa/métodos
13.
Vision Res ; 108: 67-76, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25641371

RESUMEN

Since Yarbus's seminal work, vision scientists have argued that our eye movement patterns differ depending upon our task. This has recently motivated the creation of multi-fixation pattern analysis algorithms that try to infer a person's task (or mental state) from their eye movements alone. Here, we introduce new algorithms for multi-fixation pattern analysis, and we use them to argue that people have scanpath routines for judging faces. We tested our methods on the eye movements of subjects as they made six distinct judgments about faces. We found that our algorithms could detect whether a participant is trying to distinguish angriness, happiness, trustworthiness, tiredness, attractiveness, or age. However, our algorithms were more accurate at inferring a subject's task when only trained on data from that subject than when trained on data gathered from other subjects, and we were able to infer the identity of our subjects using the same algorithms. These results suggest that (1) individuals have scanpath routines for judging faces, and that (2) these are diagnostic of that subject, but that (3) at least for the tasks we used, subjects do not converge on the same "ideal" scanpath pattern. Whether universal scanpath patterns exist for a task, we suggest, depends on the task's constraints and the level of expertise of the subject.


Asunto(s)
Atención/fisiología , Movimientos Oculares/fisiología , Cara , Reconocimiento Facial/fisiología , Adolescente , Adulto , Algoritmos , Expresión Facial , Femenino , Fijación Ocular/fisiología , Humanos , Juicio/fisiología , Masculino , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa/métodos , Tiempo de Reacción , Reconocimiento en Psicología/fisiología , Adulto Joven
14.
PLoS One ; 8(1): e54088, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23365648

RESUMEN

Mammals rely on vision, audition, and olfaction to remotely sense stimuli in their environment. Determining how the mammalian brain uses this sensory information to recognize objects has been one of the major goals of psychology and neuroscience. Likewise, researchers in computer vision, machine audition, and machine olfaction have endeavored to discover good algorithms for stimulus classification. Almost 50 years ago, the neuroscientist Jerzy Konorski proposed a theoretical model in his final monograph in which competing sets of "gnostic" neurons sitting atop sensory processing hierarchies enabled stimuli to be robustly categorized, despite variations in their presentation. Much of what Konorski hypothesized has been remarkably accurate, and neurons with gnostic-like properties have been discovered in visual, aural, and olfactory brain regions. Surprisingly, there have not been any attempts to directly transform his theoretical model into a computational one. Here, I describe the first computational implementation of Konorski's theory. The model is not domain specific, and it surpasses the best machine learning algorithms on challenging image, music, and olfactory classification tasks, while also being simpler. My results suggest that criticisms of exemplar-based models of object recognition as being computationally intractable due to limited neural resources are unfounded.


Asunto(s)
Algoritmos , Modelos Neurológicos , Patrones de Reconocimiento Fisiológico , Animales , Inteligencia Artificial , Percepción Auditiva/fisiología , Encéfalo/fisiología , Simulación por Computador , Humanos , Neuronas/fisiología , Olfato/fisiología , Percepción Visual/fisiología
15.
Child Dev ; 84(4): 1407-24, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23252761

RESUMEN

The strategies children employ to selectively attend to different parts of the face may reflect important developmental changes in facial emotion recognition. Using the Moving Window Technique (MWT), children aged 5-12 years and adults (N = 129) explored faces with a mouse-controlled window in an emotion recognition task. An age-related increase in attention to the left eye emerged at age 11-12 years and reached significance in adulthood. This left-eye bias is consistent with previous eye tracking research and findings of a perceptual bias for the left side of faces. These results suggest that a strategic attentional bias to the left eye begins to emerge at age 11-12 years and is likely established sometime in adolescence.


Asunto(s)
Atención , Emociones/fisiología , Expresión Facial , Lateralidad Funcional/fisiología , Niño , Medidas del Movimiento Ocular , Femenino , Humanos , Masculino , Reconocimiento Visual de Modelos/fisiología , Estimulación Luminosa , Pruebas Psicológicas , Tiempo de Reacción , Adulto Joven
16.
Nature ; 483(7389): 275, 2012 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-22422250
17.
PLoS One ; 7(1): e29740, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22253768

RESUMEN

In image recognition it is often assumed the method used to convert color images to grayscale has little impact on recognition performance. We compare thirteen different grayscale algorithms with four types of image descriptors and demonstrate that this assumption is wrong: not all color-to-grayscale algorithms work equally well, even when using descriptors that are robust to changes in illumination. These methods are tested using a modern descriptor-based image recognition framework, on face, object, and texture datasets, with relatively few training instances. We identify a simple method that generally works best for face and object recognition, and two that work well for recognizing textures.


Asunto(s)
Algoritmos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Animales , Color , Bases de Datos como Asunto , Humanos
19.
Vis cogn ; 17(6-7): 979-1003, 2009 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-21052485

RESUMEN

When people try to find particular objects in natural scenes they make extensive use of knowledge about how and where objects tend to appear in a scene. Although many forms of such "top-down" knowledge have been incorporated into saliency map models of visual search, surprisingly, the role of object appearance has been infrequently investigated. Here we present an appearance-based saliency model derived in a Bayesian framework. We compare our approach with both bottom-up saliency algorithms as well as the state-of-the-art Contextual Guidance model of Torralba et al. (2006) at predicting human fixations. Although both top-down approaches use very different types of information, they achieve similar performance; each substantially better than the purely bottom-up models. Our experiments reveal that a simple model of object appearance can predict human fixations quite well, even making the same mistakes as people.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...