Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 133(2): 364-74, 2008 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-18423206

RESUMO

To fully understand animal transcription networks, it is essential to accurately measure the spatial and temporal expression patterns of transcription factors and their targets. We describe a registration technique that takes image-based data from hundreds of Drosophila blastoderm embryos, each costained for a reference gene and one of a set of genes of interest, and builds a model VirtualEmbryo. This model captures in a common framework the average expression patterns for many genes in spite of significant variation in morphology and expression between individual embryos. We establish the method's accuracy by showing that relationships between a pair of genes' expression inferred from the model are nearly identical to those measured in embryos costained for the pair. We present a VirtualEmbryo containing data for 95 genes at six time cohorts. We show that known gene-regulatory interactions can be automatically recovered from this data set and predict hundreds of new interactions.


Assuntos
Drosophila melanogaster/genética , Redes Reguladoras de Genes , Modelos Genéticos , Animais , Blastoderma , Drosophila melanogaster/metabolismo , Embrião não Mamífero/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Genes de Insetos
2.
Proc Natl Acad Sci U S A ; 116(45): 22737-22745, 2019 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-31636195

RESUMO

Computed tomography (CT) of the head is used worldwide to diagnose neurologic emergencies. However, expertise is required to interpret these scans, and even highly trained experts may miss subtle life-threatening findings. For head CT, a unique challenge is to identify, with perfect or near-perfect sensitivity and very high specificity, often small subtle abnormalities on a multislice cross-sectional (three-dimensional [3D]) imaging modality that is characterized by poor soft tissue contrast, low signal-to-noise using current low radiation-dose protocols, and a high incidence of artifacts. We trained a fully convolutional neural network with 4,396 head CT scans performed at the University of California at San Francisco and affiliated hospitals and compared the algorithm's performance to that of 4 American Board of Radiology (ABR) certified radiologists on an independent test set of 200 randomly selected head CT scans. Our algorithm demonstrated the highest accuracy to date for this clinical application, with a receiver operating characteristic (ROC) area under the curve (AUC) of 0.991 ± 0.006 for identification of examinations positive for acute intracranial hemorrhage, and also exceeded the performance of 2 of 4 radiologists. We demonstrate an end-to-end network that performs joint classification and segmentation with examination-level classification comparable to experts, in addition to robust localization of abnormalities, including some that are missed by radiologists, both of which are critically important elements for this application.


Assuntos
Aprendizado Profundo , Hemorragias Intracranianas/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Doença Aguda , Algoritmos , Humanos , Redes Neurais de Computação
3.
J Struct Biol ; 187(1): 66-75, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24694675

RESUMO

Tilted electron microscope images are routinely collected for an ab initio structure reconstruction as a part of the Random Conical Tilt (RCT) or Orthogonal Tilt Reconstruction (OTR) methods, as well as for various applications using the "free-hand" procedure. These procedures all require identification of particle pairs in two corresponding images as well as accurate estimation of the tilt-axis used to rotate the electron microscope (EM) grid. Here we present a computational approach, PCT (particle correspondence from tilted pairs), based on tilt-invariant context and projection matching that addresses both problems. The method benefits from treating the two problems as a single optimization task. It automatically finds corresponding particle pairs and accurately computes tilt-axis direction even in the cases when EM grid is not perfectly planar.


Assuntos
IMP Desidrogenase/ultraestrutura , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Imageamento Tridimensional/estatística & dados numéricos , Ribossomos/ultraestrutura , Microscopia Crioeletrônica/instrumentação , Desulfovibrio vulgaris/química , Escherichia coli/química , Imageamento Tridimensional/instrumentação , Imageamento Tridimensional/métodos
4.
Sci Robot ; 9(89): eadi9579, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38630806

RESUMO

Humanoid robots that can autonomously operate in diverse environments have the potential to help address labor shortages in factories, assist elderly at home, and colonize new planets. Although classical controllers for humanoid robots have shown impressive results in a number of settings, they are challenging to generalize and adapt to new environments. Here, we present a fully learning-based approach for real-world humanoid locomotion. Our controller is a causal transformer that takes the history of proprioceptive observations and actions as input and predicts the next action. We hypothesized that the observation-action history contains useful information about the world that a powerful transformer model can use to adapt its behavior in context, without updating its weights. We trained our model with large-scale model-free reinforcement learning on an ensemble of randomized environments in simulation and deployed it to the real-world zero-shot. Our controller could walk over various outdoor terrains, was robust to external disturbances, and could adapt in context.


Assuntos
Robótica , Humanos , Idoso , Robótica/métodos , Locomoção , Caminhada , Aprendizagem , Reforço Psicológico
5.
Sci Robot ; 8(79): eadf6991, 2023 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-37379376

RESUMO

Semantic navigation is necessary to deploy mobile robots in uncontrolled environments such as homes or hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end learning approaches reactively map sensor inputs to actions with deep neural networks, whereas modular learning approaches enrich the classical pipeline with learning-based semantic sensing and exploration. However, learned visual navigation policies have predominantly been evaluated in sim, with little known about what works on a robot. We present a large-scale empirical study of semantic visual navigation methods comparing representative methods with classical, modular, and end-to-end learning approaches across six homes with no prior experience, maps, or instrumentation. We found that modular learning works well in the real world, attaining a 90% success rate. In contrast, end-to-end learning does not, dropping from 77% sim to a 23% real-world success rate because of a large image domain gap between sim and reality. For practitioners, we show that modular learning is a reliable approach to navigate to objects: Modularity and abstraction in policy design enable sim-to-real transfer. For researchers, we identify two key issues that prevent today's simulators from being reliable evaluation benchmarks-a large sim-to-real gap in images and a disconnect between sim and real-world error modes-and propose concrete steps forward.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 8754-8765, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30762530

RESUMO

We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g., foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction. We present empirical analysis of our technique in a controlled setting. We also show that this approach allows us to improve over existing techniques for single-view reconstruction of objects from the PASCAL VOC dataset.

7.
J Struct Biol ; 175(3): 319-28, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21640190

RESUMO

The goal of this study is to evaluate the performance of software for automated particle-boxing, and in particular the performance of a new tool (TextonSVM) that recognizes the characteristic texture of particles of interest. As part of a high-throughput protocol, we use human editing that is based solely on class-average images to create final data sets that are enriched in what the investigator considers to be true-positive particles. The Fourier shell correlation (FSC) function is then used to characterize the homogeneity of different single-particle data sets that are derived from the same micrographs by two or more alternative methods. We find that the homogeneity is generally quite similar for class-edited data sets obtained by the texture-based method and by SIGNATURE, a cross-correlation-based method. The precision-recall characteristics of the texture-based method are, on the other hand, significantly better than those of the cross-correlation based method; that is to say, the texture-based approach produces a smaller fraction of false positives in the initial set of candidate particles. The computational efficiency of the two approaches is generally within a factor of two of one another. In situations when it is helpful to use a larger number of templates (exemplars), however, TextonSVM scales in a much more efficient way than do boxing programs that are based on localized cross-correlation.


Assuntos
Algoritmos , Software , Microscopia Crioeletrônica
8.
J Struct Biol ; 170(1): 98-108, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20085819

RESUMO

Biological macromolecules can adopt multiple conformational and compositional states due to structural flexibility and alternative subunit assemblies. This structural heterogeneity poses a major challenge in the study of macromolecular structure using single-particle electron microscopy. We propose a fully automated, unsupervised method for the three-dimensional reconstruction of multiple structural models from heterogeneous data. As a starting reference, our method employs an initial structure that does not account for any heterogeneity. Then, a multi-stage clustering is used to create multiple models representative of the heterogeneity within the sample. The multi-stage clustering combines an existing approach based on Multivariate Statistical Analysis to perform clustering within individual Euler angles, and a newly developed approach to sort out class averages from individual Euler angles into homogeneous groups. Structural models are computed from individual clusters. The whole data classification is further refined using an iterative multi-model projection-matching approach. We tested our method on one synthetic and three distinct experimental datasets. The tests include the cases where a macromolecular complex exhibits structural flexibility and cases where a molecule is found in ligand-bound and unbound states. We propose the use of our approach as an efficient way to reconstruct distinct multiple models from heterogeneous data.


Assuntos
Algoritmos , Técnicas de Química Analítica/métodos , Processamento de Imagem Assistida por Computador/métodos , Substâncias Macromoleculares/química , Microscopia Eletrônica/métodos , Modelos Moleculares , Fator de Iniciação 3 em Eucariotos/química , RNA Polimerase II/química , Ribossomos/química
9.
IEEE Trans Pattern Anal Mach Intell ; 42(6): 1348-1361, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30714908

RESUMO

Recently, Convolutional Neural Networks have shown promising results for 3D geometry prediction. They can make predictions from very little input data such as a single color image. A major limitation of such approaches is that they only predict a coarse resolution voxel grid, which does not capture the surface of the objects well. We propose a general framework, called hierarchical surface prediction (HSP), which facilitates prediction of high resolution voxel grids. The main insight is that it is sufficient to predict high resolution voxels around the predicted surfaces. The exterior and interior of the objects can be represented with coarse resolution voxels. This allows us to predict significantly higher resolution voxel grids around the surface, from which triangle meshes can be extracted. Additionally it allows us to predict properties such as surface color which are only defined on the surface. Our approach is not dependent on a specific input type. We show results for geometry prediction from color images and depth images. Our analysis shows that our high resolution predictions are more accurate than low resolution predictions.

10.
J Vis ; 9(6): 19.1-8, 2009 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-19761310

RESUMO

Rapid category detection, as discovered by S. Thorpe, D. Fize, and C. Marlot (1996), demonstrated that the human visual system can detect object categories in natural images in as little as 150 ms. To gain insight into this phenomenon and to determine its relevance to naturally occurring conditions, we degrade the stimulus set along various image dimensions and investigate the effects on perception. To investigate how well modern-day computer vision algorithms cope with degradations, we conduct an analog of this same experiment with state-of-the-art object recognition algorithms. We discover that rapid category detection in humans is quite robust to naturally occurring degradations and is mediated by a non-linear interaction of visual features. In contrast, modern-day object recognition algorithms are not as robust.


Assuntos
Classificação , Reconhecimento Visual de Modelos , Estimulação Luminosa/métodos , Algoritmos , Inteligência Artificial , Humanos , Aumento da Imagem/métodos , Reconhecimento Automatizado de Padrão , Reconhecimento Visual de Modelos/fisiologia , Tempo de Reação , Movimentos Sacádicos , Fatores de Tempo
11.
J Vis ; 7(8): 2, 2007 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-17685809

RESUMO

Figure-ground organization refers to the visual perception that a contour separating two regions belongs to one of the regions. Recent studies have found neural correlates of figure-ground assignment in V2 as early as 10-25 ms after response onset, providing strong support for the role of local bottom-up processing. How much information about figure-ground assignment is available from locally computed cues? Using a large collection of natural images, in which neighboring regions were assigned a figure-ground relation by human observers, we quantified the extent to which figural regions locally tend to be smaller, more convex, and lie below ground regions. Our results suggest that these Gestalt cues are ecologically valid, and we quantify their relative power. We have also developed a simple bottom-up computational model of figure-ground assignment that takes image contours as input. Using parameters fit to natural image statistics, the model is capable of matching human-level performance when scene context limited.


Assuntos
Sinais (Psicologia) , Área de Dependência-Independência , Natureza , Estimulação Luminosa/métodos , Percepção Visual , Humanos , Luz , Modelos Psicológicos
12.
IEEE Trans Pattern Anal Mach Intell ; 39(4): 627-639, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-27295654

RESUMO

Recognition algorithms based on convolutional networks (CNNs) typically use the output of the last layer as a feature representation. However, the information in this layer may be too coarse spatially to allow precise localization. On the contrary, earlier layers may be precise in localization but will not capture semantics. To get the best of both worlds, we define the hypercolumn at a pixel as the vector of activations of all CNN units above that pixel. Using hypercolumns as pixel descriptors, we show results on three fine-grained localization tasks: simultaneous detection and segmentation, where we improve state-of-the-art from 49.7 mean APr to 62.4, keypoint localization, where we get a 3.3 point boost over a strong regression baseline using CNN features, and part labeling, where we show a 6.6 point gain over a strong baseline.

13.
IEEE Trans Pattern Anal Mach Intell ; 39(4): 719-731, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-27254860

RESUMO

We address the problem of fully automatic object localization and reconstruction from a single image. This is both a very challenging and very important problem which has, until recently, received limited attention due to difficulties in segmenting objects and predicting their poses. Here we leverage recent advances in learning convolutional networks for object detection and segmentation and introduce a complementary network for the task of camera viewpoint prediction. These predictors are very powerful, but still not perfect given the stringent requirements of shape reconstruction. Our main contribution is a new class of deformable 3D models that can be robustly fitted to images based on noisy pose and silhouette estimates computed upstream and that can be learned directly from 2D annotations available in object detection datasets. Our models capture top-down information about the main global modes of shape variation within a class providing a "low-frequency" shape. In order to capture fine instance-specific shape details, we fuse it with a high-frequency component recovered from shading cues. A comprehensive quantitative analysis and ablation study on the PASCAL 3D+ dataset validates the approach as we show fully automatic reconstructions on PASCAL VOC as well as large improvements on the task of viewpoint prediction.

14.
IEEE Trans Pattern Anal Mach Intell ; 39(1): 128-140, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-26955014

RESUMO

We propose a unified approach for bottom-up hierarchical image segmentation and object proposal generation for recognition, called Multiscale Combinatorial Grouping (MCG). For this purpose, we first develop a fast normalized cuts algorithm. We then propose a high-performance hierarchical segmenter that makes effective use of multiscale information. Finally, we propose a grouping strategy that combines our multiscale regions into highly-accurate object proposals by exploring efficiently their combinatorial space. We also present Single-scale Combinatorial Grouping (SCG), a faster version of MCG that produces competitive proposals in under five seconds per image. We conduct an extensive and comprehensive empirical validation on the BSDS500, SegVOC12, SBD, and COCO datasets, showing that MCG produces state-of-the-art contours, hierarchical regions, and object proposals.

15.
IEEE Trans Pattern Anal Mach Intell ; 39(3): 546-560, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27101598

RESUMO

Light-field cameras are quickly becoming commodity items, with consumer and industrial applications. They capture many nearby views simultaneously using a single image with a micro-lens array, thereby providing a wealth of cues for depth recovery: defocus, correspondence, and shading. In particular, apart from conventional image shading, one can refocus images after acquisition, and shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. We present a principled algorithm for dense depth estimation that combines defocus and correspondence metrics. We then extend our analysis to the additional cue of shading, using it to refine fine details in the shape. By exploiting an all-in-focus image, in which pixels are expected to exhibit angular coherence, we define an optimization framework that integrates photo consistency, depth consistency, and shading consistency. We show that combining all three sources of information: defocus, correspondence, and shading, outperforms state-of-the-art light-field depth estimation algorithms in multiple scenarios.

16.
IEEE Trans Pattern Anal Mach Intell ; 28(7): 1052-62, 2006 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16792095

RESUMO

The problem we consider in this paper is to take a single two-dimensional image containing a human figure, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labeled for future use. The input image is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process will succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the 2D joint locations, the 3D body configuration and pose are then estimated using an existing algorithm. We can apply this technique to video by treating each frame independently--tracking just becomes repeated recognition. We present results on a variety of data sets.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Articulações/anatomia & histologia , Articulações/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Postura/fisiologia , Imagem Corporal Total/métodos , Algoritmos , Inteligência Artificial , Análise por Conglomerados , Simulação por Computador , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Biológicos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de Subtração
17.
IEEE Trans Pattern Anal Mach Intell ; 38(4): 690-703, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26959674

RESUMO

In this paper, we present a technique for recovering a model of shape, illumination, reflectance, and shading from a single image taken from an RGB-D sensor. To do this, we extend the SIRFS ("shape, illumination and reflectance from shading") model, which recovers intrinsic scene properties from a single image. Though SIRFS works well on neatly segmented images of objects, it performs poorly on images of natural scenes which often contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we model a scene using a mixture of shapes and a mixture of illuminations, where those mixture components are embedded in a "soft" segmentation-like representation of the input image. We use the noisy depth maps provided by RGB-D sensors (such as the Microsoft Kinect) to guide and improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications such as relighting and retargeting, or for more broad applications (recognition, segmentation) involving RGB-D images.

18.
IEEE Trans Pattern Anal Mach Intell ; 38(1): 142-58, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26656583

RESUMO

Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

19.
IEEE Trans Pattern Anal Mach Intell ; 38(6): 1155-69, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26372203

RESUMO

Light-field cameras have now become available in both consumer and industrial applications, and recent papers have demonstrated practical algorithms for depth recovery from a passive single-shot capture. However, current light-field depth estimation methods are designed for Lambertian objects and fail or degrade for glossy or specular surfaces. The standard Lambertian photoconsistency measure considers the variance of different views, effectively enforcing point-consistency, i.e., that all views map to the same point in RGB space. This variance or point-consistency condition is a poor metric for glossy surfaces. In this paper, we present a novel theory of the relationship between light-field data and reflectance from the dichromatic model. We present a physically-based and practical method to estimate the light source color and separate specularity. We present a new photo consistency metric, line-consistency, which represents how viewpoint changes affect specular points. We then show how the new metric can be used in combination with the standard Lambertian variance or point-consistency measure to give us results that are robust against scenes with glossy surfaces. With our analysis, we can also robustly estimate multiple light source colors and remove the specular component from glossy objects. We show that our method outperforms current state-of-the-art specular removal and depth estimation algorithms in multiple real world scenarios using the consumer Lytro and Lytro Illum light field cameras.

20.
IEEE Trans Pattern Anal Mach Intell ; 27(11): 1832-7, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16285381

RESUMO

We demonstrate that shape contexts can be used to quickly prune a search for similar shapes. We present two algorithms for rapid shape retrieval: representative shape contexts, performing comparisons based on a small number of shape contexts, and shapemes, using vector quantization in the space of shape contexts to obtain prototypical shape pieces.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Gráficos por Computador , Aumento da Imagem/métodos , Análise Numérica Assistida por Computador , Processamento de Sinais Assistido por Computador , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA