Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
IEEE Trans Image Process ; 33: 2936-2949, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38619939

RESUMO

Depth estimation is a fundamental task in many vision applications. With the popularity of omnidirectional cameras, it becomes a new trend to tackle this problem in the spherical space. In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete descriptions of the scene than perspective images. However, fully-convolutional networks that most current solutions rely on fail to capture rich global contexts from the panorama. To address this issue and also the distortion of equirectangular projection in the panorama, we propose Cubemap Vision Transformers (CViT), a new transformer-based architecture that can model long-range dependencies and extract distortion-free global features from the panorama. We show that cubemap vision transformers have a global receptive field at every stage and can provide globally coherent predictions for spherical signals. As a general architecture, it removes any restriction that has been imposed on the panorama in many other monocular panoramic depth estimation methods. To preserve important local features, we further design a convolution-based branch in our pipeline (dubbed GLPanoDepth) and fuse global features from cubemap vision transformers at multiple scales. This global-to-local strategy allows us to fully exploit useful global and local features in the panorama, achieving state-of-the-art performance in panoramic depth estimation.

2.
IEEE Trans Vis Comput Graph ; 29(11): 4405-4416, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37782598

RESUMO

Predicting panoramic indoor lighting from a single perspective image is a fundamental but highly ill-posed problem in computer vision and graphics. To achieve locale-aware and robust prediction, this problem can be decomposed into three sub-tasks: depth-based image warping, panorama inpainting and high-dynamic-range (HDR) reconstruction, among which the success of panorama inpainting plays a key role. Recent methods mostly rely on convolutional neural networks (CNNs) to fill the missing contents in the warped panorama. However, they usually achieve suboptimal performance since the missing contents occupy a very large portion in the panoramic space while CNNs are plagued by limited receptive fields. The spatially-varying distortion in the spherical signals further increases the difficulty for conventional CNNs. To address these issues, we propose a local-to-global strategy for large-scale panorama inpainting. In our method, a depth-guided local inpainting is first applied on the warped panorama to fill small but dense holes. Then, a transformer-based network, dubbed PanoTransformer, is designed to hallucinate reasonable global structures in the large holes. To avoid distortion, we further employ cubemap projection in our design of PanoTransformer. The high-quality panorama recovered at any locale helps us to capture spatially-varying indoor illumination with physically-plausible global structures and fine details.

3.
Biomed Res Int ; 2022: 7921922, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36457339

RESUMO

Accurate nuclear instance segmentation and classification in histopathologic images are the foundation of cancer diagnosis and prognosis. Several challenges are restricting the development of accurate simultaneous nuclear instance segmentation and classification. Firstly, the visual appearances of different category nuclei could be similar, making it difficult to distinguish different types of nuclei. Secondly, it is thorny to separate highly clustering nuclear instances. Thirdly, rare current studies have considered the global dependencies among diverse nuclear instances. In this article, we propose a novel deep learning framework named TSHVNet which integrates multiattention modules (i.e., Transformer and SimAM) into the state-of-the-art HoVer-Net for the sake of a more accurate nuclear instance segmentation and classification. Specifically, the Transformer attention module is employed on the trunk of the HoVer-Net to model the long-distance relationships of diverse nuclear instances. The SimAM attention modules are deployed on both the trunk and branches to apply the 3D channel and spatial attention to assign neurons with appropriate weights. Finally, we validate the proposed method on two public datasets: PanNuke and CoNSeP. The comparison results have shown the outstanding performance of the proposed TSHVNet network among the state-of-art methods. Particularly, as compared to the original HoVer-Net, the performance of nuclear instance segmentation evaluated by the PQ index has shown 1.4% and 2.8% increases on the CoNSeP and PanNuke datasets, respectively, and the performance of nuclear classification measured by F1_score has increased by 2.4% and 2.5% on the CoNSeP and PanNuke datasets, respectively. Therefore, the proposed multiattention-based TSHVNet is of great potential in simultaneous nuclear instance segmentation and classification.


Assuntos
Núcleo Celular , Fontes de Energia Elétrica , Análise por Conglomerados , Neurônios
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA