Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Biol Chem ; 289(25): 17895-908, 2014 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-24828504

RESUMO

The fibrillar assembly and deposition of amyloid ß (Aß) protein, a key pathology of Alzheimer disease, can occur in the form of parenchymal amyloid plaques and cerebral amyloid angiopathy (CAA). Familial forms of CAA exist in the absence of appreciable parenchymal amyloid pathology. The molecular interplay between parenchymal amyloid plaques and CAA is unclear. Here we investigated how early-onset parenchymal amyloid plaques impact the development of microvascular amyloid in transgenic mice. Tg-5xFAD mice, which produce non-mutated human Aß and develop early-onset parenchymal amyloid plaques, were bred to Tg-SwDI mice, which produce familial CAA mutant human Aß and develop cerebral microvascular amyloid. The bigenic mice presented with an elevated accumulation of Aß and fibrillar amyloid in the brain compared with either single transgenic line. Tg-SwDI/Tg-5xFAD mice were devoid of microvascular amyloid, the prominent pathology of Tg-SwDI mice, but exhibited larger parenchymal amyloid plaques compared with Tg-5xFAD mice. The larger parenchymal amyloid deposits were associated with a higher loss of cortical neurons and elevated activated microglia in the bigenic Tg-SwDI/Tg-5xFAD mice. The periphery of parenchymal amyloid plaques was largely composed of CAA mutant Aß. Non-mutated Aß fibril seeds promoted CAA mutant Aß fibril formation in vitro. Further, intrahippocampal administration of biotin-labeled CAA mutant Aß peptide accumulated on and adjacent to pre-existing parenchymal amyloid plaques in Tg-5xFAD mice. These findings indicate that early-onset parenchymal amyloid plaques can serve as a scaffold to capture CAA mutant Aß peptides and prevent their accumulation in cerebral microvessels.


Assuntos
Peptídeos beta-Amiloides/metabolismo , Angiopatia Amiloide Cerebral/metabolismo , Angiopatia Amiloide Cerebral/fisiopatologia , Córtex Cerebral/irrigação sanguínea , Córtex Cerebral/metabolismo , Circulação Cerebrovascular , Placa Amiloide/metabolismo , Peptídeos beta-Amiloides/genética , Animais , Angiopatia Amiloide Cerebral/genética , Angiopatia Amiloide Cerebral/patologia , Córtex Cerebral/patologia , Humanos , Camundongos , Camundongos Transgênicos , Mutação , Placa Amiloide/genética , Placa Amiloide/patologia
2.
Artigo em Inglês | MEDLINE | ID: mdl-38776190

RESUMO

Although face swapping has attracted much attention in recent years, it remains a challenging problem. Existing methods leverage a large number of data samples to explore the intrinsic properties of face swapping without considering the semantic information of face images. Moreover, the representation of the identity information tends to be fixed, leading to suboptimal face swapping. In this paper, we present a simple yet efficient method named FaceSwapper, for one-shot face swapping based on Generative Adversarial Networks. Our method consists of a disentangled representation module and a semantic-guided fusion module. The disentangled representation module comprises an attribute encoder and an identity encoder, which aims to achieve the disentanglement of the identity and attribute information. The identity encoder is more flexible, and the attribute encoder contains more attribute details than its competitors. Benefiting from the disentangled representation, FaceSwapper can swap face images progressively. In addition, semantic information is introduced into the semantic-guided fusion module to control the swapped region and model the pose and expression more accurately. Experimental results show that our method achieves state-of-the-art results on benchmark datasets with fewer training samples. Our code is publicly available at https://github.com/liqi-casia/FaceSwapper.

3.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5541-5555, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38412089

RESUMO

Optical aberration is a ubiquitous degeneration in realistic lens-based imaging systems. Optical aberrations are caused by the differences in the optical path length when light travels through different regions of the camera lens with different incident angles. The blur and chromatic aberrations manifest significant discrepancies when the optical system changes. This work designs a transferable and effective image simulation system of simple lenses via multi-wavelength, depth-aware, spatially-variant four-dimensional point spread functions (4D-PSFs) estimation by changing a small amount of lens-dependent parameters. The image simulation system can alleviate the overhead of dataset collecting and exploiting the principle of computational imaging for effective optical aberration correction. With the guidance of domain knowledge about the image formation model provided by the 4D-PSFs, we establish a multi-scale optical aberration correction network for degraded image reconstruction, which consists of a scene depth estimation branch and an image restoration branch. Specifically, we propose to predict adaptive filters with the depth-aware PSFs and carry out dynamic convolutions, which facilitate the model's generalization in various scenes. We also employ convolution and self-attention mechanisms for global and local feature extraction and realize a spatially-variant restoration. The multi-scale feature extraction complements the features across different scales and provides fine details and contextual features. Extensive experiments demonstrate that our proposed algorithm performs favorably against state-of-the-art restoration methods.

4.
IEEE Trans Pattern Anal Mach Intell ; 46(2): 1049-1064, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37878438

RESUMO

Video captioning aims to generate natural language descriptions for a given video clip. Existing methods mainly focus on end-to-end representation learning via word-by-word comparison between predicted captions and ground-truth texts. Although significant progress has been made, such supervised approaches neglect semantic alignment between visual and linguistic entities, which may negatively affect the generated captions. In this work, we propose a hierarchical modular network to bridge video representations and linguistic semantics at four granularities before generating captions: entity, verb, predicate, and sentence. Each level is implemented by one module to embed corresponding semantics into video representations. Additionally, we present a reinforcement learning module based on the scene graph of captions to better measure sentence similarity. Extensive experimental results show that the proposed method performs favorably against the state-of-the-art models on three widely-used benchmark datasets, including microsoft research video description corpus (MSVD), MSR-video to text (MSR-VTT), and video-and-TEXt (VATEX).

5.
IEEE Trans Pattern Anal Mach Intell ; 46(4): 2533-2544, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37995157

RESUMO

This article targets the task of novel category discovery (NCD), which aims to discover unknown categories when a certain number of classes are already known. The NCD task is challenging due to its closeness to real-world scenarios, where we have only encountered some partial classes and corresponding images. Unlike previous approaches to NCD, we propose a novel adaptive prototype learning method that leverages prototypes to emphasize category discrimination and alleviate the issue of missing annotations for novel classes. Concretely, the proposed method consists of two main stages: prototypical representation learning and prototypical self-training. In the first stage, we develop a robust feature extractor that could effectively handle images from both base and novel categories. This ability of instance and category discrimination of the feature extractor is boosted by self-supervised learning and adaptive prototypes. In the second stage, we utilize the prototypes again to rectify offline pseudo labels and train a final parametric classifier for category clustering. We conduct extensive experiments on four benchmark datasets, demonstrating our method's effectiveness and robustness with state-of-the-art performance.

6.
IEEE Trans Med Imaging ; PP2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38722726

RESUMO

Owing to the success of transformer models, recent works study their applicability in 3D medical segmentation tasks. Within the transformer models, the self-attention mechanism is one of the main building blocks that strives to capture long-range dependencies, compared to the local convolutional-based design. However, the self-attention operation has quadratic complexity which proves to be a computational bottleneck, especially in volumetric medical imaging, where the inputs are 3D with numerous slices. In this paper, we propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features using a pair of inter-dependent branches based on spatial and channel attention. Our spatial attention formulation is efficient and has linear complexity with respect to the input. To enable communication between spatial and channel-focused branches, we share the weights of query and key mapping functions that provide a complimentary benefit (paired attention), while also reducing the complexity. Our extensive evaluations on five benchmarks, Synapse, BTCV, ACDC, BraTS, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy. On Synapse, our UNETR++ sets a new state-of-the-art with a Dice Score of 87.2%, while significantly reducing parameters and FLOPs by over 71%, compared to the best method in the literature. Our code and models are available at: https://tinyurl.com/2p87x5xn.

7.
Artigo em Inglês | MEDLINE | ID: mdl-38905085

RESUMO

A desirable objective in self-supervised learning (SSL) is to avoid feature collapse. Whitening loss guarantees collapse avoidance by minimizing the distance between embeddings of positive pairs under the conditioning that the embeddings from different views are whitened. In this paper, we propose a framework with an informative indicator to analyze whitening loss, which provides a clue to demystify several interesting phenomena and a pivoting point connecting to other SSL methods. We show that batch whitening (BW) based methods do not impose whitening constraints on the embedding but only require the embedding to be full-rank. This full-rank constraint is also sufficient to avoid dimensional collapse. We further demonstrate that the stable rank of the embedding is invariant during training by gradient descent, given the assumption that embedding is updated with an infinitely small learning rate. Based on our analysis, we propose channel whitening with random group partition (CW-RGP), which exploits the advantages of BW-based methods in preventing collapse and avoids their disadvantages requiring large batch size. Experimental results on ImageNet classification and COCO object detection reveal that the proposed CW-RGP possesses a promising potential for learning good representations.

8.
Artigo em Inglês | MEDLINE | ID: mdl-38949945

RESUMO

Few-Shot Instance Segmentation (FSIS) requires detecting and segmenting novel classes with limited support examples. Existing methods based on Region Proposal Networks (RPNs) face two issues: 1) Overfitting suppresses novel class objects; 2) Dual-branch models require complex spatial correlation strategies to prevent spatial information loss when generating class prototypes. We introduce a unified framework, Reference Twice (RefT), to exploit the relationship between support and query features for FSIS and related tasks. Our three main contributions are: 1) A novel transformer-based baseline that avoids overfitting, offering a new direction for FSIS; 2) Demonstrating that support object queries encode key factors after base training, allowing query features to be enhanced twice at both feature and query levels using simple cross-attention, thus avoiding complex spatial correlation interaction; 3) Introducing a class-enhanced base knowledge distillation loss to address the issue of DETR-like models struggling with incremental settings due to the input projection layer, enabling easy extension to incremental FSIS. Extensive experimental evaluations on the COCO dataset under three FSIS settings demonstrate that our method performs favorably against existing approaches across different shots, e.g., +8.2/ + 9.4 performance gain over state-of-the-art methods with 10/30-shots. Source code and models will be available at this github site.

9.
Artigo em Inglês | MEDLINE | ID: mdl-38683713

RESUMO

Crowd localization aims to predict the positions of humans in images of crowded scenes. While existing methods have made significant progress, two primary challenges remain: (i) a fixed number of evenly distributed anchors can cause excessive or insufficient predictions across regions in an image with varying crowd densities, and (ii) ranking inconsistency of predictions between the testing and training phases leads to the model being sub-optimal in inference. To address these issues, we propose a Consistency-Aware Anchor Pyramid Network (CAAPN) comprising two key components: an Adaptive Anchor Generator (AAG) and a Localizer with Augmented Matching (LAM). The AAG module adaptively generates anchors based on estimated crowd density in local regions to alleviate the anchor deficiency or excess problem. It also considers the spatial distribution prior to heads for better performance. The LAM module is designed to augment the predictions which are used to optimize the neural network during training by introducing an extra set of target candidates and correctly matching them to the ground truth. The proposed method achieves favorable performance against state-of-the-art approaches on five challenging datasets: ShanghaiTech A and B, UCF-QNRF, JHU-CROWD++, and NWPU-Crowd. The source code and trained models will be released at https://github.com/ucasyan/CAAPN.

10.
J Neuroinflammation ; 10: 134, 2013 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-24188129

RESUMO

BACKGROUND: Abnormal accumulation of amyloid ß-protein (Aß) in the brain plays an important role in the pathogenesis \of Alzheimer's disease (AD). Aß monomers assemble into oligomers and fibrils that promote neuronal dysfunction. This assembly pathway is influenced by naturally occurring brain molecules, the Aß chaperone proteins, which bind to Aß and modulate its aggregation. Myelin basic protein (MBP) was previously identified as a novel Aß chaperone protein and a potent inhibitor for Aß fibril assembly in vitro. METHODS: In this study, we determined whether the absence of MBP would influence Aß pathology in vivo by breeding MBP knockout mice (MBP-/-) with Tg-5xFAD mice, a model of AD-like parenchymal Aß pathology. RESULTS: Through biochemical and immunohistochemical experiments, we found that bigenic Tg-5xFAD/MBP-/- mice had a significant decrease of insoluble Aß and parenchymal plaque deposition at an early age. The expression of transgene encoded human AßPP, the levels of C-terminal fragments generated during Aß production and the intracellular Aß were unaffected in the absence of MBP. Likewise, we did not find a significant difference in plasma Aß or cerebrospinal fluid Aß, suggesting these clearance routes were unaltered in bigenic Tg-5xFAD/MBP-/- mice. However, MBP-/- mice and bigenic Tg-5xFAD/MBP-/- mice exhibited elevated reactive astrocytes and activated microglia compared with Tg-5xFAD mice. The Aß degrading enzyme matrix metalloproteinase 9 (MMP-9), which is expressed by activated glial cells, was significantly increased in the Tg-5xFAD/MBP-/- mice. CONCLUSIONS: These findings indicate that the absence of MBP decreases Aß deposition in transgenic mice and that this consequence may result from increased glial activation and expression of MMP-9, an Aß degrading enzyme.


Assuntos
Doença de Alzheimer/metabolismo , Peptídeos beta-Amiloides/metabolismo , Encéfalo/metabolismo , Inflamação/metabolismo , Proteína Básica da Mielina/deficiência , Doença de Alzheimer/patologia , Animais , Encéfalo/patologia , Modelos Animais de Doenças , Ensaio de Imunoadsorção Enzimática , Humanos , Immunoblotting , Imuno-Histoquímica , Camundongos , Camundongos Knockout , Camundongos Transgênicos
11.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14590-14610, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37494159

RESUMO

Facial Attribute Manipulation (FAM) aims to aesthetically modify a given face image to render desired attributes, which has received significant attention due to its broad practical applications ranging from digital entertainment to biometric forensics. In the last decade, with the remarkable success of Generative Adversarial Networks (GANs) in synthesizing realistic images, numerous GAN-based models have been proposed to solve FAM with various problem formulation approaches and guiding information representations. This paper presents a comprehensive survey of GAN-based FAM methods with a focus on summarizing their principal motivations and technical details. The main contents of this survey include: (i) an introduction to the research background and basic concepts related to FAM, (ii) a systematic review of GAN-based FAM methods in three main categories, and (iii) an in-depth discussion of important properties of FAM methods, open issues, and future research directions. This survey not only builds a good starting point for researchers new to this field but also serves as a reference for the vision community.

12.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8920-8935, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015400

RESUMO

Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. To facilitate GAN training, current methods propose to use data-specific augmentation techniques. Despite the effectiveness, it is difficult for these methods to scale to practical applications. In this article, we present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks. We first produce augmented samples using the convex combinations of the real samples. Then, we optimize the augmented samples by minimizing the norms of the data scores, i.e., the gradients of the log-density functions. This procedure enforces the augmented samples close to the data manifold. To estimate the scores, we train a deep estimation network with multi-scale score matching. For different image synthesis tasks, we train the score estimation network using different data. We do not require the tuning of the hyperparameters or modifications to the network architecture. The ScoreMix method effectively increases the diversity of data and reduces the overfitting problem. Moreover, it can be easily incorporated into existing GAN models with minor modifications. Experimental results on numerous tasks demonstrate that GAN models equipped with the ScoreMix method achieve significant improvements.

13.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9411-9425, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37022839

RESUMO

We present compact and effective deep convolutional neural networks (CNNs) by exploring properties of videos for video deblurring. Motivated by the non-uniform blur property that not all the pixels of the frames are blurry, we develop a CNN to integrate a temporal sharpness prior (TSP) for removing blur in videos. The TSP exploits sharp pixels from adjacent frames to facilitate the CNN for better frame restoration. Observing that the motion field is related to latent frames instead of blurry ones in the image formation model, we develop an effective cascaded training approach to solve the proposed CNN in an end-to-end manner. As videos usually contain similar contents within and across frames, we propose a non-local similarity mining approach based on a self-attention method with the propagation of global features to constrain CNNs for frame restoration. We show that exploring the domain knowledge of videos can make CNNs more compact and efficient, where the CNN with the non-local spatial-temporal similarity is 3× smaller than the state-of-the-art methods in terms of model parameters while its performance gains are at least 1 dB higher in terms of PSNRs. Extensive experimental results show that our method performs favorably against state-of-the-art approaches on benchmarks and real-world videos.


Assuntos
Algoritmos , Redes Neurais de Computação
14.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 2430-2444, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35412972

RESUMO

The softmax cross-entropy loss function has been widely used to train deep models for various tasks. In this work, we propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification. Unlike the softmax cross-entropy loss, our method explicitly shapes the deep feature space towards a Gaussian Mixture distribution. With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution. The GM loss can be readily used to distinguish the adversarial examples based on the discrepancy between feature distributions of clean and adversarial examples. Furthermore, theoretical analysis shows that a symmetric feature space can be achieved by using the GM loss, which enables the models to perform robustly against adversarial attacks. The proposed model can be implemented easily and efficiently without introducing more trainable parameters. Extensive evaluations demonstrate that the method with the GM loss performs favorably on image classification, face recognition, and detection as well as recognition of adversarial examples generated by various attacks.

15.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 7457-7476, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36315550

RESUMO

Empowered by large datasets, e.g., ImageNet and MS COCO, unsupervised learning on large-scale data has enabled significant advances for classification tasks. However, whether the large-scale unsupervised semantic segmentation can be achieved remains unknown. There are two major challenges: i) we need a large-scale benchmark for assessing algorithms; ii) we need to develop methods to simultaneously learn category and shape representation in an unsupervised manner. In this work, we propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to help the research progress. Building on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 50k high-quality semantic segmentation annotations for evaluation. Our benchmark has a high data diversity and a clear task objective. We also present a simple yet effective method that works surprisingly well for LUSS. In addition, we benchmark related un/weakly/fully supervised methods accordingly, identifying the challenges and possible directions of LUSS. The benchmark and source code is publicly available at https://github.com/LUSSeg.

16.
Artigo em Inglês | MEDLINE | ID: mdl-37018296

RESUMO

While deep-learning-based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training. To eliminate expensive and exhaustive annotation, we study self-supervised (SS) learning for visual tracking. In this work, we develop the crop-transform-paste operation, which is able to synthesize sufficient training data by simulating various appearance variations during tracking, including appearance variations of objects and background interference. Since the target state is known in all synthesized data, existing deep trackers can be trained in routine ways using the synthesized data without human annotation. The proposed target-aware data-synthesis method adapts existing tracking approaches within a SS learning framework without algorithmic changes. Thus, the proposed SS learning mechanism can be seamlessly integrated into existing tracking frameworks to perform training. Extensive experiments show that our method: 1) achieves favorable performance against supervised (Su) learning schemes under the cases with limited annotations; 2) helps deal with various tracking challenges such as object deformation, occlusion (OCC), or background clutter (BC) due to its manipulability; 3) performs favorably against the state-of-the-art unsupervised tracking methods; and 4) boosts the performance of various state-of-the-art Su learning frameworks, including SiamRPN++, DiMP, and TransT.

17.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3121-3138, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37022469

RESUMO

GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model so that the image can be faithfully reconstructed from the inverted code by the generator. As an emerging technique to bridge the real and fake image domains, GAN inversion plays an essential role in enabling pretrained GAN models, such as StyleGAN and BigGAN, for applications of real image editing. Moreover, GAN inversion interprets GAN's latent space and examines how realistic images can be generated. In this paper, we provide a survey of GAN inversion with a focus on its representative algorithms and its applications in image restoration and image manipulation. We further discuss the trends and challenges for future research. A curated list of GAN inversion methods, datasets, and other related information can be found at https://github.com/weihaox/awesome-gan-inversion.

18.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 1934-1948, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35417348

RESUMO

Given a degraded input image, image restoration aims to recover the missing high-quality image content. Numerous applications demand effective image restoration, e.g., computational photography, surveillance, autonomous vehicles, and remote sensing. Significant advances in image restoration have been made in recent years, dominated by convolutional neural networks (CNNs). The widely-used CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatial details are preserved but the contextual information cannot be precisely encoded. In the latter case, generated outputs are semantically reliable but spatially less accurate. This paper presents a new architecture with a holistic goal of maintaining spatially-precise high-resolution representations through the entire network, and receiving complementary contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing the following key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) non-local attention mechanism for capturing contextual information, and (d) attention based multi-scale feature aggregation. Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on six real image benchmark datasets demonstrate that our method, named as MIRNet-v2, achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNetv2.

19.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9806-9820, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35349456

RESUMO

The study of mouse social behaviors has been increasingly undertaken in neuroscience research. However, automated quantification of mouse behaviors from the videos of interacting mice is still a challenging problem, where object tracking plays a key role in locating mice in their living spaces. Artificial markers are often applied for multiple mice tracking, which are intrusive and consequently interfere with the movements of mice in a dynamic environment. In this article, we propose a novel method to continuously track several mice and individual parts without requiring any specific tagging. First, we propose an efficient and robust deep-learning-based mouse part detection scheme to generate part candidates. Subsequently, we propose a novel Bayesian-inference integer linear programming (BILP) model that jointly assigns the part candidates to individual targets with necessary geometric constraints while establishing pair-wise association between the detected parts. There is no publicly available dataset in the research community that provides a quantitative test bed for part detection and tracking of multiple mice, and we here introduce a new challenging Multi-Mice PartsTrack dataset that is made of complex behaviors. Finally, we evaluate our proposed approach against several baselines on our new datasets, where the results show that our method outperforms the other state-of-the-art approaches in terms of accuracy. We also demonstrate the generalization ability of the proposed approach on tracking zebra and locust.


Assuntos
Algoritmos , Redes Neurais de Computação , Teorema de Bayes , Movimento
20.
IEEE Trans Pattern Anal Mach Intell ; 44(4): 1905-1921, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33079657

RESUMO

Super-resolution is a fundamental problem in computer vision which aims to overcome the spatial limitation of camera sensors. While significant progress has been made in single image super-resolution, most algorithms only perform well on synthetic data, which limits their applications in real scenarios. In this paper, we study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images. We focus on two issues of existing super-resolution algorithms: lack of realistic training data and insufficient utilization of visual information obtained from cameras. To address the first issue, we propose a method to generate more realistic training data by mimicking the imaging process of digital cameras. For the second issue, we develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images. In addition, we propose a dense channel-attention block for better image restoration as well as a learning-based guided filter network for effective color correction. Our model is able to generalize to different cameras without deliberately training on images from specific camera types. Extensive experiments demonstrate that the proposed algorithm can recover fine details and clear structures, and achieve high-quality results for single image super-resolution in real scenes.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA