RESUMO
Terahertz (THz) tomographic imaging based on time-resolved THz signals has raised significant attention due to its non-invasive, non-destructive, non-ionizing, material-classification, and ultrafast-frame-rate nature for object exploration and inspection. However, the material and geometric information of the tested objects is inherently embedded in the highly distorted THz time-domain signals, leading to substantial computational complexity and the necessity for intricate multi-physics models to extract the desired information. To address this challenge, we present a THz multi-dimensional tomographic framework and multi-scale spatio-spectral fusion Unet (MS3-Unet), capable of fusing and collaborating the THz signals across diverse signal domains. MS3-Unet employs multi-scale branches to extract spatio-spectral features, which are subsequently processed through element-wise adaptive filters and fused to achieve high-quality THz image restoration. Evaluated by geometry-variant objects, MS3-Unet outperforms other peer methods in PSNR and SSIM. In addition to the superior performance, the proposed framework additionally provides high scalable, adjustable, and accessible interface to collaborate with different user-defined models or methods.
RESUMO
The BTBR T+Itpr3tf/J (BTBR/J) strain is one of the most valid models of idiopathic autism, serving as a potent forward genetics tool to dissect the complexity of autism. We found that a sister strain with an intact corpus callosum, BTBR TF/ArtRbrc (BTBR/R), showed more prominent autism core symptoms but moderate ultrasonic communication/normal hippocampus-dependent memory, which may mimic autism in the high functioning spectrum. Intriguingly, disturbed epigenetic silencing mechanism leads to hyperactive endogenous retrovirus (ERV), a mobile genetic element of ancient retroviral infection, which increases de novo copy number variation (CNV) formation in the two BTBR strains. This feature makes the BTBR strain a still evolving multiple-loci model toward higher ASD susceptibility. Furthermore, active ERV, analogous to virus infection, evades the integrated stress response (ISR) of host defense and hijacks the transcriptional machinery during embryonic development in the BTBR strains. These results suggest dual roles of ERV in the pathogenesis of ASD, driving host genome evolution at a long-term scale and managing cellular pathways in response to viral infection, which has immediate effects on embryonic development. The wild-type Draxin expression in BTBR/R also makes this substrain a more precise model to investigate the core etiology of autism without the interference of impaired forebrain bundles as in BTBR/J.
Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Retrovirus Endógenos , Gravidez , Feminino , Humanos , Animais , Camundongos , Retrovirus Endógenos/genética , Variações do Número de Cópias de DNA , Transtorno Autístico/etiologia , Prosencéfalo/metabolismo , Corpo Caloso/patologia , Modelos Animais de Doenças , Camundongos Endogâmicos C57BL , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/complicações , Camundongos EndogâmicosRESUMO
Immune dysregulation plays a key role in the pathogenesis of autism. Changes occurring at the systemic level, from brain inflammation to disturbed innate/adaptive immune in the periphery, are frequently observed in patients with autism; however, the intrinsic mechanisms behind them remain elusive. We hypothesize a common etiology may lie in progenitors of different types underlying widespread immune dysregulation. By single-cell RNA sequencing (sc-RNA seq), we trace the developmental origins of immune dysregulation in a mouse model of idiopathic autism. It is found that both in aorta-gonad-mesonephros (AGM) and yolk sac (YS) progenitors, the dysregulation of HDAC1-mediated epigenetic machinery alters definitive hematopoiesis during embryogenesis and downregulates the expression of the AP-1 complex for microglia development. Subsequently, these changes result in the dysregulation of the immune system, leading to gut dysbiosis and hyperactive microglia in the brain. We further confirm that dysregulated immune profiles are associated with specific microbiota composition, which may serve as a biomarker to identify autism of immune-dysregulated subtypes. Our findings elucidate a shared mechanism for the origin of immune dysregulation from the brain to the gut in autism and provide new insight to dissecting the heterogeneity of autism, as well as the therapeutic potential of targeting immune-dysregulated autism subtypes.
Assuntos
Transtorno Autístico , Camundongos , Animais , Transtorno Autístico/genética , Mesonefro , Saco Vitelino/fisiologia , Gônadas , Epigênese Genética/genética , Modelos Animais de DoençasRESUMO
Terahertz (THz) tomographic imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The diffraction-limited THz signals highly constrain the performances of existing restoration methods. To address the problem, we propose a novel multi-view Subspace-Attention-guided Restoration Network (SARNet) that fuses multi-view and multi-spectral features of THz images for effective image restoration and 3D tomographic reconstruction. To this end, SARNet uses multi-scale branches to extract intra-view spatio-spectral amplitude and phase features and fuse them via shared subspace projection and self-attention guidance. We then perform inter-view fusion to further improve the restoration of individual views by leveraging the redundancies between neighboring views. Here, we experimentally construct a THz time-domain spectroscopy (THz-TDS) system covering a broad frequency range from 0.1 to 4 THz for building up a temporal/spectral/spatial/material THz database of hidden 3D objects. Complementary to a quantitative evaluation, we demonstrate the effectiveness of our SARNet model on 3D THz tomographic reconstruction applications. Supplementary Information: The online version contains supplementary material available at 10.1007/s11263-023-01812-y.
RESUMO
Chemotherapy efficacy is limited by intrinsic and acquired resistance in glioblastoma (GBM); hence, novel tactics are crucial. Survivin has been demonstrated as a key resistant factor in GBM because of its function in inhibiting apoptosis, regulating autophagy, and in promoting G2/M cell cycle transition. Parthenolide has been reported to be an effective antitumor agent in a variety of tumor cells and decreases survivin level in leukemia cells. But the effect of parthenolide on survivin and the cell death process in GBM is still unknown. The aim of this study was to examine whether parthenolide had the potential to inhibit cell proliferation in the GBM cell line U373. The parthenolide-induced effects in relation to survivin suppression and cell death were further investigated. Our results showed that parthenolide substantially inhibited cell viability with IC50 values of approximate 16 µM. Treatment with parthenolide at the dose of 16 µM led to considerable downregulation of survivin, G2/M cell cycle arrest and Chk2 upregulation in cells. Parthenolide induced apoptosis in only a few cells and a slight increase in activated caspases 3 levels. By contrast, parthenolide induced a significant increase of intracellular autophagosomes and the expression of autophagy related proteins, including ULK1 and LC3 I/LC3 II, in the treated cells. These results suggested that parthenolide induced survivin inhibition, G2/M cell cycle arrest, and triggered cell death through autophagic cell death in the GBM cell line.
Assuntos
Pontos de Checagem da Fase G2 do Ciclo Celular/efeitos dos fármacos , Glioblastoma/tratamento farmacológico , Glioblastoma/patologia , Proteínas Inibidoras de Apoptose/metabolismo , Pontos de Checagem da Fase M do Ciclo Celular/efeitos dos fármacos , Sesquiterpenos/administração & dosagem , Antineoplásicos/administração & dosagem , Apoptose/efeitos dos fármacos , Autofagia/efeitos dos fármacos , Linhagem Celular Tumoral , Relação Dose-Resposta a Droga , Humanos , Proteínas Inibidoras de Apoptose/antagonistas & inibidores , Survivina , Resultado do TratamentoRESUMO
Impaired neurodevelopment leads to several psychiatric disorders, including autism, schizophrenia and attention deficiency hyperactivity disorder. Our prior study showed that sterile alpha and TIR motif-containing 1 protein (Sarm1) regulates neuronal morphogenesis through at least two pathways. Sarm1 controls neuronal morphogenesis, including dendritic arborization, axonal outgrowth and establishment of neuronal polarity, through the MKK-JNK pathway. Neuronally expressed Sarm1 also regulates the expression of inflammatory cytokines in the brain, which have also been shown to impact brain development and function. Because the reduction of Sarm1 expression negatively influences neuronal development, here we investigated whether Sarm1 controls mouse behaviors. We analyzed two independent Sarm1 transgenic mouse lines using a series of behavioral assays, and found that the reduction of Sarm1 protein levels had a limited effect on locomotion and anxiety. However, Sarm1 knockdown mice exhibited impairments in cued and contextual fear conditioning as well as cognitive flexibility. Moreover, the three-chambered social test, reciprocal social interaction and social transmission of food preference further illustrated deficiencies in Sarm1 knockdown mice in social interaction. These findings suggest that Sarm1, a molecule that regulates innate immunity and neuronal morphogenesis, regulates social behaviors and cognition. We conclude that Sarm1 is involved in immune response, neural development and psychiatric disorders.
Assuntos
Proteínas do Domínio Armadillo/fisiologia , Cognição/fisiologia , Condicionamento Psicológico/fisiologia , Proteínas do Citoesqueleto/fisiologia , Relações Interpessoais , Animais , Proteínas do Domínio Armadillo/genética , Comportamento Animal/fisiologia , Proteínas do Citoesqueleto/genética , Medo/fisiologia , Locomoção/fisiologia , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos TransgênicosRESUMO
In image watermark removal, popular methods depend on given reference non-watermark images in a supervised way to remove watermarks. However, reference non-watermark images are difficult to be obtained in the real world. At the same time, they often suffer from the influence of noise when captured by digital devices. To resolve these issues, in this paper, we present a self-supervised network for image denoising and watermark removal (SSNet). SSNet uses a parallel network in a self-supervised learning way to remove noise and watermarks. Specifically, each sub-network contains two sub-blocks. The upper sub-network uses the first sub-block to remove noise, according to noise-to-noise. Then, the second sub-block in the upper sub-network is used to remove watermarks, according to the distributions of watermarks. To prevent the loss of important information, the lower sub-network is used to simultaneously learn noise and watermarks in a self-supervised learning way. Moreover, two sub-networks interact via attention to extract more complementary salient information. The proposed method does not depend on paired images to learn a blind denoising and watermark removal model, which is very meaningful for real applications. Also, it is more effective than the popular image watermark removal methods in public datasets. Codes can be found at https://github.com/hellloxiaotian/SSNet.
RESUMO
Transformer-based method has demonstrated promising performance in image super-resolution tasks, due to its long-range and global aggregation capability. However, the existing Transformer brings two critical challenges for applying it in large-area earth observation scenes: (1) redundant token representation due to most irrelevant tokens; (2) single-scale representation which ignores scale correlation modeling of similar ground observation targets. To this end, this paper proposes to adaptively eliminate the interference of irreverent tokens for a more compact self-attention calculation. Specifically, we devise a Residual Token Selective Group (RTSG) to grasp the most crucial token by dynamically selecting the top- k keys in terms of score ranking for each query. For better feature aggregation, a Multi-scale Feed-forward Layer (MFL) is developed to generate an enriched representation of multi-scale feature mixtures during feed-forward process. Moreover, we also proposed a Global Context Attention (GCA) to fully explore the most informative components, thus introducing more inductive bias to the RTSG for an accurate reconstruction. In particular, multiple cascaded RTSGs form our final Top- k Token Selective Transformer (TTST) to achieve progressive representation. Extensive experiments on simulated and real-world remote sensing datasets demonstrate our TTST could perform favorably against state-of-the-art CNN-based and Transformer-based methods, both qualitatively and quantitatively. In brief, TTST outperforms the state-of-the-art approach (HAT-L) in terms of PSNR by 0.14 dB on average, but only accounts for 47.26% and 46.97% of its computational cost and parameters. The code and pre-trained TTST will be available at https://github.com/XY-boy/TTST for validation.
RESUMO
Convolutional neural networks (CNNs) and self-attention (SA) have demonstrated remarkable success in low-level vision tasks, such as image super-resolution, deraining, and dehazing. The former excels in acquiring local connections with translation equivariance, while the latter is better at capturing long-range dependencies. However, both CNNs and Transformers suffer from individual limitations, such as limited receptive field and weak diversity representation of CNNs during low efficiency and weak local relation learning of SA. To this end, we propose a multi-scale fusion and decomposition network (MFDNet) for rain perturbation removal, which unifies the merits of these two architectures while maintaining both effectiveness and efficiency. To achieve the decomposition and association of rain and rain-free features, we introduce an asymmetrical scheme designed as a dual-path mutual representation network that enables iterative refinement. Additionally, we incorporate high-efficiency convolutions throughout the network and use resolution rescaling to balance computational complexity with performance. Comprehensive evaluations show that the proposed approach outperforms most of the latest SOTA deraining methods and is versatile and robust in various image restoration tasks, including underwater image enhancement, image dehazing, and low-light image enhancement. The source codes and pretrained models are available at https://github.com/qwangg/MFDNet.
RESUMO
Multi-view action recognition aims to identify action categories from given clues. Existing studies ignore the negative influences of fuzzy views between view and action in disentangling, commonly arising the mistaken recognition results. To this end, we regard the observed image as the composition of the view and action components, and give full play to the advantages of multiple views via the adaptive cooperative representation among these two components, forming a Dual-Recommendation Disentanglement Network (DRDN) for multi-view action recognition. Specifically, 1) For the action, we leverage a multi-level Specific Information Recommendation (SIR) to enhance the interaction among intricate activities and views. SIR offers a more comprehensive representation of activities, measuring the trade-off between global and local information. 2) For the view, we utilize a Pyramid Dynamic Recommendation (PDR) to learn a complete and detailed global representation by transferring features from different views. It is explicitly restricted to resist the fuzzy noise influence, focusing on positive knowledge from other views. Our DRDN aims for complete action and view representation, where PDR directly guides action to disentangle with view features and SIR considers mutual exclusivity of view and action clues. Extensive experiments have indicated that the multi-view action recognition method DRDN we proposed achieves state-of-the-art performance over powerful competitors on several standard benchmarks. The code will be available at https://github.com/51cloud/DRDN.
RESUMO
Binary neural networks (BNNs) have attracted broad research interest due to their efficient storage and computational ability. Nevertheless, a significant challenge of BNNs lies in handling discrete constraints while ensuring bit entropy maximization, which typically makes their weight optimization very difficult. Existing methods relax the learning using the sign function, which simply encodes positive weights into +1s, and -1s otherwise. Alternatively, we formulate an angle alignment objective to constrain the weight binarization to {0,+1} to solve the challenge. In this article, we show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise. Therefore, a high-quality discrete solution is established in a computationally efficient manner without the sign function. We prove that the learned weights of binarized networks roughly follow a Laplacian distribution that does not allow entropy maximization, and further demonstrate that it can be effectively solved by simply removing the l2 regularization during network training. Our method, dubbed sign-to-magnitude network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet, demonstrating its superiority over the sign-based state-of-the-arts. Our source code, experimental settings, training logs and binary models are available at https://github.com/lmbxmu/SiMaN.
RESUMO
Channel pruning has been long studied to compress convolutional neural networks (CNNs), which significantly reduces the overall computation. Prior works implement channel pruning in an unexplainable manner, which tends to reduce the final classification errors while failing to consider the internal influence of each channel. In this article, we conduct channel pruning in a white box. Through deep visualization of feature maps activated by different channels, we observe that different channels have a varying contribution to different categories in image classification. Inspired by this, we choose to preserve channels contributing to most categories. Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner with respect to the input image's category. On the basis of the learned class-wise mask, we perform a global voting mechanism to remove channels with less category discrimination. Lastly, a fine-tuning process is conducted to recover the performance of the pruned model. To our best knowledge, it is the first time that CNN interpretability theory is considered to guide channel pruning. Extensive experiments on representative image classification tasks demonstrate the superiority of our White-Box over many state-of-the-arts (SOTAs). For instance, on CIFAR-10, it reduces 65.23% floating point operations per seconds (FLOPs) with even 0.62% accuracy improvement for ResNet-110. On ILSVRC-2012, White-Box achieves a 45.6% FLOP reduction with only a small loss of 0.83% in the top-1 accuracy for ResNet-50. Code is available at https://github.com/zyxxmu/White-Box.
RESUMO
This article focuses on filter-level network pruning. A novel pruning method, termed CLR-RNF, is proposed. We first reveal a "long-tail" pruning problem in magnitude-based weight pruning methods and then propose a computation-aware measurement for individual weight importance, followed by a cross-layer ranking (CLR) of weights to identify and remove the bottom-ranked weights. Consequently, the per-layer sparsity makes up the pruned network structure in our filter pruning. Then, we introduce a recommendation-based filter selection scheme where each filter recommends a group of its closest filters. To pick the preserved filters from these recommended groups, we further devise a k -reciprocal nearest filter (RNF) selection scheme where the selected filters fall into the intersection of these recommended groups. Both our pruned network structure and the filter selection are nonlearning processes, which, thus, significantly reduces the pruning complexity and differentiates our method from existing works. We conduct image classification on CIFAR-10 and ImageNet to demonstrate the superiority of our CLR-RNF over the state-of-the-arts. For example, on CIFAR-10, CLR-RNF removes 74.1% FLOPs and 95.0% parameters from VGGNet-16 with even 0.3% accuracy improvements. On ImageNet, it removes 70.2% FLOPs and 64.8% parameters from ResNet-50 with only 1.7% top-five accuracy drops. Our project is available at https://github.com/lmbxmu/CLR-RNF.
RESUMO
Due to the high labor cost of physicians, it is difficult to collect a rich amount of manually-labeled medical images for developing learning-based computer-aided diagnosis (CADx) systems or segmentation algorithms. To tackle this issue, we reshape the image segmentation task as an image-to-image (I2I) translation problem and propose a retinal vascular segmentation network, which can achieve good cross-domain generalizability even with a small amount of training data. We devise primarily two components to facilitate this I2I-based segmentation method. The first is the constraints provided by the proposed gradient-vector-flow (GVF) loss, and, the second is a two-stage Unet (2Unet) generator with a skip connection. This configuration makes 2Unet's first-stage play a role similar to conventional Unet, but forces 2Unet's second stage to learn to be a refinement module. Extensive experiments show that by re-casting retinal vessel segmentation as an image-to-image translation problem, our I2I translator-based segmentation subnetwork achieves better cross-domain generalizability than existing segmentation methods. Our model, trained on one dataset, e.g., DRIVE, can produce segmentation results stably on datasets of other domains, e.g., CHASE-DB1, STARE, HRF, and DIARETDB1, even in low-shot circumstances.
Assuntos
Algoritmos , Retina , Humanos , Retina/diagnóstico por imagem , Vasos Retinianos/diagnóstico por imagem , Fundo de Olho , Diagnóstico por Computador , Processamento de Imagem Assistida por Computador/métodosRESUMO
Referring expression comprehension (REC) is an emerging research topic in computer vision, which refers to the detection of a target region in an image given a test description. Most existing REC methods follow a multistage pipeline, which is computationally expensive and greatly limits the applications of REC. In this article, we propose a one-stage model toward real-time REC, termed real-time global inference network (RealGIN). RealGIN addresses the issues of expression diversity and complexity of REC with two innovative designs: adaptive feature selection (AFS) and Global Attentive ReAsoNing (GARAN). Expression diversity concerns varying expression content, which includes information such as colors, attributes, locations, and fine-grained categories. To address this issue, AFS adaptively fuses features of different semantic levels to tackle the changes in expression content. In contrast, expression complexity concerns the complex relational conditions in expressions that are used to identify the referent. To this end, GARAN uses the textual feature as a pivot to collect expression-aware visual information from all regions and then diffuses this information back to each region, which provides sufficient context for modeling the relational conditions in expressions. On five benchmark datasets, i.e., RefCOCO, RefCOCO+, RefCOCOg, ReferIT, and Flickr30k, the proposed RealGIN outperforms most existing methods and achieves very competitive performances against the most advanced one, i.e., MAttNet. More importantly, under the same hardware, RealGIN can boost the processing speed by 10-20 times over the existing methods.
RESUMO
This study investigates age-specific prostate-specific antigen (PSA) distributions in Taiwanese men and recommends reference ranges for this population after comparison with other studies. From January 1999 to December 2016, a total of 213,986 Taiwanese men aged above 19 years old without history of prostate cancer, urinary tract infection, or prostate infection were recruited from the Taiwan MJ cohort, an ongoing prospective cohort of health examinations conducted by the MJ Health Screening Center in Taiwan. Participants were divided into seven age groups. Simple descriptive statistical analyses were carried out and quartiles and 95th percentiles were calculated for each group as reference ranges for serum PSA in screening for prostate cancer in Taiwanese men. Serum PSA concentration correlated with age (r = 0.274, p<0.001). The median serum PSA concentration (5th to 95th percentile) ranged from 0.7 ng/ml (0.3 to 1.8) for men 20-29 years old (n = 6,382) to 1.6 ng/ml (0.4 to 8.4) for men over 79 years old (n = 504). The age-specific PSA reference ranges are as follows: 20-29 years, 1.80 ng/ml; 30-39 years, 1.80 ng/ml; 40-49 years, 2.0 ng/ml; 50-59 years, 3.20 ng/ml; 60-69 years, 5.60 ng/ml; 70-79 years, 7.40 ng/ml; over 80 years, 8.40 ng/ml. Almost no change occurred in the median serum PSA value in men 50 years old or younger, while a gradual increase was observed in men over 50. Taiwanese men aged 60 years above showed higher 95th percentile serum PSA values compared to Caucasian men and men in other Asian countries but were closer to those of Asian American and African American men. Results indicate significantly different PSA levels correlating to different ethnicities, suggesting that Oesterling's age-specific PSA reference ranges might not be appropriate for Taiwanese men. Our results should be further studied to validate the age-specific PSA reference ranges for Taiwanese men presented in this study.
Assuntos
Antígeno Prostático Específico , Neoplasias da Próstata , Adulto , Idoso , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem , Distribuição por Idade , Fatores Etários , Negro ou Afro-Americano , População do Leste Asiático , Estudos Prospectivos , Antígeno Prostático Específico/sangue , Antígeno Prostático Específico/química , Neoplasias da Próstata/epidemiologia , Valores de Referência , População BrancaRESUMO
Microwave radiations can be encountered regularly in daily lives. When WHO announced that microwave radiations were a kind of environmental energy which interfere with the physiological functions of the human body, great concerns have been raised over the damages microwave frequencies can do to human physiology. The immunological performance and the activities of the cellular inflammatory factor NFκB have been closely related in monocyte. Due to the effect of phorbol 12-myristate 13-acetate (PMA) on THP-1 monocytes, THP-1 monocytes would differentiate into macrophages and would then react with lipopolysaccharides (LPS), and the amount of NFκB increased in the THP-1 monocytes. Expression of cytokine is affected when cells are exposed to a frequency of 2450 MHz and at 900 W. Thus, in our experiments, an observation was made when THP-1 monocytes were stimulated with PMA and LPS to differentiate into macrophage, the amount of NFκB in cells increased exponentially, and the levels of NFκB expression were decreased by the exposure of microwave radiation. In conclusion, microwave radiations were found to inhibit the activity functions of THP-1 monocytes stimulated with PMA and LPS.
Assuntos
Lipopolissacarídeos , Monócitos , Humanos , Lipopolissacarídeos/farmacologia , Macrófagos , Micro-Ondas , NF-kappa B/metabolismo , Acetato de Tetradecanoilforbol/farmacologiaRESUMO
Convolutional neural networks (CNNs) have obtained remarkable performance via deep architectures. However, these CNNs often achieve poor robustness for image super-resolution (SR) under complex scenes. In this article, we present a heterogeneous group SR CNN (HGSRCNN) via leveraging structure information of different types to obtain a high-quality image. Specifically, each heterogeneous group block (HGB) of HGSRCNN uses a heterogeneous architecture containing a symmetric group convolutional block and a complementary convolutional block in a parallel way to enhance the internal and external relations of different channels for facilitating richer low-frequency structure information of different types. To prevent the appearance of obtained redundant features, a refinement block (RB) with signal enhancements in a serial way is designed to filter useless information. To prevent the loss of original information, a multilevel enhancement mechanism guides a CNN to achieve a symmetric architecture for promoting expressive ability of HGSRCNN. Besides, a parallel upsampling mechanism is developed to train a blind SR model. Extensive experiments illustrate that the proposed HGSRCNN has obtained excellent SR performance in terms of both quantitative and qualitative analysis. Codes can be accessed at https://github.com/hellloxiaotian/HGSRCNN.
RESUMO
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem. However, CNNs depend on deeper network architectures to improve performance of image super-resolution, which may increase computational cost in general. In this paper, we present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture by fully fusing deep and wide channel features to extract more accurate low-frequency information in terms of correlations of different channels in single image super-resolution (SISR). Also, a signal enhancement operation in the ESRGCNN is useful to inherit more long-distance contextual information for resolving long-term dependency. An adaptive up-sampling operation is gathered into a CNN to obtain an image super-resolution model with low-resolution images of different sizes. Extensive experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR. Code is found at https://github.com/hellloxiaotian/ESRGCNN.
Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodosRESUMO
Understanding foggy image sequence in driving scene is critical for autonomous driving, but it remains a challenging task due to the difficulty in collecting and annotating real-world images of adverse weather. Recently, self-training strategy has been considered as a powerful solution for unsupervised domain adaptation, which iteratively adapts the model from the source domain to the target domain by generating target pseudo labels and re-training the model. However, the selection of confident pseudo labels inevitably suffers from the conflict between sparsity and accuracy, both of which will lead to suboptimal models. To tackle this problem, we exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Specifically, based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion (TDo-Dif) scheme. It employs superpixels and optical flows to identify the spatial similarity and temporal correspondence, respectively, and then diffuses the confident but sparse pseudo labels within a superpixel or a temporal corresponding pair linked by the flow. Moreover, to ensure the feature similarity of the diffused pixels, we introduce local spatial similarity loss and temporal contrastive loss in the model re-training stage. Experimental results show that our TDo-Dif scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets (Foggy Zurich and Foggy Driving), which exceeds the state-of-the-art unsupervised domain adaptive semantic segmentation methods. The proposed method can also be applied to non-sequential images in the target domain by considering only spatial similarity.