RESUMO
Infrared and visible image fusion aims to generate a single fused image that not only contains rich texture details and salient objects, but also facilitates downstream tasks. However, existing works mainly focus on learning different modality-specific or shared features, and ignore the importance of modeling cross-modality features. To address these challenges, we propose Dual-branch Progressive learning for infrared and visible image fusion with a complementary self-Attention and Convolution (DPACFuse) network. On the one hand, we propose Cross-Modality Feature Extraction (CMEF) to enhance information interaction and the extraction of common features across modalities. In addition, we introduce a high-frequency gradient convolution operation to extract fine-grained information and suppress high-frequency information loss. On the other hand, to alleviate the CNN issues of insufficient global information extraction and computation overheads of self-attention, we introduce the ACmix, which can fully extract local and global information in the source image with a smaller computational overhead than pure convolution or pure self-attention. Extensive experiments demonstrated that the fused images generated by DPACFuse not only contain rich texture information, but can also effectively highlight salient objects. Additionally, our method achieved approximately 3% improvement over the state-of-the-art methods in MI, Qabf, SF, and AG evaluation indicators. More importantly, our fused images enhanced object detection and semantic segmentation by approximately 10%, compared to using infrared and visible images separately.
RESUMO
Video super-resolution aims to generate high-resolution frames from low-resolution counterparts. It can be regarded as a specialized application of image super-resolution, serving various purposes, such as video display and surveillance. This paper proposes a novel method for real-time video super-resolution. It effectively exploits spatial information by utilizing the capabilities of an image super-resolution model and leverages the temporal information inherent in videos. Specifically, the method incorporates a pre-trained image super-resolution network as its foundational framework, allowing it to leverage existing expertise for super-resolution. A fast temporal information aggregation module is presented to further aggregate temporal cues across frames. By using deformable convolution to align features of neighboring frames, this module takes advantage of inter-frame dependency. In addition, it employs a hierarchical fast spatial offset feature extraction and a channel attention-based temporal fusion. A redundancy-aware inference algorithm is developed to reduce computational redundancy by reusing intermediate features, achieving real-time inferring speed. Extensive experiments on several benchmarks demonstrate that the proposed method can reconstruct satisfactory results with strong quantitative performance and visual qualities. The real-time inferring ability makes it suitable for real-world deployment.
RESUMO
The quality of videos varies due to the different capabilities of sensors. Video super-resolution (VSR) is a technology that improves the quality of captured video. However, the development of a VSR model is very costly. In this paper, we present a novel approach for adapting single-image super-resolution (SISR) models to the VSR task. To achieve this, we first summarize a common architecture of SISR models and perform a formal analysis of adaptation. Then, we propose an adaptation method that incorporates a plug-and-play temporal feature extraction module into existing SISR models. The proposed temporal feature extraction module consists of three submodules: offset estimation, spatial aggregation, and temporal aggregation. In the spatial aggregation submodule, the features obtained from the SISR model are aligned to the center frame based on the offset estimation results. The aligned features are fused in the temporal aggregation submodule. Finally, the fused temporal feature is fed to the SISR model for reconstruction. To evaluate the effectiveness of our method, we adapt five representative SISR models and evaluate these models on two popular benchmarks. The experiment results show the proposed method is effective on different SISR models. In particular, on the Vid4 benchmark, the VSR-adapted models achieve at least 1.26 dB and 0.067 improvement over the original SISR models in terms of PSNR and SSIM metrics, respectively. Additionally, these VSR-adapted models achieve better performance than the state-of-the-art VSR models.
Assuntos
Aclimatação , Benchmarking , TecnologiaRESUMO
BACKGROUND: Tumor histomorphology analysis plays a crucial role in predicting the prognosis of resectable lung adenocarcinoma (LUAD). Computer-extracted image texture features have been previously shown to be correlated with outcome. However, a comprehensive, quantitative, and interpretable predictor remains to be developed. METHODS: In this multi-center study, we included patients with resectable LUAD from four independent cohorts. An automated pipeline was designed for extracting texture features from the tumor region in hematoxylin and eosin (H&E)-stained whole slide images (WSIs) at multiple magnifications. A multi-scale pathology image texture signature (MPIS) was constructed with the discriminative texture features in terms of overall survival (OS) selected by the LASSO method. The prognostic value of MPIS for OS was evaluated through univariable and multivariable analysis in the discovery set (n = 111) and the three external validation sets (V1, n = 115; V2, n = 116; and V3, n = 246). We constructed a Cox proportional hazards model incorporating clinicopathological variables and MPIS to assess whether MPIS could improve prognostic stratification. We also performed histo-genomics analysis to explore the associations between texture features and biological pathways. RESULTS: A set of eight texture features was selected to construct MPIS. In multivariable analysis, a higher MPIS was associated with significantly worse OS in the discovery set (HR 5.32, 95%CI 1.72-16.44; P = 0.0037) and the three external validation sets (V1: HR 2.63, 95%CI 1.10-6.29, P = 0.0292; V2: HR 2.99, 95%CI 1.34-6.66, P = 0.0075; V3: HR 1.93, 95%CI 1.15-3.23, P = 0.0125). The model that integrated clinicopathological variables and MPIS had better discrimination for OS compared to the clinicopathological variables-based model in the discovery set (C-index, 0.837 vs. 0.798) and the three external validation sets (V1: 0.704 vs. 0.679; V2: 0.728 vs. 0.666; V3: 0.696 vs. 0.669). Furthermore, the identified texture features were associated with biological pathways, such as cytokine activity, structural constituent of cytoskeleton, and extracellular matrix structural constituent. CONCLUSIONS: MPIS was an independent prognostic biomarker that was robust and interpretable. Integration of MPIS with clinicopathological variables improved prognostic stratification in resectable LUAD and might help enhance the quality of individualized postoperative care.
Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Humanos , Prognóstico , Estudos Retrospectivos , Modelos de Riscos Proporcionais , Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/cirurgiaRESUMO
Low-illumination images exhibit low brightness, blurry details, and color casts, which present us an unnatural visual experience and further have a negative effect on other visual applications. Data-driven approaches show tremendous potential for lighting up the image brightness while preserving its visual naturalness. However, these methods introduce hand-crafted holes and noise enlargement or over/under enhancement and color deviation. For mitigating these challenging issues, this paper presents a frequency division and multiscale learning network named FDMLNet, including two subnets, DetNet and StruNet. This design first applies the guided filter to separate the high and low frequencies of authentic images, then DetNet and StruNet are, respectively, developed to process them, to fully explore their information at different frequencies. In StruNet, a feasible feature extraction module (FFEM), grouped by multiscale learning block (MSL) and a dual-branch channel attention mechanism (DCAM), is injected to promote its multiscale representation ability. In addition, three FFEMs are connected in a new dense connectivity meant to utilize multilevel features. Extensive quantitative and qualitative experiments on public benchmarks demonstrate that our FDMLNet outperforms state-of-the-art approaches benefiting from its stronger multiscale feature expression and extraction ability.
Assuntos
Algoritmos , Aumento da Imagem , Aumento da Imagem/métodosRESUMO
The combination of near infrared spectrum and pattern recognition methods has a wide application prospect in rapid and nondestructive supervision and management of drugs. The traditional identification methods regard the smallest error rate as the goal while the imbalance of classes is ignored. This makes the positive class is overwhelming covered by the negative class and reduces its effect for the classifier, so that the classification results tend to recognize the negative class correctly, which severely affects the identification accuracy. In this paper, we mainly studied the class imbalance problems of true or false drugs via infrared spectral data of its, and then propose a balance cascading and sparse representation based classification method (BC-SRC) by combining the Balance Cascading with SRC. We sampling majority samples from the majority class for several times, which has the same size as minority samples and the majority samples we sampled can contain all the majority class samples entirely (sampling times is ceiling the result of majority samples number divide minority samples number). We can get sets of results, and then obtain the final predict labels form those results. Experiments of three databases achieved on Matlab2012a shows that the method is effective. From the experimental results, it can be seen that the method is superior to the commonly used Partial Least Squares (PLS), Extreme Learning Machine (ELM) and BP. Particularly, for the imbalanced databases, when the imbalance factor is greater than 10, the proposed method has more stable performance with higher classification accuracy than the existing ones mentioned above.
RESUMO
As an effective technique to identify counterfeit drugs, Near Infrared Spectroscopy has been successfully used in the drug management of grass-roots units, with classifier modeling of Pattern Recognition. Due to a major disadvantage of the characteristic overlap and complexity, the wide bandwidth and the weak absorption of the Spectroscopy signals, it seems difficult to give a satisfactory solutions for the modeling problem. To address those problems, in the present paper, a summation wavelet extreme learning machine algorithm (SWELM(CS)) combined with Cuckoo research was adopted for drug discrimination by NIRS. Specifically, Extreme Learning Machine (ELM) was selected as the classifier model because of its properties of fast learning and insensitivity, to improve the accuracy and generalization performances of the classifier model; An inverse hyperbolic sine and a Morlet-wavelet are used as dual activation functions to improve convergence speed, and a combination of activation functions makes the network more adequate to deal with dynamic systems; Due to ELM' s weights and hidden layer threshold generated randomly, it leads to network instability, so Cuckoo Search was adapted to optimize model parameters; SWELM(CS) improves stability of the classifier model. Besides, SWELM(CS) is based on the ELM algorithm for fast learning and insensitivity; the dual activation functions and proper choice of activation functions enhances the capability of the network to face low and high frequency signals simultaneously; it has high stability of classification by Cuckoo Research. This compact structure of the dual activation functions constitutes a kernel framework by extracting signal features and signal simultaneously, which can be generalized to other machine learning fields to obtain a good accuracy and generalization performances. Drug samples of near in- frared spectroscopy produced by Xian-Janssen Pharmaceutical Ltd were adopted as the main objects in this paper. Experiments for binary classification and multi-label classification were conducted, and the conclusion proved that the proposed method has more stable performance, higher classification accuracy and lower sensitivity to training samples than the existing ones, such as the BP neural network, ELM and-ELM by particle swarm optimization.
Assuntos
Inteligência Artificial , Preparações Farmacêuticas/análise , Espectroscopia de Luz Próxima ao Infravermelho , Algoritmos , Modelos Teóricos , Redes Neurais de Computação , Software , Análise de OndaletasRESUMO
BACKGROUND AND OBJECTIVE: Liver cancer seriously threatens human health. In clinical diagnosis, contrast-enhanced computed tomography (CECT) images provide important supplementary information for accurate liver tumor segmentation. However, most of the existing methods of liver tumor automatic segmentation focus only on single-phase image features. And the existing multi-modal methods have limited segmentation effect due to the redundancy of fusion features. In addition, the spatial misalignment of multi-phase images causes feature interference. METHODS: In this paper, we propose a phase attention network (PA-Net) to adequately aggregate multi-phase information of CT images and improve segmentation performance for liver tumors. Specifically, we design a PA module to generate attention weight maps voxel by voxel to efficiently fuse multi-phase CT images features to avoid feature redundancy. In order to solve the problem of feature interference in the multi-phase image segmentation task, we design a new learning strategy and prove its effectiveness experimentally. RESULTS: We conduct comparative experiments on the in-house clinical dataset and achieve the SOTA segmentation performance on multi-phase methods. In addition, our method has improved the mean dice score by 3.3% compared with the single-phase method based on nnUNet, and our learning strategy has improved the mean dice score by 1.51% compared with the ML strategy. CONCLUSION: The experimental results show that our method is superior to the existing multi-phase liver tumor segmentation method, and provides a scheme for dealing with missing modalities in multi-modal tasks. In addition, our proposed learning strategy makes more effective use of arterial phase image information and is proven to be the most effective in liver tumor segmentation tasks using thick-layer CT images. The source code is released on (https://github.com/Houjunfeng203934/PA-Net).
Assuntos
Neoplasias Hepáticas , Humanos , Neoplasias Hepáticas/diagnóstico por imagem , Veias , Artérias , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por ComputadorRESUMO
Background and objective: Spatial interaction between tumor-infiltrating lymphocytes (TILs) and tumor cells is valuable in predicting the effectiveness of immune response and prognosis amongst patients with lung adenocarcinoma (LUAD). Recent evidence suggests that the spatial distance between tumor cells and lymphocytes also influences the immune responses, but the distance analysis based on Hematoxylin and Eosin (H&E) -stained whole-slide images (WSIs) remains insufficient. To address this issue, we aim to explore the relationship between distance and prognosis prediction of patients with LUAD in this study. Methods: We recruited patients with resectable LUAD from three independent cohorts in this multi-center study. We proposed a simple but effective deep learning-driven workflow to automatically segment different cell types in the tumor region using the HoVer-Net model, and quantified the spatial distance (DIST) between tumor cells and lymphocytes based on H&E-stained WSIs. The association of DIST with disease-free survival (DFS) was explored in the discovery set (D1, n = 276) and the two validation sets (V1, n = 139; V2, n = 115). Results: In multivariable analysis, the low DIST group was associated with significantly better DFS in the discovery set (D1, HR, 0.61; 95 % CI, 0.40-0.94; p = 0.027) and the two validation sets (V1, HR, 0.54; 95 % CI, 0.32-0.91; p = 0.022; V2, HR, 0.44; 95 % CI, 0.24-0.81; p = 0.009). By integrating the DIST with clinicopathological factors, the integrated model (full model) had better discrimination for DFS in the discovery set (C-index, D1, 0.745 vs. 0.723) and the two validation sets (V1, 0.621 vs. 0.596; V2, 0.671 vs. 0.650). Furthermore, the computerized DIST was associated with immune phenotypes such as immune-desert and inflamed phenotypes. Conclusions: The integration of DIST with clinicopathological factors could improve the stratification performance of patients with resectable LUAD, was beneficial for the prognosis prediction of LUAD patients, and was also expected to assist physicians in individualized treatment.
RESUMO
The increased amount of tertiary lymphoid structures (TLSs) is associated with a favorable prognosis in patients with lung adenocarcinoma (LUAD). However, evaluating TLSs manually is an experience-dependent and time-consuming process, which limits its clinical application. In this multi-center study, we developed an automated computational workflow for quantifying the TLS density in the tumor region of routine hematoxylin and eosin (H&E)-stained whole-slide images (WSIs). The association between the computerized TLS density and disease-free survival (DFS) was further explored in 802 patients with resectable LUAD of three cohorts. Additionally, a Cox proportional hazard regression model, incorporating clinicopathological variables and the TLS density, was established to assess its prognostic ability. The computerized TLS density was an independent prognostic biomarker in patients with resectable LUAD. The integration of the TLS density with clinicopathological variables could support individualized clinical decision-making by improving prognostic stratification.
RESUMO
BACKGROUND AND OBJECTIVE: A high degree of lymphocyte infiltration is related to superior outcomes amongst patients with lung adenocarcinoma. Recent evidence indicates that the spatial interactions between tumours and lymphocytes also influence the anti-tumour immune responses, but the spatial analysis at the cellular level remains insufficient. METHODS: We proposed an artificial intelligence-quantified Tumour-Lymphocyte Spatial Interaction score (TLSI-score) by calculating the ratio between the number of spatial adjacent tumour-lymphocyte and the number of tumour cells based on topology cell graph constructed using H&E-stained whole-slide images. The association of TLSI-score with disease-free survival (DFS) was explored in 529 patients with lung adenocarcinoma across three independent cohorts (D1, 275; V1, 139; V2, 115). RESULTS: After adjusting for pTNM stage and other clinicopathologic risk factors, a higher TLSI-score was independently associated with longer DFS than a low TLSI-score in the three cohorts [D1, adjusted hazard ratio (HR), 0.674; 95% confidence interval (CI) 0.463-0.983; p = 0.040; V1, adjusted HR, 0.408; 95% CI 0.223-0.746; p = 0.004; V2, adjusted HR, 0.294; 95% CI 0.130-0.666; p = 0.003]. By integrating the TLSI-score with clinicopathologic risk factors, the integrated model (full model) improves the prediction of DFS in three independent cohorts (C-index, D1, 0.716 vs. 0.701; V1, 0.666 vs. 0.645; V2, 0.708 vs. 0.662) CONCLUSIONS: TLSI-score shows the second highest relative contribution to the prognostic prediction model, next to the pTNM stage. TLSI-score can assist in the characterising of tumour microenvironment and is expected to promote individualized treatment and follow-up decision-making in clinical practice.
Assuntos
Adenocarcinoma de Pulmão , Adenocarcinoma , Neoplasias Pulmonares , Humanos , Intervalo Livre de Doença , Inteligência Artificial , Adenocarcinoma de Pulmão/cirurgia , Adenocarcinoma/cirurgia , Linfócitos , Prognóstico , Neoplasias Pulmonares/cirurgia , Estudos Retrospectivos , Microambiente TumoralRESUMO
High throughput nuclear segmentation and classification of whole slide images (WSIs) is crucial to biological analysis, clinical diagnosis and precision medicine. With the advances of CNN algorithms and the continuously growing datasets, considerable progress has been made in nuclear segmentation and classification. However, few works consider how to reasonably deal with nuclear heterogeneity in the following two aspects: imbalanced data distribution and diversified morphology characteristics. The minority classes might be dominated by the majority classes due to the imbalanced data distribution and the diversified morphology characteristics may lead to fragile segmentation results. In this study, a cost-Sensitive MultI-task LEarning (SMILE) framework is conducted to tackle the data heterogeneity problem. Based on the most popular multi-task learning backbone in nuclei segmentation and classification, we propose a multi-task correlation attention (MTCA) to perform feature interaction of multiple high relevant tasks to learn better feature representation. A cost-sensitive learning strategy is proposed to solve the imbalanced data distribution by increasing the penalization for the error classification of the minority classes. Furthermore, we propose a novel post-processing step based on the coarse-to-fine marker-controlled watershed scheme to alleviate fragile segmentation when nuclei are with large size and unclear contour. Extensive experiments show that the proposed method achieves state-of-the-art performances on CoNSeP and MoNuSAC 2020 datasets. The code is available at: https://github.com/panxipeng/nuclear_segandcls.
Assuntos
Algoritmos , Aprendizagem , Humanos , Núcleo Celular , Processamento de Imagem Assistida por Computador , Medicina de PrecisãoRESUMO
Automatic tissue segmentation in whole-slide images (WSIs) is a critical task in hematoxylin and eosin- (H&E-) stained histopathological images for accurate diagnosis and risk stratification of lung cancer. Patch classification and stitching the classification results can fast conduct tissue segmentation of WSIs. However, due to the tumour heterogeneity, large intraclass variability and small interclass variability make the classification task challenging. In this paper, we propose a novel bilinear convolutional neural network- (Bilinear-CNN-) based model with a bilinear convolutional module and a soft attention module to tackle this problem. This method investigates the intraclass semantic correspondence and focuses on the more distinguishable features that make feature output variations relatively large between interclass. The performance of the Bilinear-CNN-based model is compared with other state-of-the-art methods on the histopathological classification dataset, which consists of 107.7 k patches of lung cancer. We further evaluate our proposed algorithm on an additional dataset from colorectal cancer. Extensive experiments show that the performance of our proposed method is superior to that of previous state-of-the-art ones and the interpretability of our proposed method is demonstrated by Grad-CAM.
Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias Pulmonares , Algoritmos , Atenção , Humanos , Processamento de Imagem Assistida por Computador/métodos , Neoplasias Pulmonares/diagnóstico por imagem , Redes Neurais de ComputaçãoRESUMO
A high abundance of tumor-infiltrating lymphocytes (TILs) has a positive impact on the prognosis of patients with lung adenocarcinoma (LUAD). We aimed to develop and validate an artificial intelligence-driven pathological scoring system for assessing TILs on H&E-stained whole-slide images of LUAD. Deep learning-based methods were applied to calculate the densities of lymphocytes in cancer epithelium (DLCE) and cancer stroma (DLCS), and a risk score (WELL score) was built through linear weighting of DLCE and DLCS. Association between WELL score and patient outcome was explored in 793 patients with stage I-III LUAD in four cohorts. WELL score was an independent prognostic factor for overall survival and disease-free survival in the discovery cohort and validation cohorts. The prognostic prediction model-integrated WELL score demonstrated better discrimination performance than the clinicopathologic model in the four cohorts. This artificial intelligence-based workflow and scoring system could promote risk stratification for patients with resectable LUAD.
RESUMO
Background: There is no definite effect in the treatment of myocardial ischemia/reperfusion (I/R) injury in patients with acute ST-segment elevation myocardial infarction (STEMI). We evaluated the protective effect of Shexiang Baoxin Pill (SBP) on I/R injury in STEMI patients. Methods: STEMI patients were randomly divided into a primary percutaneous coronary intervention (PPCI) group (n = 52) and a PPCI + SBP group (n = 51). The area at risk of infarction (AAR) and final infarct size (FIS) were examined by single-photon emission computed tomography (SPECT). I/R injury was assessed using myocardial salvage (MS) and salvage index (SI) calculated from AAR and FIS. Results: The ST-segment resolution (STR) in the PPCI + SBP group was significantly higher than that in the PPCI group (p = 0.036), and the peak value of high-sensitivity troponin T (hsTNT) was lower than that in the PPCI group (p = 0.048). FIS in the PPCI + SBP group was smaller than that in the PPCI group (p = 0.047). MS (p = 0.023) and SI (p = 0.006) in the PPCI + SBP group were larger than those in the PPCI group. The left ventricular ejection fraction (LVEF) in the PPCI + SBP group was higher than that in the PPCI group (p = 0.049), and N-terminal pro-B type natriuretic peptide (NT-proBNP) level in the PPCI + SBP group was lower than that in the PPCI group (p = 0.048). Conclusions: SBP can alleviate I/R injury (MS and SI), decrease myocardial infarction area (peak value of hsTNT and FIS), and improve myocardial reperfusion (MBG and STR) and cardiac function (LVEF and NT-proBNP).
RESUMO
In this article, a simple yet effective method, called a two-phase learning-based swarm optimizer (TPLSO), is proposed for large-scale optimization. Inspired by the cooperative learning behavior in human society, mass learning and elite learning are involved in TPLSO. In the mass learning phase, TPLSO randomly selects three particles to form a study group and then adopts a competitive mechanism to update the members of the study group. Then, we sort all of the particles in the swarm and pick out the elite particles that have better fitness values. In the elite learning phase, the elite particles learn from each other to further search for more promising areas. The theoretical analysis of TPLSO exploration and exploitation abilities is performed and compared with several popular particle swarm optimizers. Comparative experiments on two widely used large-scale benchmark datasets demonstrate that the proposed TPLSO achieves better performance on diverse large-scale problems than several state-of-the-art algorithms.
Assuntos
Algoritmos , HumanosRESUMO
In recent years, hashing learning has received increasing attention in supervised video retrieval. However, most existing supervised video hashing approaches design hash functions based on pairwise similarity or triple relationships and focus on local information, which results in low retrieval accuracy. In this work, we propose a novel supervised framework called discriminative codebook hashing (DCH) for large-scale video retrieval. The proposed DCH encourages samples within the same category to converge to the same code word and maximizes the mutual distances among different categories. Specifically, we first propose the discriminative codebook via a predefined distance among intercode words and Bernoulli distributions to handle each hash bit. Then, we use the composite Kullback-Leibler (KL) divergence to align the neighborhood structures between the high-dimensional space and the Hamming space. The proposed DCH is optimized via the gradient descent algorithm. Experimental results on three widely used video datasets verify that our proposed DCH performs better than several state-of-the-art methods.
Assuntos
AlgoritmosRESUMO
Recently, deep convolutional neural networks (CNNs) have been successfully applied to the single-image super-resolution (SISR) task with great improvement in terms of both peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). However, most of the existing CNN-based SR models require high computing power, which considerably limits their real-world applications. In addition, most CNN-based methods rarely explore the intermediate features that are helpful for final image recovery. To address these issues, in this article, we propose a dense lightweight network, called MADNet, for stronger multiscale feature expression and feature correlation learning. Specifically, a residual multiscale module with an attention mechanism (RMAM) is developed to enhance the informative multiscale feature representation ability. Furthermore, we present a dual residual-path block (DRPB) that utilizes the hierarchical features from original low-resolution images. To take advantage of the multilevel features, dense connections are employed among blocks. The comparative results demonstrate the superior performance of our MADNet model while employing considerably fewer multiadds and parameters.
RESUMO
Deep convolutional neural networks (CNNs) have contributed to the significant progress of the single-image super-resolution (SISR) field. However, the majority of existing CNN-based models maintain high performance with massive parameters and exceedingly deeper structures. Moreover, several algorithms essentially have underused the low-level features, thus causing relatively low performance. In this article, we address these problems by exploring two strategies based on novel local wider residual blocks (LWRBs) to effectively extract the image features for SISR. We propose a cascading residual network (CRN) that contains several locally sharing groups (LSGs), in which the cascading mechanism not only promotes the propagation of features and the gradient but also eases the model training. Besides, we present another enhanced residual network (ERN) for image resolution enhancement. ERN employs a dual global pathway structure that incorporates nonlocal operations to catch long-distance spatial features from the the original low-resolution (LR) input. To obtain the feature representation of the input at different scales, we further introduce a multiscale block (MSB) to directly detect low-level features from the LR image. The experimental results on four benchmark datasets have demonstrated that our models outperform most of the advanced methods while still retaining a reasonable number of parameters.
RESUMO
Collaborative representation is an effective way to design classifiers for many practical applications. In this paper, we propose a novel classifier, called the prior knowledge-based probabilistic collaborative representation-based classifier (PKPCRC), for visual recognition. Compared with existing classifiers which use the collaborative representation strategy, the proposed PKPCRC further includes characteristics of training samples of each class as prior knowledge. Four types of prior knowledge are developed from the perspectives of image distance and representation capacity. They adaptively accommodate the contribution of each class and result in an accurate representation to classify a query sample. Experiments and comparisons on four challenging databases demonstrate that PKPCRC outperforms several state-of-the-art classifiers.