RESUMO
Scene text detection is an important research field in computer vision, playing a crucial role in various application scenarios. However, existing scene text detection methods often fail to achieve satisfactory results when faced with text instances of different sizes, shapes, and complex backgrounds. To address the challenge of detecting diverse texts in natural scenes, this paper proposes a multi-scale natural scene text detection method based on attention feature extraction and cascaded feature fusion. This method combines global and local attention through an improved attention feature fusion module (DSAF) to capture text features of different scales, enhancing the network's perception of text regions and improving its feature extraction capabilities. Simultaneously, an improved cascaded feature fusion module (PFFM) is used to fully integrate the extracted feature maps, expanding the receptive field of features and enriching the expressive ability of the feature maps. Finally, to address the cascaded feature maps, a lightweight subspace attention module (SAM) is introduced to partition the concatenated feature maps into several sub-space feature maps, facilitating spatial information interaction among features of different scales. In this paper, comparative experiments are conducted on the ICDAR2015, Total-Text, and MSRA-TD500 datasets, and comparisons are made with some existing scene text detection methods. The results show that the proposed method achieves good performance in terms of accuracy, recall, and F-score, thus verifying its effectiveness and practicality.
RESUMO
Medical images such as CT and X-ray have been widely used for the detection of several chest infections and lung diseases. However, these images are susceptible to different types of noise, and it is hard to remove these noises due to their complex distribution. The presence of such noise significantly deteriorates the quality of the images and significantly affects the diagnosis performance. Hence, the design of an effective de-noising technique is highly essential to remove the noise from chest CT and X-ray images prior to further processing. Deep learning methods, mainly, CNN have shown tremendous progress on de-noising tasks. However, existing CNN based models estimate the noise from the final layers, which may not carry adequate details of the image. To tackle this issue, in this paper a deep multi-level semantic fusion network is proposed, called DMF-Net for the removal of noise from chest CT and X-ray images. The DMF-Net mainly comprises of a dilated convolutional feature extraction block, a cascaded feature learning block (CFLB) and a noise fusion block (NFB) followed by a prominent feature extraction block. The CFLB cascades the features from different levels (convolutional layers) which are later fed to NFB to attain correct noise prediction. Finally, the Prominent Feature Extraction Block(PFEB) produces the clean image. To validate the proposed de-noising technique, a separate and a mixed dataset containing high-resolution CT and X-ray images with specific and blind noise are used. Experimental results indicate the effectiveness of the DMF-Net compared to other state-of-the-art methods in the context of peak signal-to-noise ratio (PSNR) and structural similarity measurement (SSIM) while drastically cutting down on the processing power needed.