Búsqueda | Biblioteca Virtual en Salud

1.

EEG Emotion Recognition Network Based on Attention and Spatiotemporal Convolution.

Zhu, Xiaoliang; Liu, Chen; Zhao, Liang; Wang, Shengming.

Sensors (Basel) ; 24(11)2024 May 27.

Artículo en Inglés | MEDLINE | ID: mdl-38894254

RESUMEN

Human emotions are complex psychological and physiological responses to external stimuli. Correctly identifying and providing feedback on emotions is an important goal in human-computer interaction research. Compared to facial expressions, speech, or other physiological signals, using electroencephalogram (EEG) signals for the task of emotion recognition has advantages in terms of authenticity, objectivity, and high reliability; thus, it is attracting increasing attention from researchers. However, the current methods have significant room for improvement in terms of the combination of information exchange between different brain regions and time-frequency feature extraction. Therefore, this paper proposes an EEG emotion recognition network, namely, self-organized graph pesudo-3D convolution (SOGPCN), based on attention and spatiotemporal convolution. Unlike previous methods that directly construct graph structures for brain channels, the proposed SOGPCN method considers that the spatial relationships between electrodes in each frequency band differ. First, a self-organizing map is constructed for each channel in each frequency band to obtain the 10 most relevant channels to the current channel, and graph convolution is employed to capture the spatial relationships between all channels in the self-organizing map constructed for each channel in each frequency band. Then, pseudo-three-dimensional convolution combined with partial dot product attention is implemented to extract the temporal features of the EEG sequence. Finally, LSTM is employed to learn the contextual information between adjacent time-series data. Subject-dependent and subject-independent experiments are conducted on the SEED dataset to evaluate the performance of the proposed SOGPCN method, which achieves recognition accuracies of 95.26% and 94.22%, respectively, indicating that the proposed method outperforms several baseline methods.

Asunto(s)

Electroencefalografía , Emociones , Redes Neurales de la Computación , Electroencefalografía/métodos , Humanos , Emociones/fisiología , Atención/fisiología , Algoritmos , Encéfalo/fisiología , Procesamiento de Señales Asistido por Computador

2.

High-Performance Binocular Disparity Prediction Algorithm for Edge Computing.

Cheng, Yuxi; Song, Yang; Liu, Yi; Zhang, Hui; Liu, Feng.

Sensors (Basel) ; 24(14)2024 Jul 14.

Artículo en Inglés | MEDLINE | ID: mdl-39065960

RESUMEN

End-to-end disparity estimation algorithms based on cost volume deployed in edge-end neural network accelerators have the problem of structural adaptation and need to ensure accuracy under the condition of adaptation operator. Therefore, this paper proposes a novel disparity calculation algorithm that uses low-rank approximation to approximately replace 3D convolution and transposed 3D convolution, WReLU to reduce data compression caused by the activation function, and unimodal cost volume filtering and a confidence estimation network to regularize cost volume. It alleviates the problem of disparity-matching cost distribution being far away from the true distribution and greatly reduces the computational complexity and number of parameters of the algorithm while improving accuracy. Experimental results show that compared with a typical disparity estimation network, the absolute error of the proposed algorithm is reduced by 38.3%, the three-pixel error is reduced to 1.41%, and the number of parameters is reduced by 67.3%. The calculation accuracy is better than that of other algorithms, it is easier to deploy, and it has strong structural adaptability and better practicability.

3.

AST3DRNet: Attention-Based Spatio-Temporal 3D Residual Neural Networks for Traffic Congestion Prediction.

Li, Lecheng; Dai, Fei; Huang, Bi; Wang, Shuai; Dou, Wanchun; Fu, Xiaodong.

Sensors (Basel) ; 24(4)2024 Feb 16.

Artículo en Inglés | MEDLINE | ID: mdl-38400419

RESUMEN

Traffic congestion prediction has become an indispensable component of an intelligent transport system. However, one limitation of the existing methods is that they treat the effects of spatio-temporal correlations on traffic prediction as invariable during modeling spatio-temporal features, which results in inadequate modeling. In this paper, we propose an attention-based spatio-temporal 3D residual neural network, named AST3DRNet, to directly forecast the congestion levels of road networks in a city. AST3DRNet combines a 3D residual network and a self-attention mechanism together to efficiently model the spatial and temporal information of traffic congestion data. Specifically, by stacking 3D residual units and 3D convolution, we proposed a 3D convolution module that can simultaneously capture various spatio-temporal correlations. Furthermore, a novel spatio-temporal attention module is proposed to explicitly model the different contributions of spatio-temporal correlations in both spatial and temporal dimensions through the self-attention mechanism. Extensive experiments are conducted on a real-world traffic congestion dataset in Kunming, and the results demonstrate that AST3DRNet outperforms the baselines in short-term (5/10/15 min) traffic congestion predictions with an average accuracy improvement of 59.05%, 64.69%, and 48.22%, respectively.

4.

Shot Boundary Detection with 3D Depthwise Convolutions and Visual Attention.

Esteve Brotons, Miguel Jose; Lucendo, Francisco Javier; Javier, Rodriguez-Juan; Garcia-Rodriguez, Jose.

Sensors (Basel) ; 23(16)2023 Aug 08.

Artículo en Inglés | MEDLINE | ID: mdl-37631559

RESUMEN

Shot boundary detection is the process of identifying and locating the boundaries between individual shots in a video sequence. A shot is a continuous sequence of frames that are captured by a single camera, without any cuts or edits. Recent investigations have shown the effectiveness of the use of 3D convolutional networks to solve this task due to its high capacity to extract spatiotemporal features of the video and determine in which frame a transition or shot change occurs. When this task is used as part of a scene segmentation use case with the aim of improving the experience of viewing content from streaming platforms, the speed of segmentation is very important for live and near-live use cases such as start-over. The problem with models based on 3D convolutions is the large number of parameters that they entail. Standard 3D convolutions impose much higher CPU and memory requirements than do the same 2D operations. In this paper, we rely on depthwise separable convolutions to address the problem but with a scheme that significantly reduces the number of parameters. To compensate for the slight loss of performance, we analyze and propose the use of visual self-attention as a mechanism of improvement.

5.

Enhancing Feature Detection and Matching in Low-Pixel-Resolution Hyperspectral Images Using 3D Convolution-Based Siamese Networks.

Perera, Chamika Janith; Premachandra, Chinthaka; Kawanaka, Hiroharu.

Sensors (Basel) ; 23(18)2023 Sep 21.

Artículo en Inglés | MEDLINE | ID: mdl-37766058

RESUMEN

Today, hyperspectral imaging plays an integral part in the remote sensing and precision agriculture field. Identifying the matching key points between hyperspectral images is an important step in tasks such as image registration, localization, object recognition, and object tracking. Low-pixel resolution hyperspectral imaging is a recent introduction to the field, bringing benefits such as lower cost and form factor compared to traditional systems. However, the use of limited pixel resolution challenges even state-of-the-art feature detection and matching methods, leading to difficulties in generating robust feature matches for images with repeated textures, low textures, low sharpness, and low contrast. Moreover, the use of narrower optics in these cameras adds to the challenges during the feature-matching stage, particularly for images captured during low-altitude flight missions. In order to enhance the robustness of feature detection and matching in low pixel resolution images, in this study we propose a novel approach utilizing 3D Convolution-based Siamese networks. Compared to state-of-the-art methods, this approach takes advantage of all the spectral information available in hyperspectral imaging in order to filter out incorrect matches and produce a robust set of matches. The proposed method initially generates feature matches through a combination of Phase Stretch Transformation-based edge detection and SIFT features. Subsequently, a 3D Convolution-based Siamese network is utilized to filter out inaccurate matches, producing a highly accurate set of feature matches. Evaluation of the proposed method demonstrates its superiority over state-of-the-art approaches in cases where they fail to produce feature matches. Additionally, it competes effectively with the other evaluated methods when generating feature matches in low-pixel resolution hyperspectral images. This research contributes to the advancement of low pixel resolution hyperspectral imaging techniques, and we believe it can specifically aid in mosaic generation of low pixel resolution hyperspectral images.

6.

Anomaly Detection Based on a 3D Convolutional Neural Network Combining Convolutional Block Attention Module Using Merged Frames.

Hwang, In-Chang; Kang, Hyun-Soo.

Sensors (Basel) ; 23(23)2023 Dec 04.

Artículo en Inglés | MEDLINE | ID: mdl-38067989

RESUMEN

With the recent rise in violent crime, the real-time situation analysis capabilities of the prevalent closed-circuit television have been employed for the deterrence and resolution of criminal activities. Anomaly detection can identify abnormal instances such as violence within the patterns of a specified dataset; however, it faces challenges in that the dataset for abnormal situations is smaller than that for normal situations. Herein, using datasets such as UBI-Fights, RWF-2000, and UCSD Ped1 and Ped2, anomaly detection was approached as a binary classification problem. Frames extracted from each video with annotation were reconstructed into a limited number of images of 3×3, 4×3, 4×4, 5×3 sizes using the method proposed in this paper, forming an input data structure similar to a light field and patch of vision transformer. The model was constructed by applying a convolutional block attention module that included channel and spatial attention modules to a residual neural network with depths of 10, 18, 34, and 50 in the form of a three-dimensional convolution. The proposed model performed better than existing models in detecting abnormal behavior such as violent acts in videos. For instance, with the undersampled UBI-Fights dataset, our network achieved an accuracy of 0.9933, a loss value of 0.0010, an area under the curve of 0.9973, and an equal error rate of 0.0027. These results may contribute significantly to solve real-world issues such as the detection of violent behavior in artificial intelligence systems using computer vision and real-time video monitoring.

7.

Liver tumor segmentation based on 3D convolutional neural network with dual scale.

Meng, Lu; Tian, Yaoyu; Bu, Sihang.

J Appl Clin Med Phys ; 21(1): 144-157, 2020 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-31793212

RESUMEN

PURPOSE: Liver is one of the organs with a high incidence of tumors in the human body. Malignant liver tumors seriously threaten human life and health. The difficulties of liver tumor segmentation from computed tomography (CT) image are: (a) The contrast between the liver tumors and healthy tissues in CT images is low and the boundary is blurred; (b) The image of liver tumor is complex and diversified in size, shape, and location. METHODS: To solve the above problems, this paper focused on the human liver and liver tumor segmentation algorithm based on convolutional neural network (CNN), and specially designed a three-dimensional dual path multiscale convolutional neural network (TDP-CNN). To balance the performance of segmentation and requirement of computational resources, the dual path was used in the network, then the feature maps from both paths were fused at the end of the paths. To refine the segmentation results, we used conditional random fields (CRF) to eliminate the false segmentation points in the segmentation results to improve the accuracy. RESULTS: In the experiment, we used the public dataset liver tumor segmentation (LiTS) to analyze the segmentation results qualitatively and quantitatively. Ground truth segmentation of liver and liver tumor was manually labeled by an experienced radiologist. Quantitative metrics were Dice, Hausdorff distance, and average distance. For the segmentation results of liver tumor, Dice was 0.689, Hausdorff distance was 7.69, and the average distance was 1.07; for the segmentation results of the liver, Dice was 0.965, Hausdorff distance was 29.162, and the average distance was 0.197. Compared with other liver and liver tumor segmentation algorithms in Medical Image Computing and Intervention (MICCAI) 2017 competition, our method of liver segmentation ranked first, and liver tumor segmentation ranked second. CONCLUSIONS: The experimental results showed that the proposed algorithm had good performance in both liver and liver tumor segmentation.

Asunto(s)

Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Neoplasias Hepáticas/patología , Redes Neurales de la Computación , Fantasmas de Imagen , Planificación de la Radioterapia Asistida por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Humanos , Imagenología Tridimensional/métodos , Neoplasias Hepáticas/diagnóstico por imagen , Neoplasias Hepáticas/radioterapia , Dosificación Radioterapéutica

8.

Adaptive temporal compression for reduction of computational complexity in human behavior recognition.

Huang, Haixin; Wang, Yuyao; Cai, Mingqi; Wang, Ruipeng; Wen, Feng; Hu, Xiaojie.

Sci Rep ; 14(1): 10560, 2024 05 08.

Artículo en Inglés | MEDLINE | ID: mdl-38720020

RESUMEN

The research on video analytics especially in the area of human behavior recognition has become increasingly popular recently. It is widely applied in virtual reality, video surveillance, and video retrieval. With the advancement of deep learning algorithms and computer hardware, the conventional two-dimensional convolution technique for training video models has been replaced by three-dimensional convolution, which enables the extraction of spatio-temporal features. Specifically, the use of 3D convolution in human behavior recognition has been the subject of growing interest. However, the increased dimensionality has led to challenges such as the dramatic increase in the number of parameters, increased time complexity, and a strong dependence on GPUs for effective spatio-temporal feature extraction. The training speed can be considerably slow without the support of powerful GPU hardware. To address these issues, this study proposes an Adaptive Time Compression (ATC) module. Functioning as an independent component, ATC can be seamlessly integrated into existing architectures and achieves data compression by eliminating redundant frames within video data. The ATC module effectively reduces GPU computing load and time complexity with negligible loss of accuracy, thereby facilitating real-time human behavior recognition.

Asunto(s)

Algoritmos , Compresión de Datos , Grabación en Video , Humanos , Compresión de Datos/métodos , Actividades Humanas , Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos

9.

Research on gesture recognition algorithm based on MME-P3D.

Jin, Hongmei; He, Ning; Liu, Boyu; Li, Zhanli.

Math Biosci Eng ; 21(3): 3594-3617, 2024 Feb 05.

Artículo en Inglés | MEDLINE | ID: mdl-38549297

RESUMEN

A Multiscale-Motion Embedding Pseudo-3D (MME-P3D) gesture recognition algorithm has been proposed to tackle the issues of excessive parameters and high computational complexity encountered by existing gesture recognition algorithms deployed in mobile and embedded devices. The algorithm initially takes into account the characteristics of gesture motion information, integrating the channel attention (CE) mechanism into the pseudo-3D (P3D) module, thereby constructing a P3D-C feature extraction network that can efficiently extract spatio-temporal feature information while reducing the complexity of the algorithmic model. To further enhance the understanding and learning of the global gesture movement's dynamic information, a Multiscale Motion Embedding (MME) mechanism is subsequently designed. The experimental findings reveal that the MME-P3D model achieves recognition accuracies reaching up to 91.12% and 83.06% on the self-constructed conference gesture dataset and the publicly available Chalearn 2013 dataset, respectively. In comparison with the conventional 3D convolutional neural network, the MME-P3D model demonstrates a significant advantage in terms of parameter count and computational requirements, which are reduced by as much as 82% and 83%, respectively. This effectively addresses the limitations of the original algorithms, making them more suitable for deployment on embedded and mobile devices and providing a more effective means for the practical application of hand gesture recognition technology.

Asunto(s)

Endrín/análogos & derivados , Gestos , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Redes Neurales de la Computación

10.

Stable 3D Deep Convolutional Autoencoder Method for Ultrasonic Testing of Defects in Polymer Composites.

Liu, Yi; Yu, Qing; Liu, Kaixin; Zhu, Ningtao; Yao, Yuan.

Polymers (Basel) ; 16(11)2024 May 31.

Artículo en Inglés | MEDLINE | ID: mdl-38891506

RESUMEN

Ultrasonic testing is widely used for defect detection in polymer composites owing to advantages such as fast processing speed, simple operation, high reliability, and real-time monitoring. However, defect information in ultrasound images is not easily detectable because of the influence of ultrasound echoes and noise. In this study, a stable three-dimensional deep convolutional autoencoder (3D-DCA) was developed to identify defects in polymer composites. Through 3D convolutional operations, it can synchronously learn the spatiotemporal properties of the data volume. Subsequently, the depth receptive field (RF) of the hidden layer in the autoencoder maps the defect information to the original depth location, thereby mitigating the effects of the defect surface and bottom echoes. In addition, a dual-layer encoder was designed to improve the hidden layer visualization results. Consequently, the size, shape, and depth of the defects can be accurately determined. The feasibility of the method was demonstrated through its application to defect detection in carbon-fiber-reinforced polymers.

11.

A Deep Learning-Based Approach for Cervical Cancer Classification Using 3D CNN and Vision Transformer.

K, Abinaya; B, Sivakumar.

J Imaging Inform Med ; 37(1): 280-296, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38343216

RESUMEN

Cervical cancer is a significant health problem worldwide, and early detection and treatment are critical to improving patient outcomes. To address this challenge, a deep learning (DL)-based cervical classification system is proposed using 3D convolutional neural network and Vision Transformer (ViT) module. The proposed model leverages the capability of 3D CNN to extract spatiotemporal features from cervical images and employs the ViT model to capture and learn complex feature representations. The model consists of an input layer that receives cervical images, followed by a 3D convolution block, which extracts features from the images. The feature maps generated are down-sampled using max-pooling block to eliminate redundant information and preserve important features. Four Vision Transformer models are employed to extract efficient feature maps of different levels of abstraction. The output of each Vision Transformer model is an efficient set of feature maps that captures spatiotemporal information at a specific level of abstraction. The feature maps generated by the Vision Transformer models are then supplied into the 3D feature pyramid network (FPN) module for feature concatenation. The 3D squeeze-and-excitation (SE) block is employed to obtain efficient feature maps that recalibrate the feature responses of the network based on the interdependencies between different feature maps, thereby improving the discriminative power of the model. At last, dimension minimization of feature maps is executed using 3D average pooling layer. Its output is then fed into a kernel extreme learning machine (KELM) for classification into one of the five classes. The KELM uses radial basis kernel function (RBF) for mapping features in high-dimensional feature space and classifying the input samples. The superiority of the proposed model is known using simulation results, achieving an accuracy of 98.6%, demonstrating its potential as an effective tool for cervical cancer classification. Also, it can be used as a diagnostic supportive tool to assist medical experts in accurately identifying cervical cancer in patients.

12.

SCGNet: efficient sparsely connected group convolution network for wheat grains classification.

Sun, Xuewei; Li, Yan; Li, Guohou; Jin, Songlin; Zhao, Wenyi; Liang, Zheng; Zhang, Weidong.

Front Plant Sci ; 14: 1304962, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38186591

RESUMEN

Introduction: Efficient and accurate varietal classification of wheat grains is crucial for maintaining varietal purity and reducing susceptibility to pests and diseases, thereby enhancing crop yield. Traditional manual and machine learning methods for wheat grain identification often suffer from inefficiencies and the use of large models. In this study, we propose a novel classification and recognition model called SCGNet, designed for rapid and efficient wheat grain classification. Methods: Specifically, our proposed model incorporates several modules that enhance information exchange and feature multiplexing between group convolutions. This mechanism enables the network to gather feature information from each subgroup of the previous layer, facilitating effective utilization of upper-layer features. Additionally, we introduce sparsity in channel connections between groups to further reduce computational complexity without compromising accuracy. Furthermore, we design a novel classification output layer based on 3-D convolution, replacing the traditional maximum pooling layer and fully connected layer in conventional convolutional neural networks (CNNs). This modification results in more efficient classification output generation. Results: We conduct extensive experiments using a curated wheat grain dataset, demonstrating the superior performance of our proposed method. Our approach achieves an impressive accuracy of 99.56%, precision of 99.59%, recall of 99.55%, and an F 1-score of 99.57%. Discussion: Notably, our method also exhibits the lowest number of Floating-Point Operations (FLOPs) and the number of parameters, making it a highly efficient solution for wheat grains classification.

13.

Multiband seizure type classification based on 3D convolution with attention mechanisms.

Huang, Hui; Chen, Peiyu; Wen, Jianfeng; Lu, Xuzhe; Zhang, Nan.

Comput Biol Med ; 166: 107517, 2023 Sep 25.

Artículo en Inglés | MEDLINE | ID: mdl-37778214

RESUMEN

Electroencephalogram (EEG) signal contains important information about abnormal brain activity, which has become an important basis for epilepsy diagnosis. Recently, epilepsy EEG signal classification methods mainly extract features from the perspective of a single domain, which cannot effectively utilize the spatial domain information in EEG signals. The redundant information in EEG signals will affect the learning features with the increase of convolution layer and multi-domain features, resulting in inefficient learning and a lack of distinguishing features. To tackle these issues, we propose an end-to-end 3D convolutional multiband seizure-type classification model based on attention mechanisms. Specifically, to process preprocessed electroencephalogram (EEG) data, a multilevel wavelet decomposition is applied to obtain the joint distribution information in the two-dimensional time-frequency domain across multiple frequency bands. Subsequently, this information is transformed into three-dimensional spatial data based on the electrode configuration. Discriminative joint activity features in the time, frequency, and spatial domains are then extracted by a series of parallel 3D convolutional sub-networks, where 3D channels and spatial attention mechanisms improve the ability to learn critical global and local information. A multi-layer perceptron is finally implemented to integrate the extracted features and further map them to the classification results. Experimental results on the TUSZ dataset, the world's largest publicly available seizure corpus, show that 3D-CBAMNet significantly outperforms the state-of-the-art methods, indicating effectiveness in the seizure type classification task.

14.

3D network with channel excitation and knowledge distillation for action recognition.

Hu, Zhengping; Mao, Jianzeng; Yao, Jianxin; Bi, Shuai.

Front Neurorobot ; 17: 1050167, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37033413

RESUMEN

Modern action recognition techniques frequently employ two networks: the spatial stream, which accepts input from RGB frames, and the temporal stream, which accepts input from optical flow. Recent researches use 3D convolutional neural networks that employ spatiotemporal filters on both streams. Although mixing flow with RGB enhances performance, correct optical flow computation is expensive and adds delay to action recognition. In this study, we present a method for training a 3D CNN using RGB frames that replicates the motion stream and, as a result, does not require flow calculation during testing. To begin, in contrast to the SE block, we suggest a channel excitation module (CE module). Experiments have shown that the CE module can improve the feature extraction capabilities of a 3D network and that the effect is superior to the SE block. Second, for action recognition training, we adopt a linear mix of loss based on knowledge distillation and standard cross-entropy loss to effectively leverage appearance and motion information. The Intensified Motion RGB Stream is the stream trained with this combined loss (IMRS). IMRS surpasses RGB or Flow as a single stream; for example, HMDB51 achieves 73.5% accuracy, while RGB and Flow streams score 65.6% and 69.1% accuracy, respectively. Extensive experiments confirm the effectiveness of our proposed method. The comparison with other models proves that our model has good competitiveness in behavior recognition.

15.

Non-destructive detection of single-seed viability in maize using hyperspectral imaging technology and multi-scale 3D convolutional neural network.

Fan, Yaoyao; An, Ting; Wang, Qingyan; Yang, Guang; Huang, Wenqian; Wang, Zheli; Zhao, Chunjiang; Tian, Xi.

Front Plant Sci ; 14: 1248598, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37711294

RESUMEN

The viability of Zea mays seed plays a critical role in determining the yield of corn. Therefore, developing a fast and non-destructive method is essential for rapid and large-scale seed viability detection and is of great significance for agriculture, breeding, and germplasm preservation. In this study, hyperspectral imaging (HSI) technology was used to obtain images and spectral information of maize seeds with different aging stages. To reduce data input and improve model detection speed while obtaining more stable prediction results, successive projections algorithm (SPA) was used to extract key wavelengths that characterize seed viability, then key wavelength images of maize seed were divided into small blocks with 5 pixels ×5 pixels and fed into a multi-scale 3D convolutional neural network (3DCNN) for further optimizing the discrimination possibility of single-seed viability. The final discriminant result of single-seed viability was determined by comprehensively evaluating the result of all small blocks belonging to the same seed with the voting algorithm. The results showed that the multi-scale 3DCNN model achieved an accuracy of 90.67% for the discrimination of single-seed viability on the test set. Furthermore, an effort to reduce labor and avoid the misclassification caused by human subjective factors, a YOLOv7 model and a Mask R-CNN model were constructed respectively for germination judgment and bud length detection in this study, the result showed that mean average precision (mAP) of YOLOv7 model could reach 99.7%, and the determination coefficient of Mask R-CNN model was 0.98. Overall, this study provided a feasible solution for detecting maize seed viability using HSI technology and multi-scale 3DCNN, which was crucial for large-scale screening of viable seeds. This study provided theoretical support for improving planting quality and crop yield.

16.

Classification of lungs infected COVID-19 images based on inception-ResNet.

Chen, Yunfeng; Lin, Yalan; Xu, Xiaodie; Ding, Jinzhen; Li, Chuzhao; Zeng, Yiming; Liu, Weili; Xie, Weifang; Huang, Jianlong.

Comput Methods Programs Biomed ; 225: 107053, 2022 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-35964421

RESUMEN

OBJECTIVE: Nowadays, COVID-19 is spreading rapidly worldwide, and seriously threatening lives . From the perspective of security and economy, the effective control of COVID-19 has a profound impact on the entire society. An effective strategy is to diagnose earlier to prevent the spread of the disease and prompt treatment of severe cases to improve the chance of survival. METHODS: The method of this paper is as follows: Firstly, the collected data set is processed by chest film image processing, and the bone removal process is carried out in the rib subtraction module. Then, the set preprocessing method performed histogram equalization, sharpening, and other preprocessing operations on the chest film. Finally, shallow and high-level feature mapping through the backbone network extracts the processed chest radiographs. We implement the self-attention mechanism in Inception-Resnet, perform the standard classification, and identify chest radiograph diseases through the classifier to realize the auxiliary COVID-19 diagnosis process at the medical level, all in an effort to further enhance the classification performance of the convolutional neural network. Numerous computer simulations demonstrate that the Inception-Resnet convolutional neural network performs CT image categorization and enhancement with greater efficiency and flexibility than conventional segmentation techniques. RESULTS: The experimental COVID-19 CT dataset obtained in this paper is the new data for CT scans and medical imaging of normal, early COVID-19 patients and severe COVID-19 patients from Jinyintan hospital. The experiment plots the relationship between model accuracy, model loss and epoch, using ACC, TPR, SPE, F1 score and G-mean to measure the image maps of patients with and without the disease. Statistical measurement values are obtained by Inception-Resnet are 88.23%, 83.45%, 89.72%, 95.53% and 88.74%. The experimental results show that Inception-Resnet plays a more effective role than other image classification methods in evaluation indicators, and the method has higher robustness, accuracy and intuitiveness. CONCLUSION: With CT images in the clinical diagnosis of COVID-19 images being widely used and the number of applied samples continuously increasing, the method in this paper is expected to become an additional diagnostic tool that can effectively improve the diagnostic accuracy of clinical COVID-19 images.

Asunto(s)

COVID-19 , COVID-19/diagnóstico por imagen , Prueba de COVID-19 , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Pulmón/diagnóstico por imagen , Redes Neurales de la Computación

17.

Alzheimer's Disease Diagnosis With Brain Structural MRI Using Multiview-Slice Attention and 3D Convolution Neural Network.

Chen, Lin; Qiao, Hezhe; Zhu, Fan.

Front Aging Neurosci ; 14: 871706, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35557839

RESUMEN

Numerous artificial intelligence (AI) based approaches have been proposed for automatic Alzheimer's disease (AD) prediction with brain structural magnetic resonance imaging (sMRI). Previous studies extract features from the whole brain or individual slices separately, ignoring the properties of multi-view slices and feature complementarity. For this reason, we present a novel AD diagnosis model based on the multiview-slice attention and 3D convolution neural network (3D-CNN). Specifically, we begin by extracting the local slice-level characteristic in various dimensions using multiple sub-networks. Then we proposed a slice-level attention mechanism to emphasize specific 2D-slices to exclude the redundancy features. After that, a 3D-CNN was employed to capture the global subject-level structural changes. Finally, all these 2D and 3D features were fused to obtain more discriminative representations. We conduct the experiments on 1,451 subjects from ADNI-1 and ADNI-2 datasets. Experimental results showed the superiority of our model over the state-of-the-art approaches regarding dementia classification. Specifically, our model achieves accuracy values of 91.1 and 80.1% on ADNI-1 for AD diagnosis and mild cognitive impairment (MCI) convention prediction, respectively.

18.

Deep networks for behavioral variant frontotemporal dementia identification from multiple acquisition sources.

Di Benedetto, Marco; Carrara, Fabio; Tafuri, Benedetta; Nigro, Salvatore; De Blasi, Roberto; Falchi, Fabrizio; Gennaro, Claudio; Gigli, Giuseppe; Logroscino, Giancarlo; Amato, Giuseppe.

Comput Biol Med ; 148: 105937, 2022 09.

Artículo en Inglés | MEDLINE | ID: mdl-35985188

RESUMEN

Behavioral variant frontotemporal dementia (bvFTD) is a neurodegenerative syndrome whose clinical diagnosis remains a challenging task especially in the early stage of the disease. Currently, the presence of frontal and anterior temporal lobe atrophies on magnetic resonance imaging (MRI) is part of the diagnostic criteria for bvFTD. However, MRI data processing is usually dependent on the acquisition device and mostly require human-assisted crafting of feature extraction. Following the impressive improvements of deep architectures, in this study we report on bvFTD identification using various classes of artificial neural networks, and present the results we achieved on classification accuracy and obliviousness on acquisition devices using extensive hyperparameter search. In particular, we will demonstrate the stability and generalization of different deep networks based on the attention mechanism, where data intra-mixing confers models the ability to identify the disorder even on MRI data in inter-device settings, i.e., on data produced by different acquisition devices and without model fine tuning, as shown from the very encouraging performance evaluations that dramatically reach and overcome the 90% value on the AuROC and balanced accuracy metrics.

Asunto(s)

Enfermedad de Alzheimer , Demencia Frontotemporal , Atrofia , Humanos , Imagen por Resonancia Magnética

19.

Recognition of the ligand-induced spatiotemporal residue pair pattern of ß2-adrenergic receptors using 3-D residual networks trained by the time series of protein distance maps.

Han, Minwoo; Lee, Seungju; Ha, Yuna; Lee, Jee-Young.

Comput Struct Biotechnol J ; 20: 6360-6374, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36420156

RESUMEN

G protein-coupled receptors (GPCRs) are promising drug targets because they play a large role in physiological processes by modulating diverse signaling pathways in the human body. The GPCR-mediated signaling pathways are regulated by four types of ligands-agonists, neutral antagonists, partial agonists, and inverse agonists. Once each type of ligand is bound to the binding site, it activates, deactivates, or does not perturb signaling by shifting the conformational ensemble of GPCRs. Predicting the ligand's effect on the conformation at the binding moment could be a powerful screening tool for rational GPCR drug design. Here, we detected conformational differences by capturing the spatiotemporal residue pair pattern of the ligand-bound ß2-adrenergic receptor (ß2AR) using a 3-dimensional residual network, 3D-ResNets. The network was trained with the time series of protein distance maps extracted from hundreds of molecular dynamics (MD) simulation trajectories of ten ß2AR-ligand complexes. The MD system was constructed with a lipid bilayer embedded in an inactive ß2AR X-ray crystal structure and solvated with explicit water molecules. To train the network, three hyperparameters were tested, and it was found that the number of MD trajectories in the training set significantly affected the model's accuracy. The classification of agonists and neutral antagonists was successful, but inverse agonists were not. Between the agonists and antagonists, different residue pair patterns were spotted on the extracellular loop segment. This result demonstrates the potential application of a 3-D neural network in GPCR drug screening, as well as an analysis tool for protein functional dynamics.

20.

Multi-view features integrated 2D\3D Net for glomerulopathy histologic types classification using ultrasound images.

Hai, Jinjin; Qiao, Kai; Chen, Jian; Liang, Ningning; Zhang, Lijie; Yan, Bin.

Comput Methods Programs Biomed ; 212: 106439, 2021 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-34695734

RESUMEN

BACKGROUND AND OBJECTIVE: Early diagnoses and rational therapeutics of glomerulopathy can control progression and improve prognosis. The gold standard for the diagnosis of glomerulopathy is pathology by renal biopsy, which is invasive and has many contraindications. We aim to use renal ultrasonography for histologic classification of glomerulopathy. METHODS: Ultrasonography can present multi-view sections of kidney, thus we proposed a multi-view and cross-domain integration strategy (CD-ConcatNet) to obtain more effective features and improve diagnosis accuracy. We creatively apply 2D group convolution and 3D convolution to process multiple 2D ultrasound images and extract multi-view features of renal ultrasound images. Cross-domain concatenation in each spatial resolution of feature maps is applied for more informative feature learning. RESULTS: A total of 76 adult patients were collected and divided into training dataset (56 cases with 515 images) and validation dataset (20 cases with 180 images). We obtained the best mean accuracy of 0.83 and AUC of 0.8667 in the validation dataset. CONCLUSION: Comparison experiments demonstrate that our designed CD-ConcatNet achieves the best classification performance and has great superiority on histologic types diagnosis. Results also prove that the integration of multi-view ultrasound images is beneficial for histologic classification and ultrasound images can indeed provide discriminating information for histologic diagnosis.

Asunto(s)

Ultrasonografía , Humanos

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

Detalles de la búsqueda