Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Neural Netw ; 179: 106555, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-39068676

RESUMO

Lossy image coding techniques usually result in various undesirable compression artifacts. Recently, deep convolutional neural networks have seen encouraging advances in compression artifact reduction. However, most of them focus on the restoration of the luma channel without considering the chroma components. Besides, most deep convolutional neural networks are hard to deploy in practical applications because of their high model complexity. In this article, we propose a dual-stage feedback network (DSFN) for lightweight color image compression artifact reduction. Specifically, we propose a novel curriculum learning strategy to drive a DSFN to reduce color image compression artifacts in a luma-to-RGB manner. In the first stage, the DSFN is dedicated to reconstructing the luma channel, whose high-level features containing rich structural information are then rerouted to the second stage by a feedback connection to guide the RGB image restoration. Furthermore, we present a novel enhanced feedback block for efficient high-level feature extraction, in which an adaptive iterative self-refinement module is carefully designed to refine the low-level features progressively, and an enhanced separable convolution is advanced to exploit multiscale image information fully. Extensive experiments show the notable advantage of our DSFN over several state-of-the-art methods in both quantitative indices and visual effects with lower model complexity.

2.
Front Aging Neurosci ; 16: 1341227, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39081395

RESUMO

Objective: Early identification of cognitive impairment in older adults could reduce the burden of age-related disabilities. Gait parameters are associated with and predictive of cognitive decline. Although a variety of sensors and machine learning analysis methods have been used in cognitive studies, a deep optimized machine vision-based method for analyzing gait to identify cognitive decline is needed. Methods: This study used a walking footage dataset of 158 adults named West China Hospital Elderly Gait, which was labelled by performance on the Short Portable Mental Status Questionnaire. We proposed a novel recognition network, Deep Optimized GaitPart (DO-GaitPart), based on silhouette and skeleton gait images. Three improvements were applied: short-term temporal template generator (STTG) in the template generation stage to decrease computational cost and minimize loss of temporal information; depth-wise spatial feature extractor (DSFE) to extract both global and local fine-grained spatial features from gait images; and multi-scale temporal aggregation (MTA), a temporal modeling method based on attention mechanism, to improve the distinguishability of gait patterns. Results: An ablation test showed that each component of DO-GaitPart was essential. DO-GaitPart excels in backpack walking scene on CASIA-B dataset, outperforming comparison methods, which were GaitSet, GaitPart, MT3D, 3D Local, TransGait, CSTL, GLN, GaitGL and SMPLGait on Gait3D dataset. The proposed machine vision gait feature identification method achieved a receiver operating characteristic/area under the curve (ROCAUC) of 0.876 (0.852-0.900) on the cognitive state classification task. Conclusion: The proposed method performed well identifying cognitive decline from the gait video datasets, making it a prospective prototype tool in cognitive assessment.

3.
IEEE J Biomed Health Inform ; 28(5): 3055-3066, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38381639

RESUMO

Chinese medical machine reading comprehension question-answering (cMed-MRCQA) is a critical component of the intelligence question-answering task, focusing on the Chinese medical domain question-answering task. Its purpose enable machines to analyze and understand the given text and question and then extract the accurate answer. To enhance cMed-MRCQA performance, it is essential to possess a profound comprehension and analysis of the context, deduce concealed information from the textual content and, subsequently, precisely determine the answer's span. The answer span has predominantly been defined by language items, with sentences employed in most instances. However, it is worth noting that sentences may not be properly split to varying degrees in various languages, making it challenging for the model to predict the answer zone. To alleviate this issue, this paper presents a novel architecture called HCT based on a Hierarchically Collaborative Transformer. Specifically, we presented a hierarchical collaborative method to locate the boundaries of sentence and answer spans separately. First, we designed a hierarchical encoding module to obtain the local semantic features of the corpus; second, we proposed a sentence-level self-attention module and a fused interaction-attention module to get the global information about the text. Finally, the model is trained by combining loss functions. Extensive experiments were conducted on the public dataset CMedMRC and the reconstruction dataset eMedicine to validate the effectiveness of the proposed method. Experimental results showed that the proposed method performed better than the state-of-the-art methods. Using the F1 metric, our model scored 90.4% on the CMedMRC and 73.2% on eMedicine.


Assuntos
Compreensão , Humanos , Compreensão/fisiologia , China , Processamento de Linguagem Natural , Leitura , Semântica , População do Leste Asiático
4.
Artigo em Inglês | MEDLINE | ID: mdl-38083225

RESUMO

Structural MRI and PET imaging play an important role in the diagnosis of Alzheimer's disease (AD), showing the morphological changes and glucose metabolism changes in the brain respectively. The manifestations in the brain image of some cognitive impairment patients are relatively inconspicuous, for example, it still has difficulties in achieving accurate diagnosis through sMRI in clinical practice. With the emergence of deep learning, convolutional neural network (CNN) has become a valuable method in AD-aided diagnosis, but some CNN methods cannot effectively learn the features of brain image, making the diagnosis of AD still presents some challenges. In this work, we propose an end-to-end 3D CNN framework for AD diagnosis based on ResNet, which integrates multi-layer features obtained under the effect of the attention mechanism to better capture subtle differences in brain images. The attention maps showed our model can focus on key brain regions related to the disease diagnosis. Our method was verified in ablation experiments with two modality images on 792 subjects from the ADNI database, where AD diagnostic accuracies of 89.71% and 91.18% were achieved based on sMRI and PET respectively, and also outperformed some state-of-the-art methods.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico por imagem , Redes Neurais de Computação , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos , Encéfalo/diagnóstico por imagem
5.
Comput Biol Med ; 164: 107328, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37573721

RESUMO

In recent years, deep learning models have been applied to neuroimaging data for early diagnosis of Alzheimer's disease (AD). Structural magnetic resonance imaging (sMRI) and positron emission tomography (PET) images provide structural and functional information about the brain, respectively. Combining these features leads to improved performance than using a single modality alone in building predictive models for AD diagnosis. However, current multi-modal approaches in deep learning, based on sMRI and PET, are mostly limited to convolutional neural networks, which do not facilitate integration of both image and phenotypic information of subjects. We propose to use graph neural networks (GNN) that are designed to deal with problems in non-Euclidean domains. In this study, we demonstrate how brain networks are created from sMRI or PET images and can be used in a population graph framework that combines phenotypic information with imaging features of the brain networks. Then, we present a multi-modal GNN framework where each modality has its own branch of GNN and a technique that combines the multi-modal data at both the level of node vectors and adjacency matrices. Finally, we perform late fusion to combine the preliminary decisions made in each branch and produce a final prediction. As multi-modality data becomes available, multi-source and multi-modal is the trend of AD diagnosis. We conducted explorative experiments based on multi-modal imaging data combined with non-imaging phenotypic information for AD diagnosis and analyzed the impact of phenotypic information on diagnostic performance. Results from experiments demonstrated that our proposed multi-modal approach improves performance for AD diagnosis. Our study also provides technical reference and support the need for multivariate multi-modal diagnosis methods.


Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação , Tomografia por Emissão de Pósitrons/métodos , Neuroimagem/métodos , Diagnóstico Precoce
6.
Phys Rev E ; 107(5-2): 055309, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37329045

RESUMO

Digital cores can characterize the true internal structure of rocks at the pore scale. This method has become one of the most effective ways to quantitatively analyze the pore structure and other properties of digital cores in rock physics and petroleum science. Deep learning can precisely extract features from training images for a rapid reconstruction of digital cores. Usually, the reconstruction of three-dimensional (3D) digital cores is performed by optimization using generative adversarial networks. The training data required for the 3D reconstruction are 3D training images. In practice, two-dimensional (2D) imaging devices are widely used because they can achieve faster imaging, higher resolution, and easier identification of different rock phases, so replacing 3D images with 2D ones avoids the difficulty of acquiring 3D images. In this paper, we propose a method, named EWGAN-GP, for the reconstruction of 3D structures from a 2D image. Our proposed method includes an encoder, a generator, and three discriminators. The main purpose of the encoder is to extract statistical features of a 2D image. The generator extends the extracted features into 3D data structures. Meanwhile, the three discriminators have been designed to gauge the similarity of morphological characteristics between cross sections of the reconstructed 3D structure and the real image. The porosity loss function is used to control the distribution of each phase in general. In the entire optimization process, a strategy using Wasserstein distance with gradient penalty makes the convergence of the training process faster and the reconstruction result more stable; it also avoids the problems of gradient disappearance and mode collapse. Finally, the reconstructed 3D structure and the target 3D structure are visualized to ascertain their similar morphologies. The morphological parameter indicators of the reconstructed 3D structure were consistent with those of the target 3D structure. The microstructure parameters of the 3D structure were also compared and analyzed. The proposed method can achieve accurate and stable 3D reconstruction compared with classical stochastic methods of image reconstruction.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Porosidade
7.
Comput Biol Med ; 162: 107050, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37269680

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disorder, the most common cause of dementia, so the accurate diagnosis of AD and its prodromal stage mild cognitive impairment (MCI) is significant. Recent studies have demonstrated that multiple neuroimaging and biological measures contain complementary information for diagnosis. Many existing multi-modal models based on deep learning simply concatenate each modality's features despite substantial differences in representation spaces. In this paper, we propose a novel multi-modal cross-attention AD diagnosis (MCAD) framework to learn the interaction between modalities for better playing their complementary roles for AD diagnosis with multi-modal data including structural magnetic resonance imaging (sMRI), fluorodeoxyglucose-positron emission tomography (FDG-PET) and cerebrospinal fluid (CSF) biomarkers. Specifically, the imaging and non-imaging representations are learned by the image encoder based on cascaded dilated convolutions and CSF encoder, respectively. Then, a multi-modal interaction module is introduced, which takes advantage of cross-modal attention to integrate imaging and non-imaging information and reinforce relationships between these modalities. Moreover, an extensive objective function is designed to reduce the discrepancy between modalities for effectively fusing the features of multi-modal data, which could further improve the diagnosis performance. We evaluate the effectiveness of our proposed method on the ADNI dataset, and the extensive experiments demonstrate that our MCAD achieves superior performance for multiple AD-related classification tasks, compared to several competing methods. Also, we investigate the importance of cross-attention and the contribution of each modality to the diagnostics performance. The experimental results demonstrate that combining multi-modality data via cross-attention is helpful for accurate AD diagnosis.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Doença de Alzheimer/diagnóstico por imagem , Neuroimagem/métodos , Imageamento por Ressonância Magnética/métodos , Tomografia por Emissão de Pósitrons/métodos , Disfunção Cognitiva/diagnóstico por imagem
8.
Math Biosci Eng ; 20(3): 4912-4939, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36896529

RESUMO

Chinese medical knowledge-based question answering (cMed-KBQA) is a vital component of the intelligence question-answering assignment. Its purpose is to enable the model to comprehend questions and then deduce the proper answer from the knowledge base. Previous methods solely considered how questions and knowledge base paths were represented, disregarding their significance. Due to entity and path sparsity, the performance of question and answer cannot be effectively enhanced. To address this challenge, this paper presents a structured methodology for the cMed-KBQA based on the cognitive science dual systems theory by synchronizing an observation stage (System 1) and an expressive reasoning stage (System 2). System 1 learns the question's representation and queries the associated simple path. Then System 2 retrieves complicated paths for the question from the knowledge base by using the simple path provided by System 1. Specifically, System 1 is implemented by the entity extraction module, entity linking module, simple path retrieval module, and simple path-matching model. Meanwhile, System 2 is performed by using the complex path retrieval module and complex path-matching model. The public CKBQA2019 and CKBQA2020 datasets were extensively studied to evaluate the suggested technique. Using the metric average F1-score, our model achieved 78.12% on CKBQA2019 and 86.60% on CKBQA2020.


Assuntos
Bases de Conhecimento , Semântica , Armazenamento e Recuperação da Informação , Resolução de Problemas
9.
Comput Methods Programs Biomed ; 228: 107249, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36423486

RESUMO

BACKGROUND AND OBJECTIVE: The Chinese medical question answer matching (cMedQAM) task is the essential branch of the medical question answering system. Its goal is to accurately choose the correct response from a pool of candidate answers. The relatively effective methods are deep neural network-based and attention-based to obtain rich question-and-answer representations. However, those methods overlook the crucial characteristics of Chinese characters: glyphs and pinyin. Furthermore, they lose the local semantic information of the phrase by generating attention information using only relevant medical keywords. To address this challenge, we propose the multi-scale context-aware interaction approach based on multi-granularity embedding (MAGE) in this paper. METHODS: We adapted ChineseBERT, which integrates Chinese characters glyphs and pinyin information into the language model and fine-tunes the medical corpus. It solves the common phenomenon of homonyms in Chinese. Moreover, we proposed a context-aware interactive module to correctly align question and answer sequences and infer semantic relationships. Finally, we utilized the multi-view fusion method to combine local semantic features and attention representation. RESULTS: We conducted validation experiments on the three publicly available datasets, namely cMedQA V1.0, cMedQA V2.0, and cEpilepsyQA. The proposed multi-scale context-aware interaction approach based on the multi-granularity embedding method is validated by top-1 accuracy. On cMedQA V1.0, cMedQA V2.0, and cEpilepsyQA, the top-1 accuracy on the test dataset was improved by 74.1%, 82.7%, and 60.9%, respectively. Experimental results on the three datasets demonstrate that our MAGE achieves superior performance over state-of-the-art methods for the Chinese medical question answer matching tasks. CONCLUSIONS: The experiment results indicate that the proposed model can improve the accuracy of the Chinese medical question answer matching task. Therefore, it may be considered a potential intelligent assistant tool for the future Chinese medical answer question system.


Assuntos
População do Leste Asiático , Idioma , Humanos
10.
Heliyon ; 8(10): e11038, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36267375

RESUMO

Visual-based social group detection aims to cluster pedestrians in crowd scenes according to social interactions and spatio-temporal position relations by using surveillance video data. It is a basic technique for crowd behaviour analysis and group-based activity understanding. According to the theory of proxemics study, the interpersonal relationship between individuals determines the scope of their self-space, while the spatial distance can reflect the closeness degree of their interpersonal relationship. In this paper, we proposed a new unsupervised approach to address the issues of interaction recognition and social group detection in public spaces, which remits the need to intensely label time-consuming training data. First, based on pedestrians' spatio-temporal trajectories, the interpersonal distances among individuals were measured from static and dynamic perspectives. Combined with proxemics' theory, a social interaction recognition scheme was designed to judge whether there is a social interaction between pedestrians. On this basis, the pedestrians are clustered to identify if they form a social group. Extensive experiments on our pedestrian dataset "SCU-VSD-Social" annotated with multi-group labels demonstrated that the proposed method has outstanding performance in both accuracy and complexity.

11.
Phys Rev E ; 106(2-2): 025310, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-36109946

RESUMO

Modeling the three-dimensional (3D) structure from a given 2D image is of great importance for analyzing and studying the physical properties of porous media. As an intractable inverse problem, many methods have been developed to address this fundamental problems over the past decades. Among many methods, the deep learning-(DL) based methods show great advantages in terms of accuracy, diversity, and efficiency. Usually, the 3D reconstruction from the 2D slice with a larger field-of-view is more conducive to simulate and analyze the physical properties of porous media accurately. However, due to the limitation of reconstruction ability, the reconstruction size of most widely used generative adversarial network-based model is constrained to 64^{3} or 128^{3}. Recently, a 3D porous media recurrent neural network based method (namely, 3D-PMRNN) (namely 3D-PMRNN) has been proposed to improve the reconstruction ability, and thus the reconstruction size is expanded to 256^{3}. Nevertheless, in order to train these models, the existed DL-based methods need to down-sample the original computed tomography (CT) image first so that the convolutional kernel can capture the morphological features of training images. Thus, the detailed information of the original CT image will be lost. Besides, the 3D reconstruction from a optical thin section is not available because of the large size of the cutting slice. In this paper, we proposed an improved recurrent generative model to further enhance the reconstruction ability (512^{3}). Benefiting from the RNN-based architecture, the proposed model requires only one 3D training sample at least and generates the 3D structures layer by layer. There are three more improvements: First, a hybrid receptive field for the kernel of convolutional neural network is adopted. Second, an attention-based module is merged into the proposed model. Finally, a useful section loss is proposed to enhance the continuity along the Z direction. Three experiments are carried out to verify the effectiveness of the proposed model. Experimental results indicate the good reconstruction ability of proposed model in terms of accuracy, diversity, and generalization. And the effectiveness of section loss is also proved from the perspective of visual inspection and statistical comparison.

12.
Artif Intell Med ; 131: 102346, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36100340

RESUMO

Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. Compared with general visual question answering task, Med-VQA task involve more demanding challenges. First, clinical questions about medical images are usually diverse due to different clinicians and the complexity of diseases. Consequently, noise is inevitably introduced when extracting question features. Second, Med-VQA task have always been regarded as a classification problem for predefined answers, ignoring the relationships between candidate responses. Thus, the Med-VQA model pays equal attention to all candidate answers when predicting answers. In this paper, a novel Med-VQA framework is proposed to alleviate the above-mentioned problems. Specifically, we employed a question-type reasoning module severally to closed-ended and open-ended questions, thereby extracting the important information contained in the questions through an attention mechanism and filtering the noise to extract more valuable question features. To take advantage of the relational information between answers, we designed a semantic constraint space to calculate the similarity between the answers and assign higher attention to answers with high correlation. To evaluate the effectiveness of the proposed method, extensive experiments were conducted on a public dataset, namely VQA-RAD. Experimental results showed that the proposed method achieved better performance compared to other the state-of-the-art methods. The overall accuracy, closed-ended accuracy, and open-ended accuracy reached 74.1 %, 82.7 %, and 60.9 %, respectively. It is worth noting that the absolute accuracy of the proposed method improved by 5.5 % for closed-ended questions.


Assuntos
Semântica , Algoritmos , Atenção , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/métodos
13.
Neural Netw ; 155: 155-167, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36058021

RESUMO

Text-to-image synthesis is a fundamental and challenging task in computer vision, which aims to synthesize realistic images from given descriptions. Recently, text-to-image synthesis methods have achieved great improvements in the quality of synthesized images. However, very few works have explored its application in the scenario of face synthesis, which is of great potentials in face-related applications and the public safety domain. On the other side, the faces generated by existing methods are generally of poor quality and have low consistency to the given text. To tackle this issue, in this paper, we build a novel end-to-end dual-channel generator based generative adversarial network, named DualG-GAN, to improve the quality of the generated images and the consistency to the text description. In DualG-GAN, to improve the consistency between the synthesized image and the input description, a dual-channel generator block is introduced, and a novel loss is designed to improve the similarity between the generated image and the ground-truth in three different semantic levels. Extensive experiments demonstrate that DualG-GAN achieves state-of-the-art results on SCU-Text2face dataset. To further verify the performance of DualG-GAN, we compare it with the current optimal methods on text-to-image synthesis tasks, where quantitative and qualitative results show that the proposed DualG-GAN achieves optimal performance in both Fréchet inception distance (FID) and R-precision metrics. As only a few works are focusing on text-to-face synthesis, this work can be seen as a baseline for future research.


Assuntos
Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos
14.
Math Biosci Eng ; 19(10): 10192-10212, 2022 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-36031991

RESUMO

Medical visual question answering (Med-VQA) aims to leverage a pre-trained artificial intelligence model to answer clinical questions raised by doctors or patients regarding radiology images. However, owing to the high professional requirements in the medical field and the difficulty of annotating medical data, Med-VQA lacks sufficient large-scale, well-annotated radiology images for training. Researchers have mainly focused on improving the ability of the model's visual feature extractor to address this problem. However, there are few researches focused on the textual feature extraction, and most of them underestimated the interactions between corresponding visual and textual features. In this study, we propose a corresponding feature fusion (CFF) method to strengthen the interactions of specific features from corresponding radiology images and questions. In addition, we designed a semantic attention (SA) module for textual feature extraction. This helps the model consciously focus on the meaningful words in various questions while reducing the attention spent on insignificant information. Extensive experiments demonstrate that the proposed method can achieve competitive results in two benchmark datasets and outperform existing state-of-the-art methods on answer prediction accuracy. Experimental results also prove that our model is capable of semantic understanding during answer prediction, which has certain advantages in Med-VQA.


Assuntos
Inteligência Artificial , Semântica , Algoritmos , Atenção , Humanos
15.
Sensors (Basel) ; 22(16)2022 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-36015738

RESUMO

The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper. Specifically, to better predict areas with similar texture information, we propose a fusion-based adaptive template matching method, which directly takes the error between reference and objective templates into account. Furthermore, to better utilize the correlation between reference pixels and the pixels to be predicted, we propose a fusion-based linear prediction method, which can compensate for the deficiency of single linear prediction. We implemented our algorithm on top of the VVC Test Model (VTM) 9.1. When compared with the VVC, our proposed fusion-based algorithm saves a bitrate of 0.89%, 0.84%, and 0.90% on average for the Y, Cb, and Cr components, respectively. In addition, when compared with some other existing works, our algorithm showed superior performance in bitrate savings.


Assuntos
Compressão de Dados , Algoritmos , Compressão de Dados/métodos , Gravação em Vídeo/métodos
16.
Sensors (Basel) ; 22(15)2022 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-35898027

RESUMO

Despite the fact that Versatile Video Coding (VVC) achieves a superior coding performance to High-Efficiency Video Coding (HEVC), it takes a lot of time to encode video sequences due to the high computational complexity of the tools. Among these tools, Multiple Transform Selection (MTS) require the best of several transforms to be obtained using the Rate-Distortion Optimization (RDO) process, which increases the time spent video encoding, meaning that VVC is not suited to real-time sensor application networks. In this paper, a low-complexity multiple transform selection, combined with the multi-type tree partition algorithm, is proposed to address the above issue. First, to skip the MTS process, we introduce a method to estimate the Rate-Distortion (RD) cost of the last Coding Unit (CU) based on the relationship between the RD costs of transform candidates and the correlation between Sub-Coding Units' (sub-CUs') information entropy under binary splitting. When the sum of the RD costs of sub-CUs is greater than or equal to their parent CU, the RD checking of MTS will be skipped. Second, we make full use of the coding information of neighboring CUs to terminate MTS early. The experimental results show that, compared with the VVC, the proposed method achieves a 26.40% reduction in time, with a 0.13% increase in Bjøontegaard Delta Bitrate (BDBR).


Assuntos
Algoritmos , Entropia , Gravação em Vídeo/métodos
17.
J Neural Eng ; 19(4)2022 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-35882218

RESUMO

Objective. Alzheimer's disease (AD) is a degenerative brain disorder, one of the main causes of death in elderly people, so early diagnosis of AD is vital to prompt access to medication and medical care. Fluorodeoxyglucose positron emission tomography (FDG-PET) proves to be effective to help understand neurological changes via measuring glucose uptake. Our aim is to explore information-rich regions of FDG-PET imaging, which enhance the accuracy and interpretability of AD-related diagnosis.Approach. We develop a novel method for early diagnosis of AD based on multi-scale discriminative regions in FDG-PET imaging, which considers the diagnosis interpretability. Specifically, a multi-scale region localization module is discussed to automatically identify disease-related discriminative regions in full-volume FDG-PET images in an unsupervised manner, upon which a confidence score is designed to evaluate the prioritization of regions according to the density distribution of anomalies. Then, the proposed multi-scale region classification module adaptively fuses multi-scale region representations and makes decision fusion, which not only reduces useless information but also offers complementary information. Most of previous methods concentrate on discriminating AD from cognitively normal (CN), while mild cognitive impairment, a transitional state, facilitates early diagnosis. Therefore, our method is further applied to multiple AD-related diagnosis tasks, not limited to AD vs. CN.Main results. Experimental results on the Alzheimer's Disease Neuroimaging Initiative dataset show that the proposed method achieves superior performance over state-of-the-art FDG-PET-based approaches. Besides, some cerebral cortices highlighted by extracted regions cohere with medical research, further demonstrating the superiority.Significance. This work offers an effective method to achieve AD diagnosis and detect disease-affected regions in FDG-PET imaging. Our results could be beneficial for providing an additional opinion on the clinical diagnosis.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Idoso , Doença de Alzheimer/diagnóstico por imagem , Encéfalo , Disfunção Cognitiva/diagnóstico por imagem , Diagnóstico Precoce , Fluordesoxiglucose F18 , Humanos , Tomografia por Emissão de Pósitrons/métodos
18.
Comput Methods Programs Biomed ; 217: 106676, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35167997

RESUMO

BACKGROUND AND OBJECTIVE: Multi-modal medical images, such as magnetic resonance imaging (MRI) and positron emission tomography (PET), have been widely used for the diagnosis of brain disorder diseases like Alzheimer's disease (AD) since they can provide various information. PET scans can detect cellular changes in organs and tissues earlier than MRI. Unlike MRI, PET data is difficult to acquire due to cost, radiation, or other limitations. Moreover, PET data is missing for many subjects in the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. To solve this problem, a 3D end-to-end generative adversarial network (named BPGAN) is proposed to synthesize brain PET from MRI scans, which can be used as a potential data completion scheme for multi-modal medical image research. METHODS: We propose BPGAN, which learns an end-to-end mapping function to transform the input MRI scans to their underlying PET scans. First, we design a 3D multiple convolution U-Net (MCU) generator architecture to improve the visual quality of synthetic results while preserving the diverse brain structures of different subjects. By further employing a 3D gradient profile (GP) loss and structural similarity index measure (SSIM) loss, the synthetic PET scans have higher-similarity to the ground truth. In this study, we explore alternative data partitioning ways to study their impact on the performance of the proposed method in different medical scenarios. RESULTS: We conduct experiments on a publicly available ADNI database. The proposed BPGAN is evaluated by mean absolute error (MAE), peak-signal-to-noise-ratio (PSNR) and SSIM, superior to other compared models in these quantitative evaluation metrics. Qualitative evaluations also validate the effectiveness of our approach. Additionally, combined with MRI and our synthetic PET scans, the accuracies of multi-class AD diagnosis on dataset-A and dataset-B are 85.00% and 56.47%, which have been improved by about 1% and 1%, respectively, compared to the stand-alone MRI. CONCLUSIONS: The experimental results of quantitative measures, qualitative displays, and classification evaluation demonstrate that the synthetic PET images by BPGAN are reasonable and high-quality, which provide complementary information to improve the performance of AD diagnosis. This work provides a valuable reference for multi-modal medical image analysis.


Assuntos
Doença de Alzheimer , Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Tomografia por Emissão de Pósitrons
19.
IEEE Trans Neural Netw Learn Syst ; 33(1): 430-444, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34793307

RESUMO

The amount of multimedia data, such as images and videos, has been increasing rapidly with the development of various imaging devices and the Internet, bringing more stress and challenges to information storage and transmission. The redundancy in images can be reduced to decrease data size via lossy compression, such as the most widely used standard Joint Photographic Experts Group (JPEG). However, the decompressed images generally suffer from various artifacts (e.g., blocking, banding, ringing, and blurring) due to the loss of information, especially at high compression ratios. This article presents a feature-enriched deep convolutional neural network for compression artifacts reduction (FeCarNet, for short). Taking the dense network as the backbone, FeCarNet enriches features to gain valuable information via introducing multi-scale dilated convolutions, along with the efficient 1 ×1 convolution for lowering both parameter complexity and computation cost. Meanwhile, to make full use of different levels of features in FeCarNet, a fusion block that consists of attention-based channel recalibration and dimension reduction is developed for local and global feature fusion. Furthermore, short and long residual connections both in the feature and pixel domains are combined to build a multi-level residual structure, thereby benefiting the network training and performance. In addition, aiming at reducing computation complexity further, pixel-shuffle-based image downsampling and upsampling layers are, respectively, arranged at the head and tail of the FeCarNet, which also enlarges the receptive field of the whole network. Experimental results show the superiority of FeCarNet over state-of-the-art compression artifacts reduction approaches in terms of both restoration capacity and model complexity. The applications of FeCarNet on several computer vision tasks, including image deblurring, edge detection, image segmentation, and object detection, demonstrate the effectiveness of FeCarNet further.

20.
J Neurosci Methods ; 365: 109376, 2022 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-34627926

RESUMO

BACKGROUND: Alzheimer's disease (AD) is the most common symptom of aggressive and irreversible dementia that affects people's ability of daily life. At present, neuroimaging technology plays an important role in the evaluation and early diagnosis of AD. With the widespread application of artificial intelligence in the medical field, deep learning has shown great potential in computer-aided AD diagnosis based on MRI. NEW METHOD: In this study, we proposed a deep learning framework based on sMRI gray matter slice for AD diagnosis. Compared with the previous methods based on deep learning, our method enhanced gray matter feature information more effectively by combination of slice region and attention mechanism, which can improve the accuracy on the AD diagnosis. RESULTS: To ensure the performance of our proposed method, the experiment was evaluated on T1 weighted structural MRI (sMRI) images with non-leakage splitting from the ADNI database. Our method can achieve 0.90 accuracy in classification of AD/NC and 0.825 accuracy in classification of AD/MCI, which has better diagnostic performance and advantages than other competitive single-modality methods based on sMRI. Furthermore, we indicated the most discriminative brain MRI slice area determined for AD diagnosis. COMPARISON WITH EXISTING METHODS: Our proposed method based on the regional attention with GM slice has a 1%-8% improvement in accuracy compared with several state-of-the-art methods for AD diagnosis. CONCLUSIONS: The results of experiment indicate that our method can focus more effective features in the gray matter of coronal slices and to achieve a more accurate diagnosis of Alzheimer's disease. This study can provide a more remarkably effective approach and more objective evaluation for the diagnosis of AD based on sMRI slice images.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Doença de Alzheimer/diagnóstico por imagem , Inteligência Artificial , Disfunção Cognitiva/diagnóstico , Substância Cinzenta/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA