RESUMO
Aging in an individual refers to the temporal change, mostly decline, in the body's ability to meet physiological demands. Biological age (BA) is a biomarker of chronological aging and can be used to stratify populations to predict certain age-related chronic diseases. BA can be predicted from biomedical features such as brain MRI, retinal, or facial images, but the inherent heterogeneity in the aging process limits the usefulness of BA predicted from individual body systems. In this paper, we developed a multimodal Transformer-based architecture with cross-attention which was able to combine facial, tongue, and retinal images to estimate BA. We trained our model using facial, tongue, and retinal images from 11,223 healthy subjects and demonstrated that using a fusion of the three image modalities achieved the most accurate BA predictions. We validated our approach on a test population of 2,840 individuals with six chronic diseases and obtained significant difference between chronological age and BA (AgeDiff) than that of healthy subjects. We showed that AgeDiff has the potential to be utilized as a standalone biomarker or conjunctively alongside other known factors for risk stratification and progression prediction of chronic diseases. Our results therefore highlight the feasibility of using multimodal images to estimate and interrogate the aging process.
Assuntos
Envelhecimento , Fontes de Energia Elétrica , Humanos , Face , Biomarcadores , Doença CrônicaRESUMO
Drug-target interactions (DTIs) are a key part of drug development process and their accurate and efficient prediction can significantly boost development efficiency and reduce development time. Recent years have witnessed the rapid advancement of deep learning, resulting in an abundance of deep learning-based models for DTI prediction. However, most of these models used a single representation of drugs and proteins, making it difficult to comprehensively represent their characteristics. Multimodal data fusion can effectively compensate for the limitations of single-modal data. However, existing multimodal models for DTI prediction do not take into account both intra- and inter-modal interactions simultaneously, resulting in limited presentation capabilities of fused features and a reduction in DTI prediction accuracy. A hierarchical multimodal self-attention-based graph neural network for DTI prediction, called HMSA-DTI, is proposed to address multimodal feature fusion. Our proposed HMSA-DTI takes drug SMILES, drug molecular graphs, protein sequences and protein 2-mer sequences as inputs, and utilizes a hierarchical multimodal self-attention mechanism to achieve deep fusion of multimodal features of drugs and proteins, enabling the capture of intra- and inter-modal interactions between drugs and proteins. It is demonstrated that our proposed HMSA-DTI has significant advantages over other baseline methods on multiple evaluation metrics across five benchmark datasets.
Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Proteínas/química , Proteínas/metabolismo , Humanos , Algoritmos , Biologia Computacional/métodosRESUMO
Adolescents are high-risk population for major depressive disorder. Executive dysfunction emerges as a common feature of depression and exerts a significant influence on the social functionality of adolescents. This study aimed to identify the multimodal co-varying brain network related to executive function in adolescent with major depressive disorder. A total of 24 adolescent major depressive disorder patients and 43 healthy controls were included and completed the Intra-Extra Dimensional Set Shift Task. Multimodal neuroimaging data, including the amplitude of low-frequency fluctuations from resting-state functional magnetic resonance imaging and gray matter volume from structural magnetic resonance imaging, were combined with executive function using a supervised fusion method named multimodal canonical correlation analysis with reference plus joint independent component analysis. The major depressive disorder showed more total errors than the healthy controls in the Intra-Extra Dimensional Set Shift task. Their performance on the Intra-Extra Dimensional Set Shift Task was negatively related to the 14-item Hamilton Rating Scale for Anxiety score. We discovered an executive function-related multimodal fronto-occipito-temporal network with lower amplitude of low-frequency fluctuation and gray matter volume loadings in major depressive disorder. The gray matter component of the identified network was negatively related to errors made in Intra-Extra Dimensional Set Shift while positively related to stages completed. These findings may help to deepen our understanding of the pathophysiological mechanisms of cognitive dysfunction in adolescent depression.
Assuntos
Transtorno Depressivo Maior , Função Executiva , Imageamento por Ressonância Magnética , Imagem Multimodal , Humanos , Transtorno Depressivo Maior/diagnóstico por imagem , Transtorno Depressivo Maior/fisiopatologia , Adolescente , Função Executiva/fisiologia , Masculino , Feminino , Imageamento por Ressonância Magnética/métodos , Imagem Multimodal/métodos , Encéfalo/diagnóstico por imagem , Encéfalo/fisiopatologia , Substância Cinzenta/diagnóstico por imagem , Substância Cinzenta/patologia , Neuroimagem/métodos , Cognição/fisiologia , Rede Nervosa/diagnóstico por imagem , Rede Nervosa/fisiopatologia , Testes Neuropsicológicos , Mapeamento Encefálico/métodosRESUMO
INTRODUCTION: Previous brain studies of growth hormone deficiency (GHD) often used single-modal neuroimaging, missing the complexity captured by multimodal data. Growth hormone affects gut microbiota and metabolism in GHD. However, from a gut-brain axis (GBA) perspective, the relationship between abnormal GHD brain development and microbiota alterations remains unclear. The ultimate goal is to uncover the manifestations underlying GBA abnormalities in GHD and idiopathic short stature (ISS). METHODS: Participants included 23 GHD and 25 ISS children. The fusion independent component analysis was applied to integrate multimodal brain data (high-resolution structural, diffusion tensor, and resting-state functional MRI) covering regional homogeneity (ReHo), amplitude of low frequency fluctuations (ALFF), and white matter fractional anisotropy (FA). Gut microbiome diversity and metabolites were analyzed using 16S sequencing and proton nuclear magnetic resonance (1H-NMR). Associations between multimodal neuroimaging and cognition were assessed using moderation analysis. RESULTS: Six independent components (IC) of ReHo, ALFF, and FA differed significantly between GHD and ISS patients, with three functional components linked to the processing speed index. GHD individuals showed higher levels of acetate, nicotinate, and lysine in microbiota metabolism. Higher alpha diversity in GHD strengthened connections between ReHo-IC1, ReHo-IC5, ALFF-IC1, and the processing speed index, while increasing agathobacter levels in ISS weakened the link between ALFF-IC1 and the speech comprehension index. CONCLUSIONS: Our findings uncover differing brain structure and functional fusion in GHD, alongside microbiota metabolism of short-chain fatty acids. Additionally, microbiome influences connections between neuroimaging and cognition, offering insight into diverse GBA patterns in GHD and ISS, enhancing our understanding of the disease's pathophysiology and interventions.
Assuntos
Encéfalo , Cognição , Microbioma Gastrointestinal , Imageamento por Ressonância Magnética , Humanos , Microbioma Gastrointestinal/fisiologia , Masculino , Criança , Feminino , Encéfalo/diagnóstico por imagem , Encéfalo/metabolismo , Cognição/fisiologia , Adolescente , Eixo Encéfalo-Intestino/fisiologia , Hormônio do Crescimento Humano/deficiência , Hormônio do Crescimento Humano/metabolismo , Imagem de Tensor de DifusãoRESUMO
Acute ischemic stroke is a leading cause of mortality and morbidity worldwide. Timely identification of the extent of a stroke is crucial for effective treatment, whereas spatio-temporal (4D) Computed Tomography Perfusion (CTP) imaging is playing a critical role in this process. Recently, the first deep learning-based methods that leverage the full spatio-temporal nature of perfusion imaging for predicting stroke lesion outcomes have been proposed. However, clinical information is typically not integrated into the learning process, which may be helpful to improve the tissue outcome prediction given the known influence of various factors (i.e., physiological, demographic, and treatment factors) on lesion growth. Cross-attention, a multimodal fusion strategy, has been successfully used to combine information from multiple sources, but it has yet to be applied to stroke lesion outcome prediction. Therefore, this work aimed to develop and evaluate a novel multimodal and spatio-temporal deep learning model that utilizes cross-attention to combine information from 4D CTP and clinical metadata simultaneously to predict stroke lesion outcomes. The proposed model was evaluated using a dataset of 70 acute ischemic stroke patients, demonstrating significantly improved volume estimates (mean error = 19 ml) compared to a baseline unimodal approach (mean error = 35 ml, p< 0.05). The proposed model allows generating attention maps and counterfactual outcome scenarios to investigate the relevance of clinical variables in predicting stroke lesion outcomes at a patient level, helping to provide a better understanding of the model's decision-making process.
Assuntos
Isquemia Encefálica , AVC Isquêmico , Acidente Vascular Cerebral , Humanos , Isquemia Encefálica/diagnóstico por imagem , Isquemia Encefálica/terapia , Tomografia Computadorizada Quadridimensional , Acidente Vascular Cerebral/diagnóstico por imagem , Acidente Vascular Cerebral/terapia , Análise Espaço-Temporal , PerfusãoRESUMO
Neurovascular compression syndrome (NVCS), characterized by cranial nerve compression due to adjacent blood vessels at the root entry zone, frequently presents as trigeminal neuralgia (TN), hemifacial spasm (HFS), or glossopharyngeal neuralgia (GN). Despite its prevalence in NVCS assessment, Magnetic Resonance Tomographic Angiography (MRTA)'s limited sensitivity to small vessels and veins poses challenges. This study aims to refine vessel localization and surgical planning for NVCS patients using a novel 3D multimodal fusion imaging (MFI) technique incorporating computed tomography angiography and venography (CTA/CTV). A retrospective analysis was conducted on 76 patients who underwent MVD surgery and were diagnosed with single-site primary TN, HFS, or GN. Imaging was obtained from MRTA and CTA/CTV sequences, followed by image processing and 3D-MFI using FastSurfer and 3DSlicer. The CTA/CTV-3D-MFI showed higher sensitivity than MRTA-3D-MFI in predicting responsible vessels (98.6% vs. 94.6%) and NVC severity (98.6% vs. 90.8%). Kappa coefficients revealed strong agreement with MRTA-3D-MFI (0.855 for vessels, 0.835 for NVC severity) and excellent agreement with CTA/CTV-3D-MFI (0.951 for vessels, 0.952 for NVC). Resident neurosurgeons significantly preferred CTA/CTV-3D-MFI due to its better correlation with surgical reality, clearer depiction of surgical anatomy, and optimized visualization of approaches (p < 0.001). Implementing CTA/CTV-3D-MFI significantly enhanced diagnostic accuracy and surgical planning for NVCS, outperforming MRTA-3D-MFI in identifying responsible vessels and assessing NVC severity. This innovative imaging modality can potentially improve outcomes by guiding safer and more targeted surgeries, particularly in cases where MRTA may not adequately visualize crucial neurovascular structures.
Assuntos
Angiografia por Tomografia Computadorizada , Angiografia por Ressonância Magnética , Cirurgia de Descompressão Microvascular , Síndromes de Compressão Nervosa , Neuralgia do Trigêmeo , Humanos , Cirurgia de Descompressão Microvascular/métodos , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Síndromes de Compressão Nervosa/cirurgia , Síndromes de Compressão Nervosa/diagnóstico por imagem , Adulto , Estudos Retrospectivos , Neuralgia do Trigêmeo/cirurgia , Neuralgia do Trigêmeo/diagnóstico por imagem , Angiografia por Ressonância Magnética/métodos , Angiografia por Tomografia Computadorizada/métodos , Espasmo Hemifacial/cirurgia , Espasmo Hemifacial/diagnóstico por imagem , Imageamento Tridimensional/métodos , Doenças do Nervo Glossofaríngeo/cirurgia , Idoso de 80 Anos ou mais , Flebografia/métodosRESUMO
This review examines the recent developments in deep learning (DL) techniques applied to multimodal fusion image segmentation for liver cancer. Hepatocellular carcinoma is a highly dangerous malignant tumor that requires accurate image segmentation for effective treatment and disease monitoring. Multimodal image fusion has the potential to offer more comprehensive information and more precise segmentation, and DL techniques have achieved remarkable progress in this domain. This paper starts with an introduction to liver cancer, then explains the preprocessing and fusion methods for multimodal images, then explores the application of DL methods in this area. Various DL architectures such as convolutional neural networks (CNN) and U-Net are discussed and their benefits in multimodal image fusion segmentation. Furthermore, various evaluation metrics and datasets currently used to measure the performance of segmentation models are reviewed. While reviewing the progress, the challenges of current research, such as data imbalance, model generalization, and model interpretability, are emphasized and future research directions are suggested. The application of DL in multimodal image segmentation for liver cancer is transforming the field of medical imaging and is expected to further enhance the accuracy and efficiency of clinical decision making. This review provides useful insights and guidance for medical practitioners.
RESUMO
PURPOSE: Diagnosing Renal artery stenosis (RAS) presents challenges. This research aimed to develop a deep learning model for the computer-aided diagnosis of RAS, utilizing multimodal fusion technology based on ultrasound scanning images, spectral waveforms, and clinical information. METHODS: A total of 1485 patients received renal artery ultrasonography from Peking Union Medical College Hospital were included and their color doppler sonography (CDS) images were classified according to anatomical site and left-right orientation. The RAS diagnosis was modeled as a process involving feature extraction and multimodal fusion. Three deep learning (DL) models (ResNeSt, ResNet, and XCiT) were trained on a multimodal dataset consisted of CDS images, spectrum waveform images, and individual basic information. Predicted performance of different models were compared with senior physician and evaluated on a test dataset (N = 117 patients) with renal artery angiography results. RESULTS: Sample sizes of training and validation datasets were 3292 and 169 respectively. On test data (N = 676 samples), predicted accuracies of three DL models were more than 80% and the ResNeSt achieved the accuracy 83.49% ± 0.45%, precision 81.89% ± 3.00%, and recall 76.97% ± 3.7%. There was no significant difference between the accuracy of ResNeSt and ResNet (82.84% ± 1.52%), and the ResNeSt was higher than the XCiT (80.71% ± 2.23%, p < 0.05). Compared to the gold standard, renal artery angiography, the accuracy of ResNest model was 78.25% ± 1.62%, which was inferior to the senior physician (90.09%). Besides, compared to the multimodal fusion model, the performance of single-modal model on spectrum waveform images was relatively lower. CONCLUSION: The DL multimodal fusion model shows promising results in assisting RAS diagnosis.
Assuntos
Aprendizado Profundo , Obstrução da Artéria Renal , Humanos , Obstrução da Artéria Renal/diagnóstico por imagem , Angiografia , Ultrassonografia Doppler em Cores/métodosRESUMO
The maturity of fruits and vegetables such as tomatoes significantly impacts indicators of their quality, such as taste, nutritional value, and shelf life, making maturity determination vital in agricultural production and the food processing industry. Tomatoes mature from the inside out, leading to an uneven ripening process inside and outside, and these situations make it very challenging to judge their maturity with the help of a single modality. In this paper, we propose a deep learning-assisted multimodal data fusion technique combining color imaging, spectroscopy, and haptic sensing for the maturity assessment of tomatoes. The method uses feature fusion to integrate feature information from images, near-infrared spectra, and haptic modalities into a unified feature set and then classifies the maturity of tomatoes through deep learning. Each modality independently extracts features, capturing the tomatoes' exterior color from color images, internal and surface spectral features linked to chemical compositions in the visible and near-infrared spectra (350 nm to 1100 nm), and physical firmness using haptic sensing. By combining preprocessed and extracted features from multiple modalities, data fusion creates a comprehensive representation of information from all three modalities using an eigenvector in an eigenspace suitable for tomato maturity assessment. Then, a fully connected neural network is constructed to process these fused data. This neural network model achieves 99.4% accuracy in tomato maturity classification, surpassing single-modal methods (color imaging: 94.2%; spectroscopy: 87.8%; haptics: 87.2%). For internal and external maturity unevenness, the classification accuracy reaches 94.4%, demonstrating effective results. A comparative analysis of performance between multimodal fusion and single-modal methods validates the stability and applicability of the multimodal fusion technique. These findings demonstrate the key benefits of multimodal fusion in terms of improving the accuracy of tomato ripening classification and provide a strong theoretical and practical basis for applying multimodal fusion technology to classify the quality and maturity of other fruits and vegetables. Utilizing deep learning (a fully connected neural network) for processing multimodal data provides a new and efficient non-destructive approach for the massive classification of agricultural and food products.
Assuntos
Frutas , Redes Neurais de Computação , Solanum lycopersicum , Solanum lycopersicum/crescimento & desenvolvimento , Solanum lycopersicum/fisiologia , Frutas/crescimento & desenvolvimento , Aprendizado Profundo , Espectroscopia de Luz Próxima ao Infravermelho/métodos , CorRESUMO
To enhance the accuracy of detecting objects in front of intelligent vehicles in urban road scenarios, this paper proposes a dual-layer voxel feature fusion augmentation network (DL-VFFA). It aims to address the issue of objects misrecognition caused by local occlusion or limited field of view for targets. The network employs a point cloud voxelization architecture, utilizing the Mahalanobis distance to associate similar point clouds within neighborhood voxel units. It integrates local and global information through weight sharing to extract boundary point information within each voxel unit. The relative position encoding of voxel features is computed using an improved attention Gaussian deviation matrix in point cloud space to focus on the relative positions of different voxel sequences within channels. During the fusion of point cloud and image features, learnable weight parameters are designed to decouple fine-grained regions, enabling two-layer feature fusion from voxel to voxel and from point cloud to image. Extensive experiments on the KITTI dataset demonstrate the significant performance of DL-VFFA. Compared to the baseline network Second, DL-VFFA performs better in medium- and high-difficulty scenarios. Furthermore, compared to the voxel fusion module in MVX-Net, the voxel feature fusion results in this paper are more accurate, effectively capturing fine-grained object features post-voxelization. Through ablative experiments, we conducted in-depth analyses of the three voxel fusion modules in DL-VFFA to enhance the performance of the baseline detector and achieved superior results.
RESUMO
In autonomous driving, the fusion of multiple sensors is considered essential to improve the accuracy and safety of 3D object detection. Currently, a fusion scheme combining low-cost cameras with highly robust radars can counteract the performance degradation caused by harsh environments. In this paper, we propose the IRBEVF-Q model, which mainly consists of BEV (Bird's Eye View) fusion coding module and an object decoder module.The BEV fusion coding module solves the problem of unified representation of different modal information by fusing the image and radar features through 3D spatial reference points as a medium. The query in the object decoder, as a core component, plays an important role in detection. In this paper, Heat Map-Guided Query Initialization (HGQI) and Dynamic Position Encoding (DPE) are proposed in query construction to increase the a priori information of the query. The Auxiliary Noise Query (ANQ) then helps to stabilize the matching. The experimental results demonstrate that the proposed fusion model IRBEVF-Q achieves an NDS of 0.575 and a mAP of 0.476 on the nuScenes test set. Compared to recent state-of-the-art methods, our model shows significant advantages, thus indicating that our approach contributes to improving detection accuracy.
RESUMO
Scene graphs can enhance the understanding capability of intelligent ships in navigation scenes. However, the complex entity relationships and the presence of significant noise in contextual information within navigation scenes pose challenges for navigation scene graph generation (NSGG). To address these issues, this paper proposes a novel NSGG network named SGK-Net. This network comprises three innovative modules. The Semantic-Guided Multimodal Fusion (SGMF) module utilizes prior information on relationship semantics to fuse multimodal information and construct relationship features, thereby elucidating the relationships between entities and reducing semantic ambiguity caused by complex relationships. The Graph Structure Learning-based Structure Evolution (GSLSE) module, based on graph structure learning, reduces redundancy in relationship features and optimizes the computational complexity in subsequent contextual message passing. The Key Entity Message Passing (KEMP) module takes full advantage of contextual information to refine relationship features, thereby reducing noise interference from non-key nodes. Furthermore, this paper constructs the first Ship Navigation Scene Graph Simulation dataset, named SNSG-Sim, which provides a foundational dataset for the research on ship navigation SGG. Experimental results on the SNSG-sim dataset demonstrate that our method achieves an improvement of 8.31% (R@50) in the PredCls task and 7.94% (R@50) in the SGCls task compared to the baseline method, validating the effectiveness of our method in navigation scene graph generation.
RESUMO
Anomaly detection plays a critical role in ensuring safe, smooth, and efficient operation of machinery and equipment in industrial environments. With the wide deployment of multimodal sensors and the rapid development of Internet of Things (IoT), the data generated in modern industrial production has become increasingly diverse and complex. However, traditional methods for anomaly detection based on a single data source cannot fully utilize multimodal data to capture anomalies in industrial systems. To address this challenge, we propose a new model for anomaly detection in industrial environments using multimodal temporal data. This model integrates an attention-based autoencoder (AAE) and a generative adversarial network (GAN) to capture and fuse rich information from different data sources. Specifically, the AAE captures time-series dependencies and relevant features in each modality, and the GAN introduces adversarial regularization to enhance the model's ability to reconstruct normal time-series data. We conduct extensive experiments on real industrial data containing both measurements from a distributed control system (DCS) and acoustic signals, and the results demonstrate the performance superiority of the proposed model over the state-of-the-art TimesNet for anomaly detection, with an improvement of 5.6% in F1 score.
RESUMO
Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios.
Assuntos
Algoritmos , Atividades Humanas , Humanos , Processamento de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Postura/fisiologia , Reconhecimento Automatizado de Padrão/métodosRESUMO
Currently, the development of deep learning-based multimodal learning is advancing rapidly, and is widely used in the field of artificial intelligence-generated content, such as image-text conversion and image-text generation. Electronic health records are digital information such as numbers, charts, and texts generated by medical staff using information systems in the process of medical activities. The multimodal fusion method of electronic health records based on deep learning can assist medical staff in the medical field to comprehensively analyze a large number of medical multimodal data generated in the process of diagnosis and treatment, thereby achieving accurate diagnosis and timely intervention for patients. In this article, we firstly introduce the methods and development trends of deep learning-based multimodal data fusion. Secondly, we summarize and compare the fusion of structured electronic medical records with other medical data such as images and texts, focusing on the clinical application types, sample sizes, and the fusion methods involved in the research. Through the analysis and summary of the literature, the deep learning methods for fusion of different medical modal data are as follows: first, selecting the appropriate pre-trained model according to the data modality for feature representation and post-fusion, and secondly, fusing based on the attention mechanism. Lastly, the difficulties encountered in multimodal medical data fusion and its developmental directions, including modeling methods, evaluation and application of models, are discussed. Through this review article, we expect to provide reference information for the establishment of models that can comprehensively utilize various modal medical data.
Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Humanos , Inteligência Artificial , AlgoritmosRESUMO
The brain functions as an accurate circuit that regulates information to be sequentially propagated and processed in a hierarchical manner. However, it is still unknown how the brain is hierarchically organized and how information is dynamically propagated during high-level cognition. In this study, we developed a new scheme for quantifying the information transmission velocity (ITV) by combining electroencephalogram (EEG) and diffusion tensor imaging (DTI), and then mapped the cortical ITV network (ITVN) to explore the information transmission mechanism of the human brain. The application in MRI-EEG data of P300 revealed bottom-up and top-down ITVN interactions subserving P300 generation, which was comprised of four hierarchical modules. Among these four modules, information exchange between visual- and attention-activated regions occurred at a high velocity, related cognitive processes could thus be efficiently accomplished due to the heavy myelination of these regions. Moreover, inter-individual variability in P300 was probed to be attributed to the difference in information transmission efficiency of the brain, which may provide new insight into the cognitive degenerations in clinical neurodegenerative disorders, such as Alzheimer's disease, from the transmission velocity perspective. Together, these findings confirm the capacity of ITV to effectively determine the efficiency of information propagation in the brain.
Assuntos
Encéfalo , Imagem de Tensor de Difusão , Humanos , Encéfalo/fisiologia , Cognição/fisiologia , Eletroencefalografia/métodos , Mapeamento Encefálico/métodosRESUMO
Concomitant neuropsychiatric symptoms (NPS) are associated with accelerated Alzheimer's disease (AD) progression. Identifying multimodal brain imaging patterns associated with NPS may help understand pathophysiology correlates AD. Based on the AD continuum, a supervised learning strategy was used to guide four-way multimodal neuroimaging fusion (Amyloid, Tau, gray matter volume, brain function) by using NPS total score as the reference. Loadings of the identified multimodal patterns were compared across the AD continuum. Then, regression analyses were performed to investigate its predictability of longitudinal cognition performance. Furthermore, the fusion analysis was repeated in the four NPS subsyndromes. Here, an NPS-associated pathological-structural-functional covaried pattern was observed in the frontal-subcortical limbic circuit, occipital, and sensor-motor region. Loading of this multimodal pattern showed a progressive increase with the development of AD. The pattern significantly correlates with multiple cognitive domains and could also predict longitudinal cognitive decline. Notably, repeated fusion analysis using subsyndromes as references identified similar patterns with some unique variations associated with different syndromes. Conclusively, NPS was associated with a multimodal imaging pattern involving complex neuropathologies, which could effectively predict longitudinal cognitive decline. These results highlight the possible neural substrate of NPS in AD, which may provide guidance for clinical management.
Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Doença de Alzheimer/patologia , Encéfalo , Substância Cinzenta/patologia , NeuroimagemRESUMO
In this article, we focus on estimating the joint relationship between structural magnetic resonance imaging (sMRI) gray matter (GM), and multiple functional MRI (fMRI) intrinsic connectivity networks (ICNs). To achieve this, we propose a multilink joint independent component analysis (ml-jICA) method using the same core algorithm as jICA. To relax the jICA assumption, we propose another extension called parallel multilink jICA (pml-jICA) that allows for a more balanced weight distribution over ml-jICA/jICA. We assume a shared mixing matrix for both the sMRI and fMRI modalities, while allowing for different mixing matrices linking the sMRI data to the different ICNs. We introduce the model and then apply this approach to study the differences in resting fMRI and sMRI data from patients with Alzheimer's disease (AD) versus controls. The results of the pml-jICA yield significant differences with large effect sizes that include regions in overlapping portions of default mode network, and also hippocampus and thalamus. Importantly, we identify two joint components with partially overlapping regions which show opposite effects for AD versus controls, but were able to be separated due to being linked to distinct functional and structural patterns. This highlights the unique strength of our approach and multimodal fusion approaches generally in revealing potentially biomarkers of brain disorders that would likely be missed by a unimodal approach. These results represent the first work linking multiple fMRI ICNs to GM components within a multimodal data fusion model and challenges the typical view that brain structure is more sensitive to AD than fMRI.
Assuntos
Neuroimagem Funcional , Substância Cinzenta , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/fisiopatologia , Descanso , Imageamento por Ressonância Magnética/métodos , Humanos , Substância Cinzenta/diagnóstico por imagem , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Hipocampo/diagnóstico por imagem , Tálamo/diagnóstico por imagem , Neuroimagem Funcional/métodosRESUMO
This work proposes a novel generative multimodal approach to jointly analyze multimodal data while linking the multimodal information to colors. We apply our proposed framework, which disentangles multimodal data into private and shared sets of features from pairs of structural (sMRI), functional (sFNC and ICA), and diffusion MRI data (FA maps). With our approach, we find that heterogeneity in schizophrenia is potentially a function of modality pairs. Results show (1) schizophrenia is highly multimodal and includes changes in specific networks, (2) non-linear relationships with schizophrenia are observed when interpolating among shared latent dimensions, and (3) we observe a decrease in the modularity of functional connectivity and decreased visual-sensorimotor connectivity for schizophrenia patients for the FA-sFNC and sMRI-sFNC modality pairs, respectively. Additionally, our results generally indicate decreased fractional corpus callosum anisotropy, and decreased spatial ICA map and voxel-based morphometry strength in the superior frontal lobe as found in the FA-sFNC, sMRI-FA, and sMRI-ICA modality pair clusters. In sum, we introduce a powerful new multimodal neuroimaging framework designed to provide a rich and intuitive understanding of the data which we hope challenges the reader to think differently about how modalities interact.
Assuntos
Esquizofrenia , Humanos , Esquizofrenia/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Neuroimagem , Imagem de Difusão por Ressonância MagnéticaRESUMO
Extracerebral tumors often occur on the surface of the brain or at the skull base. It is important to identify the peritumoral sulci, gyri, and nerve fibers. Preoperative visualization of three-dimensional (3D) multimodal fusion imaging (MFI) is crucial for surgery. However, the traditional 3D-MFI brain models are homochromatic and do not allow easy identification of anatomical functional areas. In this study, 33 patients with extracerebral tumors without peritumoral edema were retrospectively recruited. They underwent 3D T1-weighted MRI, diffusion tensor imaging (DTI), and CT angiography (CTA) sequence scans. 3DSlicer, Freesurfer, and BrainSuite were used to explore 3D-color-MFI and preoperative planning. To determine the effectiveness of 3D-color-MFI as an augmented reality (AR) teaching tool for neurosurgeons and as a patient education and communication tool, questionnaires were administered to 15 neurosurgery residents and all patients, respectively. For neurosurgical residents, 3D-color-MFI provided a better understanding of surgical anatomy and more efficient techniques for removing extracerebral tumors than traditional 3D-MFI (P < 0.001). For patients, the use of 3D-color MFI can significantly improve their understanding of the surgical approach and risks (P < 0.005). 3D-color-MFI is a promising AR tool for extracerebral tumors and is more useful for learning surgical anatomy, developing surgical strategies, and improving communication with patients.