Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nanotechnology ; 32(1): 012002, 2021 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-32679577

RESUMO

Recent progress in artificial intelligence is largely attributed to the rapid development of machine learning, especially in the algorithm and neural network models. However, it is the performance of the hardware, in particular the energy efficiency of a computing system that sets the fundamental limit of the capability of machine learning. Data-centric computing requires a revolution in hardware systems, since traditional digital computers based on transistors and the von Neumann architecture were not purposely designed for neuromorphic computing. A hardware platform based on emerging devices and new architecture is the hope for future computing with dramatically improved throughput and energy efficiency. Building such a system, nevertheless, faces a number of challenges, ranging from materials selection, device optimization, circuit fabrication and system integration, to name a few. The aim of this Roadmap is to present a snapshot of emerging hardware technologies that are potentially beneficial for machine learning, providing the Nanotechnology readers with a perspective of challenges and opportunities in this burgeoning field.

2.
Proc Natl Acad Sci U S A ; 114(20): 5107-5112, 2017 05 16.
Artigo em Inglês | MEDLINE | ID: mdl-28461459

RESUMO

Increasing performance demands and shorter use lifetimes of consumer electronics have resulted in the rapid growth of electronic waste. Currently, consumer electronics are typically made with nondecomposable, nonbiocompatible, and sometimes even toxic materials, leading to serious ecological challenges worldwide. Here, we report an example of totally disintegrable and biocompatible semiconducting polymers for thin-film transistors. The polymer consists of reversible imine bonds and building blocks that can be easily decomposed under mild acidic conditions. In addition, an ultrathin (800-nm) biodegradable cellulose substrate with high chemical and thermal stability is developed. Coupled with iron electrodes, we have successfully fabricated fully disintegrable and biocompatible polymer transistors. Furthermore, disintegrable and biocompatible pseudo-complementary metal-oxide-semiconductor (CMOS) flexible circuits are demonstrated. These flexible circuits are ultrathin (<1 µm) and ultralightweight (∼2 g/m2) with low operating voltage (4 V), yielding potential applications of these disintegrable semiconducting polymers in low-cost, biocompatible, and ultralightweight transient electronics.


Assuntos
Materiais Biocompatíveis/química , Plásticos Biodegradáveis/química , Celulose/química , Semicondutores , Eletrodos
3.
Opt Express ; 23(12): 15545-54, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-26193534

RESUMO

We propose compact DC and small-signal models for carrier-injection microring modulators that accurately describe the DC characteristics (resonance wavelength, quality factor, and extinction ratio) and the high frequency performance. The proposed theoretical models provide physical insights of the carrier-injection microring modulators with a variety of designs. The DC and small-signal models are implemented in Verilog-A for SPICE-compatible simulations.

4.
Opt Express ; 23(20): 25653-60, 2015 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-26480081

RESUMO

We investigate the athermal characteristics of silicon waveguides clad with TiO(2) designed for 1.3 µm wavelength operation. Using CMOS-compatible fabrication processes, we realize and experimentally demonstrate silicon photonic ring resonators with resonant wavelengths that vary by less than 6 pm/°C at 1.3 µm. The measured ring resonance wavelengths across the 20-50°C temperature range show nearly complete cancellation of the first-order thermo-optical effects and exhibit second-order thermo-optical effects expected from the combination of TiO(2) and Si.

5.
Opt Express ; 22(1): 661-6, 2014 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-24515025

RESUMO

Ring resonators with TiO2 core confinement factors from 0.07 to 0.42 are fabricated and measured for thermal sensitivity achieving -2.9 pm/K thermal drift in the best case. Materials used are CMOS compatible (TiO2, SiO2 and Si3N4) on a Si substrate. The under discussed role of stress in thermo-optic behavior is clearly observed when contrasting waveguides buried in SiO2 to those with etched sidewalls revealed to air. Multiphysics simulations are conducted to provide a theoretical explanation of this phenomenon in contrast to the more widely reported theories on thermo-optic behavior dominated by confinement factor.

6.
Sci Rep ; 14(1): 13893, 2024 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-38886528

RESUMO

We present a new learning-based framework S-3D-RCNN that can recover accurate object orientation in SO(3) and simultaneously predict implicit rigid shapes from stereo RGB images. For orientation estimation, in contrast to previous studies that map local appearance to observation angles, we propose a progressive approach by extracting meaningful Intermediate Geometrical Representations (IGRs). This approach features a deep model that transforms perceived intensities from one or two views to object part coordinates to achieve direct egocentric object orientation estimation in the camera coordinate system. To further achieve finer description inside 3D bounding boxes, we investigate the implicit shape estimation problem from stereo images. We model visible object surfaces by designing a point-based representation, augmenting IGRs to explicitly address the unseen surface hallucination problem. Extensive experiments validate the effectiveness of the proposed IGRs, and S-3D-RCNN achieves superior 3D scene understanding performance. We also designed new metrics on the KITTI benchmark for our evaluation of implicit shape estimation.

7.
IEEE Trans Med Imaging ; 43(6): 2137-2147, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38231818

RESUMO

Nuclei segmentation is a fundamental prerequisite in the digital pathology workflow. The development of automated methods for nuclei segmentation enables quantitative analysis of the wide existence and large variances in nuclei morphometry in histopathology images. However, manual annotation of tens of thousands of nuclei is tedious and time-consuming, which requires significant amount of human effort and domain-specific expertise. To alleviate this problem, in this paper, we propose a weakly-supervised nuclei segmentation method that only requires partial point labels of nuclei. Specifically, we propose a novel boundary mining framework for nuclei segmentation, named BoNuS, which simultaneously learns nuclei interior and boundary information from the point labels. To achieve this goal, we propose a novel boundary mining loss, which guides the model to learn the boundary information by exploring the pairwise pixel affinity in a multiple-instance learning manner. Then, we consider a more challenging problem, i.e., partial point label, where we propose a nuclei detection module with curriculum learning to detect the missing nuclei with prior morphological knowledge. The proposed method is validated on three public datasets, MoNuSeg, CPM, and CoNIC datasets. Experimental results demonstrate the superior performance of our method to the state-of-the-art weakly-supervised nuclei segmentation methods. Code: https://github.com/hust-linyi/bonus.


Assuntos
Algoritmos , Núcleo Celular , Humanos , Processamento de Imagem Assistida por Computador/métodos , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos
8.
Med Image Anal ; 99: 103333, 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39244795

RESUMO

Partially-supervised multi-organ medical image segmentation aims to develop a unified semantic segmentation model by utilizing multiple partially-labeled datasets, with each dataset providing labels for a single class of organs. However, the limited availability of labeled foreground organs and the absence of supervision to distinguish unlabeled foreground organs from the background pose a significant challenge, which leads to a distribution mismatch between labeled and unlabeled pixels. Although existing pseudo-labeling methods can be employed to learn from both labeled and unlabeled pixels, they are prone to performance degradation in this task, as they rely on the assumption that labeled and unlabeled pixels have the same distribution. In this paper, to address the problem of distribution mismatch, we propose a labeled-to-unlabeled distribution alignment (LTUDA) framework that aligns feature distributions and enhances discriminative capability. Specifically, we introduce a cross-set data augmentation strategy, which performs region-level mixing between labeled and unlabeled organs to reduce distribution discrepancy and enrich the training set. Besides, we propose a prototype-based distribution alignment method that implicitly reduces intra-class variation and increases the separation between the unlabeled foreground and background. This can be achieved by encouraging consistency between the outputs of two prototype classifiers and a linear classifier. Extensive experimental results on the AbdomenCT-1K dataset and a union of four benchmark datasets (including LiTS, MSD-Spleen, KiTS, and NIH82) demonstrate that our method outperforms the state-of-the-art partially-supervised methods by a considerable margin, and even surpasses the fully-supervised methods. The source code is publicly available at LTUDA.

9.
IEEE Trans Cybern ; 54(10): 5795-5805, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38728131

RESUMO

Radiation therapy treatment planning requires balancing the delivery of the target dose while sparing normal tissues, making it a complex process. To streamline the planning process and enhance its quality, there is a growing demand for knowledge-based planning (KBP). Ensemble learning has shown impressive power in various deep learning tasks, and it has great potential to improve the performance of KBP. However, the effectiveness of ensemble learning heavily depends on the diversity and individual accuracy of the base learners. Moreover, the complexity of model ensembles is a major concern, as it requires maintaining multiple models during inference, leading to increased computational cost and storage overhead. In this study, we propose a novel learning-based ensemble approach named LENAS, which integrates neural architecture search with knowledge distillation for 3-D radiotherapy dose prediction. Our approach starts by exhaustively searching each block from an enormous architecture space to identify multiple architectures that exhibit promising performance and significant diversity. To mitigate the complexity introduced by the model ensemble, we adopt the teacher-student paradigm, leveraging the diverse outputs from multiple learned networks as supervisory signals to guide the training of the student network. Furthermore, to preserve high-level semantic information, we design a hybrid loss to optimize the student network, enabling it to recover the knowledge embedded within the teacher networks. The proposed method has been evaluated on two public datasets: 1) OpenKBP and 2) AIMIS. Extensive experimental results demonstrate the effectiveness of our method and its superior performance to the state-of-the-art methods. Code: github.com/hust-linyi/LENAS.


Assuntos
Redes Neurais de Computação , Dosagem Radioterapêutica , Planejamento da Radioterapia Assistida por Computador , Humanos , Planejamento da Radioterapia Assistida por Computador/métodos , Algoritmos , Aprendizado Profundo , Aprendizado de Máquina
10.
Med Image Anal ; 98: 103311, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-39217674

RESUMO

Optical Coherence Tomography Angiography (OCTA) is a crucial tool in the clinical screening of retinal diseases, allowing for accurate 3D imaging of blood vessels through non-invasive scanning. However, the hardware-based approach for acquiring OCTA images presents challenges due to the need for specialized sensors and expensive devices. In this paper, we introduce a novel method called TransPro, which can translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images without requiring any additional hardware modifications. Our TransPro method is primarily driven by two novel ideas that have been overlooked by prior work. The first idea is derived from a critical observation that the OCTA projection map is generated by averaging pixel values from its corresponding B-scans along the Z-axis. Hence, we introduce a hybrid architecture incorporating a 3D adversarial generative network and a novel Heuristic Contextual Guidance (HCG) module, which effectively maintains the consistency of the generated OCTA images between 3D volumes and projection maps. The second idea is to improve the vessel quality in the translated OCTA projection maps. As a result, we propose a novel Vessel Promoted Guidance (VPG) module to enhance the attention of network on retinal vessels. Experimental results on two datasets demonstrate that our TransPro outperforms state-of-the-art approaches, with relative improvements around 11.4% in MAE, 2.7% in PSNR, 2% in SSIM, 40% in VDE, and 9.1% in VDC compared to the baseline method. The code is available at: https://github.com/ustlsh/TransPro.


Assuntos
Imageamento Tridimensional , Vasos Retinianos , Tomografia de Coerência Óptica , Tomografia de Coerência Óptica/métodos , Humanos , Vasos Retinianos/diagnóstico por imagem , Imageamento Tridimensional/métodos , Heurística , Doenças Retinianas/diagnóstico por imagem , Algoritmos , Angiografia/métodos
11.
IEEE Rev Biomed Eng ; PP2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38265911

RESUMO

Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex context of breast imaging modalities. Considering the rapid improvement in deep learning technology and the increasing severity of breast cancer, it is critical to summarize past progress and identify future challenges to be addressed. This paper provides an extensive review of deep learning-based breast cancer imaging research, covering studies on mammograms, ultrasound, magnetic resonance imaging, and digital pathology images over the past decade. The major deep learning methods and applications on imaging-based screening, diagnosis, treatment response prediction, and prognosis are elaborated and discussed. Drawn from the findings of this survey, we present a comprehensive discussion of the challenges and potential avenues for future research in deep learning-based breast cancer imaging.

12.
Sci Adv ; 10(33): eado1058, 2024 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-39141720

RESUMO

The brain is dynamic, associative, and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network using a memristor. The network associates incoming data with the past experience stored as semantic vectors. The network and the semantic memory are physically implemented on noise-robust ternary memristor-based computing-in-memory (CIM) and content-addressable memory (CAM) circuits, respectively. We validate our co-designs, using a 40-nm memristor macro, on ResNet and PointNet++ for classifying images and three-dimensional points from the MNIST and ModelNet datasets, which achieves not only accuracy on par with software but also a 48.1 and 15.9% reduction in computational budget. Moreover, it delivers a 77.6 and 93.3% reduction in energy consumption.

13.
Artigo em Inglês | MEDLINE | ID: mdl-37506015

RESUMO

This article presents a simple yet effective two-stage framework for semi-supervised medical image segmentation. Unlike prior state-of-the-art semi-supervised segmentation methods that predominantly rely on pseudo supervision directly on predictions, such as consistency regularization and pseudo labeling, our key insight is to explore the feature representation learning with labeled and unlabeled (i.e., pseudo labeled) images to regularize a more compact and better-separated feature space, which paves the way for low-density decision boundary learning and therefore enhances the segmentation performance. A stage-adaptive contrastive learning method is proposed, containing a boundary-aware contrastive loss that takes advantage of the labeled images in the first stage, as well as a prototype-aware contrastive loss to optimize both labeled and pseudo labeled images in the second stage. To obtain more accurate prototype estimation, which plays a critical role in prototype-aware contrastive learning, we present an aleatoric uncertainty-aware method to generate higher quality pseudo labels. Aleatoric-uncertainty adaptive (AUA) adaptively regularizes prediction consistency by taking advantage of image ambiguity, which, given its significance, is underexplored by existing works. Our method achieves the best results on three public medical image segmentation benchmarks.

14.
Artigo em Inglês | MEDLINE | ID: mdl-38090872

RESUMO

This article addresses the problem of few-shot skin disease classification by introducing a novel approach called the subcluster-aware network (SCAN) that enhances accuracy in diagnosing rare skin diseases. The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple subclusters, characterized by distinct variations in appearance. To improve the performance of few-shot learning (FSL), we focus on learning a high-quality feature encoder that captures the unique subclustered representations within each disease class, enabling better characterization of feature distributions. Specifically, SCAN follows a dual-branch framework, where the first branch learns classwise features to distinguish different skin diseases, and the second branch aims to learn features, which can effectively partition each class into several groups so as to preserve the subclustered structure within each class. To achieve the objective of the second branch, we present a cluster loss to learn image similarities via unsupervised clustering. To ensure that the samples in each subcluster are from the same class, we further design a purity loss to refine the unsupervised clustering results. We evaluate the proposed approach on two public datasets for few-shot skin disease classification. The experimental results validate that our framework outperforms the state-of-the-art methods by around 2%-5% in terms of sensitivity, specificity, accuracy, and F1-score on the SD-198 and Derm7pt datasets.

15.
IEEE J Biomed Health Inform ; 27(7): 3501-3512, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37053058

RESUMO

OBJECTIVE: Transformers, born to remedy the inadequate receptive fields of CNNs, have drawn explosive attention recently. However, the daunting computational complexity of global representation learning, together with rigid window partitioning, hinders their deployment in medical image segmentation. This work aims to address the above two issues in transformers for better medical image segmentation. METHODS: We propose a boundary-aware lightweight transformer (BATFormer) that can build cross-scale global interaction with lower computational complexity and generate windows flexibly under the guidance of entropy. Specifically, to fully explore the benefits of transformers in long-range dependency establishment, a cross-scale global transformer (CGT) module is introduced to jointly utilize multiple small-scale feature maps for richer global features with lower computational complexity. Given the importance of shape modeling in medical image segmentation, a boundary-aware local transformer (BLT) module is constructed. Different from rigid window partitioning in vanilla transformers which would produce boundary distortion, BLT adopts an adaptive window partitioning scheme under the guidance of entropy for both computational complexity reduction and shape preservation. RESULTS: BATFormer achieves the best performance in Dice of 92.84 %, 91.97 %, 90.26 %, and 96.30 % for the average, right ventricle, myocardium, and left ventricle respectively on the ACDC dataset and the best performance in Dice, IoU, and ACC of 90.76 %, 84.64 %, and 96.76 % respectively on the ISIC 2018 dataset. More importantly, BATFormer requires the least amount of model parameters and the lowest computational complexity compared to the state-of-the-art approaches. CONCLUSION AND SIGNIFICANCE: Our results demonstrate the necessity of developing customized transformers for efficient and better medical image segmentation. We believe the design of BATFormer is inspiring and extendable to other applications/frameworks.


Assuntos
Fontes de Energia Elétrica , Ventrículos do Coração , Humanos , Entropia , Processamento de Imagem Assistida por Computador
16.
IEEE Trans Med Imaging ; 42(8): 2325-2337, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37027664

RESUMO

Vision transformers have recently set off a new wave in the field of medical image analysis due to their remarkable performance on various computer vision tasks. However, recent hybrid-/transformer-based approaches mainly focus on the benefits of transformers in capturing long-range dependency while ignoring the issues of their daunting computational complexity, high training costs, and redundant dependency. In this paper, we propose to employ adaptive pruning to transformers for medical image segmentation and propose a lightweight and effective hybrid network APFormer. To our best knowledge, this is the first work on transformer pruning for medical image analysis tasks. The key features of APFormer are self-regularized self-attention (SSA) to improve the convergence of dependency establishment, Gaussian-prior relative position embedding (GRPE) to foster the learning of position information, and adaptive pruning to eliminate redundant computations and perception information. Specifically, SSA and GRPE consider the well-converged dependency distribution and the Gaussian heatmap distribution separately as the prior knowledge of self-attention and position embedding to ease the training of transformers and lay a solid foundation for the following pruning operation. Then, adaptive transformer pruning, both query-wise and dependency-wise, is performed by adjusting the gate control parameters for both complexity reduction and performance improvement. Extensive experiments on two widely-used datasets demonstrate the prominent segmentation performance of APFormer against the state-of-the-art methods with much fewer parameters and lower GFLOPs. More importantly, we prove, through ablation studies, that adaptive pruning can work as a plug-n-play module for performance improvement on other hybrid-/transformer-based methods. Code is available at https://github.com/xianlin7/APFormer.


Assuntos
Diagnóstico por Imagem , Distribuição Normal
17.
IEEE Trans Med Imaging ; 42(5): 1446-1461, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37015560

RESUMO

Left-ventricular ejection fraction (LVEF) is an important indicator of heart failure. Existing methods for LVEF estimation from video require large amounts of annotated data to achieve high performance, e.g. using 10,030 labeled echocardiogram videos to achieve mean absolute error (MAE) of 4.10. Labeling these videos is time-consuming however and limits potential downstream applications to other heart diseases. This paper presents the first semi-supervised approach for LVEF prediction. Unlike general video prediction tasks, LVEF prediction is specifically related to changes in the left ventricle (LV) in echocardiogram videos. By incorporating knowledge learned from predicting LV segmentations into LVEF regression, we can provide additional context to the model for better predictions. To this end, we propose a novel Cyclical Self-Supervision (CSS) method for learning video-based LV segmentation, which is motivated by the observation that the heartbeat is a cyclical process with temporal repetition. Prediction masks from our segmentation model can then be used as additional input for LVEF regression to provide spatial context for the LV region. We also introduce teacher-student distillation to distill the information from LV segmentation masks into an end-to-end LVEF regression model that only requires video inputs. Results show our method outperforms alternative semi-supervised methods and can achieve MAE of 4.17, which is competitive with state-of-the-art supervised performance, using half the number of labels. Validation on an external dataset also shows improved generalization ability from using our method.


Assuntos
Cardiopatias , Função Ventricular Esquerda , Humanos , Volume Sistólico , Ecocardiografia/métodos , Ventrículos do Coração/diagnóstico por imagem
18.
IEEE Trans Med Imaging ; 42(11): 3244-3255, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37220039

RESUMO

This study investigates barely-supervised medical image segmentation where only few labeled data, i.e., single-digit cases are available. We observe the key limitation of the existing state-of-the-art semi-supervised solution cross pseudo supervision is the unsatisfactory precision of foreground classes, leading to a degenerated result under barely-supervised learning. In this paper, we propose a novel Compete-to-Win method (ComWin) to enhance the pseudo label quality. In contrast to directly using one model's predictions as pseudo labels, our key idea is that high-quality pseudo labels should be generated by comparing multiple confidence maps produced by different networks to select the most confident one (a compete-to-win strategy). To further refine pseudo labels at near-boundary areas, an enhanced version of ComWin, namely, ComWin + , is proposed by integrating a boundary-aware enhancement module. Experiments show that our method can achieve the best performance on three public medical image datasets for cardiac structure segmentation, pancreas segmentation and colon tumor segmentation, respectively. The source code is now available at https://github.com/Huiimin5/comwin.


Assuntos
Neoplasias do Colo , Humanos , Coração , Pâncreas , Software , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina Supervisionado
19.
Med Image Anal ; 86: 102794, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36934507

RESUMO

Medical anomaly detection is a crucial yet challenging task aimed at recognizing abnormal images to assist in diagnosis. Due to the high-cost annotations of abnormal images, most methods utilize only known normal images during training and identify samples deviating from the normal profile as anomalies in the testing phase. Many readily available unlabeled images containing anomalies are thus ignored in the training phase, restricting the performance. To solve this problem, we introduce one-class semi-supervised learning (OC-SSL) to utilize known normal and unlabeled images for training, and propose Dual-distribution Discrepancy for Anomaly Detection (DDAD) based on this setting. Ensembles of reconstruction networks are designed to model the distribution of normal images and the distribution of both normal and unlabeled images, deriving the normative distribution module (NDM) and unknown distribution module (UDM). Subsequently, the intra-discrepancy of NDM and inter-discrepancy between the two modules are designed as anomaly scores. Furthermore, we propose a new perspective on self-supervised learning, which is designed to refine the anomaly scores rather than directly detect anomalies. Five medical datasets, including chest X-rays, brain MRIs and retinal fundus images, are organized as benchmarks for evaluation. Experiments on these benchmarks comprehensively compare a wide range of anomaly detection methods and demonstrate that our method achieves significant gains and outperforms the state-of-the-art. Code and organized benchmarks are available at https://github.com/caiyu6666/DDAD-ASR.


Assuntos
Benchmarking , Neuroimagem , Humanos , Fundo de Olho , Aprendizado de Máquina Supervisionado
20.
Artigo em Inglês | MEDLINE | ID: mdl-37862279

RESUMO

Brain tumor segmentation is a fundamental task and existing approaches usually rely on multi-modality magnetic resonance imaging (MRI) images for accurate segmentation. However, the common problem of missing/incomplete modalities in clinical practice would severely degrade their segmentation performance, and existing fusion strategies for incomplete multi-modality brain tumor segmentation are far from ideal. In this work, we propose a novel framework named M 2 FTrans to explore and fuse cross-modality features through modality-masked fusion transformers under various incomplete multi-modality settings. Considering vanilla self-attention is sensitive to missing tokens/inputs, both learnable fusion tokens and masked self-attention are introduced to stably build long-range dependency across modalities while being more flexible to learn from incomplete modalities. In addition, to avoid being biased toward certain dominant modalities, modality-specific features are further re-weighted through spatial weight attention and channel- wise fusion transformers for feature redundancy reduction and modality re-balancing. In this way, the fusion strategy in M 2 FTrans is more robust to missing modalities. Experimental results on the widely-used BraTS2018, BraTS2020, and BraTS2021 datasets demonstrate the effectiveness of M 2 FTrans, outperforming the state-of-the-art approaches with large margins under various incomplete modalities for brain tumor segmentation. Code is available at https://github.com/Jun-Jie-Shi/M2FTrans.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA