Multi-modal disease segmentation with continual learning and adaptive decision fusion
Multi-modal disease segmentation is essential for the diagnosis and treatment of patients. Advanced algorithms have been proposed, however, two challenging issues remain unsolved, i.e., lacked knowledge share and limited modal relation. To this end, we develop a novel framework for multi-modal disease segmentation. It is based on improved continual learning and adaptive decision fusion. Specifically, continual learning with 𝑘-means sampling is developed to highlight knowledge share from multi-modal medical images. In addition, we propose an adaptive decision fusion technique that uses the Naive Bayesian algorithm to improve the relationship between different modalities. To evaluate our proposed model, we chose two typical tasks, i.e., myocardial pathology segmentation and brain tumor segmentation. Four benchmark datasets, i.e., myocardial pathology segmentation challenge 2020 (MyoPS 2020), brain tumor segmentation challenge 2018 (BraTS 2018), BraTS 2019, and BraTS 2020, are utilized to train and test our framework. Both the qualitative and quantitative results demonstrate that our proposed model is effective and has advantages over peer state-of-the-art (SOTA) methods.
Multi-modality medical image segmentation via adversarial learning with CV energy functional
Medical image processing methods based on deep learning have gradually become mainstream. Automatic segmentation of brain tumor from multi-modality magnetic resonance images (MRI) using deep learning method is the key to the diagnosis of gliomas. In our hybrid network, the proposed neural network framework consists of Segmentor and Critic. A new Transformer-CV-Unet (TCUnet) is introduced to gain more semantic features. We employ the new TCUnet as the generator of GAN to complete the segmentation task to increase robustness and efficiency. With a generator to segment the target images, Critic is then built to tightly merge the latent representation with hierarchical characteristics from each modality. Moreover, a hybrid adversarial with multi-phase CV energy functional is introduced. Our hybrid network, AdvTCUnet, combines the advantages of both methods. Furthermore, extensive experiments on BraTs 19–21 show that the proposed model performs better than existing state-of-the-art techniques for segmenting brain tumor MRI (e.g., the Dice Similarity Coefficient of ET, WT and TC on BraTs 21 can reach 0.8642, 0.9303 and 0.9060, respectively).
Entropy-aware dynamic path selection network for multi-modality medical image fusion
上海大学
Multiscale Frequency-Guided Image Analyses for Mixed-Modality Medical Image Segmentation
广东工业大学
A Novel 3D Unsupervised Domain Adaptation Framework for Cross-Modality Medical Image Segmentation
We consider the problem of volumetric (3D) unsupervised domain adaptation (UDA) in cross-modality medical image segmentation, aiming to perform segmentation on the unannotated target domain (e.g. MRI) with the help of labeled source domain (e.g. CT). Previous UDA methods in medical image analysis usually suffer from two challenges: 1) they focus on processing and analyzing data at 2D level only, thus missing semantic information from the depth level; 2) one-to-one mapping is adopted during the style-transfer process, leading to insufficient alignment in the target domain. Different from the existing methods, in our work, we conduct a first of its kind investigation on multi-style image translation for complete image alignment to alleviate the domain shift problem, and also introduce 3D segmentation in domain adaptation tasks to maintain semantic consistency at the depth level. In particular, we develop an unsupervised domain adaptation framework incorporating a novel quartet self-attention module to efficiently enhance relationships between widely separated features in spatial regions on a higher dimension, leading to a substantial improvement in segmentation accuracy in the unlabeled target domain. In two challenging cross-modality tasks, specifically brain structures and multi-organ abdominal segmentation, our model is shown to outperform current state-of-the-art methods by a significant margin, demonstrating its potential as a benchmark resource for the biomedical and health informatics research community.
MATR Multimodal Medical Image Fusion via Multiscale Adaptive Transformer
Owing to the limitations of imaging sensors, it is challenging to obtain a medical image that simultaneously contains functional metabolic information and structural tissue details. Multimodal medical image fusion, an effective way to merge the complementary information in different modalities, has become a significant technique to facilitate clinical diagnosis and surgical navigation. With powerful feature representation ability, deep learning (DL)-based methods have improved such fusion results but still have not achieved satisfactory performance. Specifically, existing DL-based methods generally depend on convolutional operations, which can well extract local patterns but have limited capability in preserving global context information. To compensate for this defect and achieve accurate fusion, we propose a novel unsupervised method to fuse multimodal medical images via a multiscale adaptive Transformer termed MATR. In the proposed method, instead of directly employing vanilla convolution, we introduce an adaptive convolution for adaptively modulating the convolutional kernel based on the global complementary context. To further model long-range dependencies, an adaptive Transformer is employed to enhance the global semantic extraction capability. Our network architecture is designed in a multiscale fashion so that useful multimodal information can be adequately acquired from the perspective of different scales. Moreover, an objective function composed of a structural loss and a region mutual information loss is devised to construct constraints for information preservation at both the structural-level and the feature-level. Extensive experiments on a mainstream database demonstrate that the proposed method outperforms other representative and state-of-the-art methods in terms of both visual quality and quantitative evaluation. We also extend the proposed method to address other biomedical image fusion issues, and the pleasing fusion results illustrate that MATR has good generalization capability. The code of the proposed method is available at https://github.com/tthinking/MATR.
Hybrid cross-modality fusion network for medical image segmentation with contrastive learning
Medical image segmentation has been widely adopted in artificial intelligence-based clinical applications. The integration of medical texts into image segmentation models has significantly improved the segmentation performance. It is crucial to design an effective fusion manner to integrate the paired image and text features. Existing multi-modal medical image segmentation methods fuse the paired image and text features through a non-local attention mechanism, which lacks local interaction. Besides, they lack a mechanism to enhance the relevance of the paired features and keep the discriminability of unpaired features in the training process, which limits the segmentation performance. To solve the above problem, we propose a hybrid cross-modality fusion network (HCFNet) based on contrastive learning for medical image segmentation. The key designs of our proposed method are a multi-stage cross-modality contrastive loss and a hybrid cross-modality feature decoder. The multi-stage cross-modality contrastive loss is utilized to enhance the discriminability of the paired features and separate the unpaired features. Furthermore, the hybrid cross-modality feature decoder conducts local and non-local cross-modality feature interaction by a local cross-modality fusion module and a non-local cross-modality fusion module, respectively. Experimental results show that our method achieved state-of-the-art results on two public medical image segmentation datasets.
Cross-Modality Interaction Network for Medical Image Fusion
Multi-modal medical image fusion maximizes the complementary information from diverse modality images by integrating source images. The fused medical image could offer enhanced richness and improved accuracy compared to the source images. Unfortunately, the existing deep learning-based medical image fusion methods generally rely on convolutional operations, which may not effectively capture global information such as spatial relationships or shape features within and across image modalities. To address this problem, we propose a unified AI-Generated Content (AIGC)-based medical image fusion, termed Cross-Modal Interactive Network (CMINet). The CMINet integrates a recursive transformer with an interactive Convolutional Neural Network. Specifically, the recursive transformer is designed to capture extended spatial and temporal dependencies within modalities, while the interactive CNN aims to extract and merge local features across modalities. Benefiting from cross-modality interaction learning, the proposed method can generate fused images with rich structural and functional information. Additionally, the architecture of the recursive network is structured to reduce parameter count, which could be beneficial for deployment on resource-constrained devices. Comprehensive experiments on multi-model medical images (MRI and CT, MRI and PET, and MRI and SPECT) demonstrate that the proposed method outperforms the state-ofthe-art fusion methods subjectively and objectively.
A cascaded framework with cross-modality transfer learning for whole heart segmentation
Automatic and accurate segmentation of the whole heart structure from 3D cardiac images plays an important role in helping physicians diagnose and treat cardiovascular disease. However, the time-consuming and laborious manual labeling of the heart images results in the inefficiency of utilizing the existing CT or MRI for training the deep learning network, which decrease the accuracy of whole heart segmentation. However, multi-modality data contains multi-level information of cardiac images due to different imaging mechanisms, which is beneficial to improve the segmentation accuracy. Therefore, this paper proposes a cascaded framework with cross-modality transfer learning for whole heart segmentation (CM-TranCaF), which consists of three key modules: modality transfer network (MTN), U-shaped multi-attention network (MAUNet) and spatial configuration network (SCN). In MTN, MRI images are transferred from MRI domain to CT domain, to increase the data volume by adopting the idea of adversarial training. The MAUNet is designed based on UNet, while the attention gates (AGs) are integrated into the skip connection to reduce the weight of background pixels. Moreover, to solve the problem of boundary blur, the position attention block (PAB) is also integrated into the bottom layer to aggregate similar features. Finally, the SCN is used to finetune the segmentation results by utilizing the anatomical information between different cardiac substructures. By evaluating the proposed method on the dataset of the MM-WHS challenge, CM-TranCaF achieves a Dice score of 91.1% on the testing dataset. The extensive experimental results prove the effectiveness of the proposed method compared to other state-of-the-art methods.
Diff-IF Multi-modality image fusion via diffusion model with fusion knowledge prior
武汉大学

目录