Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 377
Filter
1.
Front Bioeng Biotechnol ; 12: 1392807, 2024.
Article in English | MEDLINE | ID: mdl-39104626

ABSTRACT

Radiologists encounter significant challenges when segmenting and determining brain tumors in patients because this information assists in treatment planning. The utilization of artificial intelligence (AI), especially deep learning (DL), has emerged as a useful tool in healthcare, aiding radiologists in their diagnostic processes. This empowers radiologists to understand the biology of tumors better and provide personalized care to patients with brain tumors. The segmentation of brain tumors using multi-modal magnetic resonance imaging (MRI) images has received considerable attention. In this survey, we first discuss multi-modal and available magnetic resonance imaging modalities and their properties. Subsequently, we discuss the most recent DL-based models for brain tumor segmentation using multi-modal MRI. We divide this section into three parts based on the architecture: the first is for models that use the backbone of convolutional neural networks (CNN), the second is for vision transformer-based models, and the third is for hybrid models that use both convolutional neural networks and transformer in the architecture. In addition, in-depth statistical analysis is performed of the recent publication, frequently used datasets, and evaluation metrics for segmentation tasks. Finally, open research challenges are identified and suggested promising future directions for brain tumor segmentation to improve diagnostic accuracy and treatment outcomes for patients with brain tumors. This aligns with public health goals to use health technologies for better healthcare delivery and population health management.

2.
Comput Med Imaging Graph ; 116: 102422, 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-39116707

ABSTRACT

Reliability learning and interpretable decision-making are crucial for multi-modality medical image segmentation. Although many works have attempted multi-modality medical image segmentation, they rarely explore how much reliability is provided by each modality for segmentation. Moreover, the existing approach of decision-making such as the softmax function lacks the interpretability for multi-modality fusion. In this study, we proposed a novel approach named contextual discounted evidential network (CDE-Net) for reliability learning and interpretable decision-making under multi-modality medical image segmentation. Specifically, the CDE-Net first models the semantic evidence by uncertainty measurement using the proposed evidential decision-making module. Then, it leverages the contextual discounted fusion layer to learn the reliability provided by each modality. Finally, a multi-level loss function is deployed for the optimization of evidence modeling and reliability learning. Moreover, this study elaborates on the framework interpretability by discussing the consistency between pixel attribution maps and the learned reliability coefficients. Extensive experiments are conducted on both multi-modality brain and liver datasets. The CDE-Net gains high performance with an average Dice score of 0.914 for brain tumor segmentation and 0.913 for liver tumor segmentation, which proves CDE-Net has great potential to facilitate the interpretation of artificial intelligence-based multi-modality medical image fusion.

3.
Neuroradiology ; 2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39102087

ABSTRACT

BACKGROUND: Tuberculomas are prevalent in developing countries and demonstrate variable signals on MRI resulting in the overlap of the conventional imaging phenotype with other entities including glioma and brain metastasis. An accurate MRI diagnosis is important for the early institution of anti-tubercular therapy, decreased patient morbidity, mortality, and prevents unnecessary neurosurgical excision. This study aims to assess the potential of radiomics features of regular contrast images including T1W, T2W, T2W FLAIR, T1W post contrast images, and ADC maps, to differentiate between tuberculomas, high-grade-gliomas and metastasis, the commonest intra parenchymal mass lesions encountered in the clinical practice. METHODS: This retrospective study includes 185 subjects. Images were resampled, co-registered, skull-stripped, and zscore-normalized. Automated lesion segmentation was performed followed by radiomics feature extraction, train-test split, and features reduction. All machine learning algorithms that natively support multiclass classification were trained and assessed on features extracted from individual modalities as well as combined modalities. Model explainability of the best performing model was calculated using the summary plot obtained by SHAP values. RESULTS: Extra tree classifier trained on the features from ADC maps was the best classifier for the discrimination of tuberculoma from high-grade-glioma and metastasis with AUC-score of 0.96, accuracy-score of 0.923, Brier-score of 0.23. CONCLUSION: This study demonstrates that radiomics features are effective in discriminating between tuberculoma, metastasis, and high-grade-glioma with notable accuracy and AUC scores. Features extracted from the ADC maps surfaced as the most robust predictors of the target variable.

4.
Cureus ; 16(6): e61606, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38962619

ABSTRACT

We present the case of a 56-year-old female with a significant medical history of cholelithiasis and recurrent choledocholithiasis. Following an elective cholecystectomy, an obstructing gallstone in the common bile duct led to a series of interventions, including endoscopic retrograde cholangiopancreatography and stent placement. The patient was scheduled for a robot-assisted laparoscopic common bile duct exploration. Due to severe adhesions, the procedure was converted to open with a large right upper quadrant incision. Intraoperative continuous external oblique block and catheter placement were performed at the end of surgery in the OR. Peripheral nerve blocks have become an integral part of multimodal pain management strategies. This case report describes the successful implementation of an ultrasound-guided right external oblique intercostal block and catheter placement for postoperative pain control and minimization of opioids. This case highlights the efficacy and safety of ultrasound-guided peripheral nerve blocks for postoperative pain management. Successful pain control contributed to the patient's overall postoperative recovery.

5.
Disabil Rehabil Assist Technol ; : 1-8, 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38967320

ABSTRACT

Multi-Modality Aphasia Treatment (M-MAT) is an effective group intervention for post-stroke aphasia. M-MAT employs interactive card games and the modalities of gesture, drawing, reading, and writing to improve spoken language. However, there are challenges to implementation of group interventions such as M-MAT, particularly for those who cannot travel or live in rural areas. To maximise access to this effective treatment, we aimed to adapt M-MAT to telehealth format (M-MAT Tele). The Human-Centred Design Framework was utilized to guide the adaptation approach. We identified the intended context of use (outpatient/community rehabilitation) and the stakeholders (clinicians, people with aphasia, health service funders). People with aphasia and practising speech pathologists were invited to co-design M-MAT Tele in a series of iterative workshops, to ensure the end product was user-friendly and clinically feasible. The use of co-design allowed us to understand the hardware, software and other constraints and preferences of end users. In particular, clinicians (n = 3) required software compatible with a range of telehealth platforms and people with aphasia (n = 3) valued solutions with minimal technical demands and costs for participants. Co-design within the Human-Centred Design Framework led to a telehealth solution compatible with all major telehealth platforms, with minimal hardware or software requirements. Pilot testing is underway to confirm acceptability of M-MAT Tele to clinicians and people with aphasia, aiming to provide an effective, accessible tool for aphasia therapy in telehealth settings.

6.
Comput Med Imaging Graph ; 116: 102414, 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-38981250

ABSTRACT

The use of multi-modality non-contrast images (i.e., T1FS, T2FS and DWI) for segmenting liver tumors provides a solution by eliminating the use of contrast agents and is crucial for clinical diagnosis. However, this remains a challenging task to discover the most useful information to fuse multi-modality images for accurate segmentation due to inter-modal interference. In this paper, we propose a dual-stream multi-level fusion framework (DM-FF) to, for the first time, accurately segment liver tumors from non-contrast multi-modality images directly. Our DM-FF first designs an attention-based encoder-decoder to effectively extract multi-level feature maps corresponding to a specified representation of each modality. Then, DM-FF creates two types of fusion modules, in which a module fuses learned features to obtain a shared representation across multi-modality images to exploit commonalities and improve the performance, and a module fuses the decision evidence of segment to discover differences between modalities to prevent interference caused by modality's conflict. By integrating these three components, DM-FF enables multi-modality non-contrast images to cooperate with each other and enables an accurate segmentation. Evaluation on 250 patients including different types of tumors from two MRI scanners, DM-FF achieves a Dice of 81.20%, and improves performance (Dice by at least 11%) when comparing the eight state-of-the-art segmentation architectures. The results indicate that our DM-FF significantly promotes the development and deployment of non-contrast liver tumor technology.

7.
Medicina (Kaunas) ; 60(7)2024 Jul 01.
Article in English | MEDLINE | ID: mdl-39064511

ABSTRACT

Mitral regurgitation (MR) is a broadly diffuse valvular heart disease (VHD) with a significant impact on the healthcare system and patient prognosis. Transcatheter mitral valve interventions (TMVI) are now well-established techniques included in the therapeutic armamentarium for managing patients with mitral regurgitation, either primary or functional MR. Even if the guidelines give indications regarding the correct management of this VHD, the wide heterogeneity of patients' clinical backgrounds and valvular and heart anatomies make each patient a unique case, in which the appropriate device's selection requires a multimodal imaging evaluation and a multidisciplinary discussion. Proper pre-procedural evaluation plays a pivotal role in judging the feasibility of TMVI, while a cooperative work between imagers and interventionalist is also crucial for procedural success. This manuscript aims to provide an exhaustive overview of the main parameters that need to be evaluated for appropriate device selection, pre-procedural planning, intra-procedural guidance and post-operative assessment in the setting of TMVI. In addition, it tries to give some insights about future perspectives for structural cardiovascular imaging.


Subject(s)
Cardiac Catheterization , Heart Valve Prosthesis Implantation , Mitral Valve Insufficiency , Mitral Valve , Multimodal Imaging , Humans , Mitral Valve Insufficiency/surgery , Mitral Valve Insufficiency/diagnostic imaging , Multimodal Imaging/methods , Heart Valve Prosthesis Implantation/methods , Heart Valve Prosthesis Implantation/instrumentation , Heart Valve Prosthesis Implantation/standards , Mitral Valve/surgery , Mitral Valve/diagnostic imaging , Cardiac Catheterization/methods , Cardiac Catheterization/instrumentation
8.
Article in English | MEDLINE | ID: mdl-39060655

ABSTRACT

To evaluate left atrial (LA) function and strain parameters by cardiac magnetic resonance imaging (CMR) in patients with non-ischemic cardiomyopathy (NICM) and evaluate the association of these parameters with long-term clinical outcomes. We retrospectively included 92 patients with NICM and 50 subjects with no significant cardiovascular disease (control group). We calculated LA volumes using the Simpson area-length method to derive LA ejection fraction and expansion index. LA reservoir (ƐR), conduit (ƐCD), and contractile strain (ƐCT) were measured using dedicated CMR software (cvi42, Circle Cardiovascular Imaging Inc., version 5.14). An adjusted multivariate regression analysis was performed to determine the association of LA parameters with death and heart failure hospitalization (HFH). NICM patients were older with male preponderance. The mean age for NICM patients was 59.6 ± 15.9 years, 64% males, and 73% whites versus 52.2 ± 12.4 years, 34% male and 64% white for controls. LA strain patterns were significantly lower in NICM patients when compared to controls. During a median follow-up of 58.9 months, 12 patients (13%) died and 33(35.9%) had a HFH. None of the clinical or CMR factors were significantly associated with death. On multivariate analysis, after adjusting for age and significant univariate variables, ƐR was the only variable significantly associated with the HFH (OR 0.98, CI 0.96-1.0). Unadjusted and adjusted Cox proportional hazard models divided by the median ƐR (~ 18%) showed a significant difference in HFH over time (χ2 statistic = 21.1; P value = 0.03). In NICM patients, all LA strain components were reduced. ƐR was found to be significantly associated with HFH.

9.
Med Phys ; 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-39042362

ABSTRACT

BACKGROUND: Cardiac applications in radiation therapy are rapidly expanding including magnetic resonance guided radiation therapy (MRgRT) for real-time gating for targeting and avoidance near the heart or treating ventricular tachycardia (VT). PURPOSE: This work describes the development and implementation of a novel multi-modality and magnetic resonance (MR)-compatible cardiac phantom. METHODS: The patient-informed 3D model was derived from manual contouring of a contrast-enhanced Coronary Computed Tomography Angiography scan, exported as a Stereolithography model, then post-processed to simulate female heart with an average volume. The model was 3D-printed using Elastic50A to provide MR contrast to water background. Two rigid acrylic modules containing cardiac structures were designed and assembled, retrofitting to an MR-safe programmable motor to supply cardiac and respiratory motion in superior-inferior directions. One module contained a cavity for an ion chamber (IC), and the other was equipped with multiple interchangeable cavities for plastic scintillation detectors (PSDs). Images were acquired on a 0.35 T MR-linac for validation of phantom geometry, motion, and simulated online treatment planning and delivery. Three motion profiles were prescribed: patient-derived cardiac (sine waveform, 4.3 mm peak-to-peak, 60 beats/min), respiratory (cos4 waveform, 30 mm peak-to-peak, 12 breaths/min), and a superposition of cardiac (sine waveform, 4 mm peak-to-peak, 70 beats/min) and respiratory (cos4 waveform, 24 mm peak-to-peak, 12 breaths/min). The amplitude of the motion profiles was evaluated from sagittal cine images at eight frames/s with a resolution of 2.4 mm × 2.4 mm. Gated dosimetry experiments were performed using the two module configurations for calculating dose relative to stationary. A CT-based VT treatment plan was delivered twice under cone-beam CT guidance and cumulative stationary doses to multi-point PSDs were evaluated. RESULTS: No artifacts were observed on any images acquired during phantom operation. Phantom excursions measured 49.3 ± 25.8%/66.9 ± 14.0%, 97.0 ± 2.2%/96.4 ± 1.7%, and 90.4 ± 4.8%/89.3 ± 3.5% of prescription for cardiac, respiratory, and cardio-respiratory motion profiles for the 2-chamber (PSD) and 12-substructure (IC) phantom modules respectively. In the gated experiments, the cumulative dose was <2% from expected using the IC module. Real-time dose measured for the PSDs at 10 Hz acquisition rate demonstrated the ability to detect the dosimetric consequences of cardiac, respiratory, and cardio-respiratory motion when sampling of different locations during a single delivery, and the stability of our phantom dosimetric results over repeated cycles for the high dose and high gradient regions. For the VT delivery, high dose PSD was <1% from expected (5-6 cGy deviation of 5.9 Gy/fraction) and high gradient/low dose regions had deviations <3.6% (6.3 cGy less than expected 1.73 Gy/fraction). CONCLUSIONS: A novel multi-modality modular heart phantom was designed, constructed, and used for gated radiotherapy experiments on a 0.35 T MR-linac. Our phantom was capable of mimicking cardiac, cardio-respiratory, and respiratory motion while performing dosimetric evaluations of gated procedures using IC and PSD configurations. Time-resolved PSDs with small sensitive volumes appear promising for low-amplitude/high-frequency motion and multi-point data acquisition for advanced dosimetric capabilities. Illustrating VT planning and delivery further expands our phantom to address the unmet needs of cardiac applications in radiotherapy.

10.
Photoacoustics ; 38: 100630, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39040971

ABSTRACT

A comprehensive understanding of a tumor is required for accurate diagnosis and effective treatment. However, currently, there is no single imaging modality that can provide sufficient information. Photoacoustic (PA) imaging is a hybrid imaging technique with high spatial resolution and detection sensitivity, which can be combined with ultrasound (US) imaging to provide both optical and acoustic contrast. Elastography can noninvasively map the elasticity distribution of biological tissue, which reflects pathological conditions. In this study, we incorporated PA elastography into a commercial US/PA imaging system to develop a tri-modality imaging system, which has been tested for tumor detection using four mice with different physiological conditions. The results show that this tri-modality imaging system can provide complementary information on acoustic, optical, and mechanical properties. The enabled visualization and dimension estimation of tumors can lead to a more comprehensive tissue characterization for diagnosis and treatment.

12.
PeerJ Comput Sci ; 10: e2077, 2024.
Article in English | MEDLINE | ID: mdl-38983227

ABSTRACT

Background: Dyslexia is a neurological disorder that affects an individual's language processing abilities. Early care and intervention can help dyslexic individuals succeed academically and socially. Recent developments in deep learning (DL) approaches motivate researchers to build dyslexia detection models (DDMs). DL approaches facilitate the integration of multi-modality data. However, there are few multi-modality-based DDMs. Methods: In this study, the authors built a DL-based DDM using multi-modality data. A squeeze and excitation (SE) integrated MobileNet V3 model, self-attention mechanisms (SA) based EfficientNet B7 model, and early stopping and SA-based Bi-directional long short-term memory (Bi-LSTM) models were developed to extract features from magnetic resonance imaging (MRI), functional MRI, and electroencephalography (EEG) data. In addition, the authors fine-tuned the LightGBM model using the Hyperband optimization technique to detect dyslexia using the extracted features. Three datasets containing FMRI, MRI, and EEG data were used to evaluate the performance of the proposed DDM. Results: The findings supported the significance of the proposed DDM in detecting dyslexia with limited computational resources. The proposed model outperformed the existing DDMs by producing an optimal accuracy of 98.9%, 98.6%, and 98.8% for the FMRI, MRI, and EEG datasets, respectively. Healthcare centers and educational institutions can benefit from the proposed model to identify dyslexia in the initial stages. The interpretability of the proposed model can be improved by integrating vision transformers-based feature extraction.

13.
Cureus ; 16(5): e59935, 2024 May.
Article in English | MEDLINE | ID: mdl-38854259

ABSTRACT

BACKGROUND: The routine use of multimodal analgesic modality results in lower pain scores with minimum side effects and opioid utilization. MATERIALS AND METHODS:  A prospective, cross-sectional, observational study was conducted among orthopedicians practicing across India to assess the professional opinions on using analgesics to manage orthopedic pain effectively. RESULTS:  A total of 530 orthopedicians participated in this survey. Over 50% of the participants responded that tramadol with or without paracetamol was the choice of therapy for acute pain. Nearly 50% of the participants mentioned that multimodal interventions can sometimes help to manage pain. A total of 55.6% of participants mentioned that using Non-steroidal anti-inflammatory drugs was the most common in their clinical practice, while 25.7% of participants mentioned that they used tramadol more commonly in their clinical practice. As per clinical efficacy ranking, the combination of tramadol plus paracetamol (44.3%) was ranked first among analgesic combinations, followed by aceclofenac plus paracetamol (40.0%). The severity of pain (62.6%) followed by age (60.6%) and duration of therapy (52.6%) were the most common factors that should be considered while prescribing tramadol plus paracetamol combination. Gastrointestinal and renal are reported as the most common safety concerns encountered with analgesics. CONCLUSION:  The combination of tramadol and paracetamol was identified as the most preferred choice of analgesics for prolonged orthopedic pain management.

14.
Precis Clin Med ; 7(2): pbae012, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38912415

ABSTRACT

Background: The prognosis of breast cancer is often unfavorable, emphasizing the need for early metastasis risk detection and accurate treatment predictions. This study aimed to develop a novel multi-modal deep learning model using preoperative data to predict disease-free survival (DFS). Methods: We retrospectively collected pathology imaging, molecular and clinical data from The Cancer Genome Atlas and one independent institution in China. We developed a novel Deep Learning Clinical Medicine Based Pathological Gene Multi-modal (DeepClinMed-PGM) model for DFS prediction, integrating clinicopathological data with molecular insights. The patients included the training cohort (n = 741), internal validation cohort (n = 184), and external testing cohort (n = 95). Result: Integrating multi-modal data into the DeepClinMed-PGM model significantly improved area under the receiver operating characteristic curve (AUC) values. In the training cohort, AUC values for 1-, 3-, and 5-year DFS predictions increased to 0.979, 0.957, and 0.871, while in the external testing cohort, the values reached 0.851, 0.878, and 0.938 for 1-, 2-, and 3-year DFS predictions, respectively. The DeepClinMed-PGM's robust discriminative capabilities were consistently evident across various cohorts, including the training cohort [hazard ratio (HR) 0.027, 95% confidence interval (CI) 0.0016-0.046, P < 0.0001], the internal validation cohort (HR 0.117, 95% CI 0.041-0.334, P < 0.0001), and the external cohort (HR 0.061, 95% CI 0.017-0.218, P < 0.0001). Additionally, the DeepClinMed-PGM model demonstrated C-index values of 0.925, 0.823, and 0.864 within the three cohorts, respectively. Conclusion: This study introduces an approach to breast cancer prognosis, integrating imaging and molecular and clinical data for enhanced predictive accuracy, offering promise for personalized treatment strategies.

15.
Echocardiography ; 41(6): e15859, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38853624

ABSTRACT

Aortic stenosis (AS) stands as the most common valvular heart disease in developed countries and is characterized by progressive narrowing of the aortic valve orifice resulting in elevated transvalvular flow resistance, left ventricular hypertrophy, and progressive increased risk of heart failure and sudden death. This narrative review explores clinical challenges and evolving perspectives in moderate AS, where discrepancies between aortic valve area and pressure gradient measurements may pose diagnostic and therapeutic quandaries. Transthoracic echocardiography is the first-line imaging modality for AS evaluation, yet cases of discordance may require the application of ancillary noninvasive diagnostic modalities. This review underscores the importance of accurate grading of AS severity, especially in low-gradient phenotypes, emphasizing the need for vigilant follow-up. Current clinical guidelines primarily recommend aortic valve replacement for severe AS, potentially overlooking latent risks in moderate disease stages. The noninvasive multimodality imaging approach-including echocardiography, cardiac magnetic resonance, computed tomography, and nuclear techniques-provides unique insights into adaptive and maladaptive cardiac remodeling in AS and offers a promising avenue to deliver precise indications and exact timing for intervention in moderate AS phenotypes and asymptomatic patients, potentially improving long-term outcomes. Nevertheless, what we may have gleaned from a large amount of observational data is still insufficient to build a robust framework for clinical decision-making in moderate AS. Future research will prioritize randomized clinical trials designed to weigh the benefits and risks of preemptive aortic valve replacement in the management of moderate AS, as directed by specific imaging and nonimaging biomarkers.


Subject(s)
Aortic Valve Stenosis , Aortic Valve , Echocardiography , Humans , Aortic Valve Stenosis/physiopathology , Aortic Valve Stenosis/surgery , Echocardiography/methods , Aortic Valve/diagnostic imaging , Aortic Valve/surgery , Aortic Valve/physiopathology , Severity of Illness Index
16.
bioRxiv ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38826413

ABSTRACT

Background: Volumetry of subregions in the medial temporal lobe (MTL) computed from automatic segmentation in MRI can track neurodegeneration in Alzheimer's disease. However, image quality may vary in MRI. Poor quality MR images can lead to unreliable segmentation of MTL subregions. Considering that different MRI contrast mechanisms and field strengths (jointly referred to as "modalities" here) offer distinct advantages in imaging different parts of the MTL, we developed a muti-modality segmentation model using both 7 tesla (7T) and 3 tesla (3T) structural MRI to obtain robust segmentation in poor-quality images. Method: MRI modalities including 3T T1-weighted, 3T T2-weighted, 7T T1-weighted and 7T T2-weighted (7T-T2w) of 197 participants were collected from a longitudinal aging study at the Penn Alzheimer's Disease Research Center. Among them, 7T-T2w was used as the primary modality, and all other modalities were rigidly registered to the 7T-T2w. A model derived from nnU-Net took these registered modalities as input and outputted subregion segmentation in 7T-T2w space. 7T-T2w images most of which had high quality from 25 selected training participants were manually segmented to train the multi-modality model. Modality augmentation, which randomly replaced certain modalities with Gaussian noise, was applied during training to guide the model to extract information from all modalities. To compare our proposed model with a baseline single-modality model in the full dataset with mixed high/poor image quality, we evaluated the ability of derived volume/thickness measures to discriminate Amyloid+ mild cognitive impairment (A+MCI) and Amyloid- cognitively unimpaired (A-CU) groups, as well as the stability of these measurements in longitudinal data. Results: The multi-modality model delivered good performance regardless of 7T-T2w quality, while the single-modality model under-segmented subregions in poor-quality images. The multi-modality model generally demonstrated stronger discrimination of A+MCI versus A-CU. Intra-class correlation and Bland-Altman plots demonstrate that the multi-modality model had higher longitudinal segmentation consistency in all subregions while the single-modality model had low consistency in poor-quality images. Conclusion: The multi-modality MRI segmentation model provides an improved biomarker for neurodegeneration in the MTL that is robust to image quality. It also provides a framework for other studies which may benefit from multimodal imaging.

17.
Neural Netw ; 178: 106406, 2024 Oct.
Article in English | MEDLINE | ID: mdl-38838393

ABSTRACT

Low-light conditions pose significant challenges to vision tasks, such as salient object detection (SOD), due to insufficient photons. Light-insensitive RGB-T SOD models mitigate the above problems to some extent, but they are limited in performance as they only focus on spatial feature fusion while ignoring the frequency discrepancy. To this end, we propose an RGB-T SOD model by mining spatial-frequency cues, called SFMNet, for low-light scenes. Our SFMNet consists of spatial-frequency feature exploration (SFFE) modules and spatial-frequency feature interaction (SFFI) modules. To be specific, the SFFE module aims to separate spatial-frequency features and adaptively extract high and low-frequency features. Moreover, the SFFI module integrates cross-modality and cross-domain information to capture effective feature representations. By deploying both modules in a top-down pathway, our method generates high-quality saliency predictions. Furthermore, we construct the first low-light RGB-T SOD dataset as a benchmark for evaluating performance. Extensive experiments demonstrate that our SFMNet can achieve higher accuracy than the existing models for low-light scenes.


Subject(s)
Cues , Humans , Light , Neural Networks, Computer , Visual Perception/physiology , Photic Stimulation/methods , Algorithms
18.
Med Phys ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38896829

ABSTRACT

BACKGROUND: Head and neck (HN) gross tumor volume (GTV) auto-segmentation is challenging due to the morphological complexity and low image contrast of targets. Multi-modality images, including computed tomography (CT) and positron emission tomography (PET), are used in the routine clinic to assist radiation oncologists for accurate GTV delineation. However, the availability of PET imaging may not always be guaranteed. PURPOSE: To develop a deep learning segmentation framework for automated GTV delineation of HN cancers using a combination of PET/CT images, while addressing the challenge of missing PET data. METHODS: Two datasets were included for this study: Dataset I: 524 (training) and 359 (testing) oropharyngeal cancer patients from different institutions with their PET/CT pairs provided by the HECKTOR Challenge; Dataset II: 90 HN patients(testing) from a local institution with their planning CT, PET/CT pairs. To handle potentially missing PET images, a model training strategy named the "Blank Channel" method was implemented. To simulate the absence of a PET image, a blank array with the same dimensions as the CT image was generated to meet the dual-channel input requirement of the deep learning model. During the model training process, the model was randomly presented with either a real PET/CT pair or a blank/CT pair. This allowed the model to learn the relationship between the CT image and the corresponding GTV delineation based on available modalities. As a result, our model had the ability to handle flexible inputs during prediction, making it suitable for cases where PET images are missing. To evaluate the performance of our proposed model, we trained it using training patients from Dataset I and tested it with Dataset II. We compared our model (Model 1) with two other models which were trained for specific modality segmentations: Model 2 trained with only CT images, and Model 3 trained with real PET/CT pairs. The performance of the models was evaluated using quantitative metrics, including Dice similarity coefficient (DSC), mean surface distance (MSD), and 95% Hausdorff Distance (HD95). In addition, we evaluated our Model 1 and Model 3 using the 359 test cases in Dataset I. RESULTS: Our proposed model(Model 1) achieved promising results for GTV auto-segmentation using PET/CT images, with the flexibility of missing PET images. Specifically, when assessed with only CT images in Dataset II, Model 1 achieved DSC of 0.56 ± 0.16, MSD of 3.4 ± 2.1 mm, and HD95 of 13.9 ± 7.6 mm. When the PET images were included, the performance of our model was improved to DSC of 0.62 ± 0.14, MSD of 2.8 ± 1.7 mm, and HD95 of 10.5 ± 6.5 mm. These results are comparable to those achieved by Model 2 and Model 3, illustrating Model 1's effectiveness in utilizing flexible input modalities. Further analysis using the test dataset from Dataset I showed that Model 1 achieved an average DSC of 0.77, surpassing the overall average DSC of 0.72 among all participants in the HECKTOR Challenge. CONCLUSIONS: We successfully refined a multi-modal segmentation tool for accurate GTV delineation for HN cancer. Our method addressed the issue of missing PET images by allowing flexible data input, thereby providing a practical solution for clinical settings where access to PET imaging may be limited.

19.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38801702

ABSTRACT

Self-supervised learning plays an important role in molecular representation learning because labeled molecular data are usually limited in many tasks, such as chemical property prediction and virtual screening. However, most existing molecular pre-training methods focus on one modality of molecular data, and the complementary information of two important modalities, SMILES and graph, is not fully explored. In this study, we propose an effective multi-modality self-supervised learning framework for molecular SMILES and graph. Specifically, SMILES data and graph data are first tokenized so that they can be processed by a unified Transformer-based backbone network, which is trained by a masked reconstruction strategy. In addition, we introduce a specialized non-overlapping masking strategy to encourage fine-grained interaction between these two modalities. Experimental results show that our framework achieves state-of-the-art performance in a series of molecular property prediction tasks, and a detailed ablation study demonstrates efficacy of the multi-modality framework and the masking strategy.


Subject(s)
Supervised Machine Learning , Algorithms , Computational Biology/methods
20.
Sensors (Basel) ; 24(10)2024 May 18.
Article in English | MEDLINE | ID: mdl-38794076

ABSTRACT

Object detection is one of the core technologies for autonomous driving. Current road object detection mainly relies on visible light, which is prone to missed detections and false alarms in rainy, night-time, and foggy scenes. Multispectral object detection based on the fusion of RGB and infrared images can effectively address the challenges of complex and changing road scenes, improving the detection performance of current algorithms in complex scenarios. However, previous multispectral detection algorithms suffer from issues such as poor fusion of dual-mode information, poor detection performance for multi-scale objects, and inadequate utilization of semantic information. To address these challenges and enhance the detection performance in complex road scenes, this paper proposes a novel multispectral object detection algorithm called MRD-YOLO. In MRD-YOLO, we utilize interaction-based feature extraction to effectively fuse information and introduce the BIC-Fusion module with attention guidance to fuse different modal information. We also incorporate the SAConv module to improve the model's detection performance for multi-scale objects and utilize the AIFI structure to enhance the utilization of semantic information. Finally, we conduct experiments on two major public datasets, FLIR_Aligned and M3FD. The experimental results demonstrate that compared to other algorithms, the proposed algorithm achieves superior detection performance in complex road scenes.

SELECTION OF CITATIONS
SEARCH DETAIL