Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 109
Filtrar
1.
IEEE Trans Med Imaging ; PP2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38949934

RESUMO

Deep learning approaches for multi-label Chest X-ray (CXR) images classification usually require large-scale datasets. However, acquiring such datasets with full annotations is costly, time-consuming, and prone to noisy labels. Therefore, we introduce a weakly supervised learning problem called Single Positive Multi-label Learning (SPML) into CXR images classification (abbreviated as SPML-CXR), in which only one positive label is annotated per image. A simple solution to SPML-CXR problem is to assume that all the unannotated pathological labels are negative, however, it might introduce false negative labels and decrease the model performance. To this end, we present a Multi-level Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired by the pseudo-labeling and consistency regularization in semi-supervised learning, we construct a weak-to-strong consistency framework, where the model prediction on weakly-augmented image is treated as the pseudo label for supervising the model prediction on a strongly-augmented version of the same image, and define an Image-level Perturbation-based Consistency (IPC) regularization to recover the potential mislabeled positive labels. Besides, we incorporate Random Elastic Deformation (RED) as an additional strong augmentation to enhance the perturbation. Second, aiming to expand the perturbation space, we design a perturbation stream to the consistency framework at the feature-level and introduce a Feature-level Perturbation-based Consistency (FPC) regularization as a supplement. Third, we design a Transformer-based encoder module to explore the sample relationship within each mini-batch by a Batch-level Transformer-based Correlation (BTC) regularization. Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown the effectiveness of our MPC framework for solving the SPML-CXR problem.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38083369

RESUMO

[18F]-Fluorodeoxyglucose (FDG) positron emission tomography - computed tomography (PET-CT) has become the imaging modality of choice for diagnosing many cancers. Co-learning complementary PET-CT imaging features is a fundamental requirement for automatic tumor segmentation and for developing computer aided cancer diagnosis systems. In this study, we propose a hyper-connected transformer (HCT) network that integrates a transformer network (TN) with a hyper connected fusion for multi-modality PET-CT images. The TN was leveraged for its ability to provide global dependencies in image feature learning, which was achieved by using image patch embeddings with a self-attention mechanism to capture image-wide contextual information. We extended the single-modality definition of TN with multiple TN based branches to separately extract image features. We also introduced a hyper connected fusion to fuse the contextual and complementary image features across multiple transformers in an iterative manner. Our results with two clinical datasets show that HCT achieved better performance in segmentation accuracy when compared to the existing methods.Clinical Relevance-We anticipate that our approach can be an effective and supportive tool to aid physicians in tumor quantification and in identifying image biomarkers for cancer treatment.


Assuntos
Neoplasias , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Neoplasias/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Fluordesoxiglucose F18 , Diagnóstico por Computador
3.
Eur J Nucl Med Mol Imaging ; 50(13): 3996-4009, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37596343

RESUMO

PURPOSE: Prognostic prediction is crucial to guide individual treatment for locoregionally advanced nasopharyngeal carcinoma (LA-NPC) patients. Recently, multi-task deep learning was explored for joint prognostic prediction and tumor segmentation in various cancers, resulting in promising performance. This study aims to evaluate the clinical value of multi-task deep learning for prognostic prediction in LA-NPC patients. METHODS: A total of 886 LA-NPC patients acquired from two medical centers were enrolled including clinical data, [18F]FDG PET/CT images, and follow-up of progression-free survival (PFS). We adopted a deep multi-task survival model (DeepMTS) to jointly perform prognostic prediction (DeepMTS-Score) and tumor segmentation from FDG-PET/CT images. The DeepMTS-derived segmentation masks were leveraged to extract handcrafted radiomics features, which were also used for prognostic prediction (AutoRadio-Score). Finally, we developed a multi-task deep learning-based radiomic (MTDLR) nomogram by integrating DeepMTS-Score, AutoRadio-Score, and clinical data. Harrell's concordance indices (C-index) and time-independent receiver operating characteristic (ROC) analysis were used to evaluate the discriminative ability of the proposed MTDLR nomogram. For patient stratification, the PFS rates of high- and low-risk patients were calculated using Kaplan-Meier method and compared with the observed PFS probability. RESULTS: Our MTDLR nomogram achieved C-index of 0.818 (95% confidence interval (CI): 0.785-0.851), 0.752 (95% CI: 0.638-0.865), and 0.717 (95% CI: 0.641-0.793) and area under curve (AUC) of 0.859 (95% CI: 0.822-0.895), 0.769 (95% CI: 0.642-0.896), and 0.730 (95% CI: 0.634-0.826) in the training, internal validation, and external validation cohorts, which showed a statistically significant improvement over conventional radiomic nomograms. Our nomogram also divided patients into significantly different high- and low-risk groups. CONCLUSION: Our study demonstrated that MTDLR nomogram can perform reliable and accurate prognostic prediction in LA-NPC patients, and also enabled better patient stratification, which could facilitate personalized treatment planning.


Assuntos
Aprendizado Profundo , Neoplasias Nasofaríngeas , Humanos , Prognóstico , Nomogramas , Carcinoma Nasofaríngeo/diagnóstico por imagem , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Fluordesoxiglucose F18 , Neoplasias Nasofaríngeas/diagnóstico por imagem , Estudos Retrospectivos
4.
Front Public Health ; 11: 1143947, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37033028

RESUMO

Virtual Reality (VR) has emerged as a new safe and efficient tool for the rehabilitation of many childhood and adulthood illnesses. VR-based therapies have the potential to improve both motor and functional skills in a wide range of age groups through cortical reorganization and the activation of various neuronal connections. Recently, the potential for using serious VR-based games that combine perceptual learning and dichoptic stimulation has been explored for the rehabilitation of ophthalmological and neurological disorders. In ophthalmology, several clinical studies have demonstrated the ability to use VR training to enhance stereopsis, contrast sensitivity, and visual acuity. The use of VR technology provides a significant advantage in training each eye individually without requiring occlusion or penalty. In neurological disorders, the majority of patients undergo recurrent episodes (relapses) of neurological impairment, however, in a few cases (60-80%), the illness progresses over time and becomes chronic, consequential in cumulated motor disability and cognitive deficits. Current research on memory restoration has been spurred by theories about brain plasticity and findings concerning the nervous system's capacity to reconstruct cellular synapses as a result of interaction with enriched environments. Therefore, the use of VR training can play an important role in the improvement of cognitive function and motor disability. Although there are several reviews in the community employing relevant Artificial Intelligence in healthcare, VR has not yet been thoroughly examined in this regard. In this systematic review, we examine the key ideas of VR-based training for prevention and control measurements in ocular diseases such as Myopia, Amblyopia, Presbyopia, and Age-related Macular Degeneration (AMD), and neurological disorders such as Alzheimer, Multiple Sclerosis (MS) Epilepsy and Autism spectrum disorder. This review highlights the fundamentals of VR technologies regarding their clinical research in healthcare. Moreover, these findings will raise community awareness of using VR training and help researchers to learn new techniques to prevent and cure different diseases. We further discuss the current challenges of using VR devices, as well as the future prospects of human training.


Assuntos
Transtorno do Espectro Autista , Pessoas com Deficiência , Transtornos Motores , Doenças do Sistema Nervoso , Realidade Virtual , Humanos , Criança , Inteligência Artificial
5.
IEEE Trans Med Imaging ; 42(10): 2842-2852, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37043322

RESUMO

Dynamic PET imaging provides superior physiological information than conventional static PET imaging. However, the dynamic information is gained at the cost of a long scanning protocol; this limits the clinical application of dynamic PET imaging. We developed a modified Logan reference plot model to shorten the acquisition procedure in dynamic PET imaging by omitting the early-time information necessary for the conventional reference Logan model. The proposed model is accurate theoretically, but the straightforward approach raises the sampling problem in implementation and results in noisy parametric images. We then designed a self-supervised convolutional neural network to increase the noise performance of parametric imaging, with dynamic images of only a single subject for training. The proposed method was validated via simulated and real dynamic [Formula: see text]-fallypride PET data. Results showed that it accurately estimated the distribution volume ratio (DVR) in dynamic PET with a shortened scanning protocol, e.g., 20 minutes, where the estimations were comparable with those obtained from a standard dynamic PET study of 120 minutes of acquisition. Further comparisons illustrated that our method outperformed the shortened Logan model implemented with Gaussian filtering, regularization, BM4D and the 4D deep image prior methods in terms of the trade-off between bias and variance. Since the proposed method uses data acquired in a short period of time upon the equilibrium, it has the potential to add clinical values by providing both DVR and Standard Uptake Value (SUV) simultaneously. It thus promotes clinical applications of dynamic PET studies when neuronal receptor functions are studied.


Assuntos
Redes Neurais de Computação , Tomografia por Emissão de Pósitrons , Tomografia por Emissão de Pósitrons/métodos
6.
IEEE Trans Biomed Eng ; 70(9): 2592-2603, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37030751

RESUMO

In this article, we propose a novel wavelet convolution unit for the image-oriented neural network to integrate wavelet analysis with a vanilla convolution operator to extract deep abstract features more efficiently. On one hand, in order to acquire non-local receptive fields and avoid information loss, we define a new convolution operation by composing a traditional convolution function and approximate and detailed representations after single-scale wavelet decomposition of source images. On the other hand, multi-scale wavelet decomposition is introduced to obtain more comprehensive multi-scale feature information. Then, we fuse all these cross-scale features to improve the problem of inaccurate localization of singular points. Given the novel wavelet convolution unit, we further design a network based on it for fine-grained Alzheimer's disease classifications (i.e., Alzheimer's disease, Normal controls, early mild cognitive impairment, late mild cognitive impairment). Up to now, only a few methods have studied one or several fine-grained classifications, and even fewer methods can achieve both fine-grained and multi-class classifications. We adopt the novel network and diffuse tensor images to achieve fine-grained classifications, which achieved state-of-the-art accuracy for all eight kinds of fine-grained classifications, up to 97.30%, 95.78%, 95.00%, 94.00%, 97.89%, 95.71%, 95.07%, 93.79%. In order to build a reference standard for Alzheimer's disease classifications, we actually implemented all twelve coarse-grained and fine-grained classifications. The results show that the proposed method achieves solidly high accuracy for them. Its classification ability greatly exceeds any kind of existing Alzheimer's disease classification method.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Doença de Alzheimer/diagnóstico por imagem , Redes Neurais de Computação , Encéfalo , Bases de Dados Factuais
7.
Comput Biol Med ; 154: 106576, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36736097

RESUMO

The spatial architecture of the tumour microenvironment and phenotypic heterogeneity of tumour cells have been shown to be associated with cancer prognosis and clinical outcomes, including survival. Recent advances in highly multiplexed imaging, including imaging mass cytometry (IMC), capture spatially resolved, high-dimensional maps that quantify dozens of disease-relevant biomarkers at single-cell resolution, that contain potential to inform patient-specific prognosis. Existing automated methods for predicting survival, on the other hand, typically do not leverage spatial phenotype information captured at the single-cell level. Furthermore, there is no end-to-end method designed to leverage the rich information in whole IMC images and all marker channels, and aggregate this information with clinical data in a complementary manner to predict survival with enhanced accuracy. To that end, we present a deep multimodal graph-based network (DMGN) with two modules: (1) a multimodal graph-based module that considers relationships between spatial phenotype information in all image regions and all clinical variables adaptively, and (2) a clinical embedding module that automatically generates embeddings specialised for each clinical variable to enhance multimodal aggregation. We demonstrate that our modules are consistently effective at improving survival prediction performance using two public breast cancer datasets, and that our new approach can outperform state-of-the-art methods in survival prediction.


Assuntos
Neoplasias , Microambiente Tumoral , Humanos , Fenótipo , Extremidade Superior , Neoplasias/diagnóstico por imagem
8.
Artif Intell Med ; 132: 102374, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36207084

RESUMO

OBJECTIVE: The accurate classification of mass lesions in the adrenal glands ('adrenal masses'), detected with computed tomography (CT), is important for diagnosis and patient management. Adrenal masses can be benign or malignant and benign masses have varying prevalence. Classification methods based on convolutional neural networks (CNNs) are the state-of-the-art in maximizing inter-class differences in large medical imaging training datasets. The application of CNNs, to adrenal masses is challenging due to large intra-class variations, large inter-class similarities and imbalanced training data due to the size of the mass lesions. METHODS: We developed a deep multi-scale resemblance network (DMRN) to overcome these limitations and leveraged paired CNNs to evaluate the intra-class similarities. We used multi-scale feature embedding to improve the inter-class separability by iteratively combining complementary information produced at different scales of the input to create structured feature descriptors. We augmented the training data with randomly sampled paired adrenal masses to reduce the influence of imbalanced training data. RESULTS: We used 229 CT scans of patients with adrenal masses for evaluation. In a five-fold cross-validation, our method had the best results (89.52 % in accuracy) when compared to the state-of-the-art methods (p < 0.05). We conducted a generalizability analysis of our method on the ImageCLEF 2016 competition dataset for medical subfigure classification, which consists of a training set of 6776 images and a test set of 4166 images across 30 classes. Our method achieved better classification performance (85.90 % in accuracy) when compared to the existing methods and was competitive when compared with methods that require additional training data (1.47 % lower in accuracy). CONCLUSION: Our DMRN sub-classified adrenal masses on CT and was superior to state-of-the-art approaches.


Assuntos
Redes Neurais de Computação , Tomografia Computadorizada por Raios X , Humanos , Tomografia Computadorizada por Raios X/métodos
9.
Front Oncol ; 12: 899351, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35965589

RESUMO

Objective: Deep learning-based radiomics (DLR) has achieved great success in medical image analysis and has been considered a replacement for conventional radiomics that relies on handcrafted features. In this study, we aimed to explore the capability of DLR for the prediction of 5-year progression-free survival (PFS) in advanced nasopharyngeal carcinoma (NPC) using pretreatment PET/CT images. Methods: A total of 257 patients (170/87 patients in internal/external cohorts) with advanced NPC (TNM stage III or IVa) were enrolled. We developed an end-to-end multi-modality DLR model, in which a 3D convolutional neural network was optimized to extract deep features from pretreatment PET/CT images and predict the probability of 5-year PFS. The TNM stage, as a high-level clinical feature, could be integrated into our DLR model to further improve the prognostic performance. For a comparison between conventional radiomics and DLR, 1,456 handcrafted features were extracted, and optimal conventional radiomics methods were selected from 54 cross-combinations of six feature selection methods and nine classification methods. In addition, risk group stratification was performed with clinical signature, conventional radiomics signature, and DLR signature. Results: Our multi-modality DLR model using both PET and CT achieved higher prognostic performance (area under the receiver operating characteristic curve (AUC) = 0.842 ± 0.034 and 0.823 ± 0.012 for the internal and external cohorts) than the optimal conventional radiomics method (AUC = 0.796 ± 0.033 and 0.782 ± 0.012). Furthermore, the multi-modality DLR model outperformed single-modality DLR models using only PET (AUC = 0.818 ± 0.029 and 0.796 ± 0.009) or only CT (AUC = 0.657 ± 0.055 and 0.645 ± 0.021). For risk group stratification, the conventional radiomics signature and DLR signature enabled significant difference between the high- and low-risk patient groups in both the internal and external cohorts (p < 0.001), while the clinical signature failed in the external cohort (p = 0.177). Conclusion: Our study identified potential prognostic tools for survival prediction in advanced NPC, which suggests that DLR could provide complementary values to the current TNM staging.

10.
Neuroimage ; 259: 119444, 2022 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-35792292

RESUMO

Deformable image registration is fundamental for many medical image analyses. A key obstacle for accurate image registration lies in image appearance variations such as the variations in texture, intensities, and noise. These variations are readily apparent in medical images, especially in brain images where registration is frequently used. Recently, deep learning-based registration methods (DLRs), using deep neural networks, have shown computational efficiency that is several orders of magnitude faster than traditional optimization-based registration methods (ORs). DLRs rely on a globally optimized network that is trained with a set of training samples to achieve faster registration. DLRs tend, however, to disregard the target-pair-specific optimization inherent in ORs and thus have degraded adaptability to variations in testing samples. This limitation is severe for registering medical images with large appearance variations, especially since few existing DLRs explicitly take into account appearance variations. In this study, we propose an Appearance Adjustment Network (AAN) to enhance the adaptability of DLRs to appearance variations. Our AAN, when integrated into a DLR, provides appearance transformations to reduce the appearance variations during registration. In addition, we propose an anatomy-constrained loss function through which our AAN generates anatomy-preserving transformations. Our AAN has been purposely designed to be readily inserted into a wide range of DLRs and can be trained cooperatively in an unsupervised and end-to-end manner. We evaluated our AAN with three state-of-the-art DLRs - Voxelmorph (VM), Diffeomorphic Voxelmorph (DifVM), and Laplacian Pyramid Image Registration Network (LapIRN) - on three well-established public datasets of 3D brain magnetic resonance imaging (MRI) - IBSR18, Mindboggle101, and LPBA40. The results show that our AAN consistently improved existing DLRs and outperformed state-of-the-art ORs on registration accuracy, while adding a fractional computational load to existing DLRs.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Algoritmos , Encéfalo/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação
11.
IEEE J Biomed Health Inform ; 26(9): 4497-4507, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35696469

RESUMO

Nasopharyngeal Carcinoma (NPC) is a malignant epithelial cancer arising from the nasopharynx. Survival prediction is a major concern for NPC patients, as it provides early prognostic information to plan treatments. Recently, deep survival models based on deep learning have demonstrated the potential to outperform traditional radiomics-based survival prediction models. Deep survival models usually use image patches covering the whole target regions (e.g., nasopharynx for NPC) or containing only segmented tumor regions as the input. However, the models using the whole target regions will also include non-relevant background information, while the models using segmented tumor regions will disregard potentially prognostic information existing out of primary tumors (e.g., local lymph node metastasis and adjacent tissue invasion). In this study, we propose a 3D end-to-end Deep Multi-Task Survival model (DeepMTS) for joint survival prediction and tumor segmentation in advanced NPC from pretreatment PET/CT. Our novelty is the introduction of a hard-sharing segmentation backbone to guide the extraction of local features related to the primary tumors, which reduces the interference from non-relevant background information. In addition, we also introduce a cascaded survival network to capture the prognostic information existing out of primary tumors and further leverage the global tumor information (e.g., tumor size, shape, and locations) derived from the segmentation backbone. Our experiments with two clinical datasets demonstrate that our DeepMTS can consistently outperform traditional radiomics-based survival prediction models and existing deep survival models.


Assuntos
Aprendizado Profundo , Neoplasias Nasofaríngeas , Humanos , Carcinoma Nasofaríngeo/diagnóstico por imagem , Carcinoma Nasofaríngeo/patologia , Neoplasias Nasofaríngeas/diagnóstico por imagem , Neoplasias Nasofaríngeas/patologia , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Prognóstico
12.
IEEE J Biomed Health Inform ; 26(7): 3261-3271, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35377850

RESUMO

Positron Emission Tomography (PET) has become a preferred imaging modality for cancer diagnosis, radiotherapy planning, and treatment responses monitoring. Accurate and automatic tumor segmentation is the fundamental requirement for these clinical applications. Deep convolutional neural networks have become the state-of-the-art in PET tumor segmentation. The normalization process is one of the key components for accelerating network training and improving the performance of the network. However, existing normalization methods either introduce batch noise into the instance PET image by calculating statistics on batch level or introduce background noise into every single pixel by sharing the same learnable parameters spatially. In this paper, we proposed an attentive transformation (AT)-based normalization method for PET tumor segmentation. We exploit the distinguishability of breast tumor in PET images and dynamically generate dedicated and pixel-dependent learnable parameters in normalization via the transformation on a combination of channel-wise and spatial-wise attentive responses. The attentive learnable parameters allow to re-calibrate features pixel-by-pixel to focus on the high-uptake area while attenuating the background noise of PET images. Our experimental results on two real clinical datasets show that the AT-based normalization method improves breast tumor segmentation performance when compared with the existing normalization methods.


Assuntos
Neoplasias da Mama , Redes Neurais de Computação , Neoplasias da Mama/diagnóstico por imagem , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Tomografia por Emissão de Pósitrons
13.
IEEE Trans Biomed Eng ; 69(8): 2557-2568, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35148261

RESUMO

OBJECTIVE: The m6A modification is the most common ribonucleic acid (RNA) modification, playing a role in prompting the virus's gene mutation and protein structure changes in the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). Nanopore single-molecule direct RNA sequencing (DRS) provides data support for RNA modification detection, which can preserve the potential m6A signature compared to second-generation sequencing. However, due to insufficient DRS data, there is a lack of methods to find m6A RNA modifications in DRS. Our purpose is to identify m6A modifications in DRS precisely. METHODS: We present a methodology for identifying m6A modifications that incorporated mapping and extracted features from DRS data. To detect m6A modifications, we introduce an ensemble method called mixed-weight neural bagging (MWNB), trained with 5-base RNA synthetic DRS containing modified and unmodified m6A. RESULTS: Our MWNB model achieved the highest classification accuracy of 97.85% and AUC of 0.9968. Additionally, we applied the MWNB model to the COVID-19 dataset; the experiment results reveal a strong association with biomedical experiments. CONCLUSION: Our strategy enables the prediction of m6A modifications using DRS data and completes the identification of m6A modifications on the SARS-CoV-2. SIGNIFICANCE: The Corona Virus Disease 2019 (COVID-19) outbreak has significantly influence, caused by the SARS-CoV-2. An RNA modification called m6A is connected with viral infections. The appearance of m6A modifications related to several essential proteins affects proteins' structure and function. Therefore, finding the location and number of m6A RNA modifications is crucial for subsequent analysis of the protein expression profile.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , RNA Viral/análise , RNA Viral/genética , SARS-CoV-2/genética , Análise de Sequência de RNA
14.
IEEE Trans Image Process ; 31: 1789-1804, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35100116

RESUMO

Video Summarization (VS) has become one of the most effective solutions for quickly understanding a large volume of video data. Dictionary selection with self representation and sparse regularization has demonstrated its promise for VS by formulating the VS problem as a sparse selection task on video frames. However, existing dictionary selection models are generally designed only for data reconstruction, which results in the neglect of the inherent structured information among video frames. In addition, the sparsity commonly constrained by L2,1 norm is not strong enough, which causes the redundancy of keyframes, i.e., similar keyframes are selected. Therefore, to address these two issues, in this paper we propose a general framework called graph convolutional dictionary selection with L2,p ( ) norm (GCDS 2,p ) for both keyframe selection and skimming based summarization. Firstly, we incorporate graph embedding into dictionary selection to generate the graph embedding dictionary, which can take the structured information depicted in videos into account. Secondly, we propose to use L2,p ( ) norm constrained row sparsity, in which p can be flexibly set for two forms of video summarization. For keyframe selection, can be utilized to select diverse and representative keyframes; and for skimming, p=1 can be utilized to select key shots. In addition, an efficient iterative algorithm is devised to optimize the proposed model, and the convergence is theoretically proved. Experimental results including both keyframe selection and skimming based summarization on four benchmark datasets demonstrate the effectiveness and superiority of the proposed method.

15.
IEEE Trans Med Imaging ; 41(2): 347-359, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34520350

RESUMO

68Ga-DOTATATE PET-CT is routinely used for imaging neuroendocrine tumor (NET) somatostatin receptor subtype 2 (SSTR2) density in patients, and is complementary to FDG PET-CT for improving the accuracy of NET detection, characterization, grading, staging, and predicting/monitoring NET responses to treatment. Performing sequential 18F-FDG and 68Ga-DOTATATE PET scans would require 2 or more days and can delay patient care. To align temporal and spatial measurements of 18F-FDG and 68Ga-DOTATATE PET, and to reduce scan time and CT radiation exposure to patients, we propose a single-imaging session dual-tracer dynamic PET acquisition protocol in the study. A recurrent extreme gradient boosting (rXGBoost) machine learning algorithm was proposed to separate the mixed 18F-FDG and 68Ga-DOTATATE time activity curves (TACs) for the region of interest (ROI) based quantification with tracer kinetic modeling. A conventional parallel multi-tracer compartment modeling method was also implemented for reference. Single-scan dual-tracer dynamic PET was simulated from 12 NET patient studies with 18F-FDG and 68Ga-DOTATATE 45-min dynamic PET scans separately obtained within 2 days. Our experimental results suggested an 18F-FDG injection first followed by 68Ga-DOTATATE with a minimum 5 min delayed injection protocol for the separation of mixed 18F-FDG and 68Ga-DOTATATE TACs using rXGBoost algorithm followed by tracer kinetic modeling is highly feasible.


Assuntos
Fluordesoxiglucose F18 , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Radioisótopos de Gálio , Humanos , Aprendizado de Máquina , Tomografia por Emissão de Pósitrons , Cintilografia
16.
IEEE Trans Image Process ; 31: 880-893, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34951844

RESUMO

Automatic vertebra segmentation from computed tomography (CT) image is the very first and a decisive stage in vertebra analysis for computer-based spinal diagnosis and therapy support system. However, automatic segmentation of vertebra remains challenging due to several reasons, including anatomic complexity of spine, unclear boundaries of the vertebrae associated with spongy and soft bones. Based on 2D U-Net, we have proposed an Embedded Clustering Sliced U-Net (ECSU-Net). ECSU-Net comprises of three modules named segmentation, intervertebral disc extraction (IDE) and fusion. The segmentation module follows an instance embedding clustering approach, where our three sliced sub-nets use axis of CT images to generate a coarse 2D segmentation along with embedding space with the same size of the input slices. Our IDE module is designed to classify vertebra and find the inter-space between two slices of segmented spine. Our fusion module takes the coarse segmentation (2D) and outputs the refined 3D results of vertebra. A novel adaptive discriminative loss (ADL) function is introduced to train the embedding space for clustering. In the fusion strategy, three modules are integrated via a learnable weight control component, which adaptively sets their contribution. We have evaluated classical and deep learning methods on Spineweb dataset-2. ECSU-Net has provided comparable performance to previous neural network based algorithms achieving the best segmentation dice score of 95.60% and classification accuracy of 96.20%, while taking less time and computation resources.


Assuntos
Processamento de Imagem Assistida por Computador , Disco Intervertebral , Análise por Conglomerados , Redes Neurais de Computação , Tomografia Computadorizada por Raios X
17.
Comput Med Imaging Graph ; 91: 101952, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34144318

RESUMO

Automated segmentation of left ventricular cavity (LVC) in temporal cardiac image sequences (consisting of multiple time-points) is a fundamental requirement for quantitative analysis of cardiac structural and functional changes. Deep learning methods for segmentation are the state-of-the-art in performance; however, these methods are generally formulated to work on a single time-point, and thus disregard the complementary information available from the temporal image sequences that can aid in segmentation accuracy and consistency across the time-points. In particular, single time-point segmentation methods perform poorly in segmenting the end-systole (ES) phase image in the cardiac sequence, where the left ventricle deforms to the smallest irregular shape, and the boundary between the blood chamber and the myocardium becomes inconspicuous and ambiguous. To overcome these limitations in automatically segmenting temporal LVCs, we present a spatial sequential network (SS-Net) to learn the deformation and motion characteristics of the LVCs in an unsupervised manner; these characteristics are then integrated with sequential context information derived from bi-directional learning (BL) where both chronological and reverse-chronological directions of the image sequence are used. Our experimental results on a cardiac computed tomography (CT) dataset demonstrate that our spatial-sequential network with bi-directional learning (SS-BL-Net) outperforms existing methods for spatiotemporal LVC segmentation.


Assuntos
Tomografia Computadorizada Quadridimensional , Ventrículos do Coração , Coração , Ventrículos do Coração/diagnóstico por imagem , Processamento de Imagem Assistida por Computador
18.
IEEE Trans Med Imaging ; 40(3): 840-851, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33180721

RESUMO

Short-term monitoring of lesion changes has been a widely accepted clinical guideline for melanoma screening. When there is a significant change of a melanocytic lesion at three months, the lesion will be excised to exclude melanoma. However, the decision on change or no-change heavily depends on the experience and bias of individual clinicians, which is subjective. For the first time, a novel deep learning based method is developed in this paper for automatically detecting short-term lesion changes in melanoma screening. The lesion change detection is formulated as a task measuring the similarity between two dermoscopy images taken for a lesion in a short time-frame, and a novel Siamese structure based deep network is proposed to produce the decision: changed (i.e. not similar) or unchanged (i.e. similar enough). Under the Siamese framework, a novel structure, namely Tensorial Regression Process, is proposed to extract the global features of lesion images, in addition to deep convolutional features. In order to mimic the decision-making process of clinicians who often focus more on regions with specific patterns when comparing a pair of lesion images, a segmentation loss (SegLoss) is further devised and incorporated into the proposed network as a regularization term. To evaluate the proposed method, an in-house dataset with 1,000 pairs of lesion images taken in a short time-frame at a clinical melanoma centre was established. Experimental results on this first-of-a-kind large dataset indicate that the proposed model is promising in detecting the short-term lesion change for objective melanoma screening.


Assuntos
Melanoma , Neoplasias Cutâneas , Dermoscopia , Humanos , Melanoma/diagnóstico por imagem , Redes Neurais de Computação , Neoplasias Cutâneas/diagnóstico por imagem
19.
J Biomed Inform ; 106: 103430, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32371232

RESUMO

Laparoscopic liver surgery is challenging to perform because of compromised ability of the surgeon to localize subsurface anatomy due to minimal invasive visibility. While image guidance has the potential to address this barrier, intraoperative factors, such as insufflations and variable degrees of organ mobilization from supporting ligaments, may generate substantial deformation. The navigation ability in terms of searching and tagging within liver views has not been characterized, and current object detection methods do not account for the mechanics of how these features could be applied to the liver images. In this research, we have proposed spatial pyramid based searching and tagging of liver's intraoperative views using convolution neural network (SPST-CNN). By exploiting a hybrid combination of an image pyramid at input and spatial pyramid pooling layer at deeper stages of SPST-CNN, we reveal the gains of full-image representations for searching and tagging variable scaled liver live views. SPST-CNN provides pinpoint searching and tagging of intraoperative liver views to obtain up-to-date information about the location and shape of the area of interest. Downsampling input using image pyramid enables SPST-CNN framework to deploy input images with a diversity of resolutions for achieving scale-invariance feature. We have compared the proposed approach to the four recent state-of-the-art approaches and our method achieved better mAP up to 85.9%.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Fígado/diagnóstico por imagem , Fígado/cirurgia
20.
IEEE J Biomed Health Inform ; 24(10): 2833-2843, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32149700

RESUMO

Sleep staging is to score the sleep state of a subject into different sleep stages such as Wake and Rapid Eye Movement (REM). It plays an indispensable role in the diagnosis and treatment of sleep disorders. As manual sleep staging through well-trained sleep experts is time consuming, tedious, and subjective, many automatic methods have been developed for accurate, efficient, and objective sleep staging. Recently, deep learning based methods have been successfully proposed for electroencephalogram (EEG) based sleep staging with promising results. However, most of these methods directly take EEG raw signals as input of convolutional neural networks (CNNs) without considering the domain knowledge of EEG staging. Apart from that, to capture temporal information, most of the existing methods utilize recurrent neural networks such as LSTM (Long Short Term Memory) which are not effective for modelling global temporal context and difficult to train. Therefore, inspired by the clinical guidelines of sleep staging such as AASM (American Academy of Sleep Medicine) rules where different stages are generally characterized by EEG waveforms of various frequencies, we propose a multi-scale deep architecture by decomposing an EEG signal into different frequency bands as input to CNNs. To model global temporal context, we utilize the multi-head self-attention module of the transformer model to not only improve performance, but also shorten the training time. In addition, we choose residual based architecture which makes training end-to-end. Experimental results on two widely used sleep staging datasets, Montreal Archive of Sleep Studies (MASS) and sleep-EDF datasets, demonstrate the effectiveness and significant efficiency (up to 12 times less training time) of our proposed method over the state-of-the-art.


Assuntos
Eletroencefalografia/métodos , Redes Neurais de Computação , Processamento de Sinais Assistido por Computador , Fases do Sono/fisiologia , Adulto , Idoso , Bases de Dados Factuais , Aprendizado Profundo , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...