Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Med Phys ; 2021 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-33595900

RESUMO

PURPOSE: High-performance computed tomography (CT) plays a vital role in clinical decision-making. However, the performance of CT imaging is adversely affected by the nonideal focal spot size of the x-ray source or degraded by an enlarged focal spot size due to aging. In this work, we aim to develop a deep learning-based strategy to mitigate the problem so that high spatial resolution CT images can be obtained even in the case of a nonideal x-ray source. METHODS: To reconstruct high-quality CT images from blurred sinograms via joint image and sinogram learning, a cross-domain hybrid model is formulated via deep learning into a modularized data-driven reconstruction (MDR) framework. The proposed MDR framework comprises several blocks, and all the blocks share the same network architecture and network parameters. In essence, each block utilizes two sub-models to generate an estimated blur kernel and a high-quality CT image simultaneously. In this way, our framework generates not only a final high-quality CT image but also a series of intermediate images with gradually improved anatomical details, enhancing the visual perception for clinicians through the dynamic process. We used simulated training datasets to train our model in an end-to-end manner and tested our model on both simulated and realistic experimental datasets. RESULTS: On the simulated testing datasets, our approach increases the information fidelity criterion (IFC) by up to 34.2%, the universal quality index (UQI) by up to 20.3%, the signal-to-noise (SNR) by up to 6.7%, and reduces the root mean square error (RMSE) by up to 10.5% as compared with FBP. Compared with the iterative deconvolution method (NSM), MDR increases IFC by up to 24.7%, UQI by up to 16.7%, SNR by up to 6.0%, and reduces RMSE by up to 9.4%. In the modulation transfer function (MTF) experiment, our method improves the MTF50% by 34.5% and MTF10% by 18.7% as compared with FBP, Similarly remarkably, our method improves MTF50% by 14.3% and MTF10% by 0.9% as compared with NSM. Also, our method shows better imaging results in the edge of bony structures and other tiny structures in the experiments using phantom consisting of ham and a bottle of peanuts. CONCLUSIONS: A modularized data-driven CT reconstruction framework is established to mitigate the blurring effect caused by a nonideal x-ray source with relatively large focal spot. The proposed method enables us to obtain high-resolution images with less ideal x-ray source.

2.
IEEE Trans Med Imaging ; PP2021 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-33507867

RESUMO

To better understand early brain development in health and disorder, it is critical to accurately segment infant brain magnetic resonance (MR) images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Deep learning-based methods have achieved state-of-the-art performance; however, one of the major limitations is that the learning-based methods may suffer from the multi-site issue, that is, the models trained on a dataset from one site may not be applicable to the datasets acquired from other sites with different imaging protocols/scanners. To promote methodological development in the community, the iSeg-2019 challenge (http://iseg2019.web.unc.edu) provides a set of 6-month infant subjects from multiple sites with different protocols/scanners for the participating methods. Training/validation subjects are from UNC (MAP) and testing subjects are from UNC/UMN (BCP), Stanford University, and Emory University. By the time of writing, there are 30 automatic segmentation methods participated in the iSeg-2019. In this article, 8 top-ranked methods were reviewed by detailing their pipelines/implementations, presenting experimental results, and evaluating performance across different sites in terms of whole brain, regions of interest, and gyral landmark curves. We further pointed out their limitations and possible directions for addressing the multi-site issue. We find that multi-site consistency is still an open issue. We hope that the multi-site dataset in the iSeg-2019 and this review article will attract more researchers to address the challenging and critical multi-site issue in practice.

3.
IEEE Trans Med Imaging ; PP2020 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-33201808

RESUMO

Annotation scarcity is a long-standing problem in medical image analysis area. To efficiently leverage limited annotations, abundant unlabeled data are additionally exploited in semi-supervised learning, while well-established cross-modality data are investigated in domain adaptation. In this paper, we aim to explore the feasibility of concurrently leveraging both unlabeled data and cross-modality data for annotation-efficient cardiac segmentation. To this end, we propose a cutting-edge semi-supervised domain adaptation framework, namely Dual-Teacher++. Besides directly learning from limited labeled target domain data (e.g., CT) via a student model adopted by previous literature, we design novel dual teacher models, including an inter-domain teacher model to explore cross-modality priors from source domain (e.g., MR) and an intra-domain teacher model to investigate the knowledge beneath unlabeled target domain. In this way, the dual teacher models would transfer acquired inter- and intra-domain knowledge to the student model for further integration and exploitation. Moreover, to encourage reliable dual-domain knowledge transfer, we enhance the inter-domain knowledge transfer on the samples with higher similarity to target domain after appearance alignment, and also strengthen intra-domain knowledge transfer of unlabeled target data with higher prediction confidence. In this way, the student model can obtain reliable dual-domain knowledge and yield improved performance on target domain data. We extensively evaluated the feasibility of our method on the MM-WHS 2017 challenge dataset. The experiments have demonstrated the superiority of our framework over other semi-supervised learning and domain adaptation methods. Moreover, our performance gains could be yielded in bidirections, i.e., adapting from MR to CT, and from CT to MR. Our code will be available at https://github.com/kli-lalala/Dual-Teacher-.

4.
Med Phys ; 2020 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-33012016

RESUMO

PURPOSE: Contouring intraprostatic lesions is a prerequisite for dose-escalating these lesions in radiotherapy to improve the local cancer control. In this study, a deep learning-based approach was developed for automatic intraprostatic lesion segmentation in multiparametric magnetic resonance imaging (mpMRI) images contributing to clinical practice. METHODS: Multiparametric magnetic resonance imaging images from 136 patient cases were collected from our institution, and all these cases contained suspicious lesions with Prostate Imaging Reporting and Data System (PI-RADS) score ≥ 4. The contours of the lesion and prostate were manually created on axial T2-weighted (T2W), apparent diffusion coefficient (ADC) and high b-value diffusion-weighted imaging (DWI) images to provide the ground truth data. Then a multiple branch UNet (MB-UNet) was proposed for the segmentation of an indistinct target in multi-modality MRI images. An encoder module was designed with three branches for the three MRI modalities separately, to fully extract the high-level features provided by different MRI modalities; an input module was added by using three sub-branches for three consecutive image slices, to consider the contour consistency among different image slices; deep supervision strategy was also integrated into the network to speed up the convergency of the network and improve the performance. The probability maps of the background, normal prostate and lesion were output by the network to generate the segmentation of the lesion, and the performance was evaluated using the dice similarity coefficient (DSC) as the main metric. RESULTS: A total of 162 lesions were contoured on 652 image slices, with 119 lesions in the peripheral zone, 38 in the transition zone, four in the central zone and one in the anterior fibromuscular stroma. All prostates were also contoured on 1,264 image slices. As for the segmentation of lesions in the testing set, MB-UNet achieved a per case DSC of 0.6333, specificity of 0.9993, sensitivity of 0.7056; and global DSC of 0.7205, specificity of 0.9993, sensitivity of 0.7409. All the three deep learning strategies adopted in this study contributed to the performance promotion of the MB-UNet. Missing the DWI modality would degrade the segmentation performance more markedly compared with the other two modalities. CONCLUSIONS: A deep learning-based approach with proposed MB-UNet was developed to automatically segment suspicious lesions in mpMRI images. This study makes it feasible to adopt boosting intraprostatic lesions in clinical practice to achieve better outcomes.

5.
IEEE Trans Med Imaging ; PP2020 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-32956044

RESUMO

Computed tomography (CT) has been widely used for medical diagnosis, assessment, and therapy planning and guidance. In reality, CT images may be affected adversely in the presence of metallic objects, which could lead to severe metal artifacts and influence clinical diagnosis or dose calculation in radiation therapy. In this paper, we propose a generalizable framework for metal artifact reduction (MAR) by simultaneously leveraging the advantages of image domain and sinogram domain-based MAR techniques. We formulate our framework as a sinogram completion problem and train a neural network (SinoNet) to restore the metal-affected projections. To improve the continuity of the completed projections at the boundary of metal trace and thus alleviate new artifacts in the reconstructed CT images, we train another neural network (PriorNet) to generate a good prior image to guide sinogram learning, and further design a novel residual sinogram learning strategy to effectively utilize the prior image information for better sinogram completion. The two networks are jointly trained in an end-to-end fashion with a differentiable forward projection (FP) operation so that the prior image generation and deep sinogram completion procedures can benefit from each other. Finally, the artifact-reduced CT images are reconstructed using the filtered backward projection (FBP) from the completed sinogram. Extensive experiments on simulated and real artifacts data demonstrate that our method produces superior artifact-reduced results while preserving the anatomical structures and outperforms other MAR methods.

6.
IEEE Trans Med Imaging ; 39(12): 4237-4248, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32776876

RESUMO

Deep convolutional neural networks have significantly boosted the performance of fundus image segmentation when test datasets have the same distribution as the training datasets. However, in clinical practice, medical images often exhibit variations in appearance for various reasons, e.g., different scanner vendors and image quality. These distribution discrepancies could lead the deep networks to over-fit on the training datasets and lack generalization ability on the unseen test datasets. To alleviate this issue, we present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains by exploring the knowledge from multiple source domains. Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains to make the semantic features more discriminative. Specifically, we introduce a Domain Knowledge Pool to learn and memorize the prior information extracted from multi-source domains. Then the original image features are augmented with domain-oriented aggregated features, which are induced from the knowledge pool based on the similarity between the input image and multi-source domain images. We further design a novel domain code prediction branch to infer this similarity and employ an attention-guided mechanism to dynamically combine the aggregated features with the semantic features. We comprehensively evaluate our DoFE framework on two fundus image segmentation tasks, including the optic cup and disc segmentation and vessel segmentation. Our DoFE framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.

7.
IEEE Trans Med Imaging ; 39(11): 3429-3440, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32746096

RESUMO

Training deep neural networks usually requires a large amount of labeled data to obtain good performance. However, in medical image analysis, obtaining high-quality labels for the data is laborious and expensive, as accurately annotating medical images demands expertise knowledge of the clinicians. In this paper, we present a novel relation-driven semi-supervised framework for medical image classification. It is a consistency-based method which exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations, and leverages a self-ensembling model to produce high-quality consistency targets for the unlabeled data. Considering that human diagnosis often refers to previous analogous cases to make reliable decisions, we introduce a novel sample relation consistency (SRC) paradigm to effectively exploit unlabeled data by modeling the relationship information among different samples. Superior to existing consistency-based methods which simply enforce consistency of individual predictions, our framework explicitly enforces the consistency of semantic relation among different samples under perturbations, encouraging the model to explore extra semantic information from unlabeled data. We have conducted extensive experiments to evaluate our method on two public benchmark medical image classification datasets, i.e., skin lesion diagnosis with ISIC 2018 challenge and thorax disease classification with ChestX-ray14. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.

8.
IEEE Trans Med Imaging ; 39(11): 3583-3594, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32746106

RESUMO

Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.

9.
IEEE Trans Med Imaging ; 39(12): 4023-4033, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32746140

RESUMO

The automatic diagnosis of various retinal diseases from fundus images is important to support clinical decision-making. However, developing such automatic solutions is challenging due to the requirement of a large amount of human-annotated data. Recently, unsupervised/self-supervised feature learning techniques receive a lot of attention, as they do not need massive annotations. Most of the current self-supervised methods are analyzed with single imaging modality and there is no method currently utilize multi-modal images for better results. Considering that the diagnostics of various vitreoretinal diseases can greatly benefit from another imaging modality, e.g., FFA, this paper presents a novel self-supervised feature learning method by effectively exploiting multi-modal data for retinal disease diagnosis. To achieve this, we first synthesize the corresponding FFA modality and then formulate a patient feature-based softmax embedding objective. Our objective learns both modality-invariant features and patient-similarity features. Through this mechanism, the neural network captures the semantically shared information across different modalities and the apparent visual similarity between patients. We evaluate our method on two public benchmark datasets for retinal disease diagnosis. The experimental results demonstrate that our method clearly outperforms other self-supervised feature learning methods and is comparable to the supervised baseline. Our code is available at GitHub.

10.
Med Image Anal ; 65: 101766, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32623276

RESUMO

Although having achieved great success in medical image segmentation, deep learning-based approaches usually require large amounts of well-annotated data, which can be extremely expensive in the field of medical image analysis. Unlabeled data, on the other hand, is much easier to acquire. Semi-supervised learning and unsupervised domain adaptation both take the advantage of unlabeled data, and they are closely related to each other. In this paper, we propose uncertainty-aware multi-view co-training (UMCT), a unified framework that addresses these two tasks for volumetric medical image segmentation. Our framework is capable of efficiently utilizing unlabeled data for better performance. We firstly rotate and permute the 3D volumes into multiple views and train a 3D deep network on each view. We then apply co-training by enforcing multi-view consistency on unlabeled data, where an uncertainty estimation of each view is utilized to achieve accurate labeling. Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation. Under unsupervised domain adaptation settings, we validate the effectiveness of this work by adapting our multi-organ segmentation model to two pathological organs from the Medical Segmentation Decathlon Datasets. Additionally, we show that our UMCT-DA model can even effectively handle the challenging situation where labeled source data is inaccessible, demonstrating strong potentials for real-world applications.

11.
Artigo em Inglês | MEDLINE | ID: mdl-32479407

RESUMO

A common shortfall of supervised deep learning for medical imaging is the lack of labeled data, which is often expensive and time consuming to collect. This article presents a new semisupervised method for medical image segmentation, where the network is optimized by a weighted combination of a common supervised loss only for the labeled inputs and a regularization loss for both the labeled and unlabeled data. To utilize the unlabeled data, our method encourages consistent predictions of the network-in-training for the same input under different perturbations. With the semisupervised segmentation tasks, we introduce a transformation-consistent strategy in the self-ensembling model to enhance the regularization effect for pixel-level predictions. To further improve the regularization effects, we extend the transformation in a more generalized form including scaling and optimize the consistency loss with a teacher model, which is an averaging of the student model weights. We extensively validated the proposed semisupervised method on three typical yet challenging medical image segmentation tasks: 1) skin lesion segmentation from dermoscopy images in the International Skin Imaging Collaboration (ISIC) 2017 data set; 2) optic disk (OD) segmentation from fundus images in the Retinal Fundus Glaucoma Challenge (REFUGE) data set; and 3) liver segmentation from volumetric CT scans in the Liver Tumor Segmentation Challenge (LiTS) data set. Compared with state-of-the-art, our method shows superior performance on the challenging 2-D/3-D medical images, demonstrating the effectiveness of our semisupervised method for medical image segmentation.

12.
IEEE Trans Med Imaging ; 39(9): 2713-2724, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32078543

RESUMO

Automated prostate segmentation in MRI is highly demanded for computer-assisted diagnosis. Recently, a variety of deep learning methods have achieved remarkable progress in this task, usually relying on large amounts of training data. Due to the nature of scarcity for medical images, it is important to effectively aggregate data from multiple sites for robust model training, to alleviate the insufficiency of single-site samples. However, the prostate MRIs from different sites present heterogeneity due to the differences in scanners and imaging protocols, raising challenges for effective ways of aggregating multi-site data for network training. In this paper, we propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations, leveraging multiple sources of data. To compensate for the inter-site heterogeneity of different MRI datasets, we develop Domain-Specific Batch Normalization layers in the network backbone, enabling the network to estimate statistics and perform feature normalization for each site separately. Considering the difficulty of capturing the shared knowledge from multiple datasets, a novel learning paradigm, i.e., Multi-site-guided Knowledge Transfer, is proposed to enhance the kernels to extract more generic representations from multi-site data. Extensive experiments on three heterogeneous prostate MRI datasets demonstrate that our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.

13.
IEEE Trans Med Imaging ; 39(5): 1483-1493, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31714219

RESUMO

Diabetic retinopathy (DR) and diabetic macular edema (DME) are the leading causes of permanent blindness in the working-age population. Automatic grading of DR and DME helps ophthalmologists design tailored treatments to patients, thus is of vital importance in the clinical practice. However, prior works either grade DR or DME, and ignore the correlation between DR and its complication, i.e., DME. Moreover, the location information, e.g., macula and soft hard exhaust annotations, are widely used as a prior for grading. Such annotations are costly to obtain, hence it is desirable to develop automatic grading methods with only image-level supervision. In this article, we present a novel cross-disease attention network (CANet) to jointly grade DR and DME by exploring the internal relationship between the diseases with only image-level supervision. Our key contributions include the disease-specific attention module to selectively learn useful features for individual diseases, and the disease-dependent attention module to further capture the internal relationship between the two diseases. We integrate these two attention modules in a deep network to produce disease-specific and disease-dependent features, and to maximize the overall performance jointly for grading DR and DME. We evaluate our network on two public benchmark datasets, i.e., ISBI 2018 IDRiD challenge dataset and Messidor dataset. Our method achieves the best result on the ISBI 2018 IDRiD challenge dataset and outperforms other methods on the Messidor dataset. Our code is publicly available at https://github.com/xmengli999/CANet.

14.
Med Image Anal ; 58: 101549, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31499320

RESUMO

The whole slide histopathology images (WSIs) play a critical role in gastric cancer diagnosis. However, due to the large scale of WSIs and various sizes of the abnormal area, how to select informative regions and analyze them are quite challenging during the automatic diagnosis process. The multi-instance learning based on the most discriminative instances can be of great benefit for whole slide gastric image diagnosis. In this paper, we design a recalibrated multi-instance deep learning method (RMDL) to address this challenging problem. We first select the discriminative instances, and then utilize these instances to diagnose diseases based on the proposed RMDL approach. The designed RMDL network is capable of capturing instance-wise dependencies and recalibrating instance features according to the importance coefficient learned from the fused features. Furthermore, we build a large whole-slide gastric histopathology image dataset with detailed pixel-level annotations. Experimental results on the constructed gastric dataset demonstrate the significant improvement on the accuracy of our proposed framework compared with other state-of-the-art multi-instance learning methods. Moreover, our method is general and can be extended to other diagnosis tasks of different cancer types based on WSIs.


Assuntos
Aprendizado Profundo , Neoplasias Gástricas/patologia , Calibragem , Conjuntos de Dados como Assunto , Diagnóstico Diferencial , Humanos , Coloração e Rotulagem , Neoplasias Gástricas/diagnóstico por imagem
15.
IEEE Trans Med Imaging ; 38(11): 2485-2495, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-30794170

RESUMO

Glaucoma is a leading cause of irreversible blindness. Accurate segmentation of the optic disc (OD) and optic cup (OC) from fundus images is beneficial to glaucoma screening and diagnosis. Recently, convolutional neural networks demonstrate promising progress in the joint OD and OC segmentation. However, affected by the domain shift among different datasets, deep networks are severely hindered in generalizing across different scanners and institutions. In this paper, we present a novel patch-based output space adversarial learning framework ( p OSAL) to jointly and robustly segment the OD and OC from different fundus image datasets. We first devise a lightweight and efficient segmentation network as a backbone. Considering the specific morphology of OD and OC, a novel morphology-aware segmentation loss is proposed to guide the network to generate accurate and smooth segmentation. Our p OSAL framework then exploits unsupervised domain adaptation to address the domain shift challenge by encouraging the segmentation in the target domain to be similar to the source ones. Since the whole-segmentation-based adversarial loss is insufficient to drive the network to capture segmentation details, we further design the p OSAL in a patch-based fashion to enable fine-grained discrimination on local segmentation details. We extensively evaluate our p OSAL framework and demonstrate its effectiveness in improving the segmentation performance on three public retinal fundus image datasets, i.e., Drishti-GS, RIM-ONE-r3, and REFUGE. Furthermore, our p OSAL framework achieved the first place in the OD and OC segmentation tasks in the MICCAI 2018 Retinal Fundus Glaucoma Challenge.


Assuntos
Técnicas de Diagnóstico Oftalmológico , Interpretação de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Disco Óptico/diagnóstico por imagem , Bases de Dados Factuais , Fundo de Olho , Glaucoma/diagnóstico por imagem , Humanos , Curva ROC
16.
IEEE Trans Med Imaging ; 38(1): 180-193, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30040635

RESUMO

Volumetric ultrasound is rapidly emerging as a viable imaging modality for routine prenatal examinations. Biometrics obtained from the volumetric segmentation shed light on the reformation of precise maternal and fetal health monitoring. However, the poor image quality, low contrast, boundary ambiguity, and complex anatomy shapes conspire toward a great lack of efficient tools for the segmentation. It makes 3-D ultrasound difficult to interpret and hinders the widespread of 3-D ultrasound in obstetrics. In this paper, we are looking at the problem of semantic segmentation in prenatal ultrasound volumes. Our contribution is threefold: 1) we propose the first and fully automatic framework to simultaneously segment multiple anatomical structures with intensive clinical interest, including fetus, gestational sac, and placenta, which remains a rarely studied and arduous challenge; 2) we propose a composite architecture for dense labeling, in which a customized 3-D fully convolutional network explores spatial intensity concurrency for initial labeling, while a multi-directional recurrent neural network (RNN) encodes spatial sequentiality to combat boundary ambiguity for significant refinement; and 3) we introduce a hierarchical deep supervision mechanism to boost the information flow within RNN and fit the latent sequence hierarchy in fine scales, and further improve the segmentation results. Extensively verified on in-house large data sets, our method illustrates a superior segmentation performance, decent agreements with expert measurements and high reproducibilities against scanning variations, and thus is promising in advancing the prenatal ultrasound examinations.


Assuntos
Imageamento Tridimensional/métodos , Ultrassonografia Pré-Natal/métodos , Algoritmos , Feminino , Feto/diagnóstico por imagem , Humanos , Redes Neurais de Computação , Gravidez
17.
IEEE Trans Med Imaging ; 37(5): 1114-1126, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29727275

RESUMO

We propose an analysis of surgical videos that is based on a novel recurrent convolutional network (SV-RCNet), specifically for automatic workflow recognition from surgical videos online, which is a key component for developing the context-aware computer-assisted intervention systems. Different from previous methods which harness visual and temporal information separately, the proposed SV-RCNet seamlessly integrates a convolutional neural network (CNN) and a recurrent neural network (RNN) to form a novel recurrent convolutional architecture in order to take full advantages of the complementary information of visual and temporal features learned from surgical videos. We effectively train the SV-RCNet in an end-to-end manner so that the visual representations and sequential dynamics can be jointly optimized in the learning process. In order to produce more discriminative spatio-temporal features, we exploit a deep residual network (ResNet) and a long short term memory (LSTM) network, to extract visual features and temporal dependencies, respectively, and integrate them into the SV-RCNet. Moreover, based on the phase transition-sensitive predictions from the SV-RCNet, we propose a simple yet effective inference scheme, namely the prior knowledge inference (PKI), by leveraging the natural characteristic of surgical video. Such a strategy further improves the consistency of results and largely boosts the recognition performance. Extensive experiments have been conducted with the MICCAI 2016 Modeling and Monitoring of Computer Assisted Interventions Workflow Challenge dataset and Cholec80 dataset to validate SV-RCNet. Our approach not only achieves superior performance on these two datasets but also outperforms the state-of-the-art methods by a significant margin.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Software , Cirurgia Vídeoassistida/classificação , Algoritmos , Bases de Dados Factuais , Humanos , Fluxo de Trabalho
18.
Neuroimage ; 170: 446-455, 2018 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28445774

RESUMO

Segmentation of key brain tissues from 3D medical images is of great significance for brain disease diagnosis, progression assessment and monitoring of neurologic conditions. While manual segmentation is time-consuming, laborious, and subjective, automated segmentation is quite challenging due to the complicated anatomical environment of brain and the large variations of brain tissues. We propose a novel voxelwise residual network (VoxResNet) with a set of effective training schemes to cope with this challenging problem. The main merit of residual learning is that it can alleviate the degradation problem when training a deep network so that the performance gains achieved by increasing the network depth can be fully leveraged. With this technique, our VoxResNet is built with 25 layers, and hence can generate more representative features to deal with the large variations of brain tissues than its rivals using hand-crafted features or shallower networks. In order to effectively train such a deep network with limited training data for brain segmentation, we seamlessly integrate multi-modality and multi-level contextual information into our network, so that the complementary information of different modalities can be harnessed and features of different scales can be exploited. Furthermore, an auto-context version of the VoxResNet is proposed by combining the low-level image appearance features, implicit shape information, and high-level context together for further improving the segmentation performance. Extensive experiments on the well-known benchmark (i.e., MRBrainS) of brain segmentation from 3D magnetic resonance (MR) images corroborated the efficacy of the proposed VoxResNet. Our method achieved the first place in the challenge out of 37 competitors including several state-of-the-art brain segmentation methods. Our method is inherently general and can be readily applied as a powerful tool to many brain-related studies, where accurate segmentation of brain structures is critical.


Assuntos
Encéfalo/anatomia & histologia , Encéfalo/diagnóstico por imagem , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Imagem por Ressonância Magnética/métodos , Neuroimagem/métodos , Encéfalo/patologia , Humanos
19.
Med Image Anal ; 41: 40-54, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-28526212

RESUMO

While deep convolutional neural networks (CNNs) have achieved remarkable success in 2D medical image segmentation, it is still a difficult task for CNNs to segment important organs or structures from 3D medical images owing to several mutually affected challenges, including the complicated anatomical environments in volumetric images, optimization difficulties of 3D networks and inadequacy of training samples. In this paper, we present a novel and efficient 3D fully convolutional network equipped with a 3D deep supervision mechanism to comprehensively address these challenges; we call it 3D DSN. Our proposed 3D DSN is capable of conducting volume-to-volume learning and inference, which can eliminate redundant computations and alleviate the risk of over-fitting on limited training data. More importantly, the 3D deep supervision mechanism can effectively cope with the optimization problem of gradients vanishing or exploding when training a 3D deep model, accelerating the convergence speed and simultaneously improving the discrimination capability. Such a mechanism is developed by deriving an objective function that directly guides the training of both lower and upper layers in the network, so that the adverse effects of unstable gradient changes can be counteracted during the training procedure. We also employ a fully connected conditional random field model as a post-processing step to refine the segmentation results. We have extensively validated the proposed 3D DSN on two typical yet challenging volumetric medical image segmentation tasks: (i) liver segmentation from 3D CT scans and (ii) whole heart and great vessels segmentation from 3D MR images, by participating two grand challenges held in conjunction with MICCAI. We have achieved competitive segmentation results to state-of-the-art approaches in both challenges with a much faster speed, corroborating the effectiveness of our proposed 3D DSN.


Assuntos
Imageamento Tridimensional/métodos , Redes Neurais de Computação , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Aprendizado de Máquina Supervisionado
20.
IEEE Trans Biomed Eng ; 64(7): 1558-1567, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28113302

RESUMO

OBJECTIVE: False positive reduction is one of the most crucial components in an automated pulmonary nodule detection system, which plays an important role in lung cancer diagnosis and early treatment. The objective of this paper is to effectively address the challenges in this task and therefore to accurately discriminate the true nodules from a large number of candidates. METHODS: We propose a novel method employing three-dimensional (3-D) convolutional neural networks (CNNs) for false positive reduction in automated pulmonary nodule detection from volumetric computed tomography (CT) scans. Compared with its 2-D counterparts, the 3-D CNNs can encode richer spatial information and extract more representative features via their hierarchical architecture trained with 3-D samples. More importantly, we further propose a simple yet effective strategy to encode multilevel contextual information to meet the challenges coming with the large variations and hard mimics of pulmonary nodules. RESULTS: The proposed framework has been extensively validated in the LUNA16 challenge held in conjunction with ISBI 2016, where we achieved the highest competition performance metric (CPM) score in the false positive reduction track. CONCLUSION: Experimental results demonstrated the importance and effectiveness of integrating multilevel contextual information into 3-D CNN framework for automated pulmonary nodule detection in volumetric CT data. SIGNIFICANCE: While our method is tailored for pulmonary nodule detection, the proposed framework is general and can be easily extended to many other 3-D object detection tasks from volumetric medical images, where the targeting objects have large variations and are accompanied by a number of hard mimics.


Assuntos
Erros de Diagnóstico/prevenção & controle , Imageamento Tridimensional/métodos , Aprendizado de Máquina , Redes Neurais de Computação , Nódulo Pulmonar Solitário/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Algoritmos , Reações Falso-Positivas , Humanos , Reconhecimento Automatizado de Padrão/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...