Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 199
Filtrar
1.
IEEE Trans Med Imaging ; PP2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38564346

RESUMO

Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.

2.
Transl Psychiatry ; 14(1): 150, 2024 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-38499546

RESUMO

There is an emerging potential for digital assessment of depression. In this study, Chinese patients with major depressive disorder (MDD) and controls underwent a week of multimodal measurement including actigraphy and app-based measures (D-MOMO) to record rest-activity, facial expression, voice, and mood states. Seven machine-learning models (Random Forest [RF], Logistic regression [LR], Support vector machine [SVM], K-Nearest Neighbors [KNN], Decision tree [DT], Naive Bayes [NB], and Artificial Neural Networks [ANN]) with leave-one-out cross-validation were applied to detect lifetime diagnosis of MDD and non-remission status. Eighty MDD subjects and 76 age- and sex-matched controls completed the actigraphy, while 61 MDD subjects and 47 controls completed the app-based assessment. MDD subjects had lower mobile time (P = 0.006), later sleep midpoint (P = 0.047) and Acrophase (P = 0.024) than controls. For app measurement, MDD subjects had more frequent brow lowering (P = 0.023), less lip corner pulling (P = 0.007), higher pause variability (P = 0.046), more frequent self-reference (P = 0.024) and negative emotion words (P = 0.002), lower articulation rate (P < 0.001) and happiness level (P < 0.001) than controls. With the fusion of all digital modalities, the predictive performance (F1-score) of ANN for a lifetime diagnosis of MDD was 0.81 and 0.70 for non-remission status when combined with the HADS-D item score, respectively. Multimodal digital measurement is a feasible diagnostic tool for depression in Chinese. A combination of multimodal measurement and machine-learning approach has enhanced the performance of digital markers in phenotyping and diagnosis of MDD.


Assuntos
Transtorno Depressivo Maior , Aplicativos Móveis , Humanos , Transtorno Depressivo Maior/diagnóstico , Teorema de Bayes , Actigrafia , Depressão/diagnóstico , Hong Kong
3.
IEEE Trans Med Imaging ; PP2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38345948

RESUMO

Upon remarkable progress in cardiac image segmentation, contemporary studies dedicate to further upgrading model functionality toward perfection, through progressively exploring the sequentially delivered datasets over time by domain incremental learning. Existing works mainly concentrated on addressing the heterogeneous style variations, but overlooked the critical shape variations across domains hidden behind the sub-disease composition discrepancy. In case the updated model catastrophically forgets the sub-diseases that were learned in past domains but are no longer present in the subsequent domains, we proposed a dual enrichment synergistic strategy to incrementally broaden model competence for a growing number of sub-diseases. The data-enriched scheme aims to diversify the shape composition of current training data via displacement-aware shape encoding and decoding, to gradually build up the robustness against cross-domain shape variations. Meanwhile, the model-enriched scheme intends to strengthen model capabilities by progressively appending and consolidating the latest expertise into a dynamically-expanded multi-expert network, to gradually cultivate the generalization ability over style-variated domains. The above two schemes work in synergy to collaboratively upgrade model capabilities in two-pronged manners. We have extensively evaluated our network with the ACDC and M&Ms datasets in single-domain and compound-domain incremental learning settings. Our approach outperformed other competing methods and achieved comparable results to the upper bound.

4.
IEEE Trans Med Imaging ; 43(5): 1972-1982, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38215335

RESUMO

Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible. This poses a pressing need for developing label-efficient detection models to alleviate radiologists' labeling burden. To tackle this challenge, the literature on object detection has witnessed an increase of weakly-supervised and semi-supervised approaches, yet still lacks a unified framework that leverages various forms of fully-labeled, weakly-labeled, and unlabeled data. In this paper, we present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible. Specifically, a multi-branch omni-supervised detection head is introduced with each branch trained with a specific type of supervision. A co-training-based dynamic label assignment strategy is then proposed to enable flexible and robust learning from the weakly-labeled and unlabeled data. Extensive evaluation was conducted for the proposed framework with three rib fracture datasets on both chest CT and X-ray. By leveraging all forms of supervision, ORF-Netv2 achieves mAPs of 34.7, 44.7, and 19.4 on the three datasets, respectively, surpassing the baseline detector which uses only box annotations by mAP gains of 3.8, 4.8, and 5.0, respectively. Furthermore, ORF-Netv2 consistently outperforms other competitive label-efficient methods over various scenarios, showing a promising framework for label-efficient fracture detection. The code is available at: https://github.com/zhizhongchai/ORF-Net.


Assuntos
Aprendizado Profundo , Radiografia Torácica , Fraturas das Costelas , Aprendizado de Máquina Supervisionado , Humanos , Fraturas das Costelas/diagnóstico por imagem , Radiografia Torácica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Algoritmos
6.
Nat Commun ; 14(1): 7434, 2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-37973874

RESUMO

Inverse Protein Folding (IPF) is an important task of protein design, which aims to design sequences compatible with a given backbone structure. Despite the prosperous development of algorithms for this task, existing methods tend to rely on noisy predicted residues located in the local neighborhood when generating sequences. To address this limitation, we propose an entropy-based residue selection method to remove noise in the input residue context. Additionally, we introduce ProRefiner, a memory-efficient global graph attention model to fully utilize the denoised context. Our proposed method achieves state-of-the-art performance on multiple sequence design benchmarks in different design settings. Furthermore, we demonstrate the applicability of ProRefiner in redesigning Transposon-associated transposase B, where six out of the 20 variants we propose exhibit improved gene editing activity.


Assuntos
Algoritmos , Proteínas , Entropia , Proteínas/genética , Proteínas/química , Dobramento de Proteína
7.
Br J Ophthalmol ; 2023 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-37857452

RESUMO

BACKGROUND: Deep learning (DL) is promising to detect glaucoma. However, patients' privacy and data security are major concerns when pooling all data for model development. We developed a privacy-preserving DL model using the federated learning (FL) paradigm to detect glaucoma from optical coherence tomography (OCT) images. METHODS: This is a multicentre study. The FL paradigm consisted of a 'central server' and seven eye centres in Hong Kong, the USA and Singapore. Each centre first trained a model locally with its own OCT optic disc volumetric dataset and then uploaded its model parameters to the central server. The central server used FedProx algorithm to aggregate all centres' model parameters. Subsequently, the aggregated parameters are redistributed to each centre for its local model optimisation. We experimented with three three-dimensional (3D) networks to evaluate the stabilities of the FL paradigm. Lastly, we tested the FL model on two prospectively collected unseen datasets. RESULTS: We used 9326 volumetric OCT scans from 2785 subjects. The FL model performed consistently well with different networks in 7 centres (accuracies 78.3%-98.5%, 75.9%-97.0%, and 78.3%-97.5%, respectively) and stably in the 2 unseen datasets (accuracies 84.8%-87.7%, 81.3%-84.8%, and 86.0%-87.8%, respectively). The FL model achieved non-inferior performance in classifying glaucoma compared with the traditional model and significantly outperformed the individual models. CONCLUSION: The 3D FL model could leverage all the datasets and achieve generalisable performance, without data exchange across centres. This study demonstrated an OCT-based FL paradigm for glaucoma identification with ensured patient privacy and data security, charting another course toward the real-world transition of artificial intelligence in ophthalmology.

8.
Asia Pac J Ophthalmol (Phila) ; 12(5): 468-476, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37851564

RESUMO

PURPOSE: The purpose of this study was to develop an artificial intelligence (AI) system for the identification of disease status and recommending treatment modalities for retinopathy of prematurity (ROP). METHODS: This retrospective cohort study included a total of 24,495 RetCam images from 1075 eyes of 651 preterm infants who received RetCam examination at the Shenzhen Eye Hospital in Shenzhen, China, from January 2003 to August 2021. Three tasks included ROP identification, severe ROP identification, and treatment modalities identification (retinal laser photocoagulation or intravitreal injections). The AI system was developed to identify the 3 tasks, especially the treatment modalities of ROP. The performance between the AI system and ophthalmologists was compared using extra 200 RetCam images. RESULTS: The AI system exhibited favorable performance in the 3 tasks, including ROP identification [area under the receiver operating characteristic curve (AUC), 0.9531], severe ROP identification (AUC, 0.9132), and treatment modalities identification with laser photocoagulation or intravitreal injections (AUC, 0.9360). The AI system achieved an accuracy of 0.8627, a sensitivity of 0.7059, and a specificity of 0.9412 for identifying the treatment modalities of ROP. External validation results confirmed the good performance of the AI system with an accuracy of 92.0% in all 3 tasks, which was better than 4 experienced ophthalmologists who scored 56%, 65%, 71%, and 76%, respectively. CONCLUSIONS: The described AI system achieved promising outcomes in the automated identification of ROP severity and treatment modalities. Using such algorithmic approaches as accessory tools in the clinic may improve ROP screening in the future.


Assuntos
Recém-Nascido Prematuro , Retinopatia da Prematuridade , Lactente , Recém-Nascido , Humanos , Inibidores da Angiogênese/uso terapêutico , Retinopatia da Prematuridade/terapia , Retinopatia da Prematuridade/tratamento farmacológico , Fator A de Crescimento do Endotélio Vascular , Estudos Retrospectivos , Inteligência Artificial , Idade Gestacional
9.
Radiol Artif Intell ; 5(5): e220185, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37795135

RESUMO

Purpose: To evaluate the diagnostic performance of a deep learning (DL) model for breast US across four hospitals and assess its value to readers with different levels of experience. Materials and Methods: In this retrospective study, a dual attention-based convolutional neural network was built and validated to discriminate malignant tumors from benign tumors by using B-mode and color Doppler US images (n = 45 909, March 2011-August 2018), acquired with 42 types of US machines, of 9895 pathologic analysis-confirmed breast lesions in 8797 patients (27 men and 8770 women; mean age, 47 years ± 12 [SD]). With and without assistance from the DL model, three novice readers with less than 5 years of US experience and two experienced readers with 8 and 18 years of US experience, respectively, interpreted 1024 randomly selected lesions. Differences in the areas under the receiver operating characteristic curves (AUCs) were tested using the DeLong test. Results: The DL model using both B-mode and color Doppler US images demonstrated expert-level performance at the lesion level, with an AUC of 0.94 (95% CI: 0.92, 0.95) for the internal set. In external datasets, the AUCs were 0.92 (95% CI: 0.90, 0.94) for hospital 1, 0.91 (95% CI: 0.89, 0.94) for hospital 2, and 0.96 (95% CI: 0.94, 0.98) for hospital 3. DL assistance led to improved AUCs (P < .001) for one experienced and three novice radiologists and improved interobserver agreement. The average false-positive rate was reduced by 7.6% (P = .08). Conclusion: The DL model may help radiologists, especially novice readers, improve accuracy and interobserver agreement of breast tumor diagnosis using US.Keywords: Ultrasound, Breast, Diagnosis, Breast Cancer, Deep Learning, Ultrasonography Supplemental material is available for this article. © RSNA, 2023.

10.
Radiother Oncol ; 186: 109793, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37414254

RESUMO

BACKGROUND AND PURPOSE: Immunotherapy is a standard treatment for many tumor types. However, only a small proportion of patients derive clinical benefit and reliable predictive biomarkers of immunotherapy response are lacking. Although deep learning has made substantial progress in improving cancer detection and diagnosis, there is limited success on the prediction of treatment response. Here, we aim to predict immunotherapy response of gastric cancer patients using routinely available clinical and image data. MATERIALS AND METHODS: We present a multi-modal deep learning radiomics approach to predict immunotherapy response using both clinical data and computed tomography images. The model was trained using 168 advanced gastric cancer patients treated with immunotherapy. To overcome limitations of small training data, we leverage an additional dataset of 2,029 patients who did not receive immunotherapy in a semi-supervised framework to learn intrinsic imaging phenotypes of the disease. We evaluated model performance in two independent cohorts of 81 patients treated with immunotherapy. RESULTS: The deep learning model achieved area under receiver operating characteristics curve (AUC) of 0.791 (95% CI 0.633-0.950) and 0.812 (95% CI 0.669-0.956) for predicting immunotherapy response in the internal and external validation cohorts. When combined with PD-L1 expression, the integrative model further improved the AUC by 4-7% in absolute terms. CONCLUSION: The deep learning model achieved promising performance for predicting immunotherapy response from routine clinical and image data. The proposed multi-modal approach is general and can incorporate other relevant information to further improve prediction of immunotherapy response.


Assuntos
Aprendizado Profundo , Neoplasias Gástricas , Humanos , Imunoterapia , Fenótipo , Curva ROC , Estudos Retrospectivos
11.
BMC Med Imaging ; 23(1): 91, 2023 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-37422639

RESUMO

PURPOSE: Segmentation of liver vessels from CT images is indispensable prior to surgical planning and aroused a broad range of interest in the medical image analysis community. Due to the complex structure and low-contrast background, automatic liver vessel segmentation remains particularly challenging. Most of the related researches adopt FCN, U-net, and V-net variants as a backbone. However, these methods mainly focus on capturing multi-scale local features which may produce misclassified voxels due to the convolutional operator's limited locality reception field. METHODS: We propose a robust end-to-end vessel segmentation network called Inductive BIased Multi-Head Attention Vessel Net(IBIMHAV-Net) by expanding swin transformer to 3D and employing an effective combination of convolution and self-attention. In practice, we introduce voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels and adopt multi-scale convolutional operators to gain local spatial information. On the other hand, we propose the inductive biased multi-head self-attention which learns inductively biased relative positional embedding from initialized absolute position embedding. Based on this, we can gain more reliable queries and key matrices. RESULTS: We conducted experiments on the 3DIRCADb dataset. The average dice and sensitivity of the four tested cases were 74.8[Formula: see text] and 77.5[Formula: see text], which exceed the results of existing deep learning methods and improved graph cuts method. The Branches Detected(BD)/Tree-length Detected(TD) indexes also proved the global/local feature capture ability better than other methods. CONCLUSION: The proposed model IBIMHAV-Net provides an automatic, accurate 3D liver vessel segmentation with an interleaved architecture that better utilizes both global and local spatial features in CT volumes. It can be further extended for other clinical data.


Assuntos
Cabeça , Fígado , Humanos , Fígado/diagnóstico por imagem , Atenção , Processamento de Imagem Assistida por Computador/métodos
12.
Artigo em Inglês | MEDLINE | ID: mdl-37307178

RESUMO

Due to the individual difference, EEG signals from other subjects (source) can hardly be used to decode the mental intentions of the target subject. Although transfer learning methods have shown promising results, they still suffer from poor feature representation or neglect long-range dependencies. In light of these limitations, we propose Global Adaptive Transformer (GAT), an domain adaptation method to utilize source data for cross-subject enhancement. Our method uses parallel convolution to capture temporal and spatial features first. Then, we employ a novel attention-based adaptor that implicitly transfers source features to the target domain, emphasizing the global correlation of EEG features. We also use a discriminator to explicitly drive the reduction of marginal distribution discrepancy by learning against the feature extractor and the adaptor. Besides, an adaptive center loss is designed to align the conditional distribution. With the aligned source and target features, a classifier can be optimized to decode EEG signals. Experiments on two widely used EEG datasets demonstrate that our method outperforms state-of-the-art methods, primarily due to the effectiveness of the adaptor. These results indicate that GAT has good potential to enhance the practicality of BCI.


Assuntos
Eletroencefalografia , Aprendizagem , Humanos , Eletroencefalografia/métodos , Aprendizado de Máquina , Software , Fontes de Energia Elétrica
13.
J Cheminform ; 15(1): 43, 2023 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-37038222

RESUMO

Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology's top 10 high-yield reactions is relatively close to the results of ideal yield selection.

14.
IEEE Trans Cybern ; 53(10): 6363-6375, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37015538

RESUMO

Automated detecting lung infections from computed tomography (CT) data plays an important role for combating coronavirus 2019 (COVID-19). However, there are still some challenges for developing AI system: 1) most current COVID-19 infection segmentation methods mainly relied on 2-D CT images, which lack 3-D sequential constraint; 2) existing 3-D CT segmentation methods focus on single-scale representations, which do not achieve the multiple level receptive field sizes on 3-D volume; and 3) the emergent breaking out of COVID-19 makes it hard to annotate sufficient CT volumes for training deep model. To address these issues, we first build a multiple dimensional-attention convolutional neural network (MDA-CNN) to aggregate multiscale information along different dimension of input feature maps and impose supervision on multiple predictions from different convolutional neural networks (CNNs) layers. Second, we assign this MDA-CNN as a basic network into a novel dual multiscale mean teacher network (DM [Formula: see text]-Net) for semi-supervised COVID-19 lung infection segmentation on CT volumes by leveraging unlabeled data and exploring the multiscale information. Our DM [Formula: see text]-Net encourages multiple predictions at different CNN layers from the student and teacher networks to be consistent for computing a multiscale consistency loss on unlabeled data, which is then added to the supervised loss on the labeled data from multiple predictions of MDA-CNN. Third, we collect two COVID-19 segmentation datasets to evaluate our method. The experimental results show that our network consistently outperforms the compared state-of-the-art methods.


Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos
15.
Med Image Anal ; 86: 102770, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36889206

RESUMO

PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill. METHODS: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. RESULTS: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team). CONCLUSION: Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery.


Assuntos
Inteligência Artificial , Benchmarking , Humanos , Fluxo de Trabalho , Algoritmos , Aprendizado de Máquina
16.
Med Image Anal ; 86: 102772, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36822050

RESUMO

Multi-label classification (MLC) can attach multiple labels on single image, and has achieved promising results on medical images. But existing MLC methods still face challenging clinical realities in practical use, such as: (1) medical risks arising from misclassification, (2) sample imbalance problem among different diseases, (3) inability to classify the diseases that are not pre-defined (unseen diseases). Here, we design a hybrid label to improve the flexibility of MLC methods and alleviate the sample imbalance problem. Specifically, in the labeled training set, we remain independent labels for high-frequency diseases with enough samples and use a hybrid label to merge low-frequency diseases with fewer samples. The hybrid label can also be used to put unseen diseases in practical use. In this paper, we propose Triplet Attention and Dual-pool Contrastive Learning (TA-DCL) for multi-label medical image classification based on the aforementioned label representation. TA-DCL architecture is a triplet attention network (TAN), which combines category-attention, self-attention and cross-attention together to learn high-quality label embeddings for all disease labels by mining effective information from medical images. DCL includes dual-pool contrastive training (DCT) and dual-pool contrastive inference (DCI). DCT optimizes the clustering centers of label embeddings belonging to different disease labels to improve the discrimination of label embeddings. DCI relieves the error classification of sick cases for reducing the clinical risk and improving the ability to detect unseen diseases by contrast of differences. TA-DCL is validated on two public medical image datasets, ODIR and NIH-ChestXray14, showing superior performance than other state-of-the-art MLC methods. Code is available at https://github.com/ZhangYH0502/TA-DCL.


Assuntos
Processamento de Imagem Assistida por Computador , Aprendizagem , Humanos
17.
Med Image Anal ; 83: 102673, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36403310

RESUMO

Supervised deep learning has achieved prominent success in various diabetic macular edema (DME) recognition tasks from optical coherence tomography (OCT) volumetric images. A common problematic issue that frequently occurs in this field is the shortage of labeled data due to the expensive fine-grained annotations, which increases substantial difficulty in accurate analysis by supervised learning. The morphological changes in the retina caused by DME might be distributed sparsely in B-scan images of the OCT volume, and OCT data is often coarsely labeled at the volume level. Hence, the DME identification task can be formulated as a multiple instance classification problem that could be addressed by multiple instance learning (MIL) techniques. Nevertheless, none of previous studies utilize unlabeled data simultaneously to promote the classification accuracy, which is particularly significant for a high quality of analysis at the minimum annotation cost. To this end, we present a novel deep semi-supervised multiple instance learning framework to explore the feasibility of leveraging a small amount of coarsely labeled data and a large amount of unlabeled data to tackle this problem. Specifically, we come up with several modules to further improve the performance according to the availability and granularity of their labels. To warm up the training, we propagate the bag labels to the corresponding instances as the supervision of training, and propose a self-correction strategy to handle the label noise in the positive bags. This strategy is based on confidence-based pseudo-labeling with consistency regularization. The model uses its prediction to generate the pseudo-label for each weakly augmented input only if it is highly confident about the prediction, which is subsequently used to supervise the same input in a strongly augmented version. This learning scheme is also applicable to unlabeled data. To enhance the discrimination capability of the model, we introduce the Student-Teacher architecture and impose consistency constraints between two models. For demonstration, the proposed approach was evaluated on two large-scale DME OCT image datasets. Extensive results indicate that the proposed method improves DME classification with the incorporation of unlabeled data and outperforms competing MIL methods significantly, which confirm the feasibility of deep semi-supervised multiple instance learning at a low annotation cost.


Assuntos
Retinopatia Diabética , Edema Macular , Humanos , Edema Macular/diagnóstico por imagem , Retinopatia Diabética/diagnóstico por imagem , Tomografia de Coerência Óptica , Aprendizado de Máquina Supervisionado , Retina/diagnóstico por imagem
18.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3259-3273, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35737621

RESUMO

This article formulates a new problem, instance shadow detection, which aims to detect shadow instance and the associated object instance that cast each shadow in the input image. To approach this task, we first compile a new dataset with the masks for shadow instances, object instances, and shadow-object associations. We then design an evaluation metric for quantitative evaluation of the performance of instance shadow detection. Further, we design a single-stage detector to perform instance shadow detection in an end-to-end manner, where the bidirectional relation learning module and the deformable maskIoU head are proposed in the detector to directly learn the relation between shadow instances and object instances and to improve the accuracy of the predicted masks. Finally, we quantitatively and qualitatively evaluate our method on the benchmark dataset of instance shadow detection and show the applicability of our method on light direction estimation and photo editing.

19.
Med Image Anal ; 84: 102680, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36481607

RESUMO

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via https://competitions.codalab.org/competitions/17094.


Assuntos
Benchmarking , Neoplasias Hepáticas , Humanos , Estudos Retrospectivos , Neoplasias Hepáticas/diagnóstico por imagem , Neoplasias Hepáticas/patologia , Fígado/diagnóstico por imagem , Fígado/patologia , Algoritmos , Processamento de Imagem Assistida por Computador/métodos
20.
IEEE Trans Med Imaging ; 42(3): 570-581, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36191115

RESUMO

Contemporary methods have shown promising results on cardiac image segmentation, but merely in static learning, i.e., optimizing the network once for all, ignoring potential needs for model updating. In real-world scenarios, new data continues to be gathered from multiple institutions over time and new demands keep growing to pursue more satisfying performance. The desired model should incrementally learn from each incoming dataset and progressively update with improved functionality as time goes by. As the datasets sequentially delivered from multiple sites are normally heterogenous with domain discrepancy, each updated model should not catastrophically forget previously learned domains while well generalizing to currently arrived domains or even unseen domains. In medical scenarios, this is particularly challenging as accessing or storing past data is commonly not allowed due to data privacy. To this end, we propose a novel domain-incremental learning framework to recover past domain inputs first and then regularly replay them during model optimization. Particularly, we first present a style-oriented replay module to enable structure-realistic and memory-efficient reproduction of past data, and then incorporate the replayed past data to jointly optimize the model with current data to alleviate catastrophic forgetting. During optimization, we additionally perform domain-sensitive feature whitening to suppress model's dependency on features that are sensitive to domain changes (e.g., domain-distinctive style features) to assist domain-invariant feature exploration and gradually improve the generalization performance of the network. We have extensively evaluated our approach with the M&Ms Dataset in single-domain and compound-domain incremental learning settings. Our approach outperforms other comparison methods with less forgetting on past domains and better generalization on current domains and unseen domains.


Assuntos
Coração , Processamento de Imagem Assistida por Computador , Coração/diagnóstico por imagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...