Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Med Image Anal ; 94: 103111, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38401271

RESUMO

Semi-supervised learning has garnered significant interest as a method to alleviate the burden of data annotation. Recently, semi-supervised medical image segmentation has garnered significant interest that can alleviate the burden of densely annotated data. Substantial advancements have been achieved by integrating consistency-regularization and pseudo-labeling techniques. The quality of the pseudo-labels is crucial in this regard. Unreliable pseudo-labeling can result in the introduction of noise, leading the model to converge to suboptimal solutions. To address this issue, we propose learning from reliable pseudo-labels. In this paper, we tackle two critical questions in learning from reliable pseudo-labels: which pseudo-labels are reliable and how reliable are they? Specifically, we conduct a comparative analysis of two subnetworks to address both challenges. Initially, we compare the prediction confidence of the two subnetworks. A higher confidence score indicates a more reliable pseudo-label. Subsequently, we utilize intra-class similarity to assess the reliability of the pseudo-labels to address the second challenge. The greater the intra-class similarity of the predicted classes, the more reliable the pseudo-label. The subnetwork selectively incorporates knowledge imparted by the other subnetwork model, contingent on the reliability of the pseudo labels. By reducing the introduction of noise from unreliable pseudo-labels, we are able to improve the performance of segmentation. To demonstrate the superiority of our approach, we conducted an extensive set of experiments on three datasets: Left Atrium, Pancreas-CT and Brats-2019. The experimental results demonstrate that our approach achieves state-of-the-art performance. Code is available at: https://github.com/Jiawei0o0/mutual-learning-with-reliable-pseudo-labels.


Assuntos
Átrios do Coração , Aprendizado de Máquina Supervisionado , Humanos , Reprodutibilidade dos Testes , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por Computador
2.
IEEE Trans Neural Netw Learn Syst ; 34(4): 1958-1971, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34464275

RESUMO

Visible-Infrared person reidentification (VI-ReID) is a challenging matching problem due to large modality variations between visible and infrared images. Existing approaches usually bridge the modality gap with only feature-level constraints, ignoring pixel-level variations. Some methods employ a generative adversarial network (GAN) to generate style-consistent images, but it destroys the structure information and incurs a considerable level of noise. In this article, we explicitly consider these challenges and formulate a novel spectrum-aware feature augmentation network named SFANet for cross-modality matching problem. Specifically, we put forward to employ grayscale-spectrum images to fully replace RGB images for feature learning. Learning with the grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations across the different modalities, making it robust to color variations. At feature level, we improve the conventional two-stream network by balancing the number of specific and sharable convolutional blocks, which preserve the spatial structure information of features. Additionally, a bidirectional tri-constrained top-push ranking loss (BTTR) is embedded in the proposed network to improve the discriminability, which efficiently further boosts the matching accuracy. Meanwhile, we further introduce an effective dual-linear with batch normalization identification (ID) embedding method to model the identity-specific information and assist BTTR loss in magnitude stabilizing. On SYSU-MM01 and RegDB datasets, we conducted extensively experiments to demonstrate that our proposed framework contributes indispensably and achieves a very competitive VI-ReID performance.

3.
IEEE Trans Neural Netw Learn Syst ; 34(9): 6081-6095, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34928806

RESUMO

Class imbalance is a common issue in the community of machine learning and data mining. The class-imbalance distribution can make most classical classification algorithms neglect the significance of the minority class and tend toward the majority class. In this article, we propose a label enhancement method to solve the class-imbalance problem in a graph manner, which estimates the numerical label and trains the inductive model simultaneously. It gives a new perspective on the class-imbalance learning based on the numerical label rather than the original logical label. We also present an iterative optimization algorithm and analyze the computation complexity and its convergence. To demonstrate the superiority of the proposed method, several single-label and multilabel datasets are applied in the experiments. The experimental results show that the proposed method achieves a promising performance and outperforms some state-of-the-art single-label and multilabel class-imbalance learning methods.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 5218-5235, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35969571

RESUMO

Recent studies show that deep person re-identification (re-ID) models are vulnerable to adversarial examples, so it is critical to improving the robustness of re-ID models against attacks. To achieve this goal, we explore the strengths and weaknesses of existing re-ID models, i.e., designing learning-based attacks and training robust models by defending against the learned attacks. The contributions of this paper are three-fold: First, we build a holistic attack-defense framework to study the relationship between the attack and defense for person re-ID. Second, we introduce a combinatorial adversarial attack that is adaptive to unseen domains and unseen model types. It consists of distortions in pixel and color space (i.e., mimicking camera shifts). Third, we propose a novel virtual-guided meta-learning algorithm for our attack-defense system. We leverage a virtual dataset to conduct experiments under our meta-learning framework, which can explore the cross-domain constraints for enhancing the generalization of the attack and the robustness of the re-ID model. Comprehensive experiments on three large-scale re-ID benchmarks demonstrate that: 1) Our combinatorial attack is effective and highly universal in cross-model and cross-dataset scenarios; 2) Our meta-learning algorithm can be readily applied to different attack and defense approaches, which can reach consistent improvement; 3) The defense model trained on the learning-to-learn framework is robust to recent SOTA attacks that are not even used during training.

5.
Artigo em Inglês | MEDLINE | ID: mdl-36215379

RESUMO

Information theoretical-based methods have attracted a great attention in recent years and gained promising results for multilabel feature selection (MLFS). Nevertheless, most of the existing methods consider a heuristic way to the grid search of important features, and they may also suffer from the issue of fully utilizing labeling information. Thus, they are probable to deliver a suboptimal result with heavy computational burden. In this article, we propose a general optimization framework global relevance and redundancy optimization (GRRO) to solve the learning problem. The main technical contribution in GRRO is a formulation for MLFS while feature relevance, label relevance (i.e., label correlation), and feature redundancy are taken into account, which can avoid repetitive entropy calculations to obtain a global optimal solution efficiently. To further improve the efficiency, we extend GRRO to filter out inessential labels and features, thus facilitating fast MLFS. We call the extension as GRROfast, in which the key insights are twofold: 1) promising labels and related relevant features are investigated to reduce ineffective calculations in terms of features, even labels and 2) the framework of GRRO is reconstructed to generate the optimal result with an ensemble. Moreover, our proposed algorithms have an excellent mechanism for exploiting the inherent properties of multilabel data; specifically, we provide a formulation to enhance the proposal with label-specific features. Extensive experiments clearly reveal the effectiveness and efficiency of our proposed algorithms.

6.
IEEE Trans Image Process ; 31: 3780-3792, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35604972

RESUMO

In this paper, we study the cross-view geo-localization problem to match images from different viewpoints. The key motivation underpinning this task is to learn a discriminative viewpoint-invariant visual representation. Inspired by the human visual system for mining local patterns, we propose a new framework called RK-Net to jointly learn the discriminative Representation and detect salient Keypoints with a single Network. Specifically, we introduce a Unit Subtraction Attention Module (USAM) that can automatically discover representative keypoints from feature maps and draw attention to the salient regions. USAM contains very few learning parameters but yields significant performance improvement and can be easily plugged into different networks. We demonstrate through extensive experiments that (1) by incorporating USAM, RK-Net facilitates end-to-end joint learning without the prerequisite of extra annotations. Representation learning and keypoint detection are two highly-related tasks. Representation learning aids keypoint detection. Keypoint detection, in turn, enriches the model capability against large appearance changes caused by viewpoint variants. (2) USAM is easy to implement and can be integrated with existing methods, further improving the state-of-the-art performance. We achieve competitive geo-localization accuracy on three challenging datasets, i. e., University-1652, CVUSA and CVACT. Our code is available at https://github.com/AggMan96/RK-Net.

7.
IEEE Trans Cybern ; 52(5): 3841-3854, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-32877346

RESUMO

In multilabel learning, each training example is represented by a single instance, which is relevant to multiple class labels simultaneously. Generally, all relevant labels are considered to be available for labeled data. However, instances with a full label set are difficult to obtain in real-world applications, thus leading to the weakly multilabel learning problem, that is, relevant labels of training data are partially known and many relevant labels are missing, and even abundant training data are associated with an empty label set. To address the problem, we propose a new multilabel method to learn from weakly labeled data. To be specific, an optimization framework is constructed based on the manifold regularized sparse model, in which the correlations among labels and feature structure are considered to model global and local label correlations, thereby achieving discriminative feature analysis for mapping training data to ground-truth label space. Moreover, the proposed method has an excellent mechanism to conduct semisupervised multilabel learning by exploiting training data with the predicted label set of the unlabeled. Experiments on various real-world tasks reveal that the proposed method outperforms some state-of-the-art methods.


Assuntos
Aprendizado de Máquina Supervisionado
8.
Med Image Anal ; 71: 102040, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33789178

RESUMO

Carotid artery lumen diameter (CALD) and carotid artery intima-media thickness (CIMT) are essential factors for estimating the risk of many cardiovascular diseases. The automatic measurement of them in ultrasound (US) images is an efficient assisting diagnostic procedure. Despite the advances, existing methods still suffer the issue of low measuring accuracy and poor prediction stability, mainly due to the following disadvantages: (1) ignore anatomical prior and prone to give anatomically inaccurate estimation; (2) require carefully designed post-processing, which may introduce more estimation errors; (3) rely on massive pixel-wise annotations during training; (4) can not estimate the uncertainty of the predictions. In this study, we propose the Anatomical Prior-guided ReInforcement Learning model (APRIL), which innovatively formulate the measurement of CALD & CIMT as an RL problem and dynamically incorporate anatomical prior (AP) into the system through a novel reward. With the guidance of AP, the designed keypoints in APRIL can avoid various anatomy impossible mis-locations, and accurately measure the CALD & CIMT based on their corresponding locations. Moreover, this formulation significantly reduces human annotation effort by only using several keypoints and can help to eliminate the extra post-processing steps. Further, we introduce an uncertainty module for measuring the prediction variance, which can guide us to adaptively rectify the estimation of those frames with considerable uncertainty. Experiments on a challenging carotid US dataset show that APRIL can achieve MAE (in pixel/mm) of 3.02±2.23 / 0.18±0.13 for CALD, and 0.96±0.70 / 0.06±0.04 for CIMT, which significantly surpass popular approaches that use more annotations.


Assuntos
Doenças Cardiovasculares , Espessura Intima-Media Carotídea , Artérias Carótidas/diagnóstico por imagem , Humanos , Membro 13 da Superfamília de Ligantes de Fatores de Necrose Tumoral , Ultrassonografia
9.
IEEE Trans Pattern Anal Mach Intell ; 43(8): 2723-2738, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-32142418

RESUMO

This work considers the problem of unsupervised domain adaptation in person re-identification (re-ID), which aims to transfer knowledge from the source domain to the target domain. Existing methods are primary to reduce the inter-domain shift between the domains, which however usually overlook the relations among target samples. This paper investigates into the intra-domain variations of the target domain and proposes a novel adaptation framework w.r.t three types of underlying invariance, i.e., Exemplar-Invariance, Camera-Invariance, and Neighborhood-Invariance. Specifically, an exemplar memory is introduced to store features of samples, which can effectively and efficiently enforce the invariance constraints over the global dataset. We further present the Graph-based Positive Prediction (GPP) method to explore reliable neighbors for the target domain, which is built upon the memory and is trained on the source samples. Experiments demonstrate that 1) the three invariance properties are complementary and indispensable for effective domain adaptation, 2) the memory plays a key role in implementing invariance learning and improves the performance with limited extra computation cost, 3) GPP can facilitate the invariance learning and thus significantly improves the results, and 4) our approach produces new state-of-the-art adaptation accuracy on three re-ID large-scale benchmarks.

10.
Artigo em Inglês | MEDLINE | ID: mdl-31095493

RESUMO

Retinal vessel segmentation is a critical procedure towards the accurate visualization, diagnosis, early treatment, and surgery planning of ocular diseases. Recent deep learning-based approaches have achieved impressive performance in retinal vessel segmentation. However, they usually apply global image pre-processing and take the whole retinal images as input during network training, which have two drawbacks for accurate retinal vessel segmentation. First, these methods lack the utilization of the local patch information. Second, they overlook the geometric constraint that retina only occurs in a specific area within the whole image or the extracted patch. As a consequence, these global-based methods suffer in handling details, such as recognizing the small thin vessels, discriminating the optic disk, etc. To address these drawbacks, this study proposes a Global and Local enhanced residual U-nEt (GLUE) for accurate retinal vessel segmentation, which benefits from both the globally and locally enhanced information inside the retinal region. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed method, which consistently improves the segmentation accuracy over a conventional U-Net and achieves competitive performance compared to the state-of-the-art.


Assuntos
Aprendizado Profundo , Técnicas de Diagnóstico Oftalmológico , Interpretação de Imagem Assistida por Computador/métodos , Vasos Retinianos/diagnóstico por imagem , Algoritmos , Bases de Dados Factuais , Humanos , Doenças Retinianas/diagnóstico por imagem
11.
Artigo em Inglês | MEDLINE | ID: mdl-33294001

RESUMO

BACKGROUND: Metabolic syndrome (MS) is a complex multisystem disease. Traditional Chinese medicine (TCM) is effective in preventing and treating MS. Syndrome differentiation is the basis of TCM treatment, which is composed of location and/or nature syndrome elements. At present, there are still some problems for objective and comprehensive syndrome differentiation in MS. This study mainly proposes a solution to two problems. Firstly, TCM syndromes are concurrent, that is, multiple TCM syndromes may develop in the same patient. Secondly, there is a lack of holistic exploration of the relationship between microscopic indexes, and TCM syndromes. In regard to these two problems, multilabel learning (MLL) method in machine learning can be used to solve them, and a microcosmic syndrome differentiation model can also be built innovatively, which can provide a foundation for the establishment of the next model of multidimensional syndrome differentiation in MS. METHODS: The standardization scale of TCM four diagnostic information for MS was designed, which was used to obtain the results of TCM diagnosis. The model of microcosmic syndrome differentiation was constructed based on 39 physicochemical indexes by MLL techniques, called ML-kNN. Firstly, the multilabel learning method was compared with three commonly used single learning algorithms. Then, the results from ML-kNN were compared between physicochemical indexes and TCM information. Finally, the influence of the parameter k on the diagnostic model was investigated and the best k value was chosen for TCM diagnosis. RESULTS: A total of 698 cases were collected for the modeling of the microcosmic diagnosis of MS. The comprehensive performance of the ML-kNN model worked obviously better than the others, where the average precision of diagnosis was 71.4%. The results from ML-kNN based on physicochemical indexes were similar to the results based on TCM information. On the other hand, the k value had less influence on the prediction results from ML-kNN. CONCLUSIONS: In the present study, the microcosmic syndrome differentiation model of MS with MLL techniques was good at predicting syndrome elements and could be used to solve the diagnosis problems of multiple labels. Besides, it was suggested that there was a complex correlation between TCM syndrome elements and physicochemical indexes, which worth future investigation to promote the development of objective differentiation of MS.

12.
J Biomed Inform ; 106: 103435, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32360988

RESUMO

The task of electronic medical record named entity recognition (NER) refers to automatically identify all kinds of named entities in the medical record text. Chinese clinical NER remains a major challenge. One of the main reasons is that Chinese word segmentation will lead to the wrong downstream works. Besides, existing methods only use the information of the general field, not consider the knowledge from field of medicine. To address these issues, we propose a dynamic embedding method based on dynamic attention which combines features of both character and word in embedding layer. Domain knowledge is provided by word vector trained by domain dataset. In addition, spatial attention is added to enable the model to obtain more and more effective context encoding information. Finally, we conduct extensive experiments to demonstrate the effectiveness of our proposed algorithm. Experiments on CCKS2017 and Common dataset shows that the proposed method outperforms the baseline.


Assuntos
Registros Eletrônicos de Saúde , Envio de Mensagens de Texto , Algoritmos , Atenção , China
13.
IEEE Trans Image Process ; 28(3): 1176-1190, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30296233

RESUMO

Person re-identification (re-ID) is a cross-camera retrieval task that suffers from image style variations caused by different cameras. The art implicitly addresses this problem by learning a camera-invariant descriptor subspace. In this paper, we explicitly consider this challenge by introducing camera style (CamStyle). CamStyle can serve as a data augmentation approach that reduces the risk of deep network overfitting and that smooths the CamStyle disparities. Specifically, with a style transfer model, labeled training images can be style transferred to each camera, and along with the original training samples, form the augmented training set. This method, while increasing data diversity against overfitting, also incurs a considerable level of noise. In the effort to alleviate the impact of noise, the label smooth regularization (LSR) is adopted. The vanilla version of our method (without LSR) performs reasonably well on few camera systems in which overfitting often occurs. With LSR, we demonstrate consistent improvement in all systems regardless of the extent of overfitting. We also report competitive accuracy compared with the state of the art on Market-1501 and DukeMTMC-re-ID. Importantly, CamStyle can be employed to the challenging problems of one view learning and unsupervised domain adaptation (UDA) in person re-identification (re-ID), both of which have critical research and application significance. The former only has labeled data in one camera view and the latter only has labeled data in the source domain. Experimental results show that CamStyle significantly improves the performance of the baseline in the two problems. Specially, for UDA, CamStyle achieves state-of-the-art accuracy based on a baseline deep re-ID model on Market-1501 and DukeMTMC-reID. Our code is available at: https://github.com/zhunzhong07/CamStyle .

14.
Artigo em Inglês | MEDLINE | ID: mdl-29994117

RESUMO

The ability to train on a large dataset of labeled samples is critical to the success of deep learning in many domains. In this paper, we focus on motor vehicle classification and localization from a single video frame and introduce the "MIOvision Traffic Camera Dataset" (MIO-TCD) in this context. MIO-TCD is the largest dataset for motorized traffic analysis to date. It includes 11 traffic object classes such as cars, trucks, buses, motorcycles, bicycles, pedestrians. It contains 786,702 annotated images acquired at different times of the day and different periods of the year by hundreds of traffic surveillance cameras deployed across Canada and the United States. The dataset consists of two parts: a "localization dataset", containing 137,743 full video frames with bounding boxes around traffic objects, and a "classification dataset", containing 648,959 crops of traffic objects from the 11 classes. We also report results from the 2017 CVPR MIO-TCD Challenge, that leveraged this dataset, and compare them with results for state-of-the-art deep learning architectures. These results demonstrate the viability of deep learning methods for vehicle localization and classification from a single video frame in real-life traffic scenarios. The topperforming methods achieve both accuracy and Kappa score above 96% on the classification dataset and mean-average precision of 77% on the localization dataset. We also identify scenarios in which state-of-the-art methods still fail and we suggest avenues to address these challenges. Both the dataset and detailed results are publicly available on-line [1].

15.
IEEE Trans Biomed Eng ; 59(12): 3348-56, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22929363

RESUMO

This paper presents a useful technique for totally automatic detection of myocardial infarction from patients' ECGs. Due to the large number of heartbeats constituting an ECG and the high cost of having all the heartbeats manually labeled, supervised learning techniques have achieved limited success in ECG classification. In this paper, we first discuss the rationale for applying multiple instance learning (MIL) to automated ECG classification and then propose a new MIL strategy called latent topic MIL, by which ECGs are mapped into a topic space defined by a number of topics identified over all the unlabeled training heartbeats and support vector machine is directly applied to the ECG-level topic vectors. Our experimental results on real ECG datasets from the PTB diagnostic database demonstrate that, compared with existing MIL and supervised learning algorithms, the proposed algorithm is able to automatically detect ECGs with myocardial ischemia without labeling any heartbeats. Moreover, it improves classification quality in terms of both sensitivity and specificity.


Assuntos
Eletrocardiografia/métodos , Infarto do Miocárdio/diagnóstico , Processamento de Sinais Assistido por Computador , Máquina de Vetores de Suporte , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Frequência Cardíaca/fisiologia , Humanos , Masculino , Pessoa de Meia-Idade , Infarto do Miocárdio/fisiopatologia , Curva ROC
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA