Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 46(2): 1231-1242, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37910406

RESUMO

Scene graph generation is a structured prediction task aiming to explicitly model objects and their relationships via constructing a visually-grounded scene graph for an input image. Currently, the message passing neural network based mean field variational Bayesian methodology is the ubiquitous solution for such a task, in which the variational inference objective is often assumed to be the classical evidence lower bound. However, the variational approximation inferred from such loose objective generally underestimates the underlying posterior, which often leads to inferior generation performance. In this paper, we propose a novel importance weighted structure learning method aiming to approximate the underlying log-partition function with a tighter importance weighted lower bound, which is computed from multiple samples drawn from a reparameterizable Gumbel-Softmax sampler. A generic entropic mirror descent algorithm is applied to solve the resulting constrained variational inference task. The proposed method achieves the state-of-the-art performance on various popular scene graph generation benchmarks.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11588-11599, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37276097

RESUMO

As a structured prediction task, scene graph generation aims to build a visually-grounded scene graph to explicitly model objects and their relationships in an input image. Currently, the mean field variational Bayesian framework is the de facto methodology used by the existing methods, in which the unconstrained inference step is often implemented by a message passing neural network. However, such formulation fails to explore other inference strategies, and largely ignores the more general constrained optimization models. In this paper, we present a constrained structure learning method, for which an explicit constrained variational inference objective is proposed. Instead of applying the ubiquitous message-passing strategy, a generic constrained optimization method - entropic mirror descent - is utilized to solve the constrained variational inference step. We validate the proposed generic model on various popular scene graph generation benchmarks and show that it outperforms the state-of-the-art methods.

3.
Commun Biol ; 6(1): 489, 2023 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-37147530

RESUMO

Unravelling protein distributions within individual cells is vital to understanding their function and state and indispensable to developing new treatments. Here we present the Hybrid subCellular Protein Localiser (HCPL), which learns from weakly labelled data to robustly localise single-cell subcellular protein patterns. It comprises innovative DNN architectures exploiting wavelet filters and learnt parametric activations that successfully tackle drastic cell variability. HCPL features correlation-based ensembling of novel architectures that boosts performance and aids generalisation. Large-scale data annotation is made feasible by our AI-trains-AI approach, which determines the visual integrity of cells and emphasises reliable labels for efficient training. In the Human Protein Atlas context, we demonstrate that HCPL is best performing in the single-cell classification of protein localisation patterns. To better understand the inner workings of HCPL and assess its biological relevance, we analyse the contributions of each system component and dissect the emergent features from which the localisation predictions are derived.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 10161-10172, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37022845

RESUMO

Scene graph generation aims to interpret an input image by explicitly modelling the objects contained therein and their relationships. In existing methods the problem is predominantly solved by message passing neural network models. Unfortunately, in such models, the variational distributions generally ignore the structural dependencies among the output variables, and most of the scoring functions only consider pairwise dependencies. This can lead to inconsistent interpretations. In this article, we propose a novel neural belief propagation method seeking to replace the traditional mean field approximation with a structural Bethe approximation. To find a better bias-variance trade-off, higher-order dependencies among three or more output variables are also incorporated into the relevant scoring function. The proposed method achieves the state-of-the-art performance on various popular scene graph generation benchmarks.


Assuntos
Algoritmos , Benchmarking , Redes Neurais de Computação
5.
Vet Sci ; 10(1)2023 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-36669046

RESUMO

The definitive diagnosis of canine soft-tissue sarcomas (STSs) is based on histological assessment of formalin-fixed tissues. Assessment of parameters, such as degree of differentiation, necrosis score and mitotic score, give rise to a final tumour grade, which is important in determining prognosis and subsequent treatment modalities. However, grading discrepancies are reported to occur in human and canine STSs, which can result in complications regarding treatment plans. The introduction of digital pathology has the potential to help improve STS grading via automated determination of the presence and extent of necrosis. The detected necrotic regions can be factored in the grading scheme or excluded before analysing the remaining tissue. Here we describe a method to detect tumour necrosis in histopathological whole-slide images (WSIs) of STSs using machine learning. Annotated areas of necrosis were extracted from WSIs and the patches containing necrotic tissue fed into a pre-trained DenseNet161 convolutional neural network (CNN) for training, testing and validation. The proposed CNN architecture reported favourable results, with an overall validation accuracy of 92.7% for necrosis detection which represents the number of correctly classified data instances over the total number of data instances. The proposed method, when vigorously validated represents a promising tool to assist pathologists in evaluating necrosis in canine STS tumours, by increasing efficiency, accuracy and reducing inter-rater variation.

6.
Artigo em Inglês | MEDLINE | ID: mdl-36197864

RESUMO

The process of aggregation is ubiquitous in almost all the deep nets' models. It functions as an important mechanism for consolidating deep features into a more compact representation while increasing the robustness to overfitting and providing spatial invariance in deep nets. In particular, the proximity of global aggregation layers to the output layers of DNNs means that aggregated features directly influence the performance of a deep net. A better understanding of this relationship can be obtained using information theoretic methods. However, this requires knowledge of the distributions of the activations of aggregation layers. To achieve this, we propose a novel mathematical formulation for analytically modeling the probability distributions of output values of layers involved with deep feature aggregation. An important outcome is our ability to analytically predict the Kullback-Leibler (KL)-divergence of output nodes in a DNN. We also experimentally verify our theoretical predictions against empirical observations across a broad range of different classification tasks and datasets.

7.
Sci Rep ; 12(1): 10634, 2022 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-35739267

RESUMO

Necrosis seen in histopathology Whole Slide Images is a major criterion that contributes towards scoring tumour grade which then determines treatment options. However conventional manual assessment suffers from inter-operator reproducibility impacting grading precision. To address this, automatic necrosis detection using AI may be used to assess necrosis for final scoring that contributes towards the final clinical grade. Using deep learning AI, we describe a novel approach for automating necrosis detection in Whole Slide Images, tested on a canine Soft Tissue Sarcoma (cSTS) data set consisting of canine Perivascular Wall Tumours (cPWTs). A patch-based deep learning approach was developed where different variations of training a DenseNet-161 Convolutional Neural Network architecture were investigated as well as a stacking ensemble. An optimised DenseNet-161 with post-processing produced a hold-out test F1-score of 0.708 demonstrating state-of-the-art performance. This represents a novel first-time automated necrosis detection method in the cSTS domain as well specifically in detecting necrosis in cPWTs demonstrating a significant step forward in reproducible and reliable necrosis assessment for improving the precision of tumour grading.


Assuntos
Aprendizado Profundo , Neoplasias de Tecido Conjuntivo e de Tecidos Moles , Animais , Cães , Necrose , Redes Neurais de Computação , Reprodutibilidade dos Testes
8.
Phys Med Biol ; 66(10)2021 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-33765674

RESUMO

A Machine Learning approach to the problem of calculating the proton paths inside a scanned object in proton Computed Tomography is presented. The method is developed in order to mitigate the loss in both spatial resolution and quantitative integrity of the reconstructed images caused by multiple Coulomb scattering of protons traversing the matter. Two Machine Learning models were used: a forward neural network (NN) and the XGBoost method. A heuristic approach, based on track averaging was also implemented in order to evaluate the accuracy limits on track calculation, imposed by the statistical nature of the scattering. Synthetic data from anthropomorphic voxelized phantoms, generated by the Monte Carlo (MC) Geant4 code, were utilized to train the models and evaluate their accuracy, in comparison to a widely used analytical method that is based on likelihood maximization and Fermi-Eyges scattering model. Both NN and XGBoost model were found to perform very close or at the accuracy limit, further improving the accuracy of the analytical method (by 12% in the typical case of 200 MeV protons on 20 cm of water object), especially for protons scattered at large angles. Inclusion of the material information along the path in terms of radiation length did not show improvement in accuracy for the phantoms simulated in the study. A NN was also constructed to predict the error in path calculation, thus enabling a criterion to filter out proton events that may have a negative effect on the quality of the reconstructed image. By parametrizing a large set of synthetic data, the Machine Learning models were proved capable to bring-in an indirect and time efficient way-the accuracy of the MC method into the problem of proton tracking.


Assuntos
Algoritmos , Prótons , Aprendizado de Máquina , Método de Monte Carlo , Imagens de Fantasmas , Tomografia Computadorizada por Raios X
9.
IEEE Trans Pattern Anal Mach Intell ; 43(4): 1404-1422, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-31675316

RESUMO

Visual semantic information comprises two important parts: the meaning of each visual semantic unit and the coherent visual semantic relation conveyed by these visual semantic units. Essentially, the former one is a visual perception task while the latter corresponds to visual context reasoning. Remarkable advances in visual perception have been achieved due to the success of deep learning. In contrast, visual semantic information pursuit, a visual scene semantic interpretation task combining visual perception and visual context reasoning, is still in its early stage. It is the core task of many different computer vision applications, such as object detection, visual semantic segmentation, visual relationship detection, or scene graph generation. Since it helps to enhance the accuracy and the consistency of the resulting interpretation, visual context reasoning is often incorporated with visual perception in current deep end-to-end visual semantic information pursuit methods. Surprisingly, a comprehensive review for this exciting area is still lacking. In this survey, we present a unified theoretical paradigm for all these methods, followed by an overview of the major developments and the future trends in each potential direction. The common benchmark datasets, the evaluation metrics and the comparisons of the corresponding methods are also introduced.

10.
Artigo em Inglês | MEDLINE | ID: mdl-31135362

RESUMO

This paper addresses the problem of very large-scale image retrieval, focusing on improving its accuracy and robustness. We target enhanced robustness of search to factors such as variations in illumination, object appearance and scale, partial occlusions, and cluttered backgrounds -particularly important when search is performed across very large datasets with significant variability. We propose a novel CNN-based global descriptor, called REMAP, which learns and aggregates a hierarchy of deep features from multiple CNN layers, and is trained end-to-end with a triplet loss. REMAP explicitly learns discriminative features which are mutually-supportive and complementary at various semantic levels of visual abstraction. These dense local features are max-pooled spatially at each layer, within multi-scale overlapping regions, before aggregation into a single image-level descriptor. To identify the semantically useful regions and layers for retrieval, we propose to measure the information gain of each region and layer using KL-divergence. Our system effectively learns during training how useful various regions and layers are and weights them accordingly. We show that such relative entropy-guided aggregation outperforms classical CNN-based aggregation controlled by SGD. The entire framework is trained in an end-to-end fashion, outperforming the latest state-of-the-art results. On image retrieval datasets Holidays, Oxford and MPEG, the REMAP descriptor achieves mAP of 95.5%, 91.5% and 80.1% respectively, outperforming any results published to date. REMAP also formed the core of the winning submission to the Google Landmark Retrieval Challenge on Kaggle.

11.
IEEE Trans Pattern Anal Mach Intell ; 39(9): 1783-1796, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28114059

RESUMO

Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.

12.
Mach Vis Appl ; 28(3): 393-407, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-32103860

RESUMO

Images of the kidneys using dynamic contrast-enhanced magnetic resonance renography (DCE-MRR) contains unwanted complex organ motion due to respiration. This gives rise to motion artefacts that hinder the clinical assessment of kidney function. However, due to the rapid change in contrast agent within the DCE-MR image sequence, commonly used intensity-based image registration techniques are likely to fail. While semi-automated approaches involving human experts are a possible alternative, they pose significant drawbacks including inter-observer variability, and the bottleneck introduced through manual inspection of the multiplicity of images produced during a DCE-MRR study. To address this issue, we present a novel automated, registration-free movement correction approach based on windowed and reconstruction variants of dynamic mode decomposition (WR-DMD). Our proposed method is validated on ten different healthy volunteers' kidney DCE-MRI data sets. The results, using block-matching-block evaluation on the image sequence produced by WR-DMD, show the elimination of 99 % of mean motion magnitude when compared to the original data sets, thereby demonstrating the viability of automatic movement correction using WR-DMD.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...